Unicode formatting being stripped from LO4.1

Hi there,

First of all, I am aware of the other unicode threads out there. I have looked into This and This on here, but I have a slightly different problem. My main problem is that LO Writer is actually displaying the characters wrong. Specifically the Mathematical Fraktur characters (đť–€ and đť•», Unicode: U+1D580, U+1D57B) are being stripped to their Latin equivalents (UP). This is in all fonts within LO and stays when exported to PDF, however if I save as a plain text the characters display correctly. As you can see also, within the browser they show up perfectly too.

I am using UberStudent 3.0 Plato (Ubuntu 13.04 + XFCE) x64 with LO 4.1.1.2 from repos as I wanted the font embed options (I use the Ubuntu font family a lot as I find it is one of the nicest looking, sans, accented Greek fonts available, and my counterparts use windows). I did think that this option may be the culprit, so I have tried it with and without the font embedding option, with no luck.

Is this a bug in the latest version of LO, or am I missing something here, as when i was using LO 3.5 these characters were displaying perfectly. I would upload some sample files, for you to have a look at, bus seeing as this is my first post and my karma is <3, I cannot…

Thanks for your help.

I have an interest in this sort of question. Hopefully an upvote will get you some luv. :slight_smile: You could always post files on dropbox or the like in the meantime. This sounds in the first instance like a system-level issue but the symptoms don’t quite
point that way, do they?

This is a bug in 4.1 on Linux where some missing characters in the font are rendered using there Unicode compatibility decomposition equivalent (the compatibility equivalent of đť–€ is plain U) instead of falling back to a different font that supports the character. This is already fixed in fdo#66715 and the fix should be available in the upcoming 4.1.2 release.

Thanks, It’s nice to know I’m not going nuts! I will await the bugfix eagerly.

This means waiting for 3 weeks according to release plan: https://wiki.documentfoundation.org/ReleasePlan

If you badly need this functionality then I suggest to downgrade to LibreOffice 4.0.5: https://www.libreoffice.org/download/?version=4.0.5

Your problem is a font one and basically the same as described in my answer here. I downloaded the Ubuntu font family v0.80 to check and they only appear to include characters up to U+10048 (and a couple of extras for the mono-width versions). Neither of the characters you indicate are included in this font family. I imagine you are only seeing them as a result of font-substitution (probably from FreeSerif as very few fonts encode this Unicode range).

Maybe worth experimenting with Free Sans - in my brief tests (Ubuntu 13.04) it is the only one of the fonts listed by Alan Wood to include glyphs for the Mathematical Alphanumeric Symbols range natively.

FreeSans (latest version dated 20120503) does not include the indicated range (although the positions are encoded), only FreeSerif does. In this particular case, Khaled’s answer is ultimately right as the bug was with the Harfbuzz font fallback method (decomposition) - the comments in that bug are enlightening to anyone with an interest in font handling.

Thanks for this, Indeed if I change the font to FreeSerif it does work! Which is fine for the doc I am writing at the moment as the text body is in Lib. Serif anyway. I don’t know how I missed that font. As particularly the Fraktur P is critical in Ancient manuscript textual criticism (What I’m currently doing my paper on), it is nice to have this workaround until the bugfix is pushed! Thanks again…

@ScottC : You might want to have a look at Apparatus SIL which is a unicode font, too, specialized for this kind of application. The Fraktur Math range is intended for mathematics, of course. Not that anyone would care. :slight_smile:

@oweng : I had Free Serif at first, then edited to Sans in light of OP’s preferences. I should have left it!

@dajare : Apparatus SIL is perfect for what I’m doing as is it’s sister, Charis SIL.

I probably should have made my first post more clear (it was late). I generally use Ubuntu for accented Greek. As far as most English documents, and particularly this one, I use a serif, such as Lib. Serif or similar. The Apparatus/Charis pair will suit my needs for Textual Criticism work perfectly. Thanks again.