Aramaic White Square Characters

Note: This question was edited to changed the offensive term ‘tofu’ to ‘white square.’

Sometime in the past 2 days my system lost the ability to properly render Aramaic in LibreOffice. Everything is fine in my other programs, my LO Writer is rendering ‘white square,’ and LO Calc is rendering ‘white square’ about half the time, and the proper characters the rest of the time.

I’ve used this system for ancient languages for years, and other than regular updates nothing has changed on the system. I’ve run some tests, and this is also affecting Phoenician, but not Hebrew, Arabic, or other RTL scripts.

Does anyone know what could be causing this?

Additionally, when I select the ‘white square’ in LO Writer, and go to Format->Character->Font, the proper character is rendered in the pop-up, meaning LO is internally recognizing the character, but not rendering it in the document. I assumed there was a problem with the settings in my document, however, older files that show the Aramaic text turn into a string of white square as soon as I try editing them.

And now for the strangest part, it does not affect double characters, meaning (AA) will render on screen, but not (A) or (AB), where A and B are substituted for the Aramaic letters.

System Info:
Linux Mint 19.1 Cinnamon /
Cinnamon Version 4.0.10 /
LibreOffice Version: 5.3.11 /
also confirmed the same problems on LibreOffice Version: 6.4.2.2

Also tested on the following with identical results:
Pop!_OS 20.04 /
KDE Plasma Version 5.18.5 /
KDE Frameworks Version: 5.68.0 /
Qt Version: 5.12.8 /
LibreOffice Version 6.4.5.2

If anyone has experienced this before, I would appreciate any pointers on how I can resolve it without reinstalling my OS. I’ve read everything I could find in ask.libreoffice.org, and several other sites that seemed even vaguely related to the issue, but so far nothing seems relevant. Thanks in advance.

To the site admin, sorry the ‘question’ is not a ‘question.’ I’m not sure how to simplify this question. Feel free to change the ‘question’ to a ‘question’ if you can figure out what that question is.

Tofu is a Japanese dish produced of soybeans. How does it relate to Aramaic?

Wikipedia: Tofu, or tofu character, used as Internet slang for any of: glyph for undisplayable character (□), substitution character (�) or substitute character (␚). Sometimes incorrectly used for Mojibake (the garbled text).

@J.Stornoway: don’t use slang! Most contributors are not English-language natives. I initially thought that “tofu” was the name of an Aramaic letter or diacritic.

Select your “tofu”, press Alt+X. You get U+9999. Is it the correct hexadecimal encoding for the expected character? If so, is the correct font selected? By that I mean have you added some direct formatting which would override the font? Try to clear it on a selection with Ctrl+M.

Even if we all here were native speakers of English, using slang is utterly stupid when discussing technical issues because slang words and expressions lack preciseness. Clear language is always better.

My apologies to everyone unfamiliar with the term tofu. I though it was the common term for the □ character.
@ajlittoz: Yes, the correct hexadecimal encoding is showing, and I was using the same font I have used for years: Liberation Serif. Additionally, as I work in several languages, I have fonts for most languages installed. However, my system has stopped automatically switching to the default font it used to use: Lohit Devanagari, when I copy and paste a section of CTL text. Lohit Devanagari does not have the unicode range included, but would fail-over to another font that did. I haven’t removed any fonts.

Just thought I’d add that this appears to be related somehow to Java, as I got a Java error asking me to select my Java environment at Options=>Advanced. When I turned off Java, it sort-of started working again, but not as it was before.

Previously I have been using the font Liberation Serif, which is still not rendering the letters in question. After turning off Java (in Writer), I can now get the Noto Sans fonts for the Unicode ranges in question to render the text, but, I have to set the font manually for each section of text when I switch languages. The ranges are not being automatically detected and rendered by an appropriate font.

Does anyone know how to set (or reset) the default ranges for LibreOffice? I am still having no issues with other software rendering the characters, so I continue to suspect the issue is originating with LO somehow.

Using a font is a personal choice. Writer will never interfere with your choice and change it based on criteria such as script block. Fonts are not required to cover the full Unicode rage (~1.1 M characters). The font renderer in the OS will eventually make a substitution when it detects that the font doesn’t contain the needed character or use the “missing character” glyph otherwise.

You should use consistent paragraph and character styles. I fear from your description that you “direct format” your document, which is the most reliable way to make it next to impossible to format.

  1. Show a sample file.
  2. Is Lohit Devanagari set for CTL text in the respective paragraph style properties.

The font renderer in LO changed between 5.3.1.1 and current 7.x. I don’t remember at which version and this could be the cause of the problem if you don’t style correctly your document.

@gabix. Thank you for link to the Clear Language document, however, it does not clarify what the □ should be called on this particular website. I used the term I knew.

It is called “officially” the missing character glyph in all articles related to fonts and in Unicode terminology.

Sure, the document clarifies that you should use correct terms, not some slang. For the character in question it is merely white square. For special characters, it is always a good idea to search the web (or Wikipedia) for their hexadecimal codes (see ajilittoz’s advise on Alt + X).

@ajlittoz: Thank you. I did not know that OS could override the font used in Writer when it detects a missing character. That is likely where the issue lies.

Regarding formatting: In the past I never needed to select the font as a worked my way through the document, I simply selected Liberation Serif at the beginning, and if I copied any CTL text into the document, I selected it and changed the font to Liberation Serif, and everything always looked right. The .odt files were saving the info correctly, as I was able to generate .pdfs correctly as well. Now, I have to selected each section of text separately and set the font.

As LibreOffice wasn’t changed, I’ll look into recent updates from Debian and Ubuntu to see if I can trace the issue. Thanks for informing me about the OS switching fonts.

I have removed the term tofu from the question, and replaced it with the term ‘white square’ as gabix provided a link supporting his claim that this is the name of the symbol. Thanks gabix. In retrospect, I should have googled the symbol before posting, but I thought it was actually named tofu.

Liberation Serif seems to contain only Hebrew (of CTL scripts). Apparently, as suggested above, there was some font substitution that does not work now. Check your paragraph style settings and explicitly select a suitable font for CTL languages.

Just thought I’d post what I think was the solution to the problem for anyone else that might encounter it. If it wasn’t the actual solution, it resolved the problem either way.

After a few days of looking for issues with the OS (in my case Debian/Ubuntu/Linux Mint), and not finding anything that seems to be the cause, it occurred to me one of more fonts may have become corrupted and I reinstalled all the fonts I use. After a restart, the system is preforming like it was before.

Thanks to everyone that has contributed to old OpenOffice and LibreOffice over the years, and thanks to ajlittoz and gabix for offering suggestions on what the problem was.