rendering right-to-left Latin text in logical order

Hello there,

I’m currently building an OpenType typeface for the Engsvanyáli script, as found in the ConScript Unicode Registry. Like the Arabic script, Engsvanyáli is written right-to-left, except for its numbers, which are written left-to-right. The script is used to write the Tsolyáni constructed language. All of the Engsvanyáli characters are in the primary Unicode Private Use Area (PUA) in the Basic Multilingual Plane, between U+E100 and U+E14F. Unlike Latin and Arabic characters, no character in the PUA has any inherent directionality.

To ease entry of Engsvanyáli text, the typeface has a ‘ccmp’ OpenType feature, with many rules to substitute Latin glyphs for Engsvanyáli glyphs. For example, the Tsolyáni word “ngangmuru” (“hello”) is typed with those nine Latin letters, and is rendered as seven Engsvanyáli glyphs, viz: ng.init a.comb ng.medi m.medi u.comb r.fina u.comb. [Like the Arabic script, most consonants in the Engsvanyáli script have four forms: init(ial), medi(al), fina(l), and isol(ated). Its vowels have two forms, init(ial) and comb(ining marks for non-initial). For a given Engsvanyáli consonant, the same PUA character is used to represent all of its forms.]

Since PUA characters do not have inherent directionality, neither LibreOffice’s Right-To-Left text mode (entered with ⇧⌘D on macOS) nor the insertion of the U+200F RIGHT-TO-LEFT MARK (RLM) character can be used to cause “ngangmuru” to be rendered in right-to-left order as “urumgnagn”.

However, the insertion of a U+2067 RIGHT-TO-LEFT ISOLATE (RLI), U+202B RIGHT-TO-LEFT EMBEDDING (RLE), or U+202E RIGHT-TO-LEFT OVERRIDE (RLO) character can be used to render “ngangmuru” as “urumgnagn”, depending on operating system support. [The U+2069 POP DIRECTIONAL ISOLATE (PDI) character marks the end of the right-to-left area for RLI, and U+202C POP DIRECTIONAL FORMATTING (PDF) character marks the end of the right-to-left area for both RLE and RLO.]

Because I use LibreOffice 6.2.8.2 on OS X 10.9 Mavericks—the latest version of each that my old computer supports—RLI and RLE are not supported, but RLO is. When I type “ngangmuru” in a Latin typeface, and change the typeface to the Engsvanyáli one, the Engsvanyáli glyphs are faithfully rendered, but in left-to-right order, since the underlying characters are Latin. When I type “ngangmuru” in a Latin typeface followed by the (invisible) PDF character, and then insert the (invisible) RLO character at the beginning of the string, the text is correctly rendered as “urumgnagn”. However, when I then change the text’s typeface to the Engsvanyáli one, the Engsvanyáli glyphs are rendered incorrectly, as the nine glyphs u.init r.init u.comb m.medi g.medi n.medi a.comb g.fina n.fina.

It seems as though LibreOffice’s text parser is applying OpenType features to right-to-left Latin text in the rendering order rather than in the text’s logical order. This isn’t surprising, since a Latin typeface with a “fi” ligature would typically display e.g. the reversed word “fin” with the three glyphs n i f rather than the two glyphs n fi. However, without the ability to render reversed text in logical order, a right-to-left script in a PUA such as Engsvanyáli would be unusable in LibreOffice by entering left-to-right characters and simply changing the typeface.

Is there a way to direct LibreOffice to render right-to-left Latin text according to its logical order rather than its rendering order?

I think you should submit a Bug report as your plea exceeds largely the skills of volunteer-users answering here. It is possible that this has an impact not only on LO but also on the underlying font renderer(s? – as I am not sure that the dame font renderer is used on all platforms).

It’s certainly possible that rendering right-to-left Latin text in logical order might require changes to both LibreOffice and the font-rendering libraries upon which it depends on LibreOffice’s supported platforms. Since I’m familiar with neither the LibreOffice source code nor which font-rendering libraries it uses, though, I can’t state that as a fact.

It seems to me that if such changes are needed, it would be more of a feature request than a bug report. Are feature requests also submitted through the Bugzilla system that you’d linked to?

[As an aside, in this commenting system, it seems as though paragraph spacing in replies is different from paragraph spacing in original comments.]

Yes. TDF Bugzilla is THE channel to reach developers on identified bugs, feature requests, enhancements, …

The site engine is Discourse which is HTML-based. HTML rules state that any sequence of whitespace is trimmed down to a single “whitespace”. Therefore, several spaces end up a single space; a mixture of spaces and linebreaks end up as a single linebreak (the “stronger” form of both).

If you want some spacing between your paragraphs, prefix the second one with <br>. And since <br> is not “whitespace” (this HTML tag is made of 4 non-space characters), a sequence of <br> will never be reduced to a single occurrence.

My original comment does not seem to have been fully treated as HTML. I’d used an <a> element for its only link, so that component was interpreted as HTML, but two consecutive newlines between each of its paragraphs were consistently treated as plaintext rather than as HTML sequences of whitespace. (I’d copied its text from a text editor, so perhaps that had something to do with how it was treated.)

Yes because pure HTML would allow XSS attacks for example. Consequently several tags are filtered out. Links can be added using the tool looking like chain rings. This allows Discourse to make a few checks (but it does not completely bar spam). Underline is also eliminated (I don’t understand why).

There a configuration glitch in AskLO: answers (solutions) are not handled the same as comments. In answers multiple linebreaks are kept, while in comments the W3C merge rule applies.

Then it would seem as though original comments (“questions”?) are also affected by the same configuration glitch that answers (solutions) are.

Perhaps surrounding each paragraph in a reply with a `<‌p>` element would be a more semantic approach than adding `<‌br>` elements between them?


[Nope—it didn’t work above.]

While it is a good idea on one hand it will not help @gratisserie as I don’t think any changes will be done for his version:

It’d be highly unlikely; many software developers don’t consider that older versions of their software are still in active use, and so no longer care to support those versions.


However, if this were added as a new feature to current versions of LibreOffice (and the font rendering libraries that it uses), it could help people that are able to use current versions.

LibreOffice Release Plan

I don’t have the money to hire a LibreOffice Certified Developer to update my “living dead” version of LibreOffice.

Is upgrading to a version that might receive the enhancement not an option?

A lot of users have “old” Macs, where you can get no newer OS from Apple, and LO-version and OS-version are correlated.
@gratisserie already mentioned

I expect the same for Win10 some time in future, as I already found some questions concerning Win7, where LO-support was dropped this year.

Thank you for confirming developers have no idea of the “outside world”. I was recently told similiar stuff.
.
I thought the reason for dropping Win7-support sounded understandable. (LO contains components of Python and the Python community dropped Win7-support. So TDF would need to adopt the job python developers will not do… Work spend there may be better invested in other areas. ) MacOS has similiar constraints.
But with free speech etc. you are welcome just to think “we” don’t care. (I don’t develop LO-code, but I’m a programmer, therfore “we”)

Ah, I missed that in all the wordage here :slight_smile: