Typing Direction Options

Jaqaliah · January 15, 2016, 12:02am

One of my avocations is linguistics, something that has been ill-supported for a long time by word processing programs all of which have assumed either a Roman style of Left to Right or a Semitic style of Right to Left composition. Not all languages work this way, and for artistic and poetical purposes what I am about to propose creates more expression opportunities for writers. This may also help in the reproduction and use of traditional languages around the world. With a language becoming extinct about every few weeks this is potentially a critical support to aid in slowing or halting that trend. If I could code I would do this, I cannot.

What would be helpful is if one could open a format that was part of the same place where it shows left, center, right, or justified formatting - but with more options. These option sets would be Horizontal or Vertical, Top to Bottom or Bottom to Top, and Boustrophedon, Right to Left, or Left to Right.

Languages and types of languages that would benefit from this attention:

Ancient Berber: R->L in columns B->T.
Batak: R->L in columns B->T.
Chinese: R->L in columns T->B, L->R in rows, or R->L in rows. In publications with columnar text headlines and titles appear R->L across the top of the text.
Chữ-nôm: R->L in columns T->B.
Egyptian Hieroglyphics: Columns T->B, L->R in rows, or R->L in rows.
Elamite, Old: L->R in columns T->B.
Etruscan: Boustrophedon and R->L horizontal.
Hanuno’o: R->L in columns B->T.
Japanese: Originally R->L in columns T->B. Horizontal forms began after 1868.
Korean: Originally written R->L in columns T->B.
Kulitan: R->L in columns T->B.
Linear B: Boustrophedon.
Manchu: L->R in columns T->B.
Mayan: written in columns each containing two glyphs and each glyph comprised of various combined parts.
Meroïtic (Hieroglyphic script): R->L in columns T->B.
Mongolian: L->R in columns T->B.
Nushu: R->L in columns T->B.
Oirat Clear Script: L->R in columns T->B.
Phags-pa: L->R in columns T->B.
Rongo Rongo: Boustrophedon.
Sabaean: Boustrophedon.
Sogdian: L->R in columns T->B.
Sutton Sign Writing: L->R in columns T->B.
Székely-Hungarian Rovás: Boustrophedon.
Tagbanwa: R->L in columns B->T.
Tangut: R->L in columns T->B.
Uyghur: L->R in columns T->B.

KEY to Abbreviations:
Right to Left = R->L
Left to Right = L->R
Top to Bottom = T->B
Bottom to Top = B->T

oweng · March 7, 2016, 10:49am

Thanks for raising this issue. While I completely sympathise with the lack of adequate support for language directionality, especially given the diversity in the language spectrum and the rate of extinction you indicate, this is a difficult issue. Ultimately the problem or lack lies at the file specification (ODF) level, rather than the implementation (LO) level. Typesetting of historical texts is distinctly challenging and often outside the scope of the committees that work to specify the related standards. A basic comparison of the 400 years of typesetting against the HTML specification is revealing. I doubt there will be any adequate answer offered on this forum to the issues touched on. Please continue to raise awareness of these limitations though.

The question here is related to handling LTR/RTL directionality.

petermau · March 7, 2016, 5:24pm

An interesting question as @oweng has already commented.

LibreOfice, like the Internet uses Unicode which currently has about 111,000 registered characters and can potentially cope with more than a million. Some of the languages you cite, Mongolian, Etruscan, Chinese and Japanese, to name a few are supported in Unicode. Those not currently supported, would either need to be added or separately defined in the user private area, like Klingon or Tengar. Mayan is the subject to much discussion. Otherwise the languages are, I regret to suggest, actually a non-starter for LibO, except using graphical tools.

You may find THE UNICODE STANDARD manual a useful reference as it deals with a number of issues you have raised. Boustrophedon is a challenging issue. However, some of the top to bottom, bottom to top etc. issues can be assisted by the use of standard l>r r>l and rotating the characters, often backwards, through 90 degrees. I understand that Mongolian is an example. In that case LibO could provide a partial solution. Otherwise I regret you are looking for a more specialised option than LibO can offer.

Do you any specific language especially in mind, or is this just a general topic for you? I hope you have success…Peter

Update 8 March 2016

UTF-8 is the full Unicode, not a subset. When necessary UTF-8 uses multiple 8 bit characters to represent characters above the 8 bit limit. This saves space and performance loss for the general Latin character users. UTF-32, on the other hand uses 32 bits for all characters including US-ASCII and ISO 8859-1, padding out each character. Providing the Font contains the characters, LibO insert characters will allow you to insert the characters.

Linux systems are based around UTF-8 and you can use the ctrl-shift U+xxxxxx to insert any UTF-8 character. This is valid even if the font does not support the character inserted, a little rectangle is displayed for missing characters. This is not the same as � (U+fffd), invalid character.

LibO 5.1 has added this function for Window’s. Hence the window’s specific comment in the what’s new in 5.1. …Peter

rautamiekka · March 7, 2016, 5:46pm

LO uses a full UTF-8, right ?

oweng · March 9, 2016, 9:28am

Apologies. I really should not post after only 2hrs sleep. I got confused over the UTF-16 representation used under Windows. Thanks for the reminder.

petermau · March 9, 2016, 10:52am

No need. It is a complicated question to summarise in a short space. I do read your and other comments with interest. There are huge areas of LibO I have little knowledge of but interest in…Peter

AlexKemp · August 31, 2020, 8:53am