Adding too much Word Joiner

Kusb_Pooh · January 24, 2025, 1:48am

We are from the Hong Kong Society for the Blind, Centralised Braille Production Centre.
We transcribe text book into Cantonese braille for the visually impaired students in Hong Kong.
Currently, we are testing LibreOffice to edit and format the braille content.

Per our testing, it is found that some characters will break into the following lines wrongly.
Here is our example:
sc1

This phrase has been transcribed into Cantonese braille. (Example 1)
Characters are delimited meaningfully using no-width optional break(pink color).

However, when it approaching end of line, some delimited characters are being broken wrongly into the following line. (Example 2)
You could find that w%1 is being broken wrongly.

To resolve, we need to add the word joinder in front of the % symbol. (Example 3)

However, this is not the only case. This happens to many other characters also liked $,?,!, etc.
Some need the word joinder in front, some need the word joiner behind.
This brings extra time to proof-read the braille content.

Is there any setting/option to cancel this break function?
Please advise.
Thank you.

Best Regards,
Sharon

ajlittoz · January 24, 2025, 10:45am

Please OS name, exact LO version and save format.

Which character is exactly “no-width optional break”? Tell its Unicode encoding if it is not an internal technical formatting mark.

I experimented with Insert>Formatting Mark>Zero-width Space and it gives the expected result.

If you know the encoding for your “delimiter”, you can replace it with U+200B ZERO WIDTH SPACE.

The Text Flow properties of your paragraph style may also play a role in your behaviour. Best thing to do is to attach a 1-page max. sample file for analysis.

Kusb_Pooh · January 27, 2025, 2:46am

Hello,

My OS is Windows 10 Professional (64-bit).
LO version is 7.3.4.2 (x64).
Save format is odt file.
Please find the attached Adding_WJ.odt for sample file.

Our delimiter is \u200b, the no-width optional break.
In the sample file, those words in Bold are cases that we need to add word joiner(\u2060).

I have prepared 2 paragraphs. One with word joiner and the other one without.
An extra red “1” is added at the start of each paragraph to illustrate that the % character breaks into the following line wrongly if without word joiner.

Per our example, other characters liked $, ? and ! need adding the word joiner also.
Some need the word joinder in front, some need the word joiner behind.
This brings extra time to proof-read a 100-page braille content.

Is there any other setting/option to prevent a specific character from breaking into the following line wrongly?
Please advise.
Thank you.

Adding_WJ.odt (18.5 KB)

ajlittoz · January 27, 2025, 10:29am

OK, I now understand the issue though I could not find information on the encoding.

I wonder if Writer is the right tool to do it. AFAIK Braille rendering uses a monospaced font. Then maximum number of glyphs in a line is constant and known. I don’t think you will ever request variants like bold, italic, font size change, … Therefore, added value of Writer is very limited unless you also have non-Braille text in the same document.

Inside Writer, the problem arises from the definition of a word. Since you encode through ASCII characters, the common “Western” definition of a word is used. A word is made of alphanumeric characters. A “boundary” is implicitly defined at both ends of such sequences, e.g. between c and ?, w and %, § and a, … However there is no boundary between 1 and m.

A text editor is an easier tool because it assumes nothing about file contents but you have to manage yourself end of line.

If you want to keep on using Writer (perhaps because you have other ordinary text), mimic what a text editor would do. Apply a dedicated paragraph style to your Braille encoding (you already have one and I suppose the font has been temporarily set to Courier New to enter your text). This style request a monospaced font (already done). Manage yourself the end of line: this requires less work than adding word joiners everywhere.

The “end of line” can be a line break Ctrl+Enter to keep the logic of paragraphs or a paragraph break Enter if you don’t mind having “paragraphs” not making sense from a significance or sentence point of view.

You didn't tell how your ASCII encoding was created. I suspect it is an automated process based on an obsolete standard (which maps Braille glyphs to Braille ASCII). This mapping is creating the problem because ASCII characters is associated primarily with English and implies the aforementioned definition of a *word*. Unicode has a dedicated Braill block U+2800 to U+28FF out of which the first 64 positions contain the same patterns as Braille ASCII **but in different order** (so that you can't simply offset your present encoding).
If you could review your present generator to remap the encoding, this could probably eliminate a part of the problem (but I feel you still need to handle yourself line wrap).