Soft-hyphen management

You can hint hyphenation by using U+00AD SOFT HYPHEN Ctrl+- in Writer. If it is not used, it does not print. If line wrap opts for hyphenating such a flagged word, a hyphen will be printed at end of line to warn reader about the hyphenation.

I am presently working on an old font which was created under the ISO-8859 era and was converted without further consideration to Unicode. Consequently many character slots do not abide by the Unicode semantics and must be reallocated to where they now belong.

One of the problematic allocations is the pair U+002D HYPHEN-MINUS and U+00AD SOFT HYPHEN using very uncommon “dashes”.

When hyphenation occurs, does Writer replace SOFT HYPHEN with HYPHEN-MINUS in the output or does it use the glyph provided at U+00AD?

In other words, Is SOFT HYPHEN only a function indicator or a real printing character?

Answer to this question could greatly simplify my upgrade task.

In my test, in the output Writer replace soft hyphens with U+002D. To check that, I inserted a soft hyphen in a short document and then exported the text to PDF. After I copied the resulting hyphen from the PDF, I checked it both in Writer itself and in KCharSelect: an hyphen-minus.

@RGB-es: thanks for experimenting. I’d prefer to have an answer from a developer, though.

To make sure, I changed the font and put a fancy glyph at U+00AD, but FontForge seems to ascertain consistency and copies the glyph at U+002D into U+00AD when generating the font. I have then no visual feed back.

Unicode section 6.2 says that SOFT HYPHEN is a format control character and thus has no visible glyph of its own. Unicode section 23.2 seems to hint that application software is allowed to use HYPHEN-MINUS when deciding for word break but this is not mandatory.

All in all, this is not of great help but it is likely that I can ignore the dilemma SOFT HYPHEN/HYPHEN-MINUS in my font reorganisation job.