The "Find" box doesn't accept the same font and keyman combinations as the body of text does

I am typing a Mon based script using the Padauk font and a Keyman keyboard. The language uses the Mon/Burmese characters, but doesn’t follow the Burmese language rules. I have used this combination, including the “Find” function for years, but for some reason the “Find” box won’t allow me to type a lot of the word combinations anymore. When I search a document, it seems to pick out just the consonants and therefore has a lot of matches that don’t actually match the word. It makes the “find” function pretty much useless.

You didn’t tell whic OS you’re under, LO version nor save format. If your documents have been edited for years under Window$ with “special” fonts, you may face an issue with “converted legacy fonts”. Before Unicode, W$ used custom 256-character fonts. When the Unicode switch happened, those fonts were quick’n’dirty converted by transferring the upper 128 characters into an encoding ranged called Private Use Area, likely with patches to the keyboard driver in order not to break habits. However, this was supposed to be temporary because Unicode defines encoding ranges for all scripts in the world (and even for historical scripts like hieroglyphs or cuneiform).

This legacy trick is doomed to fail in this Unicode era.

Without more details on your configuration and document, it is impossible to diagnose anything. Please attach your document or a reduced sample of it for analysis.

Sorry, this is all new to me. I’m using Windows 11, LO 7.6 and saving the documents as .odt I’ll attach a sample, but I don’t know if it will even display correctly. I only use the Padauk font and the keyman keyboard was made for me by the man who headed up the creation Padauk Font. Everything should be in Unicode. When I say I used it for years, I guess it would be five or six years now. My problems developed about a year ago. ex. If I were to type လေါဟ် in the find box, it comes up as လေဟ်. If I do a search for ဖိုဟ် it “matches” with ဖၠဟ် and ဖေါဟ် as well.
Sample.odt (69.8 KB)

I checked your text is indeed Unicode-compliant. All its characters are taken from the Myanmar Unicode block U+1000 to U+109F. I also checked that the Padauk font is not a legacy font but a real Unicode-compliant font.

In your first example (ဆ်ုမါလေ), what gets retained in the find box (same in Edit>Find & Replace) is the string without U+102B MYANMAR VOWEL SIGN TALL AA, even if I paste the string.

I suspect the Myanmar script is not correctly handled by Writer: it is likely that all combining characters (a technical term for accents and other diacritics) are removed in this dialog and the matching algorithm also ignores them.

Submit a bug report and mention its number in a comment below as tdf#1234 where “1234” is the bug number. Attach your sample file to the report and explain in detail the difference between what you expect and what is returned so that developers can reproduce it and analyse the case.

3 Likes

Note that the Find box don’t shows the U+102B, but it is there.
I can’t see the symbol in the text, but I can find it (later converted to U+102b).
But also there are some other words without U+102b (with Yellow background).
imagen