Ask Your Question
3

Bidirectional text and closing bracket bug. [closed]

asked 2013-06-12 17:42:57 +0100

jenka1980 gravatar image

updated 2015-08-31 08:03:07 +0100

Alex Kemp gravatar image

When writing a sentence which contains LTR and RTL languages, there is a problem with closing bracket. if closing bracket is in apposite direction language to the paragraph direction and text immediately after the bracket is in paragraph direction.

EDITED: Here is an image produced on Windows 7 with LO 4.0.3.3 using "Lucida Sans Unicode" TTF font image description

When using "Linux Biolinum G" font then things go wild. (Here I spacial used Hebrew characters "דהו" and English "def" witch suppose to be inside the parentheses) image description

and here is how it should be (using MS-word) image description

BiDi text (English\Hebrew for testing):

LTR paragraph

abc אבג (דהו)

abc אבג (דהו) abc

abc (אבג)

abc אבג (דהו) אבג

RTL paragraph

אבג abc (def)

אבג abc (def) אבג

אבג (abc)

אבג abc (def) abc

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2016-03-01 21:28:58.437012

Comments

Can you please indicate your platform, LO version, and particular font you are experiencing this issue with? Thanks.

oweng gravatar imageoweng ( 2013-06-13 03:50:47 +0100 )edit

I'm using Windows 7, LO 4.0.3.3, all Hebrew supporting TTF fonts like (Aharoni, Arial, David, FrankRuehl, Gisha, Levenim MT, Lucida Sans Unicode, Miriam, Narkisim, Rod, Tahoma, Times New Roman) and there is a special problem with (Linux Biolinum G, Linux Libertine G) fonts, as you can see in screen shots that I added to the question.

jenka1980 gravatar imagejenka1980 ( 2013-06-13 11:36:59 +0100 )edit

Plus I'll try it on Linux Mint when I'll be back at home.

jenka1980 gravatar imagejenka1980 ( 2013-06-13 11:49:22 +0100 )edit

Thank you for expanding on your question and for providing the graphics. Much clearer. Based on the fonts you are using I imagine the experience under Linux Mint will be similar, but I will be interested in your findings either way.

oweng gravatar imageoweng ( 2013-06-13 12:29:42 +0100 )edit

1 Answer

Sort by » oldest newest most voted
5

answered 2013-06-13 03:49:40 +0100

oweng gravatar image

updated 2014-08-09 14:53:57 +0100

This is a problem with a difficult history. It appears to depend on several factors, including platform, locale, the font (and font technology), and the entry sequence of Unicode directional formatting codes in particular. I have almost complete rewritten my original pair of answers both for better accuracy and to provide a better understanding of how Complex Text Layout (CTL) is "handled" within LO/ODF and what your particular issue may be.

While there are associated bugs, I am no longer certain that this is a bug. It would seem to me more likely that the particular issue raised by this question relates to how characters are input and the correct sequence for this. This is however a technical matter with scope for great variance and probable improvement, at least in terms of User Experience (UX). I refer here in particular to the comments in fdo#61795, which indicate that a reliance on Unicode directional formatting codes, while technically correct, is not necessarily the most user-friendly approach.

Bugs

This is a summary, and probably not a comprehensive one, of related bugs at the present time. They appear to be inter-related as well as varied in their relation to the font and Unicode directional formatting code aspects. To what degree each is a genuine bug is IMO a matter for the developers.

  • fdo#33302, RTL text: parentheses and brackets "(...) [...]" inverted to ")...( ]...[" with some fonts. MacOSX only. Appears to be the original bug and contains lots of detail. Refer comments #15, #18, #19, #22-23, and #35. Verified as fixed (as a result of patches to other bugs e.g., fdo#59892) for v4.1beta1.
  • fdo#56408, Brackets are not handled correctly with mixed English/Latin and Hebrew/Arabic texts. Still open with no clear resolution.
  • fdo#60533, Brackets (..),{..},[..] inverted )..(,}..{,]..[ when switch to RTL text direction with all fonts. Still open and present status is unclear. A patch was submitted for v4.0.3 but it caused other bugs so was reverted.
  • fdo#60534, Brackets (..),{..},[..] inverted to )..(,}..{,]..[ when switch to RTL text direction with Graphite fonts only. Companion bug to fdo#60533 but unlike fdo#33302 this one applies to all platforms. Still open and present status indicates this appears to be a difficult bug that apparently requires a rework of the LO Graphite integration code.[1]
  • fdo#61795, Weak Characters (like brackets) are mispositioned with mixed RTL and LTR. Still open and bug is marked as relating to fdo#56408 but it is unclear to what degree it is a duplicate of the other bugs listed here. It is worth noting that brackets are not classified as "weak" by Unicode, but rather as "neutral" so the title of this bug at least is wrong. Comment #1 indicates that there may be a shortcoming in the ODF specification for handling CTL at the span element level i.e., "Put simply, HTML has <div dir="rtl"> and <span dir="rtl">, and OpenDocument only has something like <div dir="rtl">, but not <span dir="rtl ...
(more)
edit flag offensive delete link more

Comments

Thanks for such comprehensive answer I guess you are right my case is the last one.

jenka1980 gravatar imagejenka1980 ( 2013-06-13 11:46:37 +0100 )edit

That's an awesome answer, @oweng. It won't add much but, FWIW, here's my experience with Linux Biolinum G, and also with Taamey David which is available from the Culmus Project. This is OS Linux Mint 13/Xfce and LibO 4.0.3.3. Btw, in Windows, just put a number character at the bracket boundary with LTR, and brackets go whacky, too. (Or did, in Word 2010 on XP.)

David gravatar imageDavid ( 2013-06-15 14:14:34 +0100 )edit

TBH my answer is not that "awesome" as CTL is "complex" for a reason (and I wish I understood it better). Numbers and most punctuation marks are classified as Weak, while space and brackets are Neutral (TR9). The RLM/LRM I indicated are for use with individual characters while RLE (U+202b) and LRE (U+202a) are used to surround strings. I have tested this and the results are far worse. I will re-edit my answer none-the-less.

oweng gravatar imageoweng ( 2013-06-15 23:35:22 +0100 )edit

Not that "awesome"? Maybe--but it shows a level of engagement with OP's issue that is exemplary, and you did a real service to a wider readership in the thoroughness of your reply. That's how I see it, anyway. :) Btw, for Windows, a Unicode editor called Babelpad shows character-at-cursor in the status bar (see mid-bottom of screeshot at that link). Very helpful feature for sorting bidi issues. Would be useful in Writer: or does it already?

David gravatar imageDavid ( 2013-06-15 23:57:35 +0100 )edit

@David, thanks for your encouragement. I loved Babelpad when I ran Windows. These days I mainly start Win7 for testing only. I have re-written my answer and the conclusion is now different. If you and @jenka1980 get the time, please verify the use of the RLE+PDF / LRE+PDF code pairs as I now indicate. I think this will solve the issue.

oweng gravatar imageoweng ( 2013-06-16 07:45:08 +0100 )edit

Question Tools

Stats

Asked: 2013-06-12 17:42:57 +0100

Seen: 2,406 times

Last updated: Aug 09 '14