Correct Auto-indent for Chinese text (not sure if this bug or setting)

Hello,

I apply automatically in “indent&spacing” tab paragraph style and set paragraph to Chinese language.

The current behavior of this action result in 1 Chinese glyph width( 2 latin glyphs width) indent in first line, but the formal indent a Chinese paragraph is equal to the width of 2 Chinese glyphs.

I reported this as bug, https://bugs.freedesktop.org/show_bug.cgi?id=64975

but still looking for someone to explain this to me in case no one to fix it, I may be able to find someone could code.

Thanks.

Even if the 2 vs 1 character value for the Automatic check box setting is not a bug (it sounds like it is), there is a problem with using a character as a unit (e.g., “ch”) as you have pointed out and I confirm in my answer. Thanks for raising this issue.

This is partly an answer supporting the Automatic setting of the indent for Chinese text being 2 characters rather than 1 character, and partly a demonstration of how using a character (“ch”) as a unit of indentation has other problems. Both of these issues have been pointed out by @jiero in the comments beneath the answer by @ROSt53. Hopefully this provides further support for the argument as put forth.

I am attaching an example (rename from ODT to ZIP) to show how the current implementation for indenting Chinese paragraph text varies with type size (10.5 pt and 72 pt). Illustrations cover:

  • Automatic setting for First line.
  • 2.00ch setting for First line.
  • Two ideographic space (U+3000) at paragraph beginning.

The third test above is effectively the control group that shows how Chinese text would normally be expected to behave. @jiero if you can confirm that what I have done here is OK I will attach it to the bug (and confirm it).

The attachment is fairly self-explanatory so I won’t bother repeating the full details here. Summary: The “character” (i.e., “ch” unit) as implemented for indenting appears to be a fixed unit (set at whatever the default text size is, such as 10.5 pt) rather a character of the current type size. This does not occur when the Automatic (i.e., single character) setting is used, but that has the problem of being one, rather than two characters as mentioned.

EDIT: The related OpenOffice bug for the 2 characters vs 1 character indent is AOO#85257. It is probably also worth noting that the CJK macro package for LaTeX introduced a \CJKindent macro in v4.5.1 on 17 June 2002 that was set to two ideographic spaces. I think you can mention this in your bug @jiero.

Yes, thank you, please confirm the bug. You’ve made wonderful explanation (plus example). Also great search. I will update the bug report. Cheers.

@oweng - great answer and great work in the background on the matter.

We know from @jiero that Chinese requires 2 characters for the first line of a paragraph as default.

I would like to propose that (default) requirements for Korean and Japanese is asked for. Both languages have routes to Chinese and use double-byte words. Therefore, native speakers of both languages should be asked.

If any Korean or Japanese native speaker reads this thread, could you please give advice?

I see @jiero you have updated your bug. Good luck with it. @ROSt53 the W3C gives a poor account of Hangul indentation. They stipulate “the value of the character width in the specific paragraph is used as the default unit for indentation” and yet the example shown is slightly less than a full character. Also, these settings (per the Automatic option) should be language-based, thus in this case, specific to Chinese.

still wondering which component is this belonging to? I want to ask someone to help.

I think “Writer” is the correct component. There are no other more suitable options. Ideally it would be good to have this in Calc / Draw / Impress as well. It really depends on whether this is handled in a shared piece of code or not. I imagine it is, but I am not certain.

As we have a comment yet from a Japanese native speaker, I just check about 10 Japanese books written vertically or horizontally. I always see only a 1 character (double byte) indent. Thus for me the default indent in Japanese is correct.

Thank you, @ROSt53. I’m sorry to have included CJK in the title, and I will change my report in corresponding to your comment. :slight_smile:

@ROSt53 I don’t think there is any danger of the Japanese default indentation being altered as it is defined in standards JIS Z 8125, JIS X 4051-1995, and TR X 0010/E. The W3C cites these standards e.g., TR/2012/NOTE-jlreq-20120403/ and TR/1999/WD-i18n-format-19990910/. The burden is on us to find a Chinese standard equivalent.

@jiero, I am struggling to find a Guojia Biaozhun (国家标准) document that specifies a paragraph indent of two characters. There are plenty of references on the internet (look at any scientific paper) to using two characters, but neither GB7713-87 or GB7714-2005 mention indentation. I am using search terms of first line (第一线), paragraph indent (段落缩进), and two characters (两个字符), does that seem right?

I searched in Chinese and found this
http://www.sasac.gov.cn/n1180/n1271/n6716920/n6716950/n6723785/6723886.html

8.2.3 公文正文

主送机关名称下一行,每自然段左空2字,回行顶格。数字、年份不能回行。

每自然段左空2字 = leave 2 characters space on first line of each paragragh

Hope this can be solved in LibreOffice 4.2.*

Thanks for finding this document Jiero. I have updated the bug with a link to the 2012 version of the same standard as well as your link.

Open your paragrah style window
Select Indent and Spacing
Set First line to “2 ch” click OK and all should be set.

Please note that the “2 ch” entry might change to 0.74 cm after you click Apply or OK and reopen. However in the document it should be correct.

I tested with Japanese on XPProf/SP3 and LibO 3.6.6.2

image description

Thanks for the suggestion, I really want to make it “Automatic”. Or the layout will still change if you want the document with scaled text.

@ROSt53, under v4.0.3.3 the default for Chinese text, when Asian typography is turned on, shows the “ch” unit, which makes it easier.

@oweng, “ch” is a fake unit. It actually take only 10.5 size to work. I don’t know whom added it, but it was a serious mistake to have this misleading feature/name. Few chinese could express in English, even the bug was old, seems no one have ever reported it. http://forum.openoffice.org/zh/forum/viewtopic.php?f=6&t=607 or earlier OpenOffice 4怎么不出来? - Linux新手园地 - Chinaunix - Powered by Discuz! Archiver . “缩进” means indent.

@jiero, I defer to your expertise as a native speaker / writer in this instance. In fact, I would tend to agree with you that using a character (ch) as a unit would seem awkward as the width is subject to vary with the font being used. What happens in a paragraph of mixed languages / fonts? Thanks for posting the links. I had to translate them, but I understand what is being discussed. I will have a look tomorrow for a related FDO bug.

@oweng - Thanks for your information about the ch-unit. In general (= in latin character languages; as for CTL I cannot judge) it would be good to have this unit displayed if entered.

I made also a test in 3.6.6.2 and 4.0.3.3 and changed to Japanese UI to see if the ch-unit remains. In both versions the ch-unit disappeared.

However I found that all katakana characters are wrongly displayed. fdo#65151