Typographic issues with Table of Contents and unbreakable Paragraph in table

Hi. I am working on a project that converts Asciidoc (with Asciidoctor) to odt. There are two problems with typography I can’t cope with.

Here in my project I made an example:

pdf from odt: https://courseorchestra.github.io/asciidoctor-open-document/ttest.pdf

  1. As you see Table of contents looks rather ugly. Long line can make dots go to next line or worse again long line can stick to page number. I’ve got even examples where dots remain on the same line while text and page number goes to next line without right allignment. In MS Word the problem can be solved by setting After text Indent while preserving tab stop. The last line becomes longer and tab with figure fit it well. But in Libre Office tab stops that exceed right paragraph margin are reduced to this margin.

  2. Endless problems with table in Writer - #3 by ajlittoz (@ajlittoz) assumes that Libre Office respects Don’t break paragrpah setting for paragraph inside the tables. But in this example it definitely doesn’t. Is it a right behaviour or I’m doing smth wrong?

Please help with this issues.

  • Issue #1: long TOC entries

You are in a corner case. TOC should not be very long (but as usual, we can’t tell all user cases). Unfortunately, there is no setting similar to Word in paragraph styles.

  • Issue #2: page breaks in tables

You didn’t read correctly what I hinted in the question you reference (for your defence, the answer was very terse), so I’ll explain it better.

Text entered in a table cell constitutes in fact a sub-document. It is no longer inside a page. It is now inside a cell which environment is the only only paragraph managers see. Consequently, there is no “page break” event sent to the paragraph manager.

The page break event is sent to the outer object, the one inserted in the page, i.e. to the table. If you don’t want any paragraph contained in a table cell to be split between pages, this flag must be set for table rows or for the whole table if it is short.

To be honest, when checking on an example, I could not “freeze” any row beyond the first one. So, if your table is only row long, you should be OK.

Anyway, describe the purpose of your table: what will be in the second column? There may be another solution then a table.

Thank you for the prompt answer.

  1. To the Table of Contents issues #1. I’ve found this In a table of contents, how to set the page number off to the right of justified entry text?, witty)), so I shouldn’t have bothered you. Sorry. You are right, section headings should be short, but I have to make a converter that produces typographic print ready result in any case. And this corner case is not so corner especially if we are autogenerating documentation. Anyway in my case odt is created programmatically, so I’ll try to find workaround solution with manually generated TOC (I mean layout). And here comes another question #3.

I could use table layouting, but I have not found “keep with next” attribute for a row. Am I right, it doesn’t exist? MS Word have a solution – to mark last paragraph of the last cell in a row with “keep with next” setting. Rather an ugly solution (as well as letting tabstop fall behind paragraph margin), in my opinion, but still works.

If no then putting page number in a frame looks like the only option.

  1. To the paragraph issue #2. I am really testing corner case and in the given example assume that the whole table should fall to the next page. If cell contents didn’t know about page breaks then we could have line broken across pages, but this doesn’t happen. So looks like it knows but to a limited extent. Really, if you allow to break row across pages then paragraph text flow setting should be respected. It would be strange for a paragraph with two lines to break between pages. Especially ugly in the last row of a table.

And here workaround is literally imposible. Probably to create a special table role in Asciidoc (last-row-keep-together). Just to mitigate the problem.

Don’t know if your question #3 is about page number frame in TOC.

If so, it will not work with the automated TOC collector engine because TOC entries are whole paragraphs made of “substituends” (see the Entries tab when creating a TOC). You can position “keywords” like E, E#, T or # for the page number. There is no provision to split the structure template into the current entry paragraph and a subordinate object like a frame where you could insert #.

These keywords are not available outside the TOC dialog. This means you can’t insert it in any frame.

But, if your conversion engine explicitly creates a TOC (by collecting some adequate information from the text), you can do whatever you want, e.g. anchoring a small frame to every entry and positioning it in the right margin. However, this means the TOC is no longer updated by Writer.

Table constraints

If your table has a single row, you can prevent the row from being split. And if the table is some kind of heading for what follows it, you can give it attribute Keep with next paragraph in Table>Properties. This could mitigate your difficulty.

correct.odt (21.5 KB)

Hi. Thank you for your time.

Issue #1. You are right I decided to generate table of contents myself. The whole TOC won’t refresh automatically, but page numbers will do. And that fits. Moreover, even sections names will update automatically. See the example. The question about the row was not clear on my side. When we generate TOC we follow a rule: if next TOC’s entry level is higher than it should be on the same page as the current one. I’ve studied ODF 1.3. There is no keep-with-next attribute for a row. But there is for the table. That is why I use a table for each TOC entry.

Still here is some research in native LibreOffice TOC.

The case where section name sticks to page number can be resolved by adding non-breaking space between tab (T) and page number (#) in Entries tab of Table of contents dialogue. ctrl-shift-space doesn’t work in this dialogue. But pasting does.

The first case (where tab goes to the next line) looks like a LibreOffice bug. Tab symbol shouldn’t go to second line if tab stop position is to the right right or coincides with the end of the preceding symbol. MS Word follows this behaviour and it is quite logical.

Look at the example. In the first paragraph word “lo” should go to the next line with tab symbol.

You may claim it to be a feature: text after tab doesn’t fit the remaing place, so we’ve got unceratainty here.

But look at 3d, 4th, 5th and 6th paragraphs. They are the same, the only difference is space between letters “l” and “o” (shown in comments). Here there is no text after tab, why then next line? As you see the difference between 3d and 6th lines constitutes 2pt (about 0.7mm). The font of the example is 44 pt. If we decrease font, the gap will be proportionally less. Looks like LibreOffice assumes tab symbol to have some width in itself and this contradicts definition of tab character and unprettifies the final result.

Issue #2. I have updated example to clarify my case. I think, you would agree that typography rules are violated. My only idea is to make last row unbreakable.

I had a look at your attached file.

Why do you number manually your chapters instead of using Tools>Chapter Numbering?

I slightly modified your “native” TOC using the trick I describe in my other answer your referenced above. For that:

  • I changed the structure line in the Entries tab of TOC description
  • I created a TOC page number character style so that the page number is not confused with the heading
  • I customised Contents 1 to indent second lines and subsequent for long entries

I know the page number position is not “traditional” but this could solve your issue with long headings and also allow to build automatically the TOC.

See correct-ajl.odt (17.4 KB)

@ajlittoz thank you for your help. All my questions are clarified and ways around are found. The idea of TOC fits some cases.

To the chapter nums. Asciidoctor has its own routine to calculate chapter number. It was logical to use it.

To the tab issue. Non-zero tab character looks like a bug. I can create an issue.

To the paragraph typography settings in table cells. Is it an expected behaviour or just a feature we are expecting in the future?

I don’t understand your last two points: non-zero tab characters?? typography settings in cells??

I’ll try to clarify my these points.

Tab characters
On the following picture there are four identical paragraphs that consist of some text and right-aligned tab character. The difference is in the length of text that is reached by the space between “l” and “o” character in the word “lon”. This length is shown in comments. Tab stop position is set to the right of paragraph end. So, as we already know coincides with the right paragraph margin.

tab-length-short

In the first paragraph text and tab symbol fit one line.
In the second paragraph tab symbol goes to the next line. But this shouldn’t happen. If letter “n” doesn’t go to the next line then its end is to the left or coincides with right paragraph margin (in fact to the left, see the 3d paragraph). But tab stop coincides with right paragraph margin, so tab symbol should remain on the first line.
In the third paragraph the situation is the same, but the line is 1.8 points wider.
In the forth paragraph letter “n” doesn’t fit the first line and goes to the next with our tab symbol.

Looks like tab symbol in this case has the width of almost 2 points (0,7mm). But according to definition tab symbol only defines the position of next symbol. 2nd and 3d paragraphs’ flow is incorrect

Typography settings in paragraphs inside cells

Look at the picture. The last paragraph in the table has orphan and widow control set to 2. As you see words “that contain spaces” are moved to the next page alone. They should have moved to the next page with the previous line “quotes are needed to specify constraint names”.

Even if I check an option “Don’t split paragraph”, the result will be the same. In 1.3 specification it is not stated that fo:keep-together should not be applied inside table cells.

In my LO 7.1.5.2 on Windows 10, I do not see the extended character spacing in your file. Maybe you changed the page size and did not update the index afterwards so it would reform itself? Right-click on a blank part and select Update index

The text in the table is constrained by the table, if you set row height to 30mm then the text flows as specified.
If you don’t want odd looking rows broken across pages then you can instead change the setting, untick Allow row to break across pages and columns. Cheers, Al

Tab characters

Checked again your attached file for your formatting way. There two locations where you typed “long long long” paragraphs: TOC entries styled as Contents 1 and “fake” TOC styled as Table Contents.

In both cases, I see “anomalies” which could cause conflicts and result in this behaviour.

  1. Contents 1 paragraph style

    The structure template (Entries tab) defines a T tab stop which is aligned right. But you also added a tab stop aligned right at 17cm in Contents 1 definition. They may interfere. However, I have not fully explored the interactions between the structure line and paragraph style definition with regard to tabs.

  2. Table Contents paragraph style

    Here there is no tab stop definition in the paragraph style and a right-align stop has bee manually added in the ruler beyond the right margin. This position can never be reached; therefore data after the tab character will always be set on next line.

    Apparently you enter a tab character only to get the leader line. Whether this prints the leader on the same on next line depends on the exact layout position reached. If you set manually your tab stops, you can’t be sure of the position due to rounding between screen location and distances. The only way to control accurately tab stop position is to define them literally in paragraph styles. But even there, you can stumble on rounding approximations.

Here is a clean example. I manually deleted all unnescessary styles. I would attach fodt to better check all measurements, but it is not allowed. Still, measurements are accurate.

ttest_1.odt (13.6 KB)

  1. My research shows that placement of tab stop after right paragraph margin means that it coincides with paragraph margin. So be it manual or arbitrary doesn’t matter if it is behind right paragraph margin
  2. But not to mix issues I inserted table 170mm width and added tab stop exactly at 170mm. The problem persists
  3. 19.519.3 style:tab-stop states definitely “right: text is right aligned with a tab stop.” In 2nd and 3d paragraphs it can be aligned but isn’t
  4. As far as I remember Donald Knuth in TeX made calculations only with integers. Ok, be it not integers 0.7mm (almost half of “l” letter width) round approximation looks like a bug, although not a critical one

Much cleaner now :wink:

At first, I suspected some nasty trick by direct formatting; so, I changed a bit your file using exclusively paragraph and character styles.

I think now some subtle behaviour of the font renderer. A character glyph always has some tiny space at left and right of the shape. Usually this spacing is invisible. But if you type with a huge font size, you see the glyph black pixels do not start flush to the left margin (or when justified/right aligned) do not end flush to the right margin).

The glyph bounding rectangle contains this tiny spacing.

When you request extra character spacing, font metrics is internally adjusted (but where? left? right? both?).

What probably happens in you case is you already reached the right margin due to your extra spacing set in position tab but you don’t see it. When the tab character is considered, the active position is beyond its distance and a new line is forced.

I’m afraid there is no solution to your problem. You’re pushing Writer to its limits: a tab stop is usually expected to be followed by some data. In your initial document, you emit a tab character to “sync” between two table cells which are independent from a layout point of view.

I experimented another venue, putting two TOC in a 1-row×2-column table with varying TOC data in the columns. I nearly succeeded in simulating tab stop at right of right margin. However since which entry will take more than one line is unpredictable, I can’t synchronise both TOCs reliably.

Eureka. You’ve suggested great workaround.

If we do it this way the distance is reduced. Yes, the problem remains but in the range of 0.03mm, not 0.7mm, that is more decent.

</style:style>
  <style:style style:name="T5" style:family="text">
<style:text-properties fo:letter-spacing="-2pt" />
<text:p text:style-name="P1">The lng long long long lo<text:span text:style-name="T5">n<text:tab/></text:span></text:p>

To make things clear for others: do you assign a reduced character spacing to the last character + the tab character?

Yes exactly. To achieve this result one should select the character before tab and tab itself and reduce character spacing.

To fully close the topic what do you think about the initital issues:

  1. Is this a right behaviour?

new_line_tab

ttest_2.odt (10.6 KB)

Ok, if yes. But in my opinion the word “line” should be pushed to the second line. I manually can do this only with soft line break. But if my text increases (for example, after changing font size in a style), then the paragraph will break and I won’t even spot it.

There should be something as unbreakable tab then. But tab is unbreakable in its essence. Tab shows position of the next character on the same line.

  1. Should widow control be applied to paragraphs inside table cells:

ttest_3.odt (19.8 KB)

@EarnestAl advice won’t fit if we’ve got more than one paragraph in a cell.

It contradicts 20.228 fo:widows: for all paragraphs widow setting should be respected.

Preventing a row from splitting across pages will solve the widow orphan issue.
Manually increasing the the height of the row (if it splits across pages) to allow room for widow and orphan splits will solve the issue.
Otherwise, there is bug tdf#53088 inherited from OOo but it rather opposes bug tdf#130287

I don’t like the workaround because it shouldn’t be needed. Applying it is just another direct formatting while, like you, I’d expect Writer to do things correctly without being pressed.


Point 1 (personal opinion)

Correct behaviour would start the leader dots after word “line” and continue on next line up to the page number. Therefore some leader is missing on first line. Unfortunately I have no example on my book shelves to check if typography tradition does it like that.

Last word “line” should not be pushed to next line because it would leave to much space at the end of the first line.

Point 2 (again personal opinion)

Definitely, widow control should be active in table cells. After all, cells create sub-documents in their own right. If rows are allowed to split between pages and columns, then all formatting/layout properties should be effective.


I didn't check LO Bugzilla to see if bugs or enhancements have been filed on these topics.
2 Likes

OK, thank you. I’ve checked LO Bugzilla thanks to @EarnestAl comment.

The https://bugs.documentfoundation.org/show_bug.cgi?id=130287 is not a bug at all. See my comment https://bugs.documentfoundation.org/show_bug.cgi?id=130287#c7. Vice versa, it is an example of correct LibreOffice behaviour regarding window/orphan control in tables. Luckily https://git.libreoffice.org/core/commit/c81d766dd4ff7d8b580b7fdc79db6e68c5f14204 that “as if” solved the problem didn’t break this behaviour.

See the attached file, widow/orphan is fully respected. I’ve checked, widow alone and orphan alone are also respected.

table-orphan.odt (22.3 KB)

Updated https://bugs.documentfoundation.org/show_bug.cgi?id=53088 with this example.

On the tabs issue, your point has sense, although I would prefer another behavior. Looks like ODF spec should cover this issue. Clarity is most important here.

Hm, I was too fast with #130287. It is about the default style settings, not about the behavior of the style. I don’t think it is a good idea to enforce different behavior in different cases, still a matter of taste. The example in this ticket is the only one with correct widow and orphan behavior, and I tried to understand, what makes it work.

Please look at the file.

ttest_4.odt (12.6 KB)

Case #1 is absolutely correct. Cases #2, #3 and #3.1 are buggy. By the way, @EarnestAl, looks like your advice to increase row size doesn’t work, as shown in Сase #3.1. Yellow font shows the difference from Case #1.

Looks like orphan and widows control is sometimes applied to paragraphs inside table cells. But rules are unobvious and dubious: inconsistent behavior.

Could you check this file, if I’ve missed something? If it is correct, I’ll update #53088 not to make messy comments – it turned out, they can’t be deleted or edited in LO Bugzilla.

There is more general #86909, quick research shows, that keep with next paragraph and do not split paragraph are never respected in table cells.

Sorry if I bother you, but I’m sure LO is the most advanced text processor (as of now). I mean not in every feature, but generally. For me as a technical writer and analyst, it is very important to have such a tool sharpened. Orphan and widow control is a must as well as keep with next paragraph and do not split paragraph. Many people use them, and they should work properly.

Besides, I’ve spent a great deal of time tuning Asciidoc-Docbook-Pdf, Asciidoc-Docbook-TeX, Asciidoc-PDF and others. All these are either very complicated or flawed. That is why I started Asciidoctor-OpenDocument that heavily relies on LO. And LO is really magnificent in terms of tuning the appearance, specification clarity, containerization, conversion to this same PDF.