Convert text to table with other character does not work

Earendil · January 30, 2022, 6:48pm

Hi,
I am trying to convert a book with headings into a table in order to perform a number of tasks on the source book and translation while keeping the two texts at the same level.
If I simply paste them in two cells, the text will be rarely at the same hight and I want to keep the styles, so it has to be in writer.

This is why, I want to separate it at headings level in order to do the same with the translation and then have them side by side in the same document. If I simply separate at paragraph level, the two books will not match, since the translator changed the number of paragraphs.

In order to separate at the H1, H2 etc. level, I inserted a symbol before each Heading ≈ and strangely two things happened: the Headings only were moved to a cell to the right (a second column was created) while the text was split at paragraph level.

So I put the symbol after the Headings. This creates a single column, but the Text is still split at the paragraph level, not at the symbol level.

Any ideas why this is happening or how to solve it?

Many thanks!

Version: 7.1.8.1 / LibreOffice Community
Build ID: e1f30c802c3269a1d052614453f260e49458c82c
CPU threads: 8; OS: Mac OS X 10.16; UI render: default; VCL: osx
Locale: it-IT (it.UTF-8); UI: en-US
Calc: threaded

Grantler · January 31, 2022, 2:06pm

An uploaded sample file could be helpful!

Earendil · January 31, 2022, 3:34pm

Thanks @Grantler!
I am attaching:

Source Text (3 titles single paragraphs)
Target Text (3 titles multiple paragraphs)
Undesired result
Desired result
Recording of what I do (Link)

Source Text.odt (16.3 KB)
Target Text.odt (17.2 KB)
Source and Target side by side-Undesired-Result.odt (21.8 KB)
Source and Target side by side-Desired-Result.odt (23.8 KB)

Link to recording: Text-to-Table-and-Result.mov - Google Drive

ajlittoz · January 31, 2022, 3:48pm

From what I see, the flaw in the procedure is to change the number of paragraphs in Target Text. This “may” be legitimate in some circumstances if target grammar requires that some sentences should be isolated in paragraphs. But usually, paragraph structure results from author’s intent and changing it is betraying author.

Should the case arise, you must manually add empty cells in the source column. There is no way to automate this because structure has been “damaged” in target document.

You could eventually cope with this through other tools like a macro-generator which could detect your “=” flag. But macro-generators handle only plain text and you’ll have to restore the whole formatting afterwards manually.

Earendil · January 31, 2022, 4:08pm

Thank you @ajlittoz! I agree that it is a flaw.

The fact is that some German books I am handling have very long paragraphs and translators have opted for splitting them into smaller ones.

There are quite a number of books I was hoping to be able to work on, though.

Isn’t there a workaround of some sort to make it split at Title level? Such as adding a symbol after titles, replacing paragraph marks with something else and then re-replacing them with paragraph marks after the conversion is done?

Do you know why the document is split at paragraph level when I am using the split at flag option? Does it only work with plain text, as you were saying about the macro?

ajlittoz · January 31, 2022, 4:42pm

Indeed German sentences can be very long. German grammar with all its declinations and cases allow for very complex unambiguous sentences. I understand now why you want to make shorter sentences from one long phrase. But is it correct to create new paragraphs?

IMHO, you’re much better off to work at paragraph level rather than at “title level” when translating. This means you may have to manually add rows and more generally manage the table yourself. I think there is a net benefit to stay at paragraph level. If you don’t like empty cells in the German columns, you can merge them with the preceding non-empty cell (assuming the extra paragraphs in the translation come from this one). You then have one cell in German column associated (and perhaps vertically centered) with several cells in the translation column.

Built-in help explains how Table>Convert>Text to Table works. Basically, in any circumstance, the root object is a paragraph. When you specify a separator, you tell Writer that your paragraph is multi-column. Paragraphs define how many rows are created. Separators cause the creation of columns. Consequently, with your “=” at end of headings, you get a column with the heading plus an empty cell, resulting in a 2-column table. As an experiment, specify comma “,” as a separator. The number of column will depend on the paragraph containing the highest number of commas.

If you want/need sophisticated split procedure, the only way I know of is to resort to macro-generators, but such tools have no notion of ODF. Unless you design a complex set of macros (in effect an ODF parser, good luck!), you can work only with plain unstyled text which makes the task of recognising headings quite hard, unless you explicitly tag your heading with some signature.

Earendil · January 31, 2022, 6:08pm

Thank you @ajlittoz about the proposed solution and explanation. I think that I’ll need to stick to styled text.

So I followed your hint and split the two documents at paragraph level. Then I proceeded to move the content of the extra cells of the target into the preceding cell of the target in order to still have the correspondence with the longer source cell.

When I do that, I am left with empty cells in the target. And I need to shift up the target or shift down the source to keep correspondence.

I tried to merge the target empty cells hoping that the following cells would shift up, but this does not happen. So I copied the source from that point to the bottom, cut it and then paste it two cells lower. Then I deleted the empty cells above. I cannot have any extra empty cells in these documents, because, the corresponding paragraph marks after transformation are meaningful for further conversions (html, epub, etc.).

Am I doing things right, or is there a better way to move the cells around?

ajlittoz · January 31, 2022, 6:32pm

Writer is not a spreadsheet app like Calc and offers features fit only for textual table. So, indeed, reorganising cells within a table is harder than in Calc. The problem here is you can’t delete cells individually, only rows or columns.

I’m afraid there is no user-friendly solution. I had a look at tables within a table but we loose vertical sync between side-by-side tables.