How could Writer be compiled to include an application such as Lokalize?

lbrtchx · November 17, 2017, 4:30am

https://userbase.kde.org/Lokalize

I am a bilingual teacher and keeping my work in sync becomes a burden.

In general (thinking beyond Lokalize), what I have in mind is some sort of multilingual text MS that would map and most importantly keep in sync the translation memories line-by-line to-from L1 <-> L2 . . ., so that if I edit in any way L1 it would mark up exactly the line in L2 for edition. Ideally, it would keep the parse trees for each sentence and reset it if sentences are edited.

I think such functionality calls for the flattening of the odt file format (an xml zipped one) into and out of a DBMS on the fly so that the per line syncing can be managed as records by the DBMS itself.

If one scrolls the L1 text on a view, when the L2 tab is clicked, the corresponding section should be automatically scrolled in the L2 view (which should be adjustable)

There are plenty of translations memories repositories out there, but AFAIK none of them are open source.

I could imagine it could get way more sophisticated than that, but that is enough for a start.

Most probably other users have either thought of that or come across those needs, but I couldn’t find anything while I searched. I have little time for such matters, but I could start working on such a thing using Writer’s java UNO interface to then be plugged in as part of libreoffice. So, probably other users have their own ideas about how such functionality should be implemented

lbrtchx

mikekaganski · November 17, 2017, 5:15am

(A passing-by comment:) there’s a FODT (Flat ODT), which is plain single-XML format.

gabix · November 17, 2017, 11:58am

Your posting has so many words, yet, is not clear to me. What are you trying to achieve?

lbrtchx · November 17, 2017, 2:09pm

I am talking about what seems to be an RFE and may become a new functionality in Libreoffice which may be very useful for multilingual people and/or those with such social needs. What is not clear to you? I didn’t claim to be a mind reader

gabix · November 17, 2017, 4:23pm

Nothing is clear. Again, what do you want to achieve? Synchronous scrolling of two documents? It has been requested, as far as I know. But what does it have to do with Lokalize?

sveinki · November 19, 2017, 4:35pm

I think what corresponds best with your description would be OmegaT, an open-source CAT translation software. It’s a segment-based tool, so it handles longer texts more gracefully than Lokalize.

There used to be an extension called Anaphraseus from OpenOffice.org 2 (similar to Wordfast) that could be used for CAT inside of LibreOffice Writer. Not sure if it works anymore, but anyway, it is a bit different from what you describe.

gabix · November 20, 2017, 6:19am

Anaphraseus does work with LO 5.4.2.2.

lbrtchx · November 23, 2017, 3:18pm

It's a segment-based tool, so it handles longer texts more gracefully

I am curious as to why/how should such an implementation be "more (or less) graceful". if you would flatten the odt file format (indexing the mark up in a CMS way, so that rendering a certain format back-and-forth could be done on the fly ...) into a DBMS? This way the DBMS would transactionally manage sync'ing, concurrency, searching and crowd editing for you.

lbrtchx · November 23, 2017, 3:18pm

Imagine including a “File > Open multiL-Text” option based on the strategy pattern. Users would set their Ls in their preferred order on a project or global basis and each time they open a file its corresponding ones in the user languages would be open as well on other tabs …

lbrtchx · November 23, 2017, 3:19pm

The thing is that except for culturally and/or politically monolingual people like “pure bred” gringos in the U.S. and the French (who only speak their “important” languages) most people in the world are more or less socially multilingual. Those kinds of capabilities should be naturally available in editors. I am working on a “proof of concept” implementation for you to then give me more ideas, so that it hopefully becomes a project within LibreOffice.

sveinki · November 29, 2017, 6:28pm

@lbrtchx: Sorry for the delay - I used the expression “graceful” just to point out that matching segments (OmegaT) is perhaps more adequate than full string comparation (Lokalize) in some cases. Especially the case described by the OP.

lbrtchx · December 4, 2017, 11:52am

OK, I decided to ask the OmegaT dev folks, since they culture/keep a mental map of such TM related developments:

https://sourceforge.net/p/omegat/feature-requests/1357/

Let’s see how they react to it.

OmegaT uses Python and Perl code and third-party software. They also seem to use a plain file-based approach to their data access strategy:

http://omegat.org/en/resources.html

I don’t know if/how that would influence such an alignment with libreoffice/writer. We may have to rewrite that Python and Perl code in java if it is used as part of the run time environment in order to make it more kosher to the java UNO runtime interface.

lbrtchx

gabix · December 4, 2017, 1:42pm

OmegaT is Java-only and does not use Python or Perl.

AlexKemp · February 8, 2021, 11:22am