Word 2010 vs LO formatting.... text overlapping at page ends... pinpointing difference

Please read this description below prior to commenting…

Ok I was trying LO. I opened my Word .docx file with LO and the formatting was good though by no means identical e.g. in adding 40 pages to the overall document size.

So I reopended the doc in Word and saved in the .odt format. I saw that some changes were not compatible & proceeded anyway.

I then compared opening the new .odt document in Word & LO. Interestingly opening in Word added +2 pages compared to the (original .docx) document whilst LO preserved the original number of pages. Also the LO formatting was closer to the original .docx version (in Word) than simply opening the .docx version in LO…

So I reopened the .docx version in Word to compare it to the .odt version in LO. It is very very close. The formatting I have only found a single instance within the text that differs. Also differing though is the text at the boundarys of the page ends. The latter is a more serious ‘conflict’ that I would like to try and pinpoint. Firstly I tried opening the Page Size and indeed it was out by a few tenths (0.1) of cm in LO .odt file so corrected this to identically match the original .docx paper size (margins were correct). Why this was slightly out in the first place I’m not sure.

Anyways the text still overuns (or runs short) in some instances… interestingly overall though it does seem to keep in sync e.g. some pages early on carry a sentence over into the next page whilst several pages later the text lines up again as if the difference has magically recorrected itself…

So what I am trying to find out is what aspect of formatting is causing the text at the boundaries of pages to go out of sync yet the whole structure itself (as noted with later pages back) remains in sync. Included in the original were inserted images/ shapes/ headers is what I can think of at the moment though I ‘ve tried deleting various & seems to not affect this issue.

Anyone help iron out why LO handles the edges of pages different to Word 2010?

LO Writer will likely never handle the DOCX format in such a manner as to perfectly display the content identically to Word. Saving your existing DOCX as an ODT via Word can easily result in differences of appearance as the underlying specifications (OOXML and ODF) are not the same. The OpenDocument (ODF) XML generated by Word will not be as clean as that generated natively by Writer. There are several potential issues here: page margins (top and bottom), page header/footer, paragraph orphan/widow handling, line ending (breaking algorithm), fonts, style handling, inserted object handling, etc.

It is impossible to say exactly where the differences may be issuing from, but I would imagine it is a combination of things (mentioned above). Any document containing a combination of elements soon develops complex XML in order to accurately represent the content. I would first look at your Text Flow settings in both applications (perhaps this attribute is not being translated correctly?) I would also set the text to use a ‘neutral’ font, such as SIL Gentium, to remove this aspect from the equation.

Another possible issue is the implementation of the line-breaking algorithm varying between Word and Writer. If it does, there is little you can do about this.