I had a look at your original and copy documents. The problem comes from the conversion from PDF to .docx plus the conversion from .docx to ODF.
Let’s talk about the first conversion (the .docx document – unfortunately I have no Word here and could display it only with Writer, so my explanation may not be 100% exact).
Remember that PDF is a page description format and does not keep text flow. Text is broken into lines (or part of lines) and these segment are independently positioned in the page.
Your converter application did quite a good job to collect together line crunches into a vertical text flow. It did synthesized paragraphs within columns but failed to recognise that text flow continued to the next column. And this is not the only failure.
It failed to identify “Oxford J. of Archaeology p. X” as being a footer. Similarly, it missed “Basil Blackwell …” as the publisher name and address.
It was disturbed by the varying number of columns between article text, title and illustration. Consequently (here I don’t know if the converter or Word/docx format is to be blamed), the document is spread over frames which are not related at all between themselves, meaning text does not flow from one to the other. Text in any frame is self contained; if you change frame size, text will not spill over to another frame. Sometimes, data which has nothing to do semantically with fame contents is included in the frame (e.g. “Oxford J. of Archaeology” and/or page number).
Worse, all frames are anchored To character which implies there is some paragraph somewhere in the “standard text flow” to attach the frame to.
When you copy the .docx document to paste it into an .odt, you paste the content without change. But if your pages are not exactly identical, i.e. same size, same margins, same font, you may get a slightly different spacing of the pasted elements,enough to grant space for one line of text on the last page. And now, the supporting “empty” paragraph for the last page frame can be shifted to the preceding page. The attached frames will be laid out with the preceding ones resulting in overlap. I tried to fix this by unchecking the Allow overlap flag for frames but it didn’t work because the frames are positioned with absolute coordinates in the page (thus they cannot move to prevent overlap). The only way was to add a manual page break but it not always obvious to find the correct location for it.
You have other problems:
- in Writer all frames have borders (too many frammes to fix easily)
- the conversion process (probably *.docx → .odt) created one page style per page (once again too many styles to fix)
- the .odt document ends up direct formatted: no way to fix it conveniently, needs to be fully reformatted first
- The last page (bibliography) is offset to left and is partly inside the left margin
Your problem has no solution because the “original data” is not text-flow compliant and .docx format has brought its load of compatibility issues.
PS: Apparently, the PDF was acquired through some form of OCR processing as shown by the “glitches” in page 369 and 370.
To show the community your question has been answered, click the ✓ next to the correct answer, and “upvote” by clicking on the ^ arrow of any helpful answers. These are the mechanisms for communicating the quality of the Q&A on this site. Thanks!
In case you need clarification, edit your question (not an answer which is reserved for solutions) or comment the relevant answer.