Pasting text from pdf without losing formatting

So i use librewriter to annotate material originated in a pdf; the way i go about is to use ctrl+a ctrl+c on the pdf and then ctrl+v into libre. The problem is when i paste the text the formatting is lost; paragraph breaks, page brakes, just different things and it makes it really hard to read the material.

i would like to past the copied text into libre exactly as it appears on the pdf.

Is there a way to resolve this?

Win11, 7.6.0.3. Odt, FF Pdf Viewer

You can’t. PDF is a “frozen” display format. All formatting directives are converted into positioning coordinates and removed. It is strictly equivalent to an image.

The only thing you can do is to retrieve text as a string of characters. Then you must recreate its logical structure, i.e. remove spurious paragraph breaks at end of each line to leave only one at real paragraph end. You must also reformat apply paragraph and character styles where appropriate).

PS; when asking here, always mentions OS name, LO version and save format.

I know this may be traveling into software other than libre but is there anyway at all of extracting the formatting through some sort of conversion?

Absolutely none when you consider PDF definition.

It is a missing functionality in Writer to work much harder to re-create the original text (yes, it is not some information that is already present in the PDF, and needs much guesswork, but the status quo in Writer PDF import, producing the same mess of text boxes as in Draw, instead of a reasonable body text, doesn’t make sense at all). It was discussed at the LibreOffice Conference 2023 (see respective talk from @EyalRozenberg; in the post-talk discussion, we shared some ideas on the problem, as well as on the terminology; and indeed, the “editor” term used there in the talk was a bit provocative, but reasonable from the user’s PoV, and we agreed that we need to work on improving the PDF import support, while at the same time, keep educating users on the correct expectations on this import’s never-to-become-perfect results).

MS Word basically does it right.

There are also tools doing OCR and similar job - like e.g. ABBYY FineReader.

Ok so in my habit of not searching before i ask questions adobe does this and so do a few others just a lot of them charge, but they do work. The only problem I’m seeing with the converted files is that i cant highlight any more for some reason.

But that does not mean it is impossible for the formatting to be kept when copying-and-pasting from PDF into LibreOffice. This is possible if the PDF reader exports this information to the clipboard in a format which LibreOffice is able to use.

However, as others suggest - that’s not the way to go with PDF annotation. Consider the following:

  1. A PDF “reader” which also supports adding annotation.
  2. A program such as Xournal++, which is FOSS, and treats the PDF as page backgrounds which you can’t touch, letting you do some scribbling, add text, images etc. on top of it.
  3. (Not really recommended) LibreOffice Draw, but using Insert Image, so that the PDF is rendered as raster and you use it as the background for some editing