# Pdf to Draw to Writer

This post is a wiki. Anyone with karma >75 is welcome to improve it.

and various other documents. I find that there is a missing link between Draw odg file and Writer file.

LibreOffice Drawing file is extremely resource-hungry. Nearly brings the system to a standstill. Can't be used for regular editing of a normal pdf file even.

Version: Version 4.0.3.3 (Build ID: 400m0(Build:3)) in Knoppix 7.2.0

edit retag close merge delete

Sort by » oldest newest most voted

===Edit1 2018-09-11===
I came back to this old thread accidentally, and as I had analysed the task since again out of interest, and written and arranged some code for it, I now attach this demo containing that code. The main aspect was moved from the pdf files to the 'Draw' document that may or may not be created by opening a pdf. In fact the demo shows some graphic shapes created in Draw and containing text, some of them grouped...
The main enhancement is that now the DrawingDocument, its DrawPages, the ShapeColection(s) and the Shape(s) contained therein are recursively resolved. The contained text is collected first in an array together with information about the original position of the particular shape, and then sorted based on that information in advance of the output to a new TextDocument. Thus the order of text is no longer defined by the logical order of the shapes, but by the visual position basically. To do this even more precisely would require a few corrections to the coordinates of position based on the shapes' properties.
You may play with the example rearranging the shapes on the second slide and compare the results.
The included sorting algorithm was written from scratch. It isn't optimised concerning the sorting itself but reducing the number of needed transpositions. (It also is not optimised for the specific task where no transpositions at all would be needed.)
As always interested in criticism and suggestions. (I personally never made much use of the proceeding.)
The code posted below is from my original answer. It was not changed during editing.
===Edit1 End===

Even when opening a pdf the way @Mike Kaganski pointed to, it will be "unmanageable" as he also told. In very rare cases where you urgently need to import the text content of a pdf without any formatting, you may open it in LibO Draw and then apply a "macro" collecting the texts.

Since simiiar requests reoccurred now and then I once sketched a very raw piece of code for the purpose in BASIC.
You may use it and enhance it as needed at your one risk.

REM  *****  BASIC  *****
REM Wolfgang Jäger (Lupp); 2016-09-05; Copyleft 0
Option Explicit

REM This procedure was sketched because questions about moving the textual
REM content from pdf files opened in 'Draw' into an actual text file come up
REM now and then, and there was not offered a solution yet, as far as I know.
REM
REM Of course, this provisional code cannot replace a thorough solution
REM to the problem (if actually needed at all).
REM In specific there is NOT MADE AN ATTEMPT TO RESOLVE GROUPS or to process
REM the 'Draw' objects regarding their position. The sequencing of texts goes
REM along the logical order of the objects.
REM For a PDF automatically imported by 'Draw' this should work.

Sub experimentalExportTextFromDrawToWriterDoc(optional pNum as Long)
Dim doc0 As Object, page As Object, shape As Object, shapeText As String
Dim doc1 ...
more

1

lupp, Sir! You are just marvellous! Other users also need to know about this code as well! I will try it ASAP and get back to you. To enhance the code I would need to have the expertise to understand it first! I will try to, but definitely try! Warm regards.

( 2017-03-16 08:22:34 +0100 )edit

@Lupp would it be possible to convert this function into some kind of extension "Join selected text frames into one paragraph" for semi-automatic conversion? That would be even more useful, IMHO.

( 2018-12-04 10:04:51 +0100 )edit

Quoting @mcepl: "...would it be possible to convert this function into some kind of extension "Join selected text frames into one paragraph" for semi-automatic conversion?"
Doubting if the one-paragraph thing as the only option would be a good idea, the answer, however, is yes. Any code running in LibO based on its API should allow conversion to an extension. Simply do it.
I personally am not an author of extensions.
See https://wiki.documentfoundation.org/D...

( 2018-12-04 11:10:14 +0100 )edit

Anyway: The code I presented in my above answer, and also the code contained in the attachment to the later inserted part there does not handle text content of the TextFrame type in Writer. It works on shapes.

( 2018-12-04 11:21:40 +0100 )edit

While I do agree with @JohnHa that Writer isn't good for editing PDFs, I do have some positive experience with Draw and simple PDFs.

If, however, you need to use Writer to open PDFs, then you don't need an intermediate step (Draw), but instead you need to use LibreOffice's File->Open... dialog, and select PDF - Portable Document Format (Writer) (*.pdf) in file type drop-down list. This will import the PDF to writer right away... and you will surely see that it's unmanageable as well...

more

Oh! Then it's unmanageable?!

( 2017-03-16 09:07:25 +0100 )edit

Unfortunately yes. It's usually imported as separate text boxes, and it's very difficult to edit it unless you intend to replace one character with another... :(

( 2017-03-16 09:10:17 +0100 )edit

Anything else is a very poor workaround and you will tear your hair out with frustration. You will probably kick your poor cat to death as you struggle for hour after wasted hour.

more

No ! There's absolutely no need to pay for Acrobat when a free software will do the same !

( 2017-03-10 21:25:07 +0100 )edit

Dear Sir rautamiekka, could you please elaborate?

( 2017-03-16 09:08:48 +0100 )edit