Ask Your Question
1

Pdf to Draw to Writer

asked 2015-07-16 07:38:10 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

I have read the following links:

link 1

link 2

link 3

and various other documents. I find that there is a missing link between Draw odg file and Writer file.

Can this Missing Link be joined please?

LibreOffice Drawing file is extremely resource-hungry. Nearly brings the system to a standstill. Can't be used for regular editing of a normal pdf file even.

Version: Version 4.0.3.3 (Build ID: 400m0(Build:3)) in Knoppix 7.2.0

edit retag flag offensive close merge delete

3 Answers

Sort by » oldest newest most voted
4

answered 2017-03-10 12:05:23 +0200

Lupp gravatar image

updated 2018-09-11 15:07:41 +0200

===Edit1 2018-09-11===
I came back to this old thread accidentally, and as I had analysed the task since again out of interest, and written and arranged some code for it, I now attach this demo containing that code. The main aspect was moved from the pdf files to the 'Draw' document that may or may not be created by opening a pdf. In fact the demo shows some graphic shapes created in Draw and containing text, some of them grouped...
The main enhancement is that now the DrawingDocument, its DrawPages, the ShapeColection(s) and the Shape(s) contained therein are recursively resolved. The contained text is collected first in an array together with information about the original position of the particular shape, and then sorted based on that information in advance of the output to a new TextDocument. Thus the order of text is no longer defined by the logical order of the shapes, but by the visual position basically. To do this even more precisely would require a few corrections to the coordinates of position based on the shapes' properties.
You may play with the example rearranging the shapes on the second slide and compare the results.
The included sorting algorithm was written from scratch. It isn't optimised concerning the sorting itself but reducing the number of needed transpositions. (It also is not optimised for the specific task where no transpositions at all would be needed.)
As always interested in criticism and suggestions. (I personally never made much use of the proceeding.)
The code posted below is from my original answer. It was not changed during editing.
===Edit1 End===

Even when opening a pdf the way @Mike Kaganski pointed to, it will be "unmanageable" as he also told. In very rare cases where you urgently need to import the text content of a pdf without any formatting, you may open it in LibO Draw and then apply a "macro" collecting the texts.

Since simiiar requests reoccurred now and then I once sketched a very raw piece of code for the purpose in BASIC.
You may use it and enhance it as needed at your one risk.

REM  *****  BASIC  *****
REM Wolfgang Jäger (Lupp); 2016-09-05; Copyleft 0 
Option Explicit

REM This procedure was sketched because questions about moving the textual
REM content from pdf files opened in 'Draw' into an actual text file come up
REM now and then, and there was not offered a solution yet, as far as I know.
REM 
REM Of course, this provisional code cannot replace a thorough solution
REM to the problem (if actually needed at all).
REM In specific there is NOT MADE AN ATTEMPT TO RESOLVE GROUPS or to process
REM the 'Draw' objects regarding their position. The sequencing of texts goes 
REM along the logical order of the objects.
REM For a PDF automatically imported by 'Draw' this should work.

Sub experimentalExportTextFromDrawToWriterDoc(optional pNum as Long)
    Dim doc0 As Object, page As Object, shape As Object, shapeText As String
    Dim doc1 ...
(more)
edit flag offensive delete link more

Comments

1

lupp, Sir! You are just marvellous! Other users also need to know about this code as well! I will try it ASAP and get back to you. To enhance the code I would need to have the expertise to understand it first! I will try to, but definitely try! Warm regards.

bkpsusmitaa gravatar imagebkpsusmitaa ( 2017-03-16 08:22:34 +0200 )edit

@Lupp would it be possible to convert this function into some kind of extension "Join selected text frames into one paragraph" for semi-automatic conversion? That would be even more useful, IMHO.

Also, https://bugs.documentfoundation.org/s...

mcepl gravatar imagemcepl ( 2018-12-04 10:04:51 +0200 )edit

Quoting @mcepl: "...would it be possible to convert this function into some kind of extension "Join selected text frames into one paragraph" for semi-automatic conversion?"
Doubting if the one-paragraph thing as the only option would be a good idea, the answer, however, is yes. Any code running in LibO based on its API should allow conversion to an extension. Simply do it.
I personally am not an author of extensions.
See https://wiki.documentfoundation.org/D...

Lupp gravatar imageLupp ( 2018-12-04 11:10:14 +0200 )edit

Anyway: The code I presented in my above answer, and also the code contained in the attachment to the later inserted part there does not handle text content of the TextFrame type in Writer. It works on shapes.

Lupp gravatar imageLupp ( 2018-12-04 11:21:40 +0200 )edit
1

answered 2017-03-10 06:51:38 +0200

While I do agree with @JohnHa that Writer isn't good for editing PDFs, I do have some positive experience with Draw and simple PDFs.

If, however, you need to use Writer to open PDFs, then you don't need an intermediate step (Draw), but instead you need to use LibreOffice's File->Open... dialog, and select PDF - Portable Document Format (Writer) (*.pdf) in file type drop-down list. This will import the PDF to writer right away... and you will surely see that it's unmanageable as well...

edit flag offensive delete link more

Comments

Oh! Then it's unmanageable?!

bkpsusmitaa gravatar imagebkpsusmitaa ( 2017-03-16 09:07:25 +0200 )edit

Unfortunately yes. It's usually imported as separate text boxes, and it's very difficult to edit it unless you intend to replace one character with another... :(

Mike Kaganski gravatar imageMike Kaganski ( 2017-03-16 09:10:17 +0200 )edit
0

answered 2017-03-09 21:32:51 +0200

JohnHa gravatar image

updated 2017-03-09 21:33:55 +0200

If you need to edit a PDF buy Adobe Acrobat.

Anything else is a very poor workaround and you will tear your hair out with frustration. You will probably kick your poor cat to death as you struggle for hour after wasted hour.

edit flag offensive delete link more

Comments

No ! There's absolutely no need to pay for Acrobat when a free software will do the same !

rautamiekka gravatar imagerautamiekka ( 2017-03-10 21:25:07 +0200 )edit

Dear Sir rautamiekka, could you please elaborate?

bkpsusmitaa gravatar imagebkpsusmitaa ( 2017-03-16 09:08:48 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2015-07-16 07:38:10 +0200

Seen: 1,247 times

Last updated: Sep 11 '18