Automate inserting hyperlinks to a table of contents inside an imported PDF (LoDraw)

Hi.

I have imported into LibreOffice Draw a PDF datasheet for an electronics part, which table of contents is deprived of hyperlinks. It is a 300 pages document and it is perfectly imported into LODraw — except maybe fonts are not 100% identical since the document was created on a Windows machine and I’m running Draw on Manjaro Linux. Anyway this is not the reason I am asking here.

I’d like to automate the task of creating a hyperlink on page numbers in the TOC. The latter is about ten pages long, each of which contains a few dozens of entries. Although generating a PDF back again seems to work flawlessly, doing it all manually is extremely tedious as I need more than a dozen keystrokes to update a single link. It is also prone to errors.

What I’d like is to automate it all. A typical TOC line appears as follows:

<section ID> <text> ......................... <page n°>

Fields are separated with spaces, although it doesn’t really matter.

So I’d like to insert a hyperlink to the adequate page on the page number. I can think of two scenarios:

  • clicking twice on the page number inside the text frame, then run a macro that inserts the hyperlink to the required page (which keyboard equivalent is: Ctrl+K, Tab, Tab, Tab, Tab, Enter or Space, , Enter, Tab, Enter, Tab, Enter - the last two Tab/Enter are meant to close the dialog boxes);
  • the macro scans all the text frames on the page from pages 2 to 10 and inserts the hyperlink where a given pattern matches.

The latter option would be ideal to me.

I have never programmed under LibreOffice but I am a developer. How could/would I implement such a functionality?

Here are the operations I suppose I would need in order to achieve this:

  • enumerate text frames in a document (which name follows a known pattern);
  • retrieve the text;
  • lookup for the last characters and make sure they’re digits (ideally, that match a page number)
  • Insert the hyperlink, preferably by program instead of showing the dialogue.

The thing is I have no idea how to do any of that.

Thanks a lot in avance for any hint/suggestion/guideline.

C.F.

I appreciate your comments, but Draw is a drawing tool, not a typewriter or a text program or a PDF-Editor.

Workaround suggestion:

Scan your PDF and enter the raw text in a Writer document. Then use styles, e.g. “Body Text” for the normal paragraphs and the existing “Heading 1” to “n”.

If you use the built-in headings, you can create an automated table of contents.


Getting started with professional text composition in Writer

English documentation

1 Like

What?

What is the point of programming inside LibreOffice?
What is the point of this forum if the only answer I get is to get back the stone age?

I don’t have a printer, all I have is a computer and a programming interface! I’m asking here because I cannot figure out by reading the documentation how to get to the goal.

Well, thanks for your suggestion, guess I’ll have to suck it all and do it on my own…

Sometimes, I swear… I regret asking questions.

Don’t bother, I’m closing this account. I’ll look elsewhere.

Scanning is just a generic term, you do not need a printer for this.
There are programs that can convert a PDF with OCR into a simple text, this is just one example:

As you have noticed yourself, your described procedure could be very error-prone.
I have therefore suggested a workaround that will lead you to a safe destination.


My answer does not have to be the only one. There are many other users here who can certainly help you.


If you would like to contribute to the development of LibreOffice, take a look here:

It could be very complicated, because some PDFs has strange parts and not only easy-detected text frames. Can you upload some example part of your PDF?

@KamilLanda

It’s not about editing a PDF but a LibreOffice Draw document (which happens to result from successfully importing a PDF datasheet) to automate the process of inserting hyperlinks, which I unambiguously described in my initial post.

@Hrbrgr

You’re just cherry picking.

WHAT IS THE POINT OF HAVING A COMPLETE DEVELOPMENT ENVIRONMENT EMBEDDED IN LIBREOFFICE THAT IS SUPPOSED TO ALLOW AUTOMATING SUCH TASKS IF ALL YOU HAVE TO ANSWER IS NOT MORE ELABORATE THAN ASKING THE PUBLISHER TO UPDATE THEIR DOCUMENT?

WHAT IS THE POINT OF HAVING A COMPLETE DEVELOPMENT ENVIRONMENT IF NOT TO AUTOMATE THOSE TASKS?

You’re basically telling me to ignore the presence of a tool, which implies that tool is completely useless, which then suggests LibreOffice is a bloated software suite. If I cannot use Basic/Macros/Python (whatever) to automate these kinds of tasks, what’s the point of macros at all?

Your responses just are just unacceptable, they’re a slap to the face of the user.

But I guess you want to export to a working pdf afterwards. So I would describe this as modify a PDF, but it is your problem, so you choose your way…
.
Problem is after 2 conversions (document to pdf, then to Draws internal structures) you have to find out, what objects you are editing now. So I guess I can only recommend the basics:
There is an introducion to macro programming by Andrew Pitonyak Macros explained. And an object inspector like MRI will also be quite useful…

https://www.pitonyak.org/oo.php

Maybe somebody here is willing to provide more detailed help, but as @KamilLanda already asked for: Please upload a sample pdf. There is no “standard” how a odf is structured. The format was not designed to be edited, even if it is possible

I use several macros for Writer, Calc and Base, so the tool can be useful, even if it may not solve your problem.

Yes, for a PDF-Editor it is quite bloated - but your intended use is not the typical one.

Considering you are at a site, where volunteers are answering questions for free, your tone may not encouraging many to help you…

2 Likes

Perhaps OP didn’t take into account that not everyone on this website is a natural English speaker.
And apparently OP doesn’t know that capitalization is perceived as shouting.
And maybe he also didn’t take into account that there is a general netiquette (Wikipedia).

1 Like

Did you try sometimes to edit the imported PDF document in Draw? Because the problem is, that some PDFs have varied textboxes etc, but some of ones aren’t normally visible. Or in other words: my experience is, that there are mostly some strange extra empty textboxes in some PDFs.

So if you will not upload the example PDF, maybe I could try to make some macro to add the hyperlinks to ODG file, but of course it will ignore your better choice: the macro scans all the text frames on the page from pages 2 to 10, because I could test only some easy textframes in ODG file.

Here is example how to insert hyperlinks to ODG via macro
link-in-Draw.odg (90.3 kB)