We will be migrating from Ask to Discourse on the first week of August, read the details here

Ask Your Question
0

Error! Reference source not found converting doc to pdf with writer

asked 2020-09-11 16:39:24 +0200

beachrunner24 gravatar image

updated 2020-09-11 22:16:16 +0200

HI,

I am using your api to convert msword docs to pdf. The msword doc has bookmarks in it. When I convert I get: Error! Reference source not found, on all the bookmarked fields.

Is there a flag or method call to get those exported with the conversion to pdf??

Thanks,

Eric Clarke C:\fakepath\test-doc.doc

edit retag flag offensive close merge delete

Comments

What happens if you open the .doc manually into LO Writer? Are the same errors displayed? If so, are you sure the bookmarks are really defined? If you only have references to absent bookmarks, nothing can be done.

Does the same .doc open correctly in M$ Word?

By the way, OS name (assumed to be Windows) and LO version.

ajlittoz gravatar imageajlittoz ( 2020-09-11 17:05:21 +0200 )edit

The screenshot is useless. Attach the original file (reduced to 1 or 2 pages provided it still exhibits the problem).

To attach a file: edit your question (you can't attach to a comment), enter at least 2 blank lines at end, use the "paper clip" tool to select a file.

ajlittoz gravatar imageajlittoz ( 2020-09-11 19:38:53 +0200 )edit

I can't upload because the file contains PII and Hippa laws prevent me from sending. If i remove the PII the all the document references disappear. I just need to know if you api and a flag or method, that you know of, that can suppress the references or not update when we open the file programatically??

beachrunner24 gravatar imagebeachrunner24 ( 2020-09-11 21:41:35 +0200 )edit

Can you try this file. I changed most data and kept the references. Let me know if this works

beachrunner24 gravatar imagebeachrunner24 ( 2020-09-11 21:47:49 +0200 )edit

file is attached: 0A34339F-8A33-436E-84C4-039DA42C7782.doc

beachrunner24 gravatar imagebeachrunner24 ( 2020-09-11 21:48:37 +0200 )edit

Updated link: test-doc.doc

beachrunner24 gravatar imagebeachrunner24 ( 2020-09-11 22:17:38 +0200 )edit

2 Answers

Sort by » oldest newest most voted
1

answered 2020-09-12 11:17:52 +0200

ajlittoz gravatar image

updated 2020-09-14 17:01:00 +0200

I open your sample file in Writer (because I have no Word)

I get Error: Reference source not found.

The Navigator (F5 or sidepane) shows no bookmark, which is confirmed by looking at the bookmark dictionary with Insert>Bookmark.

The target bookmark is NG_MACRO. With such a name, you may have attempted to define the bookmark with a macro and the macro dictionary is also empty.

Macros will not be converted from .doc to Writer because the languages are not not the same.

If you're looking for a workaround, edit your question to describe the intent of this bookmark reference (from a user point of view, i.e. it should echo such part of the document where the data is created in this way). Give the goal of the document and how you use it. This will help to understand your workflow and to suggest an alternative.

EDIT 202-09-14

I more carefully had a look at the attached file with an hexadecimal editor. What we find in the binary .doc is something like

 Date of call # REF NG_MACRO "STANDARD" "tc_date" # 08/02/2016 #

where I use # to represent various binary bytes. The binary bytes are likely to be function encodings for the strings which follow. I am not familiar with DOC format but it is likely that the field/bookmark name is NG_MACRO containing a value of type "tc_date" with "STANDARD" formatting. Last updated value is stored here as 08/02/2016 in case it is not available or not updated.

I opened the .doc file in Writer and saved it as .fodt. I examined the resulting XML with a text editor. The field is translated as:

 <text:bookmark-ref text:reference-format="text" text:ref-name="NG_MACRO">Error: Reference source not found</text:bookmark-ref>

Note that it is considered as a bookmark. A cross-reference would have been translated as text:reference-ref. The formatting code is not kept, nor the type. After all, a bookmark is a shortcut for a location in the document and can't have a time value while a field can:

 <text:date style:data-style-name="N37" text:date-value="2020-09-14T16:43:21.129685132" text:fixed="true">09/14/20</text:date>

In this example, note that type, formatting and last used value are clearly mentioned.

What I didn't show is how a bookmark reference is translated in XML. An ODF reference to a bookmark doesn't cache the bookmark target (why should it because the bookmark is supposed to be defined in the same document?)

My opinion is the original document erroneously used the bookmark feature for a field reference (such as current date of insertion, fixed = not updatable). Writer will not transform a bookmark ref into something else, even if DOC data may suggest otherwise to us humans.

Another possibility is the original file is incomplete: some part containing the bookmark is missing. Writer cannot regenerate the missing part. Who could?

It it opens correctly in Word, you should try to convert to PDF from Word ... (more)

edit flag offensive delete link more

Comments

Thanks for the reply,

These word docs are sent to us by a client. We have 25-30 million of these. Our intent is to use the libreoffice api and convert each one to a pdf. Unfortunately, we get the same error. Does the libre office api have anything to remove the bookmark (NG_MACRO) programmatically before we call the convert method to conver to a pdf???

beachrunner24 gravatar imagebeachrunner24 ( 2020-09-12 13:48:31 +0200 )edit

I've never written a Writer macro, preferring to focus my attention on styles. Hope for macro gurus to come by this question.

Meanwhile, if you have Word, try to see if the document contains macros and what they do. This could give a hint about getting rid of the bookmark before converting.

ajlittoz gravatar imageajlittoz ( 2020-09-12 14:27:34 +0200 )edit

I work with beachrunner and just wanted to try to clear up some confusion. we don't care about the macro. if you open the file up in Notepad, you can see the value has already been calculated and saved in the document before it was sent to us. we want to IGNORE the reference/bookmark/macro and simply display the value that is already in the document instead of trying to reference the bookmark/macro that does not exist.

jhegel gravatar imagejhegel ( 2020-09-14 15:45:20 +0200 )edit

Then attach a meaningful sample file. The one provided in the question is reduced to the cross-reference to the missing bookmark. The sample file should be representative if its usage, i.e. if I understand right, contain the "value" in addition to the faulty field. Waiting for it to experiment.

ajlittoz gravatar imageajlittoz ( 2020-09-14 15:59:07 +0200 )edit

the attached file does have the date.

here is a screenshot, comparing libreOffice, Word, and what the contents of the file look like in notepad: https://gyazo.com/5e0f4745f01e4e3cb97...

jhegel gravatar imagejhegel ( 2020-09-14 16:10:08 +0200 )edit

Also, we are using the libreOffice API via Java to do these conversions. there is an option called "UpdateDocMode" that can be set to "NO_UPDATE" (i.e. a value of 0). We figured this would cause the reference to not be looked at and use the existing field data - but it does not seem to do that - hence why we came to the forums.

jhegel gravatar imagejhegel ( 2020-09-14 16:17:50 +0200 )edit

@ajlittoz - were you able to see the issue? any ideas? edit: i just noticed your edit my apologies - cannot just print to pdf. i am trying to make an API that does conversions. This project is over 50million documents that have this issue. I was using libreOffice to help. it looks like we may have to go with a different converter as most converters use the exisitng value just fine libreOffice is one of the few that doesn't work. the problem with a lot of other converters however is that they do not look very good (i.e. tables and other text formatting doesn't convert well), which is why I was really hoping libreOffice would work. Especially with the "NO_UPDATE" option, as that appears to be its purpose - but perhaps wasn't extended to bookmarks. I may try to open an issue ticket to see if its something ...(more)

jhegel gravatar imagejhegel ( 2020-09-21 16:03:25 +0200 )edit

Have you considered my suggestion to use a print-to-file driver which would produce a PDF file without a dedicated converter? You request Word to print the document "as usual" and the driver would do the conversion on the fly. This way, formatting is done by Word, guaranteeing correct positioning. Another approach, again with a print-to-file driver, would be to create a PostScript file and use a PS 2 PDF converter.

Requesting Word to print documents can be done through scripting.

ajlittoz gravatar imageajlittoz ( 2020-09-21 16:49:58 +0200 )edit
0

answered 2020-09-11 19:19:17 +0200

beachrunner24 gravatar image

It opens fine in msword. Attached C:\fakepath\bookmarked_page.docxis the screen shot of the bookmark. If i remove the bookmark name then i get the same error in the word doc.

It shows the same errors in libre writer.

The fields were updated before the .doc was saved- so we just want to open the doc without it trying to update from the references and display the existing/current values

Is this possible with the libre api??

edit flag offensive delete link more

Comments

Please note for the future:
1. The answer box is reserved for answers,
2. For communication use the comments.
3. To add addional information / clarification please edit your original question.

This helps keeps the site usable for everyone.
Thanks.


Ask/Getting Started - The Document Foundation Wiki https://wiki.documentfoundation.org/A...

igorlius gravatar imageigorlius ( 2020-09-11 21:27:42 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2020-09-11 16:39:24 +0200

Seen: 1,205 times

Last updated: Sep 14 '20