Converting docx to PDF will cover the fields with grey blocks, if the fields are ms word input fields

Hi all,

We updated our LibreOffice from 7.3.4 to 7.6.4, which has solved many problems. After converting some documents we see that there are grey blocks in the PDF.

When you click the grey blocks, you will see the value underneath and you get the option to even select a different value.
These fields are date and dropdown fields from MS Word.

We are using the DocumentConverter implementation of the jodconverter and I have yet found anyone with a similar problem nor solution.
greyblocks.docx (33.9 KB)
greyblocks.pdf (50.5 KB)

Your “fields” are not Writer fields. I don’t know what they are because the Navigator reports no fields (as Writer native objects) are used in the document. In addition, conversion from DOCX created a document I’d qualify as “non-standard”, i.e. one with a lot of direct formatting which can’t be controlled with styles (a lot more than usual) and (IMHO) bad choice of “object” type.

Since I can’t identify what your “fields” end up in Writer, I’d recommend you switch to full Writer format if you’re the documents author and recreate them to be ODF-compliant.

Welcome!
Okay, I saw what you get as a result of the conversion. What did you expect to receive? I mean, what do you want to see in the PDF file: text from the current “field” values without gray blocks, or a data entry field?

I’m not afraid to seem ignorant - I don’t hide the fact that I don’t know some things. For example, I’ve never used jodconverter. Please tell us what it is, where you got it, what’s inside it and how it works. And why did you choose this to solve your problems?

If you are forced to convert DOCX, then why not winword.exe or DocTo ?

If you use LibreOffice for conversion, are you doing everything correctly?

Note that e.g. in v.7.5, Word’s content controls support was improved in Writer, and in particular, their export was implemented as form fields.

So if you do not need your exported PDFs to behave like forms, you need to disable this option in FileExport AsExport as PDF:

image

Wrt the color of these fields - I guess that you may use dark theme, and be affected by one of the problems where the theme colors affect PDF export. See e.g. tdf#150786 - which is itself solved, but maybe your case is similar but different?

@ajlittoz : Fields are “Content Controls”. You could get them in Writer trough Form → Content Controls. This controls have been added in one of the first LO7-versions.

Controls could be edited by setting cursor in such a control and start Form → Content Controls → Properties.
First listbox isn’t well formatted, so no content could be seen for the first entry.

I have tested the document, exported to *.pdf (with Create PDF form - default here). Then I see the control as form controls. No design problems as reported in Okular, the viewer here on KDE. But might be different in Adobe Acrobat Reader and might be different with dark theme.

Hi John, we use jodconverter as a framework to send documents to an executable on a server, which is libreoffice as conversion.

It runs the same commands as you added in your answer from stack overflow. However we left everything in default as in no other parameters as config settings for libreoffice.

Which made me think that something has changed in the default conversion between 7.3.4 and 7.6.4. We can’t pinpoint where we have to add changes to get the same conversion result as 7.3.4.

EDIT:
When we installed/updated LibreOffice on the server, we used the next commands:

RUN tar -xvf LibreOffice_7.6.4_Linux_x86-64_deb.tar.gz
RUN rm LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-firebird_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-kde-integration_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-onlineupdate_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-postgresql-sdbc_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-python-script-provider_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libobasis7.6-gnome-integration_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libreoffice7.6-debian-menus_7.6.4-1_all.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libreoffice7.6-dict-es_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libreoffice7.6-dict-fr_7.6.4.1-1_amd64.deb \
        LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/libreoffice7.6-draw_7.6.4.1-1_amd64.deb
RUN dpkg -i LibreOffice_7.6.4.1_Linux_x86-64_deb/DEBS/*.deb
RUN ln -s /opt/libreoffice7.6/program/soffice /usr/local/bin/soffice
RUN ln -s /opt/libreoffice7.6/program/soffice /usr/local/bin/lowriter

I read there might be missing libs, but I can’t pinpoint if this may be the reason.

Thanks, I think I’m starting to understand the process. But I still don’t know what the PDF file should look like after conversion - with input fields or with plain text?

Isn’t that what @mikekaganski wrote about?

greyblocks-converted-from-7-3-4.pdf (40.8 KB)
I have added the conversion from version 7.3.4 as the expected result.

Indeed as @mikekaganski has said, the default conversion gives me form fields instead of plain text.

Either the input documents have to be editted, which is something we would like to avoid as we have a lot, or there is a settings that converts them to plain text.
I pray for a latter solution.

I just did an experiment. I opened an arbitrary document, selected File - Export as PDF from the menu and made sure that the checkbox that @mikekaganski wrote about was checked.

image

After that, the command

soffice.exe --convert-to pdf:writer_pdf_Export greyblocks.docx

gave me this file - greyblocks_as_form.pdf (48.9 KB).

Then I repeated this command, after unchecking this checkbox, and received this file - greyblocks.pdf (46.2 KB).

I did all this with 7.6.3.2 (X86_64)

Hi John,

Thanks for experimenting. If I export the PDF through LibreOffice Writer with the checkbox unchecked, it works as I also get plain text.

This has giving me an idea.

If you disable “Create PDF Form”, this will add a config in the registrymodifications.xcu:

<item oor:path="/org.openoffice.Office.Common/Filter/PDF/Export">
       <prop oor:name="ExportFormFields" oor:op="fuse">
        <value>false</value>
       </prop>
   </item>

We replace the registrymodifications.xcu with the libreoffice package during installation/update.
I’ve tried to use it in the jodconverter and sadly it did not give the same result.
I double checked my registrymodifications.xcu and the line was still present.

FYI the registrymodifications

<?xml version="1.0" encoding="UTF-8"?>
<oor:items xmlns:oor="http://openoffice.org/2001/registry" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <item oor:path="/org.openoffice.Office.Common/Misc">
      <prop oor:name="FirstRun" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Misc">
      <prop oor:name="Persona" oor:op="fuse">
         <value>no</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Misc">
      <prop oor:name="PersonaSettings" oor:op="fuse">
         <value />
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Misc">
      <prop oor:name="UseOpenCL" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Filter/HTML/Export">
      <prop oor:name="Encoding" oor:op="fuse">
         <value>76</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Filter/HTML/Export">
      <prop oor:name="LocalGraphic" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Filter/HTML/Export">
      <prop oor:name="Warning" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Common/Filter/PDF/Export">
       <prop oor:name="ExportFormFields" oor:op="fuse">
        <value>false</value>
       </prop>
   </item>
   <item oor:path="/org.openoffice.Office.Logging/Settings">
      <node oor:name="unopkg" oor:op="replace">
         <prop oor:name="LogLevel" oor:op="fuse">
            <value>2147483647</value>
         </prop>
         <prop oor:name="DefaultHandler" oor:op="fuse">
            <value>com.sun.star.logging.FileHandler</value>
         </prop>
         <node oor:name="HandlerSettings">
            <prop oor:name="FileURL" oor:op="fuse">
               <value>$(userurl)/$(loggername).log</value>
            </prop>
         </node>
         <prop oor:name="DefaultFormatter" oor:op="fuse">
            <value>com.sun.star.logging.PlainTextFormatter</value>
         </prop>
         <node oor:name="FormatterSettings" />
      </node>
   </item>
   <item oor:path="/org.openoffice.Office.Recovery/RecoveryInfo">
      <prop oor:name="SessionData" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Setup/L10N">
      <prop oor:name="ooLocale" oor:op="fuse">
         <value>en-US</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Setup/Office">
      <prop oor:name="LastCompatibilityCheckID" oor:op="fuse">
         <value>728fec16bd5f605073805c3c9e7c4212a0120dc5</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Setup/Office">
      <prop oor:name="OfficeRestartInProgress" oor:op="fuse">
         <value>false</value>
      </prop>
   </item>
   <item oor:path="/org.openoffice.Setup/Office">
      <prop oor:name="ooSetupInstCompleted" oor:op="fuse">
         <value>true</value>
      </prop>
   </item>
</oor:items>

I am checking which command the jodconverter uses. It may be a command that ignore the registrymodifications…

My idea would be command-line parameters:
https://help.libreoffice.org/latest/en-US/text/shared/guide/pdf_params.html?&DbPAR=SHARED&System=UNIX

Okay, I have found the solution to my problem.

Within the JODConverter I use DocumentFormat, which specifies the options how the target document should be converted.

final Map<String, Object> pdfOptions = new HashMap<>(); pdfOptions.put("ExportFormFields", false); DocumentFormat.builder().storeProperty(DocumentFamily.TEXT, "FilterData", pdfOptions)

JODConverter ignores the registry and only uses the options given in the export function.

I would like to thank you all for helping me to the right direction.
This community is awesome!