Save as HTML does not always perserve the font

Hi all again!

I have two docx files and if I save them as HTML, one will be in the same and correct font as the docx, while the other defaults to the browser font and will not much it’s docx counter part.

I inspected the HTML and I saw that the correct one has a lot of font-face inline styling in each paragraph, which explains why the font is forced.

Like in my other topic, I use jodconverter and use LibreOffice as engine, which means everything what gets converted manually through LibreOffice Writer equals what I convert programmatically.

Main question: Which factor determines that the font gets forced as font-face, so the font will stay the same after conversion?
I hope it is some setting in LibreOffice, but I can’t put my finger on it.

I can’t add the HTML files, so after opening them in LibreOffice Writer, Save as HTML and you will have the result.

Attachments:
example-docx-to-html-correct-working.docx (16.4 KB)
example-docx-to-html-not-working.docx (11.0 KB)

The format properties and the formatting method are different in your sample files. (I am using the extension “Test of missing fonts” for examine the propeties.


I have not the font Aptos installed. The missing fonts will be substituted for the closely correct appearance. And (maybe) the Browsers uses another font for the substitution then LO uses.

Hi Zizi,

Thanks for opening this perspective!

The thing about the documents is that our end users deliver them and I am 99% sure they use MS Word to create and alter docx files.
If you open them in MS Word, the fonts are both configured to be Arial size 10, which the expectation should be after conversion.

In MS Word the settings are misleading and they can’t simply fixed by doing a CTRL + A and select a font for all.

Is there a common way to align the format properties/method?
I’ve seen many suggestions like extracting the font from the source docx, but if your extension shows that there are missing fonts in the properties, I will need to think a bit harder.

We just don’t export (some?) paragraph style formatting to HTML. Thus, the document with font face applied as a direct formatting gets it exported, while the document where the font face is only set through styles does not.

The missing fonts issue is a red herring, unrelated.

Hi Mike,

I checked and tried adding the missing fonts and it didn’t brought me any positive or new results.

It is still a mystery to me why sometimes a fontface gets included or not. I just can’t find anything online about it.

For now I guess I will try to force a font to the HTML result from extracting from the source docx. It feels hackish, because sometimes LibreOffice does it standardly…

Try to read Mikes comment again:

The solution for now is to copy the entire content of the faulty docx into the working docx and the font will also be passed on as font face.