Exhaustive solution for HTML export

The facts

LibreOffice has long had two separate sets of features pertaining to HTML:

  • A traditional concept of different HTML implementations, as reflected by the HTML Compatibility options which allow one to export to “Mozilla Firefox”, “Microsoft Internet Explorer” or “LibreOffice Writer”. Up until recently, the menu had a “Netscape” entry instead of “Mozilla Firefox” and even had a “HTML 3.2” entry. OpenOffice 3.4.1 still has these two ancient options.

  • A more modern approach of exporting to XHTML through the File > Export menu. This exports XHTML 1.1 with MathML. The code is clean, but it has inline CSS and does not export everything (footnotes are not present in the output, for example).

Even worse, this page from the OpenOffice website suggests that the XHTML export filter is actually based on the former Star Writer XML format and only works with ODT because it internally maps SWX to ODT.

The question

How come a new export filter from ODT to XHTML was never fully developed, considering the importance of being able to publish to the Web?

Furthermore, this would mean that the HTML version of the ODF 1.2 standard relies on an export filter that was not even designed to handle ODF? This would be seriously inappropriate. I assume said export filter was indeed used to produce this document, based on information in the head of the page’s source code: there is the comment <!--This file was converted to xhtml by OpenOffice.org - see http://xml.openoffice.org/odf2xhtml for more info.-->. Following the link in this comment, one finds a page that is more up-to-date than the previously mentioned one. On that page one can read:

Now as a sample filter, it is an optional installation component of OpenOffice.org 3.x, installed together with the ‘XSLT sample filter’ package. Within the Office the filter can be used when choosing the XHTML export from File->Export…

Most fortunately, the article gives the impression that the XSLT filter was indeed updated for ODF. However, it is considered an optional sample.

Why does this situation persist? Why is there no all-encompassing solution to impeccably export ODF 1.2 to XHTML 1.1 with full control over the mapping of elements and attributes? Of course, there is the Writer2XHTML extension, but this seems to be stagnant at the moment and the problem is that this kind of functionality should be available by default, not as part of an extension that most people will never think of looking for.

What do we do? I once read somewhere (don’t remember where) that the LibreOffice project was downplaying the importance of XSLT and was trying to rely on it less and less, just as it is trying to rely less on Java. Will the LibreOffice project then develop full-featured XHTML support in another language? Which one would it be? Why not use XSLT, which is meant for exactly that kind of task?

Could anyone provide any practical insight? What are the solutions in the short and long terms?

Hi @CyanCG,

Looks like we don’t have any good answers for you on this front. I agree that HTML export is definitely something that could use some improvement (we’ve got a number of q’s about improving export, simplifying HTML styles, etc…), but it doesn’t look like this enhancement is currently on the top of anyone’s priority queue.

For further discussion, you could try one of the mailing lists – perhaps someone will be interested in riffing with you on this topic :slight_smile:

Best,

Ok, thank you. I think I’ll send an email to the developer of Writer2XHTML too; do you know if anyone in the development team or on this Ask site knows him/works with him/knows whether he contributes to LO? I am worried that he no longer actively develops his extensions, because the 1.2 versions have been in beta for a long time. They’re invaluable tools, they just need a boost!

@CyanCG – I’m not sure I recognize the dev’s name, but the last update to the Sourceforge project was 2012-03-16, so it appears that the extensions may still be maintained.

Okay, I will try to get more info. I guess subscribing to SourceForge and posting on the extension’s site would be a good start.

File an enhancement bug.

I shall, but I wanted to know, among other things, if someone had found another satisfactory alternative or managed to tame Writer2LaTeX and bind it to their will, etc. I would like to have insights, assessments and witnesses, so that I can file an enhancement bug knowing the whole picture.

I found this list of filters the name that you should write is the attribute oor:name http://cgit.freedesktop.org/libreoffice/core/tree/filter/source/config/fragments/filters?h=libreoffice-4-0-4

I do not understand what those filters do exactly: I only see some generic properties in them and nothing about mappings of XML elements to HTML elements.

I’ve succesfully intalled the extension writer2xhtml from sourceforge

it does provide a better html exportation, though I don’t like so much the CSS method they do use.

The extension is available on the openoffice extensions website

Cheers