Conversion to odt from html: no desidered output

Hi there.

I try to convert an html file obtained from a Java transformation program.

LibreOffice displays correctly this file if I open it from File/Open menu and I choose html from type of file.

In this case I display the web page rendered from LibreOffice in the editor.
So, now I want to convert this file to an odt file.

I tried this command, to obtain an odt file named as the input:

soffice --headless --convert-to odt cleanedHtml.xml.html

The .odt file created, If I open directly it with LibreOffice Writer display the raw html whereas I want to display the rendered html page.
If I try to open the cleanedHtml.xml.html with LibreOffice as HTML file, as said before, the file is rendered correctly, and after, I try to Save As odt file, LibreOffice create and .odt file which contained the rendered content when I open it with the writer.

I need a command that, having the input html file, give me the same result obtained from opening the file as an HTML file and save it as odt. The goal is to obtain an odt file with the rendered page, and not with the raw html code when I open it with the Writer.

Thank you for your attention.

The html file needs to start with <!DOCTYPE html or <html. Your filename contains xml so perhaps it starts with an XML declaration.

You can save to ODT in two ways:

  soffice --headless --convert-to odt:writerweb8_writer cleanedHtml.xml.html
  soffice --headless --convert-to odt:writer8 cleanedHtml.xml.html