Ask Your Question
0

Bad gif when --convert-to html:"HTML (StarWriter)" on Linux [closed]

asked 2013-06-13 18:39:02 +0200

Cameron gravatar image

updated 2015-11-02 18:34:13 +0200

Alex Kemp gravatar image

OS: Open SUSE 12.1

LibreOffice Version: 4.0.3.3

I use following command to convert a doc to html:

libreoffice4.0 --headless --convert-to html:"HTML (StarWriter)" maths.doc

In the output, some formulas are displayed as black bar like this: image description

If I open libreoffice and use "Save as" HTML function, it is displayed normally like this: image description

The two images I attached above are from the same formula in the doc.

Theoretically, I think, they should call the same conversion function in core. Why could they generate different gif?

I try the command on a Mac, it works well. Both methods can generate normal gif.

In fact, I don't want to use gif at all. I would like to use --convert-to html to embed all in one html file. However, most of the embedded images are corrupted on both Mac and Linux with default HTML filter. The same situation to the last comment by Luc.Tartier of following question: http://ask.libreoffice.org/en/question/7209/export-formulas-to-mathml/

Could someone help? Thanks a lot!

Cameron

(Some more info to help to reproduce the issue)

Here is the original doc (but it includes Chinese): http://www.eguidedog.net/tmp/maths8/maths.doc

Here is the version converted with libreoffice4.0 --headless --convert-to html:"HTML (StarWriter)" maths.doc http://www.eguidedog.net/tmp/maths8/maths.html

Here is the version converted with "Save as" in GUI menu: http://www.eguidedog.net/tmp/maths8/maths2.html

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-11-02 18:34:40.472304

Comments

Thanks for providing the original file. The objects that are appearing as blacked-out GIFs are graphic objects (not formula objects) in the original DOC. They appear to be vector objects that the filter cannot understand. If I save the DOC in DOCX/ODT format they are included as WMF file. There are known issues with how LO handles vector objects. It is steadily improving although Windows vector formats (WMF/EMF) may never be fully supported, as SVG is the preferred format for such objects.

oweng gravatar imageoweng ( 2013-06-14 07:11:34 +0200 )edit

What I don't understand is that it can be correctly converted on Mac version or through "Save as" menu. Aren't they using the same filter to that generates blacked-out GIFs?

Cameron gravatar imageCameron ( 2013-06-16 16:37:23 +0200 )edit

I can confirm that under MacOS 10.6.8 LO v4.0.3.3 using File > Save As... > HTML Document (Writer) produces normal (low quality) graphics. I have never been able to get headless mode to work under MacOS so did not test that. I will update my answer.

oweng gravatar imageoweng ( 2013-06-17 02:27:01 +0200 )edit

Thank you for the detail updates although this issue is still mysterious to me. I am wondering whether this may be caused by some font libraries only loaded via GUI. By the way, LO I am using on Open SUSE Linux 12.1 is from LO website. The LO version in Open SUSE 12.1 is 3.4.

Cameron gravatar imageCameron ( 2013-06-17 14:50:38 +0200 )edit

Yeah, me too. The fact we are both experiencing the same thing, suggests (to me) that you may want to consider raising a bug for this. Include as much detail as you can and link this thread. It is probably going to take a developer or someone more familiar with the ui/headless differences to explain why this is happening. If you do raise a bug, please report it back here in the form "fdo#123456". Thanks.

oweng gravatar imageoweng ( 2013-06-18 06:08:28 +0200 )edit

I've raise a bug for this issue. (fdo #65918). Thanks!

Cameron gravatar imageCameron ( 2013-06-19 05:14:39 +0200 )edit

Thanks Cameron. I have confirmed the bug. We shall see what the developers make of it.

oweng gravatar imageoweng ( 2013-06-19 12:13:38 +0200 )edit

1 Answer

Sort by » oldest newest most voted
0

answered 2013-06-14 03:03:46 +0200

oweng gravatar image

updated 2013-06-17 03:53:29 +0200

Is your version of LO from the OpenSUSE repository or the LO website? I don't have an answer to why the same filter appears to behave differently via headless mode and the UI. Under Crunchbang 11 running TDF/LO v4.0.3.3 I managed to obtain the blacked-out graphics using your suggested command:

$ soffice --headless --convert-to html:"HTML (StarWriter)" maths.doc

I thought I tried this originally and it worked OK, but I must have gotten confused amongst the various tests I did, because it certain produces the same black-out graphics you are seeing:

Crunchbang 11 formulas, blacked out

Sorry for any confusion. Using the File > Save As... > HTML Document (Writer) menu method the graphics appear OK:

Crunchbang 11 formulas, OK

I don't know why this is as both methods would appear to be using the "HTML (StarWriter)" filter, presumably HTML__StarWriter_.xcu and HTML__StarWriter__ui.xcu (source). I can't get LO to run in headless mode under MacOS so could not test this. Under MacOS 10.6.8 running LO v4.0.3.3 using the File > Save As... > HTML Document (Writer) menu method the graphics appear OK (slightly better quality than under GNU/Linux, although still fairly poor):

MacOS formulas, OK

... most of the embedded images are corrupted on both Mac and Linux with default HTML filter.

I had a look at the embedded formulas in your original DOC. The objects that are displaying as blacked-out GIFs are graphic objects (not formula objects). They appear to be vector graphics that the filter cannot understand (although why this works via the UI and not in headless mode I don't know).

If I save the DOC in DOCX/ODT format these formulas are included as WMF format graphics. There are known issues with how LO handles vector objects. It is steadily improving although Windows vector formats (WMF/EMF, which is possibly what the original objects are) may never be fully supported, as SVG is the preferred format for such objects. The ability to store an equation as MathML (DOCX) rather than a graphic (DOC) offers significantly better quality of output.

I don't want to use gif at all. I would like to use --convert-to html to embed all in one html file.

The "XHTML Writer File" filter (same as File > Export... > XHTML file type) embeds the equation in the html as you desire. Unfortunately though this appears to work better for a DOCX source that a DOC source ... (more)

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2013-06-13 18:39:02 +0200

Seen: 1,202 times

Last updated: Jun 17 '13