Convert PPTX to HTML while preserving text

asked 2015-05-28 17:12:28 +0200

Arie gravatar image

updated 2015-10-19 12:38:28 +0200

Alex Kemp gravatar image

Windows 8.1, LibreOffice Impress 4.4.3.2

I need to convert PowerPoint PPTX files into HTML using the command-line. The HTML file should preserve text as text (not image) while converting all Shapes, Smart-arts and Graphs into images (or if not images, then SVG).

If I open Impress and Export a PPTX file to HTML, all slides get converted to images including the text. If I export to PDF, it indeed preserves text and converts all shapes fine, but alas... I want HTML not PDF.

Using the command line (on Windows 8), I tried all HTML filters that I found on: link text

Most of them simply didn't work. The closest I could find:

soffice.exe --headless --convert-to html --outdir d:\temp d:\temp\presentation.pptx

converted the text fine, but for some reason all shapes and graphs are missing from the converted HTML file.

How can I solve my problem?

edit retag flag offensive close merge delete