I have some HTML like this:
<h1>Chapter One</h1>
<p>Our story begins...</p>
<p>And then he said...</p>
<p>More stuff happened...</p>
<h1>Chapter Two</h1>
<p>You get the idea...</p>
Dead simple. (This is produced by my JavaScript code where the book is chapters = [{ title: "Foo", paragraphs: ["Our story...", "And then..."}, { title: "Bar", paragraphs: ["Baz", "and so on"] }]
.)
My goal is to create a .doc
file from it, the old-school MS Word '97-'03 variety, using only the command line or software libraries (JavaScript, Go, or Ruby preferred, but I’ll learn Python if necessary), no GUI. Obviously each h1
would be a Heading 1 and each p
would be a paragraph following it.
soffice --convert-to doc:"MS Word 97" mybook.html
works a treat, but… it doesn’t support generating a table of contents, and I need that.
What should I do here? I asked on the #libreoffice IRC channel, and a helpful person recommended using PyUNO to script this – but researching it, that seems a fairly big project, learning the tooling, the language, the OpenOffice API. I’m willing to put the time in if it’s necessary, but I thought I’d ask here and get some advice just in case there’s another simpler option.
Thanks to anyone who can offer advice!