Ask Your Question
1

How do I install filters for the `soffice` command? [closed]

asked 2013-03-09 06:45:00 +0200

Geoff Oxberry gravatar image

I'd like to convert LibreOffice files (Writer, Impress) and convert them to readable text so that I can meaningfully diff separate versions in Git. I am using OpenOffice 4.0.1 on Mac OS X 10.8.2.

I stumbled upon http://ask.libreoffice.org/en/question/1671/console-app-to-convert-to-text-file/ and its recommendations to use

soffice --headless --convert-to <TargetFileExtension>:<NameOfFilter> file_to_convert.xxx

didn't quite work, because every time I invoke the command with the "Text" filter, the following error message is returned:

/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to txt:"Text" example.odp 
convert example.odp -> example.txt using Text
Overwriting: example.txt
Error: Please reverify input parameters...

I poked around to see if there are any of the filters mentioned in http://cgit.freedesktop.org/libreoffice/core/tree/filter/source/config/fragments/filters were in the LibreOffice.app folder, but I couldn't find any of them. I suspect their absence might be the cause of the error message. Is there a way to install these filters?

Alternately, is there an alternate way to convert these files that doesn't require me to compile anything? I'll be sharing these files with other people who also want to diff the files, and distributing a Python script would be much easier than having to walk them through a build process. I uncovered the odt2txt.py script, which is a step in the right direction, but strips out all images in presentations. It's better than nothing, but something that preserves some sort of useful information related to images would be a great improvement.

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-10-26 15:43:47.007264

Comments

@qubit: Thanks for the help! The flat XML format is exactly what I was looking for. I'd upvote you if I had enough karma. To follow up, is there any way to convert .od files to .fod files on the command line, in case I screw up and accidentally commit changes to .od files multiple times?

Geoff Oxberry gravatar imageGeoff Oxberry ( 2013-03-09 19:59:59 +0200 )edit

@Geoff Oxberry -- qubit throws some karma at you

Hmm... I think you can convert them on the command-line like this:

$ ./soffice --headless --convert-to fodt:"OpenDocument Text Flat XML" embed-image-test.odt
qubit gravatar imagequbit ( 2013-03-09 20:16:29 +0200 )edit

2 Answers

Sort by » oldest newest most voted
0

answered 2013-03-09 19:11:43 +0200

qubit gravatar image

updated 2013-03-09 19:22:19 +0200

Hi @Geoff Oxberry,

I'm able to convert an ODT (Writer) file to text...

qubit@loopbackoffice$ ./soffice --headless --convert-to txt:"Text" embed-image-test.odt 
convert /home/qubit/embed-image-test.odt -> /home/qubit/embed-image-test.txt using Text

I get the same error as you when I try to convert an ODP (Impress) file to text...

qubit@loopbackoffice$ ./soffice --headless --convert-to txt:"Text" img-test.odp
convert /home/qubit/img-test.odp -> /home/qubit/img-test.txt using Text
Error: Please reverify input parameters...

It's possible that there's just no text export filter for Impress. I'll ping someone and ask...

...okay, I pinged people and it looks like Impress doesn't have a text output filter (there's a lot of non-text stuff in an Impress file, for one).

Have you considered using the flat XML file formats for storing content in version control?

The .fod* file formats give you the ability to diff your file changes, plus you can work directly with these files in LO and don't have to do any extra conversion steps before checking them into your VCS!

edit flag offensive delete link more

Comments

Hallelujah!

This worked for me.

shee-eet gravatar imageshee-eet ( 2015-05-08 12:35:57 +0200 )edit
0

answered 2013-03-09 08:42:29 +0200

Mark12547 gravatar image

I don't know if either of these two ideas will help, but one of them might help while waiting for the command-line option. (Note: I am on Windows 7 running LibreOffice 4.0.0.3, so there might be some differences.)

In LibreOffice Writer, while one has a document open, one can do a "File" | "Save as", and the "Save as type" label is next to a pull-down menu, one of the choices presented is "Text (.txt) (*.txt)". This will save the file with each "paragraph" being a line of plain text, no formatting.

There are a couple of other text options so some testing might be called or to see if one of them meets your needs.

Another thing LibreOffice Writer has is a "Compare Documents..." ("Edit" | "Compare documents..."), but I don't think that will meet your needs since it doesn't write any diff files.

edit flag offensive delete link more

Question Tools

Stats

Asked: 2013-03-09 06:45:00 +0200

Seen: 5,890 times

Last updated: Mar 09 '13