# How do I install filters for the soffice command? [closed]

I'd like to convert LibreOffice files (Writer, Impress) and convert them to readable text so that I can meaningfully diff separate versions in Git. I am using OpenOffice 4.0.1 on Mac OS X 10.8.2.

I stumbled upon http://ask.libreoffice.org/en/question/1671/console-app-to-convert-to-text-file/ and its recommendations to use

soffice --headless --convert-to <TargetFileExtension>:<NameOfFilter> file_to_convert.xxx


didn't quite work, because every time I invoke the command with the "Text" filter, the following error message is returned:

/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to txt:"Text" example.odp
convert example.odp -> example.txt using Text
Overwriting: example.txt


I poked around to see if there are any of the filters mentioned in http://cgit.freedesktop.org/libreoffice/core/tree/filter/source/config/fragments/filters were in the LibreOffice.app folder, but I couldn't find any of them. I suspect their absence might be the cause of the error message. Is there a way to install these filters?

Alternately, is there an alternate way to convert these files that doesn't require me to compile anything? I'll be sharing these files with other people who also want to diff the files, and distributing a Python script would be much easier than having to walk them through a build process. I uncovered the odt2txt.py script, which is a step in the right direction, but strips out all images in presentations. It's better than nothing, but something that preserves some sort of useful information related to images would be a great improvement.

edit retag reopen merge delete

### Closed for the following reason the question is answered, right answer was accepted by Alex Kemp close date 2015-10-26 15:43:47.007264

@qubit: Thanks for the help! The flat XML format is exactly what I was looking for. I'd upvote you if I had enough karma. To follow up, is there any way to convert .od files to .fod files on the command line, in case I screw up and accidentally commit changes to .od files multiple times?

( 2013-03-09 19:59:59 +0200 )edit

@Geoff Oxberry -- qubit throws some karma at you

Hmm... I think you can convert them on the command-line like this:

$./soffice --headless --convert-to fodt:"OpenDocument Text Flat XML" embed-image-test.odt  ( 2013-03-09 20:16:29 +0200 )edit ## 2 Answers Sort by » oldest newest most voted I'm able to convert an ODT (Writer) file to text... qubit@loopbackoffice$ ./soffice --headless --convert-to txt:"Text" embed-image-test.odt
convert /home/qubit/embed-image-test.odt -> /home/qubit/embed-image-test.txt using Text


I get the same error as you when I try to convert an ODP (Impress) file to text...

qubit@loopbackoffice\$ ./soffice --headless --convert-to txt:"Text" img-test.odp
convert /home/qubit/img-test.odp -> /home/qubit/img-test.txt using Text


It's possible that there's just no text export filter for Impress. I'll ping someone and ask...

...okay, I pinged people and it looks like Impress doesn't have a text output filter (there's a lot of non-text stuff in an Impress file, for one).

Have you considered using the flat XML file formats for storing content in version control?

The .fod* file formats give you the ability to diff your file changes, plus you can work directly with these files in LO and don't have to do any extra conversion steps before checking them into your VCS!

more

Hallelujah!

This worked for me.

( 2015-05-08 12:35:57 +0200 )edit

I don't know if either of these two ideas will help, but one of them might help while waiting for the command-line option. (Note: I am on Windows 7 running LibreOffice 4.0.0.3, so there might be some differences.)

In LibreOffice Writer, while one has a document open, one can do a "File" | "Save as", and the "Save as type" label is next to a pull-down menu, one of the choices presented is "Text (.txt) (*.txt)". This will save the file with each "paragraph" being a line of plain text, no formatting.

There are a couple of other text options so some testing might be called or to see if one of them meets your needs.

Another thing LibreOffice Writer has is a "Compare Documents..." ("Edit" | "Compare documents..."), but I don't think that will meet your needs since it doesn't write any diff files.

more