I saved my libreoffice impress file in .pptx instead of .odf and lost the text inside. Anyway to retrieve back?

Libre office impress

I just tried this with a test file (odp, LibO 4.2.4.2: Linux Mint 13; I saved to PPTX : !!! text?!! : re-save as ODP : text still problem). Looks like a known bug (?): fdo#57279, fdo#58709 It’s a bad one! Work-around - use PPT if you need to. I can’t work an easy way to “undo” if you haven’t kept original file.

I was curious why you lost text and made test but on an XP machine and LibO 4.2.4.2.

  • Original: odp file with graphics and text
  • Saved as pptx
  • Openend in Impress
  • Graphics are mostly ugly
  • Text remains in NORMAL view mode
  • Text completely gone in OUTLINE view mode

Maybe there is chance to salvage a big part of your work.
I would start with a new presentation with the template you want, and copy slides from the pptx-version over into the new odp-version from slide sorter into slide sorter. Doing so will force the pptx-slides to take the format of the odp-slides.

@dajare - can you confirm my observations with you LM MInt based results?

In general it is always better to use odf-formats; for presentations thus odp. Only when I there is no way around I save at the very end additionally in ppt and pttx. @dajare is right, ppt is better than pptx.

To avoid what happened to you, LibO can warn you. Tools > Options > Load & Save > General > Default file format and ODF settings > check Warn when not saving in ODF or default fomat.

Another hint. When you need to give a presentation on PC with only MS PowerPoint, get LibO portable and your presentation on a USB stick and present from there. But run a test to be familiar with the situation. I do this always when I cannot use my own PC and have such USB stick always with me as a backup solution.

can you confirm my observations with you LM MInt based results? = yes, that’s a more precise statement of what I got. Also, same behaviour under Ubuntu 14.04 with LibO 4.2.4.2 (just checked now). Impress will display the text on the “Normal” view, but it is difficult to select, not really recognized as “text”, and impossible to access in the “Outline” view (which is just wrong!).

P.s. I did look into the embedded ZML files in both the PPTX and ODP versions - terrible job trying to extract text from them, and PPTX worse than ODP, since former saves content in individual XML files per slide, whereas latter has all content in a single content.xml file.

@dajare - I agree with you concerning the embedded files. I also unzipped the pptx version. I would use this approach only if the text is from someone else and this is the ONLY chance to get the text. Would it be my own text I just re-write it. - Hope @minky87 reports what he did and what he achieved.

OK, it’s harder with Windows, but it does work. (Have you tried installing the free MS Powerpoint Viewer, by the way? I don’t know if it would help, but it might.)

An earlier question about PPTX suggested making use of “grep”. This does work, and is simple in Linux, but will take a bit more work in Windows. Here are the steps:

  • You need your file in PPTX format, so use “Save as…” and convert if it’s currently in ODP.
  • change the PPTX file extension to ZIP, and extract the archive
  • go to /ppt/slides directory (full of slide*.xml files)
  • from terminal run (in Linux - for Win, see below):
    grep -oP '(?<=\<a:t\>).*?(?=\</a:t\>)' slide*.xml > text.txt
    • use -oPh to omit file names from beginning each line of the extract
  • the resulting file, text.txt, will have a dump of the presentation text

Extra for Windows

This requires the installation of a grep tool. In my experiment (Win 7), GNU Grep for Windows worked best. Open the “Command prompt” box, navigate to the \ppt\slides directory if need be, and enter the command line in this form:

grep -oP "(?<=\<a:t\>).*?(?=\</a:t\>)" slide*.xml > text.txt

(Notice use of double-quotes for delimiting the pattern-string.) You may also need specify the full path to the executable, so the full command line might end up being something like:

"C:\Program Files (x86)\GnuWin32\bin\grep.exe" -oPh '(?<=\<a:t\>).*?(?=\</a:t\>)' slide*.xml > text.txt

This was successful in extracting all the text from the PPTX in my experiment. I hope it works for you, or is enough to get you going.


Reference: “Tools to extract text from powerpoint pptx in linux?

"I would start with a new presentation with the template you want, and copy slides from the pptx-version over into the new odp-version from slide sorter into slide sorter. Doing so will force the pptx-slides to take the format of the odp-slides. "

Tried this and it doesnt work :((

is there really no way to salvage the text?

I might have a solution (at least partial) for you - but we need to know what operating system you’re using - Win? Linux? or… what?

@minky87 - I am sorry to hear that my workaround idea did not work. Let’s see what @dajare has to offer. ------ In which view do you see what? Can you describe it or attach the file? (Be carefully about confidential content!)

Hi @dajare, im using Win !

@ROSt53 , I received a warning that whether to continue saving in pptx that i might lose my data. I continued and lost the text inside . so im unable to view any text despite changing it to odp format now . :frowning:

@minky87 - When you save in pptx you always loose some information, the same happens the other way round. So far I only observed format losses, never text looses. (I assume you write only latin characters!) When you compare your pptx-file and opd-file both in Normal View. What do you see in which file? In which file do you see your text?