Ask Your Question
0

How can you extract text from a pptx? [closed]

asked 2013-10-17 19:18:38 +0200

lesshaste gravatar image

updated 2013-10-17 19:19:19 +0200

I have tried the command line libreoffice --headless --convert-to txt:Text file.pptx but this seems to do nothing at all. From within libreoffice I can save as html but largely makes images of the slides where I want the text.

edit retag flag offensive reopen merge delete

Closed for the following reason question is not relevant or outdated by Alex Kemp
close date 2015-11-12 02:20:14.458988

2 Answers

Sort by » oldest newest most voted
0

answered 2013-10-17 21:25:50 +0200

David gravatar image

I'm not sure if it is exactly what you want, but there is a technique for getting text from Impress, although it is rightly described as "low-tech".

Btw, I assume by "pptx" you mean a "presentation", i.e., Impress document in LibO. If you really do mean MS's Powerpoint .pptx ... you're at the wrong site. ;) You can also have a look at this StackOverflow Q&A on ppt/pptx text extraction to see if there's any help there.

edit flag offensive delete link more

Comments

Thank you. I did mean powerpoint .pptx which as I am in linux I import into libreoffice to view. I don't have microsoft office at all.

lesshaste gravatar imagelesshaste ( 2013-10-17 23:25:09 +0200 )edit
0

answered 2013-10-19 02:51:49 +0200

oweng gravatar image

updated 2013-10-19 02:52:26 +0200

Under Linux you could script a solution using unzip and grep etc. For example, this will pipe the contents of slide 1 to the screen:

$ unzip -p /path/to/my_pres.pptx ppt/slides/slide1.xml

You should be able to loop through slide numbers and grep paragraph text / list items from that.

edit flag offensive delete link more

Comments

See also now a related Q&A which includes the steps needed to use the "grep" solution.

David gravatar imageDavid ( 2014-06-02 21:27:33 +0200 )edit

Question Tools

1 follower

Stats

Asked: 2013-10-17 19:18:38 +0200

Seen: 2,122 times

Last updated: Oct 19 '13