Converting from pdf to excel is not working

Of course you can define your custom logic. E.g., you may decide what to do with the main text flow (say, you decide to put every paragraph to the next cell); what to do with frames, or with footnotes, or with headers/footers … it’s just not a “normal” file conversion, it’s creation of a new file by applying custom procedure.

1 Like

actually i dont expect anything :slight_smile: , but someone is working on tools that convert between pdf to xlsx / reverse or from pdf to pptx/reverse , so for that reason im trying to find the simplest way to do that but it looks like even the online tools do the same like reading and writing nothing specifial and i think i should do the same without making it complicated

got u, thanks

Actually, the technology for converting a PDF to any other file format, is copyrighted and copyright holder is Adobe, if I am right and that Adobe will not open for anyone, specifically to Linux as it is a THREAT to Windows a.k.a Microsoft and Adobe has partnership with Microsoft, not with Linux. So, as of now, it’s Adobe’s territory and if you don’t have Windows as supporting OS and Acrobat DC ( I am neither asking for buying a license nor supporting to have a cracked version), online solutions are the only options. If that also doesn’t work properly, then … you find a solution. :wink:

… [you] would provide sources backing [your] assertions.

problem here being rather like OCR-ization.

No. There are a lot of individual solutions available. (Tabula and Tessaract are referenced in this thread.) But as mentioned above: This depends on the available data. Not all pdf-files are the same. And as also stated above: After saving as pdf usually a lot of information is lost. (Formulas, fields, semantic markup). Even Adobe can not re-create this (unless we cheat a bit when “converting to” pdf: We can integrate the source as embedded stream. Then xlsx>pdf>ods can be done as xlsx>ods.)