Where can I findout the magic number of .odt file with encrypted and non-encrypted ?
Thank you~~~~
Do you want to count them in a specific directory?
Without any hint about what you want to do, it is difficult to answer. But, IMHO, if you want to distinguish between encrypted and non-encrypted documents, this will lead you nowhere. The “outer” structure of an ODF file is the same as a zip. Encryption is a second level in the hierarchy. There are several directories and files in the container. One of the file describes the contents (manifest ?? I don’t remember). This is where you’ll find the encryption information (without the key, of course, in case this is the next question).
Hi,
I seem to remember that encrypted LibreOffice documents don’t contain the Thumbnails
directory. Thus searching for this directory in the document tree might be of help.
If you open the ZIP, it’s easier and 100% reliable to inspect META-INF/manifest.xml
, as @ajlittoz and @robleyd suggested, than to rely on something tangential, not holding e.g. for ODTs saved by MS Word.
ODT
is a ZIP
file (thus, the first step is to check ZIP magic number). The ODF standard requires that the first entry in that ZIP is uncompressed mimetype
. Thus, its content - which is application/vnd.oasis.opendocument.text
in case of ODT
- appears in conformant ZIP as a plain text near the beginning of the file.
However, not all files follow the "mimetype
file shall be the first file of the zip file. It shall not be compressed" piece.
Usual encrypted packages are not different in this regard. You can’t detect if it’s encrypted, using “magic number” technique. For all applications that don’t really read ODF, there would be no difference between unencrypted and encrypted ODF packages.
In 24.2, a new “wholesome encryption” appeared. It is different in how they store the data - beside the normal mimetype
, there is also an encrypted-package
stream in it, which presence you can also detect by the name in plain text near the beginning of the file. So these (new, still experimental, not widespread) files will allow you to discriminate.
HI Mike
Appreciate your detailed explanation, apology for make you misunderstand about the encrypt function that I mentioned is “save with password” option on “Save as” on LibreOffice.
What I looking for is has any unique magic number of .odt file to let me identify is this .odt file is protected by password or not.
Thank you~~~~~
I explained that in my answer. You may only reliably detect encryption not using any magic number, but opening the ZIP, and reading the META-INF/manifest.xml
.
HI Mike
I had tried to open the odt file that password protected and non-protect by 7zip application to compare and I can see the content of META-INF/manifest.xml
is different between them, I also noticed the file Thumbnails\thumbnail.png will exist on all Non-password protected odt file.
Thank you for provide me so great solution.
Testing for the presence of a thumbnail in a non-encrypted file is not an assured method of detection, as generation of a thumbnail can be turned off in Expert Configuration.
And also that was discussed earlier - see Magic number of encrypted .odt? - #7 by mikekaganski, where I mentioned that e.g. MS Word does not generate the thumbnails (when writing ODF). It is really wrong to assume something like that - using unreliable hacks. Additionally: where would you look for the "Thumbnails/thumbnail.png"
string? Are you sure that reading N leading bytes (for an arbitrary N) would necessarily cover the whole ZIP header? The number of elements in the package may be arbitrary, including macros, embedded databases, images, OLE objects and their replacements, … - so what chunk is OK for this unreliable hack to look semi-reliable? Also: do you consider the generators that may produce these strings in wrong case (which is accepted by many programs on Windows)? Or that use backslash \
instead of standard slash /
? These are real-life situations (see e.g. tdf#76115, tdf#96401, and tdf#131575).
HI Mike
where would you look for the “Thumbnails/thumbnail.png” ?
A: Use 7z software to open a non-password protected .odt file.
The purpose of this question is our customer is looking for a way to detect the password protected .odt file by their DLP product. The DLP product provide a customize script solution to detect some special file type by magic number.
DLP website
So that is why I keep trying to analyze odt file.
Lol. This makes me wonder, do you know what “magic number” means. I know how to see it. I asked, what is the reliable way to find it as a magic number.
But well, good luck.
Hi @mkil2000,
It looks you didn’t read carefully or did not understand what has been written already.
An encrypted ODF file is not the same as an “encrypted” ZIP. Encrypted ZIP relies on an encryption algorithm built in the zip utility. It operates on file level.
An encrypted ODF is different. Unless I’m wrong, only content.xml is encrypted to hide contents of the document. To avoid bad interpretation of this contents (and likely its rejection), encryption and some of its parameters are notified in manifest.xml.
And, as already mentioned, existence of thumbnail.png is not the correct criterion to decide whether document is encrypted or not.
DLP will not detect ODF encrypted files. So, we’re back to a crucial question: why do you want to detect encrypted files? Are you in an automated workflow where encrypted files needs to be routed to a human operator so that the key can be entered? Other reason?