Hi @Uglyface200, The comparison is very interesting. Could you please attach the example files that you describe above so that we could see how they differ?
Thanks!
Hi @Uglyface200, The comparison is very interesting. Could you please attach the example files that you describe above so that we could see how they differ?
Thanks!
I don’t know how to attach files.
Hi @Uglyface200, You have enough karma to upload files, so that shouldn’t be an issue
When creating a new Answer (or editing an existing one), please click the paperclip icon in the toolbar to upload.
Cheers!
I don’t have enough new info to merit an “answer”, so I’ll post this observation as a comment. I had a .docx file that I created in Word, brought home, and opened in LibreOffice. I ended up with a 4.5mB file, and LibreOffice was crashing/recovering frequently when I would go into the Formula Editor (doing math homework). When I opened a new Writer document and copy-pasted the contents from the Word document, the resulting file was 500kB.
I took a look at the odt files you provided. To see it yourself, rename these to .zip files and unpack.
This seems to leave with only 6.5 KB overhead. When I deleted Thumbnails directory from the zip file, I got:
13.9 KB Writer zip vs 12.3 KB Word zip, which is 1.6 KB difference.
(a fun note, with completely white first page the same document size was 14.9 KB)
styles.xml has some extra styles for graphics, tables, even if these are not used in the document. The largest chunk is outline settings (how 10 levels of headings are positioned and numbered).
settings.xml contains stuff like printing settings, where your cursor was when you saved and stuff. Word’s settings file is empty, it contains nothing useful at all.
I don’t like the thumbnail feature. It’s not very useful, because it’s only ever seen at small sizes, sizes at which the first pages of most files don’t look very different. Also, I suppose this partially explains why Word can’t directly open the ODT files I create with Writer.
It’s usefulness is questionable, depends on your documents. But as for Word’s inability to open ODT files, this is just bad engineering on their side (or worse if it is intentional). But for sure the reason is not previews, an extra file inside a ZIP cannot break anything, if it’s not used.
Is it possible to turn the thumbnail feature off?
@benny, not that I know of. Nowadays pretty much nobody cares about file sizes below few MBs. If you archive a lot of ODT files you could remove the thumbnail for yourself.
Also fun fact: unzipped (e.g TAR-ed) ODT would actually compress better if you compressed more than one file like this.
I thought I’d mention an interesting postscript to this question. I had a ten-megabyte .docx file of a book that I was working on. I saved it in .odt format from Word 2010, and the file size absolutely plummeted to 645 kilobytes! That is a 629% decrease! So, by all means, use the .odt format, even if it is marginally inefficient at small sizes.
Off the top of my head, I can think of at least two reasons why the .docx file is so big: 1) it probably keeps a revision history even though this has not been requested (I have experienced this many times with .docx files sent by colleagues and sometimes, obsolete and deleted content shows up at the end of the document even though it has been deleted) 2) OOXML’s tag names are shorter, but harder to compress efficiently because much more numerous (see Rob Weir’s blog for explanations of that).
Have you analysed what happen with different sizes?.
Please take a look to this thread.
There are no images involved in either file. And performance is equal at all sizes.
This is not an answer to the question.
At the request of qubit, In Microsoft Word, I created a document of the Mozilla Public License. I saved the file in .odt and .docx format.
Then I opened the .odt file with Writer and saved it as a new file, and it ballooned from 13kb to 22kb. Just for the heck of it, I opened the .docx file with LibreOffice and saved it as a new file. It shrank from 22kb to 10kb!
The involved .odt files are attached (I wanted to attach the others, but why on Earth does this site not allow .docx files?). No content was changed after the initial save.
@Uglyface200, Good question about docx uploads. Let me run the idea past the other admins and see if there’s a specific reason why that support isn’t enabled.
Regarding the comparison of file sizes, lemme resolve the file-upload issue first, then we can get back to that
The fair way to comparison is create new document in both Office with same content. Open .odt document (that generate by MSO-2010) in LO-4 then save new .odt is unfair. Because some generic code from MSO are embed to file.
I’ve no MSO-2010, so I can’t see the different result between create new document and save-as .odt document from MSO-2010.
Here’s a suggestion on how you could present the results:
Created In - Format - Size
Converted to: New filename as link
Converted by - New format - New size
So:
Word 2010 - docx - 22k
Converted to: MPL text.odt
Writer 4.0.0.3 - odt 1.2 - 10k
Converted again to: MPL text.doc
Word 2013 - odt 1.2 - 20.3k
Hi @Uglyface200,
I’m not sure if/when we can get docx file upload available on this site. Looks like things are pretty busy in IT land for the foreseeable future. I’m pretty sure that we can upload docx files at bugzilla, so here are couple of options:
File an enhancement request bug and attach some example files. I’d concentrate on the same file type first – so if (given the same input) MS-Word creates smaller ODT files than LO-Writer, then report that.
Punt on this question for a while, until we have some time to clear up our IT work queue.
Thanks!