Possible bug in Calc involving saving embedded fonts and filesize oddities

Hi,

I’ve recently started using the features to embed fonts in a calc spreadsheet - selecting both options in the properties-fonts tab. Usually I get a filesize of ~12MB when saving this document. But every once in a while it will come out as ~1.3MB. If I resave without closing the document, it will save out the smaller size again. I have to exit out of the document, reopen it, then resave it - and it will then come out as ~12MB again.

Sadly, I have not been able to find a way to reproduce it on command though. So I haven’t tried filing a bug report yet.

I started using the feature around LO 24.2.4 or so, and it has been in every version since: LO 24.2.5, and even the beta LO 24.8.0.1. The main font I’m using is Calgary.

Is anyone else able to find such filesize oddities when saving embedded fonts?

I wonder why you want to embed a font for a spreadsheet?

For me, embedding fonts increases the size of my test spreadsheet from 31 KB to 26163 KB as I have too many fonts installed. I suppose the checkbox to embed only the used fonts isn’t working properly. You should report a bug, How to Report Bugs in LibreOffice - The Document Foundation Wiki

I’m not sure whether embedding a font in a spreadsheets counts as a template but it sure increases to licence fee for Calgary if it does, it goes from $15 to $500

Where did you get these strange versions from?

Oops, I meant the beta 24.8… Corrected in OP.

It’s a spreadsheet for public use, so I want to try and make sure it looks the way I intended - for wherever it might go.

imho it is always better to publish a PDF.

It needs to be a spreadsheet. It’s a calculator and database of sorts.

I was about to try doing that, but it seems like they almost required reproducible steps - so I opted out of that.
For now, I’m going to try using just the one option for embedding all, as opposed to the “only fonts used in document” option. I think that might be what I want anyways since I’m under the suspicion that they mean only the characters of the fonts currently being used, if I would check that 2nd option. The filesize went from ~12MB to ~15MB. But, storage is cheap.

No we don’t. We mean font files as they exist on your system (the only place where subsetting works is in PDFs).

Well, without these steps, you are free to file a bug of course - but it would be useless.

I’m not up to par on font lingo. But there are a whole ton of fonts on my system. My document only uses two fonts. So why is there only a difference of about 3MB when I add the second option to it? I would think the file would be far bigger, as there seems to be about a hundred fonts on my system.

Those were my thoughts, exactly.

Because neither option tells about all fonts of your system (why would somebody have an idea that we might grab all unrelated stuff from your computer inside the document!). Both are about fonts that are mentioned somewhere in your file; the difference is where: the second “Only embed fonts that are used in documents” option (what an awful name!) restricts the list to those that are not just mentioned in styles, but they must be applied to content (directly or through styles). I.e., if you have styles that are not applied to some content (e.g., cells) - like when you create templates, and have styles that could potentially be used, - then the fonts of those specific styles won’t be taken into account when embedding fonts, if the second option is enabled. Note that the content (like cells) doesn’t have to have actual text. The fact that an empty cell has some font applied, already makes that font used.

2 Likes

Ah ok, I think I’m starting to get it now. I’ll have to digest this again later, as I’m about to go to bed - so I’m a bit fuzzy. And yeah, it certainly sounds like a bad name now.

Saw that you’re a developer, and I just got to thank you for what all you guys have done with LO. It’s quite breathtaking what has been accomplished. :+1:

1 Like

Maybe change to Spectral (Extra Light) or Forum font from Google fonts

I use 24.2.7 on Ubuntu. And this “Only embed fonts that are used in documents” option does not work as intended. It embeds font files completely unrelated to the contents of the spreadsheet. The embedded fonts don’t belong to a style that I can find in the document. I made a testfile with two cells, both use Fira Code Light. The ods file is 31.8 MB in size and has 4 embedded fonts (none of them are Fira Code Light).

the fonts are of the Noto Sans family: CJK_SC_{1,2}, Devanagari_{1,2}

I did not use them. This feature is definitely broken in some way.

It is not; it’s just a wrong idea what “used” means in this context.
The fonts are used, when they are mentioned in either styles or direct formatting, that are referenced from text. And if you do not see them, doesn’t mean there are not - you may save as FODS (or FODT, …) and parse XML to make sure there is (or more likely, there is no) wrong “used” detection.

Note that I write this without seeing your document; indeed, there may be bugs, but this is the feature I touched only recently (the other week), so I know its state and behavior pretty well.

Hi,
I have looked at the styles.xml file of my test document (30 MB is too big to upload it here). So, I will deactivate font embedding and give you the smaller version of the file (if you want to have a look).
font-embedding-test.ods (9.8 KB)

One big problem is that it is very hard to browse styles. The style preview function does not work for me. So, I clicked through all the styles and “edit style” - i could not find a style that references these other fonts. The styles.xml file is hard to read, fods is even harder to read. I could not find a mapping between styles and what elements they apply to. If the fods file has references to styles that I didn’t set, then I cannot know whether the file is bugged or I made a mistake somewhere, So, it doesn’t help me.

There is no overview that I could find that summarizes styles. And there are too many of them to keep track of. If there is a style somewhere that I missed while clicking through the styles, then maybe you are right.

I would have never chosen anything that is related to CJK or Devanagari in anything that I write (I have no such use-case).
This also doesn’t explain why Fira Code Light itself was not embedded, even though all styles that I saw used that font.

And finally, embedding fonts referenced in unused styles is bad, and you shouldn’t do it: if I cannot have an overview of which font is used by which style, then I can never diagnose this problem myself. I am fairly sure that I would never click through all of these styles again. I did it because this really bugs me, so I wanted to check thoroughly.

So, I am 99% sure that there is a bug in some sense and what you said is just how it is intended to work. If you are sure that this feature indeed works as intended, then I would just never embed fonts. I do not want 30 MB files for 8 bytes of data, and I cannot debug the list of styles for weird, unused things.

Please list the font files that get embedded for you; then I will try to explain why they were embedded. For Fira Code Light, I will check if and why doesn’t it get embedded.

But in general, you need to understand that any style (and any direct formatting) defines three fonts - for “Western” scripts, for “Asian” script, and for “Complex writing systems”. And even if you do not see them (because of your settings that hide non-Western features), they are defined; and whenever you use a style where a font is defined, that font is considered used - even if the text with that style has not a single character in Asian or Complex scripts.

So, if I understand your answer correctly, then

  1. the Asian and Complex fonts are set invisibly and then embedded (perhaps system default values, or similar).
  2. there is no way for me to know about this, because even if I meticulously define a style it will have a kind of fall-back
  3. It is very hard to write a file that will look correct for another person (assuming that they don’t have the font I used)

These are the fonts that are embedded, taken from the fods file:

  <style:font-face style:name="Noto Sans CJK SC" svg:font-family="&apos;Noto Sans CJK SC&apos;" style:font-family-generic="system" style:font-pitch="variable"/>
  <style:font-face style:name="Noto Sans Devanagari" svg:font-family="&apos;Noto Sans Devanagari&apos;" style:font-family-generic="system" style:font-pitch="variable"/>

I think that I can kind of understand your answer (even though it blows my mind that this is intended). Maybe this feature is useful in some cases.

You can disable exporting groups of fonts in the font embedding settings. But yes, this is intended. A style is how text should look; and a used style is a style appearing in the document. A font is embedded to allow user to continue typing, and get the text that looks as intended.

And Asian and Complex fonts aren’t fallbacks - they are the primary fonts used when you (or another person, who you sent your file) types a Chinese character, or a Hebrew character - the same way as e.g. your Fira Code font is used when you type a Latin character. The program has no idea which character you will type next; but it knows, that you created a style (or a direct formatting) that has a “Fira Code+Noto Sans CJK+Noto Sans Devanagari” combo, and the style is used, and you asked to embed fonts from styles that are used.

In my testing, Fira gets embedded OK… Maybe your font is some version that has wrong license, that prevents embedding?

Hmm, the license of Fira Code seems OK to me:

This Font Software is licensed under the SIL Open Font License, Version 1.1. This license is available with a FAQ at: http://scripts.sil.org/OFL

I tried switching it to Fira Mono (Open Font License) and that did work, it was embedded! This is interesting and kind of worrysome. There was no warning about a licensing failure.

But… I didn’t define a combo. I never selected or typed this: “Fira Code+Noto Sans CJK+Noto Sans Devanagari”.

This is true only in a very distant sense. If I open the style in libreoffice:

The program definitely knows that I am typing in English because somehow it knew to show me only one font, not all three.

I understand the thing you say about what another person may type, and they could very well type a Japanese character, for sure. But, do you agree that this setting is completely invisible to the user? I have no influence over this at all. Who set these fonts? I promise you that I didn’t, that’s why I called it a fallback (a value used when none is defined by the user). “Default” is maybe a better word.

I saw the checkboxes that deactivate Asian fonts and Complex writing systems. I left them on for the experimentation (otherwise I had 0 embedded fonts) - so this is close to workable, thank you.

Still… this situation isn’t good. I don’t know what a smart solution would be, but I think it’s along the lines of: someone types a Korean character into the sheet, then they have to define what the font for it should be and add that to the file.

I am kind of half satisfied that font embedding can sometimes work without creating huge files – thanks again.