Proliferation of styles just from simple edits of a new document, making tracking in source control look fugly

  1. I tried pretty hard to search for this yesterday and today. Maybe not hard enough but.
  2. Make a one sentence document
  3. Save it as “Too many styles - Step 1.fodt”
  4. Copy it to “Too many styles - Step 2.fodt”
  5. Edit “Too many styles - Step 2.fodt”
  6. Put two words in the middle of the sentence
  7. Save it

It goes from this:
<text:p text:style-name="P1">This is a test document in Default Paragraph Style.</text:p>
To this:
<text:p text:style-name="P1">This is a test document <text:span text:style-name="T1">ADDED THIS </text:span>in Default Paragraph Style.</text:p>

The more I edit, the more it gets chopped up with apparently needless styles, sometimes with different names. I have a 23 page prose document with like 200 identical styles in it, differing only by numerical identifier.

When I check this into git I see so many changes that I didn’t make that I give up and reformat the whole document to “No character style”.

Is there something I’m doing wrong? I am dying.

The styles T1, P1 are so-called autostyles, i.e. the way that is used in ODF to express the ad hoc properties, not the proper styles.

In this case, the styles will differ by rsids - identifiers that serve the goal of better comparison; see Random Number to improve accuracy of document comparison. If you disable that option, the new autostyles will stop creating. If you save to a non-extended ODF format, the existing rsids will also get removed.

But really, you never explained what real problem you are trying to solve: i.e., how these autostyles make your life harder…

Thanks for the help. The reasons are twofold:

  1. Checking into git. Source control runs on diffs or deltas, and when i check in a change I want to see the fifty words I added, inserted or changed, not the two words I added mixed randomly in with an equivalent amount of XML elements that tell me nothing about my changes
  2. Document size. I’m scared that when my document runs to 500 pages it will slow to a crawl processing all these apparently pointless XML elements that I will have to abandon it. I’d rather do that now at 23 pages than later, when I might have to reapply all my emphasis by hand in an entire book.

Thanks

tdf#85660

This is completely made up idea. No matter if these things are there or not, the imported paragraphs will contain their properties, and the rsids will not affect anything performance-wise. They have other negative effects (e.g., they may break ligatures, and that’s a real bug), but not that.

Note that the in-memory document model is not XML-based.

Thanks for the bug lookup and the pointers. I’ll try to do better about searching there.

Does it mean you don’t use character styles? Every style category has a semantic role in describing the logical structure of your document. And IMHO, this is more important than a few added rsids.

By not using character styles, you’ll create way many single-usage autostyles than properly marking your text with character styles. You’ll have only a single style definition shared by all similar sequences, while direct formatting implies creation of one style per occurrence. And this will surely increase much more your document size than rsids.

Yesterday I realized that, and resolved to switch from ctrl-I for italics to using character style Emphasis, and it looks great. In order to do that I had to go through my document paragraph by paragraph, doing control-M on each to clear direct formatting, and reapplying the Emphasis style wherever I had previously used italics. So far so good until I got the random spans just from adding two words.

  • Tried the RSID thing and it’s an improvement: Tools - Options - Load/Save - ODF format version
    to “1.3” rather than the default of “1.3 extended (recommended)”
  • But still get this when I apply Emphasis style to “forty five” and then apply No Character Style to the word “five”. It didn’t “pull in” the right hand Emphasis, it sorta sub-sectioned the second word:
    <text:p text:style-name="Standard">This is a test <text:span text:style-name="Emphasis">forty </text:span>five<text:span text:style-name="Emphasis"> </text:span>in Default XXXXX Paragraph Style.</text:p>
  • I think I might be looking for a markup language?

You surely did the first time when selecting manually the emphasis range, including the space after “five”. This happen if you Ctrl+Shift+<left_arrow> to select groups of words because Ctrl+<left_arrow> moves to the start of next word, keeping the spacing before it. It is easy to miss this extra space when applying a style.

In your example, five is not styled, so the forty and the trailing space, which are styled, are marked up correctly? What was your expectation here?

My expectation, probably unrealistic, was "I am changing from two italicized words to one italicized word, so please pull the Emphasis style in tight so there are a minimum number of XML elements in my stuff. From the point of view of looking at what has changed, an Emphasis section applied to a blank space is icky. I don’t want to be constantly aware of these little bits creeping into my stuff, I’m trying to write a story. But if I don’t stay on top of it and clean up the formatting of everything I did that day, which might be edits across the whole document, then I will see all these things as diffs in my git checkin, and I will not actually be able to see the words I changed very well.

I’m checking out Emacs with Org-mode as a markup solution.

Change your mind about how you write. Don’t try to outbeat Writer in its encoding. You’re an author, behave as an author. As you point out, the important point is the story. If you don’t want to be distracted by formatting, write everything as undecorated /Body Text*.

Then at the end of the day, become a “graphic artist” and apply styles: Heading 1 for your chapter titles; various character styles like built-in Emphasis or Strong Emphasis and other custom ones like Irony, Slang, Sotto Voce, Thought, …

Don’t describe the visual effect in your style name. Give them a semantic desired value, a significance, because today you think Slang should be italic but tomorrow you think it better to be red with a different font face. Temporarily Emphasis and Slang has the same visual effect but having two styles allows you to change one semantic value without modifying the other.

In Writer, styles are the mark-up solution you’re looking for.

And let Writer take care of the internals. Modern computers don’t really care for a few kbytes added or removed. It is much more important that you focus on your author’s creativity.

PS: Writer handles very easily heavily styled documents up to ~1000 pages without appreciable impact on performance (this is not the same story with direct formatting).

I hear what you’re saying. But I really want to do clean checkins to git for diffing, and I don’t want to be constantly conscious of little lost style spans. The two goals may be incompatible for me. There are a few more things I can try.

Thanks for your help though.

Have you already tested the Track changes features. I find it sufficient for my needs (which are not very sophisticated).

I will check it out, thanks.