Ask Your Question
2

Writer: clarification needed about character attributes

asked 2013-08-27 14:30:20 +0200

ajlittoz gravatar image

updated 2013-08-27 21:20:00 +0200

When using character styles, it is not clear whether the attributes changed by the style override or augment those of the paragraph style.

It is even more unclear if several character styles can be simultaneously applied on the same run of characters.

Take, for instance, "Source Text" which forces a monospaced font. If you subsequently apply "Emphasis", you end up with monospaced italic. On the contrary, if you apply a full user-defined style, then another user-defined style (with non conflicting attributes), the second style replaces the first one.

The (minor) consequence of this is: you'd better apply "Default" style first to be sure to have only the effects defined in the second style.

In addition to this forcing/augmenting dilemma, there is no notion of toggling attribute.

Say you want to emphasise a sequence with "Outline" (I know it is ugly, but this is for a simple example). You can easily define a style forcing outline character shape. Now, your paragraph uses outline by default and you want to emphasise a single word by negating "Outline" style. In my LO experience, I do that only by defining a new style "Removed Outline" where I force the absence of the effect. I end up with 2 paragraph styles and I must care to explicitly use the right one. Moreover, if I change my mind afterwards with style "Outline", I must manually replace all "Removed Outline" style with "Outline".

The situation is even more complex with basic stylistic variations like italic or bold.

These variations are in fact different typefaces in the same font family. Consequently, "toggling" is meaningless. Setting or unsetting italic is equivalent to chosing a different font.

Coming back to the previous example, how to toggle italics in an italic-default paragraph without defining 2 context-dependent character-styles?

To sum up my concern, attributes can be forcing (0 ou 1), toggling (logical xor) or augmenting (logical or).

What is the underlying model in LO? Is it consistent?

edit retag flag offensive close merge delete

Comments

1

Your question is very important and I have been investigating such issues for some time. I think that ideally, what we’d like is a system that behaves consistently and intelligently the way a system like LaTeX does (I’m pretty partial about LaTeX, but I strongly believe in LibO’s potential too ;-) ). For example, how can we get a conditional emphasis character style dependent on the underlying paragraph style?

CyanCG gravatar imageCyanCG ( 2013-08-28 23:36:48 +0200 )edit

I’ll try to come up with a decent answer to your question in a few hours, stay tuned. This is a good opportunity to write a small summary of “my understanding so far”.

CyanCG gravatar imageCyanCG ( 2013-08-28 23:37:24 +0200 )edit

3 Answers

Sort by » oldest newest most voted
4

answered 2013-08-29 04:12:03 +0200

oweng gravatar image

updated 2013-08-31 02:06:13 +0200

I am sure @CyanCG will provide a good answer here, but what I am commenting on is (IMO) one related aspect of why this character vs paragraph style override (OR / XOR) exists. The v3.5 series of LO did not abuse usage of the <text:span> element to the same extent as the v4.x series do. The ODF specification v1.2, with respect to this element, states:

6.1.7 <text:span>

The <text:span> element represents the application of a style to the character data of a portion of text. The content of this element is the text which uses that text style. The <text:span> element can be nested.

Here is a brief example of how use of this element has become problematic. Under GNU/Linux running TDF/LO v4.1.0.4 these steps produce the indicated underlying XML (double quotation marks are not entered, but are merely indicative of text to be entered):

  1. Open Writer.
  2. Enter "Here is some text."
  3. Save as a1.odt and exit Writer.
  4. XML shows:

    <text:p text:style-name="P1">Here is some text.</text:p>

  5. Re-open a1.odt.

  6. At the end of the previous text enter " It is now set to Text Body paragraph style."
  7. With the cursor still at the end of the text double click on the Text Body paragraph style.
  8. Save as a2.odt and exit Writer.
  9. XML shows:

    <text:p text:style-name="Text_20_body">Here is some text. <text:span text:style-name="T1">It is now set to Text Body paragraph style.</text:span> </text:p>

  10. Re-open a2.odt.

  11. At the end of the previous text enter " Same paragraph, but now I am going to go back and italicise the name of the style."
  12. Highlight "Text Body" and set it to use the Emphasis character style.
  13. Click at the end of the text and continue typing " I used the character style ‘Emphasis’ to do so and continued typing here afterwards."
  14. Save as a3.odt and exit Writer.
  15. XML shows:

    <text:p text:style-name="Text_20_body">Here is some text. <text:span text:style-name="T1">It is now set to </text:span> <text:span text:style-name="Emphasis"> <text:span text:style-name="T1">Text Body</text:span> </text:span> <text:span text:style-name="T1"> paragraph style. Same paragraph, but now I am going to go back and italicise the name of the style. I used the character style ‘Emphasis’ to ...

(more)
edit flag offensive delete link more

Comments

I do agree that simplicity in XML structure eases things and facilitates later updates. How does such a regression (personal opinion, no offense intended) happen? Some feature addition?

I don't understand the translation of a2.odt editing: a1 paragraph was "Standard". Adding text at the end uses the styles active at that location (in my understanding). Setting "Text Body" should style the whole paragraph content unless 'Revision tracking' is enabled by default (or at least some hidden feature gives the possibility to regenerate the revision history).

I already noticed user-visible differences between 3.x and 4.x. This fundamental one makes me hesitate to switch to 4.x for production.

ajlittoz gravatar imageajlittoz ( 2013-08-29 08:36:47 +0200 )edit

@oweng, this sums up the XML aspect very well. I’ll try to account for what happens inside the application itself, i.e. in LibO’s internal data structures (at least, the part of them that I think I understand). The abuse of span elements with automatic styles in 4.0 and 4.1 troubles me very much, among other reasons because it makes conversion to other XML formats (XHTML) and TeX formats (LaTeX, ConTeXt) much less clean and less semantic.

CyanCG gravatar imageCyanCG ( 2013-08-29 16:01:04 +0200 )edit

My memory is failing me, but I read somewhere (Bugzilla? new release feature page?) that this new use of span elements was meant to facilitate interoperability with OOXML: indeed, in Microsoft’s format, all text portions of a paragraph are part of something called a text run, even if no special formatting is applied to them (one of the many reasons why this format is awful). It might be related to change tracking. In any case, I don’t think this is a good reason to abuse span’s.

CyanCG gravatar imageCyanCG ( 2013-08-29 16:05:30 +0200 )edit

I agree with all your comments and I too find this matter of <text:span> elements troubling. ISO/IEC 29500-1:2012(E) §17.3.2.25 on p.293 outlines the <w:r> (Text Run) element. I can understand that <text:span> is used as a way of mapping to this element to cater for interoperability, but the behaviour of adding to existing text should not necessitate this. It should be checked whether the element is required. I cannot find a related bug or information on this change from v3.x to v4.x.

oweng gravatar imageoweng ( 2013-08-30 01:40:05 +0200 )edit
1

Update: I found a related LO User ML thread which points to a related bug: fdo#68183. Unfortunately the answer appears to be related to revision tracking (the officeooo:paragraph-rsid property, which is what @CyanCG suggested i.e., it is a OOXML compatibility feature). In my examples above I did not have revision tracking turned on, so I would think this unnecessary.

oweng gravatar imageoweng ( 2013-08-30 01:53:37 +0200 )edit
1

This is the bug I had in mind. The comment by Holger Schmithüsen nails it pretty well. Who do we need to convince to see this bug adressed? This officeooo:rsid attribute is a hack and is absolutely unnecessary for those who actually use the OpenDocument format because of its specific virtues. I think that’s how the issue should be presented. OOXML compatibility should never have negative side-effects for those who choose ODF.

CyanCG gravatar imageCyanCG ( 2013-08-30 21:39:02 +0200 )edit
2

Well, I did a bit more research and it appears I was wrong in my initial conjecture that this was an OOXML-related change. The change relates to the feature for comparing documents. I have updated my answer to be clearer about this. Bug fdo#52028 provides the details. This is still little comfort if you would like the underlying XML to be cleaner. All I can suggest is raising a bug to address this, but unfortunately you will need to be incredibly specific in your detail of the problem.

oweng gravatar imageoweng ( 2013-08-31 02:10:15 +0200 )edit
1

Good, at least this appears to be a better reason for introducing those rsid’s. I might eventually raise a bug and describe the rationale, the use cases, the best practices etc. Maybe I’ll ask for advice on TeX.SE first by asking a question along the lines of “is it advisable to define \newcommands for marking up subsequent additions and revisions to a document?”. That would give us some food for thought!

CyanCG gravatar imageCyanCG ( 2013-08-31 15:22:34 +0200 )edit
4

answered 2013-08-31 16:12:05 +0200

CyanCG gravatar image

updated 2013-09-03 16:57:01 +0200

Applying more than one text (character) style to a portion

I am using LibreOffice 4.1 on OS X 10.8.

It is indeed possible to apply more than one character style to a given portion of text. Take the following example with the two styles you mention (Source Text and Emphasis):

<text:p text:style-name="P1">This is what
<text:span text:style-name="T1">emphasized </text:span>
<text:span text:style-name="Source_20_Text">
<text:span text:style-name="Emphasis">code</text:span></text:span>
looks like.</text:p>

In this first example, I have added the string emphasized afterwards, so that it is enclosed in its own span element with an automatic text style T1. Reminder: styles that apply to text strings are called character styles in LibO and text styles in the ODF spec. The T1 style is defined thus:

<style:style style:name="T1" style:family="text">
  <style:text-properties officeooo:rsid="000c8b60"/>
</style:style>

The only defined attribute is an officeooo:rsid, which shows that this style’s only purpose is document comparison (this makes me grumpy). Apart from that, we can see that it is quite possible to apply two character styles to the same portion. In fact, there are two ways to say it:

  • LibO speak: It is possible to apply multiple different character styles to one text portion;
  • Spec speak: It is possible to enclose a given text node in multiple nested span elements with different text:style-name attributes.

Remark on LibO’s behaviour: the effects are cumulative, so that the word code in this example is displayed in monospaced oblique type (for a monospaced font, the proper term is oblique, as there is no italic shape to speak of, but of course LibO does what is expected and chooses the oblique font).

Remark on implementation: according to the ODF 1.2 spec (part 1, section 19.770), a text:class-names attribute exists for the purpose of applying more than one text style to a node:

A text:class-names attribute specifies a white space separated list of style names. The referenced styles are applied in the order they are contained in the list.

If both text:style-name and text:class-names are present, the style referenced by the text:style-name attribute is applied before the styles referenced by text:class-names attribute. If a conditional style is specified together with a text:class-names attribute, but without a text ...

(more)
edit flag offensive delete link more

Comments

1

@CyanCG: Congratulation for the depth of this answer

  1. Character styles are cumulative ("or" in my wording): then how do we remove an attribute, like contour? (I don't take bold since font family may come with several bold values, e.g. Univers). Re: your remark about conditional emphasis.
  2. My usage of character style is close to semantic markup. I experiment afterwards with visual attributes in the style definition until the distinctions are "visible" (simultaneously trying to keep the traditional typographic usages).
  3. Ergonomics: from a user point of view, there should be some highlighting in the style navigator to show which character styles are active in the selection, not the last (?) one only (the StarBasic hack is not a solution). Presently ...
(more)
ajlittoz gravatar imageajlittoz ( 2013-08-31 18:08:43 +0200 )edit

I agree, and my solution for now is also to apply “Default” and then re-apply the style I really need. If a single attribute is removed with direct formatting (an automatic style in the spec terminology) then it takes precedence over any applied style (be it user-defined or application-defined). That complicates things further.

CyanCG gravatar imageCyanCG ( 2013-08-31 23:12:06 +0200 )edit
1

Terrific analysis with which I agree. A few small things (all near the beginning): "the string emphasis afterwards" should read "the string 'emphasized' afterwards"; near the bullet points perhaps "multiple character styles" rather than "two character styles" (twice); rather than "slanted" I think the correct term is "oblique". The commentary about reverting to Default formatting and direct format overrides I also agree with (in despair). I too cannot obtain text:class-names from testing.

oweng gravatar imageoweng ( 2013-09-01 02:59:05 +0200 )edit

Answer amended. Slanted is also in common usage, but the article on Wikipedia suggests that oblique is indeed the preferred term :-).

CyanCG gravatar imageCyanCG ( 2013-09-03 16:59:14 +0200 )edit
0

answered 2019-05-09 14:24:25 +0200

Luke Kendall gravatar image

From the user's point of view this introduces some massive problems. I like using LibreOffice for writing my books. In most respects it's great. However, one area which causes me big problems is producing the variant editions of a book. To be specific, I have an A-format edition which uses 9pt text for the Chapter Body paragraph style; the same para style uses 10.5pt text for the B-format edition. Similarly for other para styles, like Chapter Heading. However, when I copy the body of the MS from one file into the other (e.g. to create the A-format from the B-format) to create the other edition, it seems a random set of paragraphs fails to take the font size from the para style of the target document. In addition, a small but (to the user) random amount of text is copied but loses the italic style. Coupled with bugs in finding italic text, and bugs in comparing documents, the underlying problem of unexpected changes to the copied text's format (font size, italics), is very difficult and time-consuming to fix. I just thought I'd make a note of the issue here while I now go and look to see if there's a bug report.

I initially came here from https://bugs.documentfoundation.org/s... in the hope of discovering the recipe that would avoid at least the problem of the italic property being lost on text apparently randomly. I assume it's a problem that sometimes appear when using direct formatting. I'm unclear what direct formatting is, but I have picked up hints that it's bad and causes troubles. But I don't know what the preferred method of formatting is that avoids the troubles. Since I always apply emphasis the same way (via selecting the text to be emphasised and then using Ctrl-I), I don't understand why 90%+ of the text keeps the italic attribute, but some text doesn't. I want to be a good user, using paragraph and page and character styles correctly to support my workflows, but I haven't found a solution to this problem yet.

edit flag offensive delete link more

Comments

As pointed out in the bug report and elsewhere, direct formatting is the usual cause of the problem. Direct formatting is any action aimed at changing text attributes without styles, such as keyboard shortcuts (for italics, bold, …) or toolbar buttons (same + lists, …). These seem "natural" because M$ Word does it this way, having no character style.

Direct formatting is "sticky" and "invisible": in the layered styles model, it sits on top and has no hint in the various style panels and menus. It survives copy from area to area or document to document. The only way to get rid of it is to select a wide range of text and Ctrl+M or Format>Clear Direct Formatting.

I have a similar need to yours (though not exactly the same). My solution is to avoid direct formatting and exclusively use styles (para, character, page and list). This leans a very ...(more)

ajlittoz gravatar imageajlittoz ( 2019-05-09 15:43:37 +0200 )edit

(continued 1) unless you reonfigure in depth LO Writer to transfer the usual Ctrl+I or B to your character styles.

However, you cannot fully forbid direct formatting because some actions have no style equivalent, e.g. resetting list numbering. These events are rare enough in my documents I accept the risk of living with it.

My workflow is to consider I put a (semantic) markup of the text with styles, i.e; I don't request bold or italics but I mark a sequence as "important" or "outstanding" (Emphasis or Strong emphasis are two candidate built-in styles). Then, afterwards (in fact rather beforehand when I designed my template), I decide whether such marked sequences should display bold or red. My goal is to separate the contents and its semantics from the appearance or presentation. This imposes many constraints but eliminates problems when reviewing or preparing for another output medium ...(more)

ajlittoz gravatar imageajlittoz ( 2019-05-09 15:53:51 +0200 )edit

(continued 2) … the possibility to mark a sequence with more than one character style (as can be done in Quark XPress®), e.g. a sequence may be marked up as "comment" and I want to put "emphasis" on a word without losing the "comment" markup. Presently I solve the issue with another character style merging both original ones (not satisfactory).

Similarly, I'd like to be able to negate an attribute: Emphasis is usually coded "bold", but if base style of paragraph style is already "bold", typographic rules say this emphasis should revert to "Roman". Can't be done today in Writer apart from creating a "complementary" style.

Despite these shortcomings, I haven't experienced the random non-updates, probably because I struggle to avoid direct formatting. As I wrote, it is not very user-friendly while typing but it is rewarding on editing.

ajlittoz gravatar imageajlittoz ( 2019-05-09 16:02:25 +0200 )edit

That's a very helpful answer, thank you. It's depressing however, as it means there's a severe and ongoing usability problem, as well as a subtle and largely invisible trap for most users. If the bug in being able to find text by attribute were fixed (Find & Replace, using Format, makes F&R unreliable), then a workaround might be to Find All italics, then simply choose one's Character Emphasis style and apply that. Does that sound practical? Or does Writer's "layered attribute" (?) model of text mean that finding text by the visible attribute the user is able to detect, can never be reliable?

Luke Kendall gravatar imageLuke Kendall ( 2019-05-23 15:39:54 +0200 )edit

I use sparingly Find & Replace probably due to my careful use of styles. I just looked to the F&R dialog to refresh my memory. The Format button opens a font selection dialog. When you choose Italic in there, you're in fact telling LO Writer to look for a font variant. If you applied your italics with direct formatting, i.e. Ctrl+I, toolbar button or munu equivalent, I'm note sure of what gets recorded in the XML or internal representation.

Styling with a style involving a teal italic font should not create problem et be relatively reliable. Direct formatting works even for font without the specific variant (when rendering, the font engine "manufactures" an italic or bold synthetic version of the font). The XML encoding is probably not the same, meaning the search strategy doesn't consider the same "keys" as the previous case.

The erratic behaviour ...(more)

ajlittoz gravatar imageajlittoz ( 2019-05-23 17:28:46 +0200 )edit

(continued) when F&R sees a direct formatting. Then, it does not revert to the initial "pure" strategy and begins to get confused. That pure speculation of mine.

Maybe the layered style architecture is also a factor. That's why I keep away from direct formatting to get one layer out of the game. Consider direct formatting is OK for experimenting but should never be used for production-quality documents.

ajlittoz gravatar imageajlittoz ( 2019-05-23 17:33:04 +0200 )edit

Thanks. When the italic font exists for the regular font you're using, I think it's reasonable to expect that searching for the italic font will find occurrences of text that Writer shows (via text display and the Font toolbar), to be that italic font. I think that expectation is reasonable regardless of whether the italic font text was produced via a style or via direct formatting.

Given other bugs, I believe Writer does not properly "understand" or operate on its own representation. Hopefully these issues can be addressed. I can do my part by providing one or two more bug reports to help.

I understand what you're saying about the problems of direct formatting, caused IMHO by Writer's model and UI and documentation. I doubt I could convince the devs to redesign the model, but hopefully they can fix the bugs in its implementation, and I ...(more)

Luke Kendall gravatar imageLuke Kendall ( 2019-05-24 07:07:05 +0200 )edit
Login/Signup to Answer

Question Tools

2 followers

Stats

Asked: 2013-08-27 14:30:20 +0200

Seen: 1,074 times

Last updated: May 09