Why are non-English words in a different font?

My paragraph style uses Source Serif Pro. I have cleared any direct formatting (ctrl-M). Everything looks fine in LO Writer. However, when I import the document into Scribus, certain words in German and Polish appear in a different font, Microsoft Sans Serif, which — perhaps not coincidentally — is the specified font in both Default Paragraph and Default Character styles.

The content.xml file shows the foreign words with a different style <text:span text:style-name=“T1”>, but as I said LO displays all document content in Source Serif Pro, the way it should be. Does style-name=“T1” refers to the default character style?

What exactly is happening here? Can I avoid this unwanted behavior? Is there some obscure (to me) setting I need to adjust? I do have Polish and German dictionary extensions installed, if that is relevant, and I would like to keep them active if possible.

LO ver. (x64); OS: Windows 10.0 Build 18362.

Thank you!

[edit: Adding LO and OS info. Also, correcting font names to Source Serif Pro and Microsoft Sans Serif. User error.]

The XML is quite difficult to read because the id in style-name="T1" is an indirect reference. You should look at the style dictionary to see which user named style is designated by “T1”.

Attach a sample file (1 page max) with the issue to your question. Attaching a file can only be done in a question (not a comment): use the edit link under que question and the paperclip in the toolbar. Please, mention LO version and OS name.

Never heard of San Serif Pro. Thus, the question: does that font have letters with diacritics that are necessary for German and, especially, Polish? Such as ł, ą, ś etc. If not, change the paragraph style font.

Please attach a file. This is the only way to technically analyse it.

Gabix, I managed to get both font names wrong. It’s Source Serif Pro, and yes it does support all the Central European diacriticals (which is one reason I chose it for this project).

Ajlittoz, See file attachedHoff-00-Table Contents.odt I had a hard time replicating the issue in a fresh, concise document, so I will offer this instead. Still only one page :slight_smile: Also, I’ve added the version info to the original post.

Thanks, both of you!

Using an answer, but this is not a solution; I break the rule because of limited space in a comment.

Your sample file is plagued with direct formatting. Instead of using a custom character style, you tagged words as Polish with Format>Character and forcing language to {pl}.

  • Direct formatting explains the style-name="T999" in the XML. T999 is used for character style. When you look at the style dictionary, you see that this style id has no name (this is an indication of direct formatting).
  • Language name {pl} is not the correct designation. It should have been chosen from the drop down menu. But it does not seem to make a difference because after correction, the "T2" character style still shows fo:language="pl".

Not all Polish names are tagged with the character style. It does not relate to the problem unless even in this case they look different. Check all your markings in the sample file.

My Linux box has no Source Serif Pro. Consequently, it is substituted just like Microsoft Sans Serif but I could find no mention of it while scanning the text. However, Microsoft Sans Serif is set in paragraph style Default Style. This means this font may come in the way if something is not configured correctly in the paragraph styles because all styles inherit from Default Style if an attribute is not overridden.

Not related to the problem at hand: do you know you can generate your TOC automatically if you style your headings with Heading n? Questions on this site provide tricks to have chapters numbered “Chapter 99” and restart numbering with “Appendix 99”, both sequences ending in the same TOC. Anyway, you didn’t configure paragraph indents which would solve some editing issues. Also you inconsistently use line breaks (Shift+Enter) and paragraph breaks (Enter) to handle multi-line headings.

I notice you have a plethora of Body text (99) and Heading # 9 (9). It might be worth thinking over the design of your styles. There is no reason to define one style per paragraph. A style is usually meant for a family of paragraphs belonging to the same semantic field. I guess the various PageStyle9 result from the necessity to have different header in different chapters. This can be solved with field insertion, thus a single page style handles several chapters.

There are also several sections. What are they for? Sections are legitimate when you want to temporarily change the number of columns with regard to the current page style. Other than that, they are a nuisance. I see them frequently in documents coming from M$ Word realm. Sections in Writer are not the same as in Word. Sections in Word are rather close to Writer page styles. Use of the same word for different concept is unfortunate.

I could not find the origin of the problem. When you put the cursor in a Polish word, does the font drop down menu in the toolbar still displays Source Serif Pro?

If you want more assistance, post a screenshot of the difference in font with an extensive description of the phenomenon (because I have not the fonts you chose).

Yes, to say it is “not a solution” is an understatement. I feel I have been bullied, reading your screed about all the things that are wrong with my sample document that you requested.

The table of contents belongs to a 200-page typewritten translation of a Holocaust survivor’s account of that horrible era. I scanned the document and have been trying to wrestle it into proper format. Abbyy FineReader Pro 12 did a surprisingly good job of OCR, but in the process it applied a lot of formatting to the output. Some was appropriate — the author used italics extensively — and some was random, such as switching to bold or using a different font. I wanted to preserve the italics mainly, so I saved the OCR as an .odt file and have been editing that for style and format.

I have been using LO (and OOo before it) for a decade or so, and I am well acquainted with the use of styles. In fact, I don’t know how we managed in the days BS (Before Styles). continued …

Somehow the second part of my comment was disappeared. I’ll just add that I do know how to get LO Writer to make a Table of Contents, but in this case I wanted to preserve the original pagination and format.

I figured out a workaround: change the default styles (paragraph and character) to match the style in use. Or maybe in some parts of the document I can go back to the source and strip all formatting from it, which I probably should have done in the first instance. Thank you for your time.

You should have mentioned that the document came from OCR. This is very important information for the problem at stake. It explains a huge part of the formatting, notably wrong decisions about formatting breaks.

When a document is acquired “mechanically”, manually checking and fixing the result is required. I understand your desire to preserve the original aspect. I think it would be simpler if you choose a monospace font since characters in typewriter all had the same width. But perhaps I misinterpret the word “typewritten” because you say there are italics.

Anyway, you’re right you have to tune the styles. For that, you must choose between full styles and full direct formatting. The number of styles with similar names and the lack of character styles suggests you’re closer to direct formatting than styling. Styling may cause the necessity to delete empty paragraph which were added by the OCR process to translate vertical spacing.

I would have mentioned the “mechanical” origin of the material had it occurred to me that it might be relevant. Only in retrospect does it now appear obvious to me.
The original material was actually a photo copy of what looks like a typewritten manuscript, using a monospaced courier-like font. It’s undated but probably ca. 1988. On closer inspection (magnifying glass), it looks like it came out of a 24-pin dot-matrix printer, which explains how it contains italic (actually slanted) portions.