There is a strange line in the Word Index at the end of my document and I can’t get rid of it. The issue is immediately after the yatrakāmāvasāyitva line in the index. I think that it came from having a blank line at the end of my .sdi file. However, getting rid of the blank line in the .sdi file doesn’t get rid of it. I even tried to delete the last non-blank line yatrakāmāvasāyitva;;;;0;0 from the file, deleted the index from the document and re-inserted the index again. When I did this the yatrakāmāvasāyitva;;;;0;0 line was still there, even though it wasn’t in the sdi file anymore. I expected the index to be recreated from the .sdi file but it wasn’t. There must be something that I don’t understand about deleting and re-adding the index. What am I doing wrong? Thank you.
I’ve attached a view of the issue in the index for you to see.
To be specific about the problem, it’s the line starting with “5, 6, …” after the yatrakāmāvasāyitva line.
Can’t reproduce the issue.
How do you manage the SDI file? Do you edit it from Writer or do you update it with an external text editor? I’ve done my experiment from inside Writer and Writer does a pretty good job ignoring the “noise” ;;;;;0 lines. The index marks are correctly removed for suppressed word.
Without specific information about your text it is quite difficult to help you.
As usual: OS name, LO version, save format.
PS: a much better look for multi-line entries can be achieved when customising paragraph style Index 1 for right alignment, left indent and negative first line indent.
I created and edited the .sdi file in Notepad. That’s the only way I could get it to work the first time I was adding it months ago. Since it seems to work for you, I must be doing something wrong. I’ve tried following the help to delete and re-add the index but it doesn’t work for me. There was no line with “noise” ;;;;;0 in my file. There was just a blank line at the end of the file at first. Then I deleted the last line so the final character was the zero at the end of the yatrakāmāvasāyitva line. Deleting and re-adding that line completely didn’t cause it to be removed from the index. I must be doing something wrong but I don’t know what.
OS is Windows 11, OL 7.3.1.3 (x64), save format is .odt for the document. Since I edited the .sdi file in Notepad the format for that is text.
Here are the instructions that I followed to delete and re-add the index.
Click in your document where you want to insert the index.
Choose Insert - Table of Contents and Index - Table of Contents, Index or Bibliography.
On the Type tab, select "Alphabetical Index" in the Type box.
If you want to use a concordance file, select Concordance file in the Options area, click the File button, and then locate an existing file or create a new concordance file.
Set the formatting options for the index, either on the current tab, or on any of the other tabs of this dialog. For example, if you want to use single letter headings in your index, click the Entries tab, and then select Alphabetical delimiter. To change the formatting of levels in the index, click the Styles tab.
Click OK.
To update the index, right-click in the index, and then choose Update Index or Table of Contents.
As to the formatting of the entries, I hadn’t noticed that they were ugly since I was only interested in fixing up the entries.
Since the index shows “page 188”, your document is likely to be huge. So this no use to attach it. But I feel you may have some stray “real” index entry somewhere in your document. What is really strange is the fact that page 129 and 185 are listed before pp 5, 6, 8, … So either you have an invisible item sorting between yatra… and agamah with occurrences between pp. 5 and 180 or there is a contorted bug. My first momentum would be for some bad data in your doc.
Can you prepare a sample keeping only pp. 1-10, 129 and 185? See if the same issue occurs. If positive, attach this sample.
I’ll try that, but did you notice that there’s no comma after the 129, 185? It looks to me like the line starting with 5,6 is for a different ‘word’. The yatra word only appears twice in the document, on pages 129 and 185.
Does this make a difference in your diagnosis?
I’ve also noticed in the past that deleting a line from the .sdi file doesn’t remove that word from the index, but adding words works for me. I just remembered that.
I tried deleting all the pages except 1-10, 129, and 185. When I try to update the index, or delete and re-add it, Writer crashes. I tried it a few times. I send in a crash report.
The index markings that you create from the .sdi file, aren’t removed when you delete the index. You will have to delete them manually or with a macro. You may want to scan the pages referenced by a blank index item for index markings. Make sure to enable field shadings and field names when you do that.
No. Your screenshot is clipped too close to the strict printing area.
Then you do have an index entry with a non-printable character which sorts between “y” U+0079 and “a”+macron U+0101 (as a precomposed glyph). My guess would be an index entry on NO-BREAK SPACE U+00A0.
Try to limit the sample to pages 1-10.
I just checked and my screen shot does have the entire line. The dark background is the entire screen width. If you notice on the yama line (three before the yatra line, there is a comma at the end. There is also one at the end of the line after the yatra line. But I agree that there is probably a screwy index entry between y and the a with a line over it.
You said that in your testing, the index marks were removed for lines that were deleted from the .sdi file. The index marks are not a concern for me. I have two problems, 1) when I delete a line from the .sdi file, that word still shows up in the index and I don’t think that it should. How did you get them to not show up in the index? I don’t really care if the removed words are still highlighted in the document, but I don’t want them in the index.
And 2) the entry between the yatra and agamah. Where did that come from? I don’t see it in the .sdi file, which I’ve uploaded. So where did it come from? It seems to me that the only things that should show up in the index are words in the .sdi file. I’ve attached the .sdi file. I had to add a .odt extension to upload it.
Word Index.sdi.odt (20.6 KB)
I will try the 10 page test and let you know.
I managed to get the 10 page test to work, and it shows an unknown line in the index in the same place as before. I had to go back to the original .sdi file without the empty line at the end to get it to work. I’ve uploaded both the 10 page file and the .sdi file that doesn’t cause a crash.
a.odt (39.0 KB)
Word Index.sdi.odt (2.4 KB)
This time your SDI file is correctly reported as UTF-8 text.
I notice the spurious entry (void key) points to page 3 in the sample, i.e. inside the first page of the index. However, I was misled by your clumsy tweaking of the page number format. Apparently, you changed the format in the footer field, forcing Roman numerals. This changes only the display of the page number, not its “type” which is defined in the page style.
Here, your page style request standard numerals and index references are captured accordding to the format defined in the page style, not how it is locally displayed. Consequently, the location of the faulty index entry should be looked for in page iii !
I edit SDI files inside Writer:
- right-click in index and
Edit index
- under Concordance file, select
Edit
from drop-down menu - a window pops up with all the SDI fields in tabular form
However, it looks you have “damaged” your SDI file (probably NotePad) because it is described by file
command as an OpenDocument Text. If I double-click on it, it opens in LibreOffice Writer while it sould in a text editor (KWrite here).
An SDI is a plain text file (“ASCII” or rather Unicode file). Yours is a full-fledged Writer document with all the meta-data for formatting. This is confirmed by the procedure I outlined above where all keys are gibberish Klingon as if there were an encoding error.
To fix your problems, save your SDI file as plain text. Then load it in Writer as I explained above.
EDIT
I think I found something interesting in your document. The problem doesn’t seem to be with the SDI file (after all there is always an empty “line” at end of the SDI editing window).
I had a look at the XML encoding of the document and discovered that all your index entries are faulty! This can be shown by unticking the Combine identical entries box in the Index configuration dialog. Then your keys are listed 14 times for every occurrence in text. This is apparent in the XML.
Deleting the corresponding key in the SDI has no effect because the index marks are already present as explicitly added (when? by whom?). The “void” entry is attached to “Samskrtam” in addition to the word itself.
Since you are using a concordance file, you have no need for the explicit index tags. Delete them all. Unfortunately, there is no command to delete all entries at once. You must right-click on a shaded entry and Index Entry
and proceed one by one. Good luck!
Note 1: right-click access to Index Entry
works if there is no higher priority warning attached to it, like spelling warning. So, first select a tagged word and Tools
>Language
>Language for selection
>None
.
Note 2:
- For a 10+ pages document, strict styling is mandatory.
- You seem to ignore character styles and what they can bring to you. Instead you apply direct formatting to change attribute (ex. About the Author where “the Author” is in a different font face.
- You space vertically your text with empty paragraphs instead of integrating this space in the paragraph style. You erroneously use Index Heading for chapter-like headings while this style is used to format TOC, index, tables of … headings.
- I guess you’re American because you type two spaces after end of sentences. This habit dates back to the mechanical typewriter era and is no longer recommended with computer office suites though it is still taught in the US. The suites do justification for you and typing several spaces in a row will have adverse effect on formatting. You’re happy enough your text does not contain “pathological” sentences which would dramatically show this catastrophic effect. If you prefer a wider space at end of a sentence, you should enable pair kerning between dot and space. However, configuring pair kerning seems only possible for certain type of fonts (which ones?). Unless you have a strong incentive, avoid double spaces.
I am using a plain text file that I edit in Notepad. I had to add the .odt extension to the file that I uploaded so that the upload feature works. When I open it in Notepad it looks just fine. When I edit it as you mentioned, it also looks fine. If you rename the file to get rid of the .odt it works just fine; at least it does for me.
However, when I use the instructions you gave to edit the file, it looks fine. However, I don’t see anything weird in the file. Every line looks fine. There is, however, a blank line at the end of the list of words in the edit dialog box. Is this normal? I’ve uploaded a screen shot of what I’m seeing. So where is the weird entry after the yatra line coming from? I don’t see any index highlights for the weird entry in the 10 page sample that I sent earlier.
I’ll be unavailable until this evening but I’ll try to look at your response then.
Thanks for your response. I got home earlier than I expected. I do have some questions on your response though.
- I have absolutely no idea how the index tags were explicitly added: I only used a concordance file for this. How do I know that I’ve removed 13 of the 14 instances of every one? It’s unclear which of your instructions fix this and how it fixes it. What is an index mark and how do I know they’re there? I tried unticking the Combine identical entries and didn’t see any change in the Word Index at the end. How do I tell if I’ve missed some? I’m really confused here.
1.5) Do I need to delete the Word Index at the end?
2 ) By " You must right-click on a shaded entry andIndex Entry
and proceed one by one." Do you mean that I have to go through all 200 pages of the document, right click on every highlighted index word, set the language to None via Tools…, then click on Index Entry, for every instance of every index word in the whole document? What does clicking on Index Entry do? What should I see when I do it? - What do you mean by " So, first select a tagged word and
Tools
>Language
>Language for selection
>None
?" What does this do? What should I see? Is this to get rid of the red spelling warning line below the Samskrtam words?
3.5) I saw something about creating my own dictionary. Is that easy and could I just put all my Samskrtam words in it to remove the spelling warning? - I missed the character font for the Heading 1 style paragraphs that I used. I just fixed that.
- On improperly using Heading 1, I am using it to populate the TOC. That was the only way I could figure out how to get the two levels in the TOC. What should I use instead? How should I have known that I shouldn’t use Heading 1?
- I used paragraph styles and tweaked the space before and after to give the spacing. At least I thought I did. There are a few instances where it was easier to just add a blank line between paragraphs, like on the title page. Everywhere else I used a style. I actually thought I was doing a good job of using styles. Apparently not.
- As to the numbering at the bottom of the pages before the TOC, I spent several hours trying to get the roman numerals to show up instead of Arabic. After the TOC the page numbers are in Arabic, which is what I wanted. What I did was the only way that I could get it to do what I wanted. How do I do this the ‘right’ way?
- I am 64 years old, and an American, so I was taught to use two spaces. I never heard that I shouldn’t do that. But that’s easy enough to fix. It will only take a minute or two.
When you build an index from a concordance file, Writer inserts index fields in the document for each item it finds in the document. When you remove items from the concordance file later on, the inserted items are still there. You can enable display of fields in your documents (Field Shadings
and Field Names
in the Display menu). It’s sort of mandatory to display non-printing characters and field shadings while you are editing any text, so you know what you are doing. With those turned on, your index markings will be visible as gray rectangles. When you right-click on any of them, the pop-up menu will have an item Index Entry
. If you don’t want that item, delete it.
Many questions to answer!
-
This will become clearer, I hope, after my answers. Don’t delete the Word Index for the time being.
To have a better view of what is really in your document, enableView
>Formatting Marks
and all boundary indicators. Also enableView
>Field Shadings
(but notField Names
!). The latter will put gray background where you have metadata associated with words, like the presence of an index entry/mark. These visual clues don’t print. So you can keep them always enabled.
For your 2), you can now see where you have index entries/marks. Set only the first one to language “None”. Now you canEdit index
which opens an “index navigator”. You can delete or edit the index entry. In your case press Delete. When you have deleted the last entry at the present position, the cursor will scroll to the next entry. In case you want to skip some entry, press the button with a black triangle pointing to right (next entry) or left (previous entry).
As I already mentioned, there is unfortunately no simple command to delete all entries. You must do it one after the other. -
By setting language to None for the first index entry, you dismiss the red wavy line for spelling error. This simultaneously unhides the
Index entry
in the right-click menu.
I have no experience with private dictionary. But I am familiar with multi-lingual documents. When I initially write such a document, I am very careful to style paragraphs and non-local words (belonging to a language different than the paragraph language) with adequate styles declaring the language, even if I haven’t installed the corresponding dictionaries. As usual, it is easier on initial writing than on later revisions. -
OK
-
Heading n is the standard way to populate the TOC where n is the hierarchical level of the heading. I didn’t say you erroneously used Heading 1. Your error is Index Heading usage where probably Heading 1 or perhaps Heading 2 would have been a good choice. In addition,
Tools
>Chapter Numbering
allows automatic heading numbering while I suspect you have numbered your chapters manually. Your appendices are alphabetically numbered. You can’t mix Arabic and alphabetical numbering with Heading n. Achieving that is advanced usage of styles where you create a parallel hierarchy for your appendices (similar to Heading n styles). -
Space above and blow are effectively configured in paragraph styles. But if you “tweaked” the spacing with
Format
>Paragraph
, this doesn’t modify the style, but adds over it a direct-formatting layer which takes precedence over the style settings. This can be tolerated when you know what you’re doing and accept the consequences. But usually, in the end, direct formatting plays nasty tricks on your back and leads you to “formatting nightmare”. -
Writer allows very smart tricks by decoupling definition from display. This is very nice but full of traps and you fell into one. The page definition is in the page style,
Page
tab. You find there a drop-down menu to select the page numbering system.
You can insert the page number in the footer as field Page Number. If you do nothing more, you get the page number as it was defined. For your convenience, you can alter the field format, but it acts only on the field display. This is why the page number in the index was not Roman numeral because the definition in your page style was Arabic numeral.
Fix your page style. -
Yes, this is part of US culture. It is so “natural” and rooted in everyday life there than nearly nobody questions it. It was a good custom when you had fixed-pitch tools like mechanical typewriter or dumb terminals with only upper-case characters to highlight the end of a sentence and allow to spot immediately the start of the next one. But nowadays, fonts are proportional and spaces may expand or shrink for justification. There are other ways to see the start of a sentence like capitalisation. I saw degenerate cases where multiple spaces resulted in weird layout.
By the way, I’m older than you.
As always, thank you for your help. I’ll look at doing what you suggested later tonight. I do have another question though. I didn’t know about the 14 times for every occurrence since it doesn’t show up on the screen, or when printing to a printer, or a PDF. So my question is, what harm is this doing? It will take days to go through the document and get rid of them.
And the rest of the document, especially after the TOC is all done with styles. I think you had pointed that out to me awhile ago and I spent several hours/days fixing things. The title page isn’t done so I’m not too worried about that right now.