Xml:id usage in ODF files

I’m investigating the use of the attribute xml:id in ODT files (*).

Grepping some samples I see

<text:p xml:id="id3893364801" text:style-name="New_20_Chapter"><draw:frame draw:style-name="fr3" draw:name="LibreOffice Logo" text:anchor-type="as-char" svg:width="12.959cm" svg:height="3.318cm" draw:z-index="2"><draw:image draw:mime-type="image/png">
<text:p xml:id="id3179432551" text:style-name="Guide_20_Name"><text:user-defined style:data-style-name="N0" text:name="Guide Name">Calc Guide</text:user-defined><text:s/><text:user-defined style:data-style-name="N0" text:name="LibreOffice Version">7.6</text:user-defined></text:p>
 <text:list xml:id="list475153334" text:style-name="Heading_20_Note">
<text:list xml:id="list3093533769" text:style-name="List_20_1">
<text:list xml:id="list211419328837578" text:continue-list="list475153334" text:style-name="Heading_20_Note">
<text:list xml:id="list211418814938759" text:continue-list="list3093533769" text:style-name="List_20_1">
<text:list xml:id="list211418180261711" text:continue-list="list211419328837578" text:style-name="Heading_20_Note">
<text:p xml:id="id3183077804" text:style-name="Text_20_body">Release Notes are here: <text:a xlink:type="simple" xlink:href="https://wiki.documentfoundation.org/ReleaseNotes/7.6" office:name="https://wiki.documentfoundation.org/ReleaseNotes/7.6" text:style-name="Internet_20_link" text:visited-style-name="Visited_20_Internet_20_Link">https://wiki.documentfoundation.org/ReleaseNotes/7.6</text:a>.</text:p>

I see xml:id applied to lists, which is OK

But xml:id is also used in some text:p but not every txt:p in the file.

Question 1 :What is the current rule of using xml:id in text:p ?

Question 2: how to read and set xml:id for text:p in a macro fragment?

Thank you

(*) Open Document Format for Office Applications (OpenDocument) Version 1.3. Part 3: OpenDocument Schema

(**) LibreOffice Developer's Guide: Chapter 6 - Office Development - The Document Foundation Wiki

What do you want to do with the xml:id in a macro? You are aware, that xml:id values need not be stable over the document lifetime? Only sure is, that they are unique in the document.

Hi Regina

If I can ensure unicity for each translatabe piece of ODT (ODF) content in the file, I believe I’m half way to improve the use of ODF for translation jobs, using computer aided translation (CAT) tools such as weblate.

Stabilty is important and the ID has to be fixed once set or explicitely changed. If a save-close-open cycle cannot garantee the same ID, I’m back to square one.

Our experience with content segmentation executed by the current extraction tools is underpeforming. I’m checking if we can have more confidence in string extraction by ensuring a fixed earmark on each string.

(we have that in the current LibeOffice Help)

Cheers
Olivier

There has been an attempt in the ODF TC to make it stable, but the according issue was closed without result.
https://issues.oasis-open.org/browse/OFFICE-3788

Although the standard does not force that the xml:id value is stable over lifetime, LibreOffice might keep it stable, but I don’t know. I would ask Micheal Stahl about that.

1 Like

there are 3 kinds of id attributes that LO writes into ODF files:

  1. legacy ids, generated:
    For example on text:list; a new number is generated every time you export the document, so these are useless for your purpose.
    A few of these are xml:id but mostly they are in other namespaces like text:id or draw:id…

  2. legacy ids, persistent:
    For these a value is generated the first time, then it will be roundtripped.
    But there is a problem: there isn’t a check that the same value isn’t in use elsewhere in the document…
    A few of these are xml:id but mostly they are in other namespaces like text:id or draw:id…

  3. xml:ids for RDF:
    For these, a value is generated by calling an API function, and then it will be roundtripped.
    The value is guaranteed to be unique in the document, at least among other RDF xml:ids (checking against legacy ids isn’t possible).
    The list of such ids is given in the DevGuide (i.e. if it’s not listed here it’s one of the legacy ids):
    LibreOffice Developer's Guide: Chapter 6 - Office Development - The Document Foundation Wiki

To create an id for such an element (3.), get the UNO service of the element, then query it for com.sun.star.rdf.XMetadatable if using static types, then call
ensureMetadataReference(); to get the value, call getStringValue() or getLocalName().

4 Likes