[Solved] XML parsing in LO Basic : unable to get node values

Hi all,
I’m migrating some Excels with VBA to LO Base and Basic.
One of the functionalities I need is to load an XML file and extract node values from it (e.g. for inserting/updating some table rows).
I’ve been googling around and reading up from several sources (LO Documentation, LibreOffice 24.2 SDK API Reference, the Pitonyac docs …) and managed to build some code that can load an XML file and navigate the XML Dom tree …

However, when I try to obtain a node’s value using .getNodeValue(), I get an empty string (whereas .getLocalName() does return the node’s name …).
Furthermore, when a node has attributes in it’s opening tag, I do manage to get the attribute value using the same .getNodeValue() on a node representing an attribute …

I have compiled a sample odb & xml file demonstrating the issue :
XMLhandling.zip (4.8 KB)
Just unzip the file (keep odb and xml files in the same folder), open the odb and run the ParseXMLfile macro from the Basic IDE …

And last but not least : running LO 24.2.7.2 on Linux Mint 22 …

Any assistance pointing me in the right direction would be greatly appreciated !

Found it !!!

Apparently, in com.sun.star.xml.dom.DocumentBuilder, for a ‘simple’ node like <tag>somevalue</tag>, the value is not ‘stored’ at node level, but in a text childnode !!!
This post put me on the right track : https://stackoverflow.com/questions/12413450/domdocument-getnodevalue-returns-null-contains-an-output-escaped-string

So for oNode containing <tag>somevalue</tag> :

  • oNode.getNodeValue() returns Null
  • oNode.getFirstChild().getNodeValue() returns somevalue

For those interested, here is my working sample:
XMLhandling_WORKING.zip (4.9 KB)

1 Like

With Python its trivial.

import xml.etree.ElementTree as ET

def test_xml():
    path_xml = '/home/elmau/Downloads/XMLhandling/Sample.xml'
    tree = ET.parse(path_xml)
    root = tree.getroot()

    version = root.find('version')

    print(version.tag,  version.text)

    for node in root.find('export'):
        print(node.tag, node.text)
    return
elmau@oficina ~> soffice --calc
version V4.0
environment TST
system Library Management
date 2025-01-08 10:25:51
1 Like

Thank you for your contribution, elmau.

Unfortunately, getting the data from XML is only a small part of the functionalities to migrate. The rest involves Base forms & reports, Base - Calc exchanges, combining Base data & Writer templates into PDF’s etc… All of this I intend to ‘glue together’ using LO Basic …
So I’m not looking forward to integrating Basic & Python into the same odb …
Also, my final goal is to hand this odb over to a LO ‘novice’, so I would prefer to stick to Basic for the macro’s …