Read HTML Data from Clipboard

ScottCLO · February 16, 2023, 11:09pm

Hello. I’m trying to find a way to read data from inside divs from HTML data in my clipboard. The divs have id attributes like

<div id="name.1">Bill</div>
<div id="name.2">Bob</div>

Is there a way to get the string “bill” from this data with the uno api?
Somthing like…

html = oClip.GetContents
myName = html.getElementById("name.1").innerHTML

Wanderer · February 17, 2023, 7:04am

“Is there a way” can often answered with “yes” - import as text (with a python-macro), then feed the html through one of the available html-parsers (beautyfulsoup is well known, but quite big) and ask the parser for your div.
.
One the other side: If LibreOffice pastes from the clipboard the contents are integrated in the “dom” of the module (Writer, Calc…) and document you actually use, so I guess your <div> is not available after import.
.
For the import of html, there has to be some “understanding” of html in the source, but I expect this in the import-filters, not in the uno-api…

ScottCLO · February 17, 2023, 2:27pm

I didn’t know you could use python for macros. I’ll have to look into that