How to fetch data from a website?

Hey!

I’m searching for a way to fetch some data from a website using Calc.

I was using Google Sheets with the IMPORTXML function before, but because it’s very unreliable I want to use Calc instead.

My functions there looked like this:

=IMPORTXML(E2; “//h3[@class=‘product-name’]”)

=IMPORTXML(E2; “//span[@class=‘price’]”)

As you can already guess, the URL was in E2 (f.i. http://www.killis.at/gin/monkey-47-gin-distiller-s-cut-2016-0-5-lt.html).

After reading the Calc help and some related topics, I suppose, that FILTERXML could help in this case, but I’ve never managed to get it to work.

So what would be the equivalent for this function in Calc and how would the appropriate command for the product name and price look like?

Many thanks in advance!

Use =WEBSERVICE(E2) to obtain the data, for example in E3, and =FILTERXML(E3;expression) to apply an XPath expression. The example you gave should probably work, but note that the data must be XML, not some arbitrary HTML website which it is in your example URL.

I’ve tried with =filterxml(webservice(E2),"//h3[@class='product-name']"), but I’m getting an error code 501 (Invalid character). And with =filterxml(webservice(E2);"//h3[@class='product-name']") I get #NAME?. I would prefer to have the function in one line, if that’s possible.

Whether you can use a , comma as parameter separator depends on your settings, this is why we prefer to use ; semicolon in examples because it works in any locale. If the result for you is #NAME? then probably you are either using a very old version that doesn’t know WEBSERVICE and FILTERXML yet, it’s in since 4.2 already, or a localized UI with different names. Literally copying your example works for me (but just gives #VALUE! error because as said the URL does not deliver XML).

My LO’s version is 6.0.4.2 with German locale. I use English function names with “;” as a separator.
Following your recommendation, I’ve split this function into two, but now I’m getting an error code 507. I’ve captured the whole process as a gif here. Would you mind to have a look on this, since it’s a mystery to me at the moment?

The second FILTERXML parameter is expected to be a string, your input was =FILTERXML(A2;//h3[@class='product-name']) instead of

=FILTERXML(A2;"//h3[@class='product-name']")

If I put //h3[@class='product-name'] in double quotes, I get #VALUE! as a result :frowning:
This is how it looks: =FILTERXML(A2;"//H3[@class='product-name']")

I already wrote that above.

Thanks to a kind soul I’ve found the answer here.