We will be migrating from Ask to Discourse on the first week of August, read the details here

Ask Your Question

How to fetch data from a website?

asked 2018-05-15 14:22:54 +0200

DMT gravatar image

updated 2018-05-15 14:38:06 +0200


I'm searching for a way to fetch some data from a website using Calc.

I was using Google Sheets with the IMPORTXML function before, but because it's very unreliable I want to use Calc instead.

My functions there looked like this:

=IMPORTXML(E2; "//h3[@class='product-name']")

=IMPORTXML(E2; "//span[@class='price']")

As you can already guess, the URL was in E2 (f.i. http://www.killis.at/gin/monkey-47-gi...).

After reading the Calc help and some related topics, I suppose, that FILTERXML could help in this case, but I've never managed to get it to work.

So what would be the equivalent for this function in Calc and how would the appropriate command for the product name and price look like?

Many thanks in advance!

edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted

answered 2018-05-15 15:17:07 +0200

erAck gravatar image

Use =WEBSERVICE(E2) to obtain the data, for example in E3, and =FILTERXML(E3;expression) to apply an XPath expression. The example you gave should probably work, but note that the data must be XML, not some arbitrary HTML website which it is in your example URL.

edit flag offensive delete link more


I've tried with =filterxml(webservice(E2),"//h3[@class='product-name']"), but I'm getting an error code 501 (Invalid character). And with =filterxml(webservice(E2);"//h3[@class='product-name']") I get #NAME?. I would prefer to have the function in one line, if that's possible.

DMT gravatar imageDMT ( 2018-05-15 18:41:36 +0200 )edit

Whether you can use a , comma as parameter separator depends on your settings, this is why we prefer to use ; semicolon in examples because it works in any locale. If the result for you is #NAME? then probably you are either using a very old version that doesn't know WEBSERVICE and FILTERXML yet, it's in since 4.2 already, or a localized UI with different names. Literally copying your example works for me (but just gives #VALUE! error because as said the URL does not deliver XML).

erAck gravatar imageerAck ( 2018-05-16 13:18:50 +0200 )edit

My LO's version is with German locale. I use English function names with ";" as a separator. Following your recommendation, I've split this function into two, but now I'm getting an error code 507. I've captured the whole process as a gif here. Would you mind to have a look on this, since it's a mystery to me at the moment?

DMT gravatar imageDMT ( 2018-05-17 13:14:38 +0200 )edit

The second FILTERXML parameter is expected to be a string, your input was =FILTERXML(A2;//h3[@class='product-name']) instead of

erAck gravatar imageerAck ( 2018-05-17 13:28:47 +0200 )edit

If I put //h3[@class='product-name'] in double quotes, I get #VALUE! as a result :( This is how it looks: =FILTERXML(A2;"//H3[@class='product-name']")

DMT gravatar imageDMT ( 2018-05-17 14:52:07 +0200 )edit

I already wrote that above.

erAck gravatar imageerAck ( 2018-05-17 21:28:33 +0200 )edit

answered 2018-05-21 23:32:04 +0200

DMT gravatar image

updated 2018-05-21 23:33:06 +0200

Thanks to a kind soul I've found the answer here.

edit flag offensive delete link more
Login/Signup to Answer

Question Tools

1 follower


Asked: 2018-05-15 14:22:54 +0200

Seen: 3,468 times

Last updated: May 21 '18