Search code examples
google-sheetsweb-scrapingxpathgoogle-sheets-formula

How to figure out proper xpath for IMPORTXML in Google Sheets - N/A Error?


I'm trying to use the IMPORTXML function on Google Sheets.

For example: =IMPORTXML("https://www.tiktok.com/@charlidamelio?lang=en", XMLPATH) should return "54.3M"

I used the Chrome inspector to copy the xpath, which gives me:

/html/body/div[1]/div/div[2]/div/div[1]/div/header/h2[1]/strong[2]

When I try this in Google Sheets it returns an error: #N/A (Import Content is Empty).

P.S. I'm open to other ways to get the data I need into the google sheet, it doesn't have to use the IMPORTXML function.


Solution

  • How about this answer?

    In this answer, IMPORTXML and REGEXEXTRACT are used. And also, it supposes that the URL of https://www.tiktok.com/@charlidamelio?lang=en is put in a cell "A1".

    Pattern 1:

    In this pattern, "followerCount" is retrieved.

    Sample formula:

    =REGEXEXTRACT(IMPORTXML(A1,"//script[@id='__NEXT_DATA__']"),"followerCount"":(\d+)")
    
    • "followerCount" is retrieved from the script.
    • In this case, when =VALUE(REGEXEXTRACT(IMPORTXML(A1,"//script[@id='__NEXT_DATA__']"),"followerCount"":(\d+)")) is used, the retrieved value can be used as the number.

    Result:

    enter image description here

    Pattern 2:

    In this pattern, "followerCount" is retrieved.

    Sample formula:

    =REGEXEXTRACT(IMPORTXML(A1,"//meta[@name='description']/@content")," ([\w\d.]+) Fans")
    
    • The value of "54.4M Fans" is retrieved from the metadata.

    Result:

    enter image description here

    References: