Search code examples
xpathweb-scrapinggoogle-sheetsgoogle-sheets-querygoogle-sheets-formula

Looking to import selective data 'country-wise' from specific part of website into Google Spreadsheets


I'm trying to import some data from IATA's website onto Google Spreadsheet / Excel. I tried using XCOPY method, but this page is constantly getting updated and the spreadsheet would eventually give 'N/A' error.

googlesheets error

I basically want a spreadsheet to import specific country's data into my sheet's cell(s). I am looking for it to be country-specific.

For example, I only want to import COSTA-RICA's data or CANADA's data from this website as shown below into my spreadsheet,

Example

Following is the spreadsheet link I'm editing : https://docs.google.com/spreadsheets/d/1anrXXHuVtuP3iCDBsX8GcP7egDBmA75eeG4-wa1ov7E/edit#gid=1624254662


Solution

  • for the whole list:

    =INDEX(TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div/div/div/div/div/div/div/div")),,1)
    

    for canada use:

    =QUERY(QUERY(INDEX(TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div/div/div/div/div/div/div/div")),,1), "limit "&
     MATCH("CAYMAN ISL.", INDEX(TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div/div/div/div/div/div/div/div")),,1), 0)-
     MATCH("CANADA", INDEX(TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div/div/div/div/div/div/div/div")),,1), 0)&" offset "&
     MATCH("CANADA", INDEX(TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div/div/div/div/div/div/div/div")),,1), 0)-1, 0), "where Col1 is not null", 0)
    

    enter image description here


    for costa rica change:

    • CANADA to COSTA RICA
    • CAYMAN ISL. to COTE D'IVOIRE

    UPDATE:

    full list:

    =ARRAYFORMULA(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div[@id='contentwrapper']//p[01]|//div[@id='contentwrapper']//p[02]|
      //div[@id='contentwrapper']//p[03]|//div[@id='contentwrapper']//p[04]|
      //div[@id='contentwrapper']//p[05]|//div[@id='contentwrapper']//p[06]|
      //div[@id='contentwrapper']//p[07]|//div[@id='contentwrapper']//p[08]|
      //div[@id='contentwrapper']//p[09]|//div[@id='contentwrapper']//p[10]|
      //div[@id='contentwrapper']//p[11]|//div[@id='contentwrapper']//p[12]|
      //div[@id='contentwrapper']//p[13]|//div[@id='contentwrapper']//p[14]"))
     ,,9^9)),,9^9), "♦"))), "where Col1 is not null", 0))
    

    portion:

    =ARRAYFORMULA(QUERY(
    QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div[@id='contentwrapper']//p[01]|//div[@id='contentwrapper']//p[02]|
      //div[@id='contentwrapper']//p[03]|//div[@id='contentwrapper']//p[04]|
      //div[@id='contentwrapper']//p[05]|//div[@id='contentwrapper']//p[06]|
      //div[@id='contentwrapper']//p[07]|//div[@id='contentwrapper']//p[08]|
      //div[@id='contentwrapper']//p[09]|//div[@id='contentwrapper']//p[10]|
      //div[@id='contentwrapper']//p[11]|//div[@id='contentwrapper']//p[12]|
      //div[@id='contentwrapper']//p[13]|//div[@id='contentwrapper']//p[14]"))
     ,,9^9)),,9^9), "♦"))), "where Col1 is not null", 0), "limit "&
     MATCH("PALAU",
     QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div[@id='contentwrapper']//p[01]|//div[@id='contentwrapper']//p[02]|
      //div[@id='contentwrapper']//p[03]|//div[@id='contentwrapper']//p[04]|
      //div[@id='contentwrapper']//p[05]|//div[@id='contentwrapper']//p[06]|
      //div[@id='contentwrapper']//p[07]|//div[@id='contentwrapper']//p[08]|
      //div[@id='contentwrapper']//p[09]|//div[@id='contentwrapper']//p[10]|
      //div[@id='contentwrapper']//p[11]|//div[@id='contentwrapper']//p[12]|
      //div[@id='contentwrapper']//p[13]|//div[@id='contentwrapper']//p[14]"))
     ,,9^9)),,9^9), "♦"))), "where Col1 is not null", 0), 0)-
     MATCH("PAKISTAN",
     QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div[@id='contentwrapper']//p[01]|//div[@id='contentwrapper']//p[02]|
      //div[@id='contentwrapper']//p[03]|//div[@id='contentwrapper']//p[04]|
      //div[@id='contentwrapper']//p[05]|//div[@id='contentwrapper']//p[06]|
      //div[@id='contentwrapper']//p[07]|//div[@id='contentwrapper']//p[08]|
      //div[@id='contentwrapper']//p[09]|//div[@id='contentwrapper']//p[10]|
      //div[@id='contentwrapper']//p[11]|//div[@id='contentwrapper']//p[12]|
      //div[@id='contentwrapper']//p[13]|//div[@id='contentwrapper']//p[14]"))
     ,,9^9)),,9^9), "♦"))), "where Col1 is not null", 0), 0)&" offset "&
     MATCH("PAKISTAN",
     QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     "//div[@id='contentwrapper']//p[01]|//div[@id='contentwrapper']//p[02]|
      //div[@id='contentwrapper']//p[03]|//div[@id='contentwrapper']//p[04]|
      //div[@id='contentwrapper']//p[05]|//div[@id='contentwrapper']//p[06]|
      //div[@id='contentwrapper']//p[07]|//div[@id='contentwrapper']//p[08]|
      //div[@id='contentwrapper']//p[09]|//div[@id='contentwrapper']//p[10]|
      //div[@id='contentwrapper']//p[11]|//div[@id='contentwrapper']//p[12]|
      //div[@id='contentwrapper']//p[13]|//div[@id='contentwrapper']//p[14]"))
     ,,9^9)),,9^9), "♦"))), "where Col1 is not null", 0), 0)-1, 0))
    

    0

    spreadsheet demo


    shorter formula:

    =ARRAYFORMULA(QUERY(QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     JOIN("|", "//div[@id='contentwrapper']//p["&ROW(A1:A15)&"]"))),,9^9)),,9^9), "♦"))), 
     "where Col1 !=''", 0), "limit "&
     MATCH("PALAU", QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     JOIN("|", "//div[@id='contentwrapper']//p["&ROW(A1:A15)&"]"))),,9^9)),,9^9), "♦"))),
     "where Col1 !=''", 0), 0)-
     MATCH("PAKISTAN", QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     JOIN("|", "//div[@id='contentwrapper']//p["&ROW(A1:A15)&"]"))),,9^9)),,9^9), "♦"))),
     "where Col1 !=''", 0), 0)&" offset "&
     MATCH("PAKISTAN", QUERY(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY("♦"&TRANSPOSE(IMPORTXML(
     "https://www.iatatravelcentre.com/international-travel-document-news/1580226297.htm", 
     JOIN("|", "//div[@id='contentwrapper']//p["&ROW(A1:A15)&"]"))),,9^9)),,9^9), "♦"))),
     "where Col1 !=''", 0), 0)-1, 0))