Search code examples
regexweb-scrapinggoogle-sheetsgoogle-sheets-formulagoogle-query-language

Query does not filter from two values separated by | in NOT CONTAINS string


I'm using the formula:

=ARRAYFORMULA(QUERY(TRIM(IMPORTXML("https://www.livescores.com/","//div[@class='content']//div[contains(@class,'row-gray')]")),"Where not Col1 contains 'Postp|Canc' "))

But for some reason 'Postp|Canc' not removing the lines that contain such values, what am I doing wrong?

enter image description here


Solution

  • | is a regex thing and in query only matches is regex attribute. use:

    =ARRAYFORMULA(QUERY(TRIM(IMPORTXML("https://www.livescores.com/",
     "//div[@class='content']//div[contains(@class,'row-gray')]")),
     "where not Col1 matches '.*Postp.*|.*Canc.*'"))
    

    or:

    =ARRAYFORMULA(QUERY(TRIM(IMPORTXML("https://www.livescores.com/",
     "//div[@class='content']//div[contains(@class,'row-gray')]")),
     "where not Col1 contains 'Postp' 
         or not Col1 contains 'Canc'"))