What is a clean regex pattern for matching URL strings that stops at the first comma? Trying to extract values from an array of arrays in Google Sheets.
Cell A1
{https://www.myshop.com/shop/the_first_shop,Marcus. White's. Shop.,ACTIVE,US};{https://www.myshop.com/shop/a-second-shop,The first! Shop,CLOSED,UK};{EMPTY,ClosedShop,CLOSED,IN}
Desired Output (Cell B1)
https://www.myshop.com/shop/the_first_shop,https://www.myshop.com/shop/a-second-shop
I have figured out how to get a clean array of matching values in my desired output cell using:
=trim(regexreplace(regexreplace(regexreplace(REGEXREPLACE(A2,"/(https?:\/\/[^ ]*)/"," "),";"," "),"}"," "),"{"," "))
But I can't find a regex pattern that stops at a comma. For example, this soution:
"/(https?:\/\/[^ ]*)/"
matches the first URL, but gives me back:
https://www.myshop.com/shop/the_first_shop,Marcus. White's. Shop.,ACTIVE,US https://www.myshop.com/shop/a-second-shop,The first! Shop,CLOSED,UK EMPTY,ClosedShop,CLOSED,IN
I'd go with REGEXREPLACE
and use:
=REGEXREPLACE(A1,".*?(?:(https.*?)|$)","$1")
Just a trailing comma to deal with...
=REGEXREPLACE(REGEXREPLACE(A1,".*?(?:(https.*?(,))|$)","$1"),",$","")
A much longer alternative to REGEXREPLACE
could be:
=TEXTJOIN(",",,QUERY(TRANSPOSE(SPLIT(SUBSTITUTE(SUBSTITUTE(A1,"{","}"),"}",","),",")),"Select Col1 where Col1 like 'http%'"))