Search code examples
sqlregeximpala

REGEXP_EXTRACT in Impala


I am trying to figure out how to extract customer ID from string that looks loke this:

{"param":"success","value":"10","level":"0","error_code":"101","customer_id":"5b0e9b23e423b0d33c9f7ddfd", "purchases": "13", "last_activity_ts": "123523465"}

I am trying to extract customer ID from strings that contain error code 101 with following code:

select regexp_extract(field, '\"customer_id":"(.*)', 0) from table_name
where field rlike '"error_code":"101"'

But this gives me a following result:

"customer_id":"5b0e9b23e423b0d33c9f7ddfd", "purchases": "13", "last_activity_ts": "123523465"}

Expected result:

5b0e9b23e423b0d33c9f7ddfd

Could you please help me with this?


Solution

  • You can use below regex:

    "customer_id":"(\w+)"
    

    Demo : https://regex101.com/r/MEOGw8/39

    Test:

    {"param":"success","value":"10","level":"0","error_code":"101","customer_id":"5b0e9b23e423b0d33c9f7ddfd", "purchases": "13", "last_activity_ts": "123523465"}
    

    Match:

    Match 1
    Full match  63-104  `"customer_id":"5b0e9b23e423b0d33c9f7ddfd"`
    Group 1.    78-103  `5b0e9b23e423b0d33c9f7ddfd`
    

    SQL Statement:

    select regexp_extract(field, '"customer_id":"(\w+)"',1, 1) from table_name
    where field rlike '"error_code":"101"'