Search code examples
sqlhadoophiveregexp-replace

regular expression in regexp_replace hive


I have a few lines in my table. This lines looks like:

Atribute       |
---------------|
B=10;MB=12;A=33|
---------------|
MB=16;B=12;A=23|
---------------|
A=10;MB=23;B=58|

and etc.

I need to get numbers only after 'B='. For that example i should get:

10
12
58

What the select query should i write for get this result? (Query should not confuse with 'MB=' and 'B=')


Solution

  • hive> select regexp_extract('B=10\;AB=12\;B=33', '(\;|^)B=([0-9]*)', 2);
        OK
        10
        Time taken: 0.157 seconds, Fetched: 1 row(s)
        hive> select regexp_extract('MB=16\;B=12\;A=23', '(\;|^)B=([0-9]*)', 2);
        OK
        12
        Time taken: 0.11 seconds, Fetched: 1 row(s)
        hive> select regexp_extract('A=10\;MB=23\;B=58', '(\;|^)B=([0-9]*)', 2);
        OK
        58
        Time taken: 0.134 seconds, Fetched: 1 row(s)
        hive>
    

    First group will try to match Start of the string with value 'B' or semicolon followwed by string with value 'B'.