Search code examples
prestotrino

Find all occurrences from a string - Presto


I have the following as rows in HIVE (HDFS) and using Presto as the Query Engine.

1,@markbutcher72 @charlottegloyn Not what Belinda Carlisle thought. And yes, she was singing about Edgbaston.
2,@tomkingham @markbutcher72 @charlottegloyn It's true the garden of Eden is currently very green...
3,@MrRhysBenjamin @gasuperspark1 @markbutcher72 Actually it's Springfield Park, the (occasional) home of the might

The requirement is to do get the following through Presto Query. How can we get this please

1,markbutcher72
1,charlottegloyn
2,tomkingham
2,markbutcher72
2,charlottegloyn
3,MrRhysBenjamin
3,gasuperspark1 
3,markbutcher72 

Solution

  • select  t.id
           ,u.token
    
    from    mytable as t
            cross join unnest (regexp_extract_all(text,'(?<=@)\S+')) as u(token) 
    ;
    

    +----+----------------+
    | id |     token      |
    +----+----------------+
    |  1 | markbutcher72  |
    |  1 | charlottegloyn |
    |  2 | tomkingham     |
    |  2 | markbutcher72  |
    |  2 | charlottegloyn |
    |  3 | MrRhysBenjamin |
    |  3 | gasuperspark1  |
    |  3 | markbutcher72  |
    +----+----------------+