Search code examples
sqlpostgresqlany

pattern-matching with ANY()


I have a table of 3 columns among which one is a text array of non-uniform lengths (modules). I want to create a derived table depending on the presence of an element in modules array matching a definite text-pattern/phrase. The element name can also vary though it will have a common phrase. A non-working example is given below:

    select machine_id, jobid, 
    case
    when '%charmm%' LIKE ANY(modules) then 'CHARMM'
    when '%gaussian%' LIKE ANY(modules) then 'GAUSSIAN'
    else 'OTHERS'
    end as package, modules
    from jobapps limit 50;

In this case, there could be several module names of Gausiian: gaussian/16b01, gaussian/09e01. But I want to focus on the phrase 'gaussian' and create a new column where all the entries matching with the text-pattern 'gaussian' will be tagged as 'gaussian'.

This is why I wanted to use '%gaussian%' LIKE ANY(modules) and this is not working.

I could use: 'gaussian/09e01' LIKE ANY(modules), but in that case I have to write all possible values, and if a user creates a new module of Gaussian, then that data will be missed.

Is there any way to rectify the above problem? Or is there any better way to achieve the goal?

    select machine_id, jobid, 
    case
    when '%charmm%' LIKE ANY(modules) then 'CHARMM'
    when '%gaussian%' LIKE ANY(modules) then 'GAUSSIAN'
    else 'OTHERS'
    end as package, modules
    from jobapps limit 50;

I want to get something like:

    machine_id | jobid  | package  |         modules          
    ------------+--------+----------+--------------------------
    6 | 1884 | CHARMM   | {charmm}
    2 | 2305 | CHARMM   | {charmm}
    6 | 786  | GAUSSIAN | {gaussian/09e1}
    7 | 1956 | CHARMM   | {charmm}
    3 | 72037| NAMD     | {namd,intel/2018}

Solution

  • The LIKE operator requires that the pattern be on the right, while the ANY() expression will always use the array members as the right operand, so I don't think there's an easy way out.

    You could create your own "reverse like" operator with the operands reversed:

    create function reverse_like(text,text) returns boolean as
    'select $2 like $1'
    language sql immutable;
    
    create operator <~~ (
      function = reverse_like,
      leftarg = text,
      rightarg = text
    );
    

    ...and then query your array with:

    case
    when '%charmm%' <~~ ANY(modules) then 'CHARMM'
    when '%gaussian%' <~~ ANY(modules) then 'GAUSSIAN'
    

    If you don't want to create a new operator just for this query, I think you'll have to do it the long way, by unnesting the array and checking the contents in a subquery:

    case
    when exists (select 1 from unnest(modules) u(v) where v like '%charmm%') then 'CHARMM'
    when exists (select 1 from unnest(modules) u(v) where v like '%gaussian%') then 'GAUSSIAN'
    

    ... or alternatively:

    case
    when (select bool_or(v like '%charmm%') from unnest(modules) u(v)) then 'CHARMM'
    when (select bool_or(v like '%gaussian%') from unnest(modules) u(v)) then 'GAUSSIAN'