Search code examples
sqlgoogle-bigquerygoogle-query-language

Splitting row in 2 columns


Trying to get data split by language where '/' delimiter is present used this query it works but when there is no '/' it the row goes to the french column i want the french column to be null if there is no '/' and the data should go in English column. It is sorted by site id so only id 412 has the french in it.

SELECT
s.siteid,
s.notes, --This is the column that CSI uses for the description.
split(s.notes,'/') [safe_OFFSET(0)] French,
split(s.notes,'/') [safe_OFFSET(1)] english
FROM AloomaTestBeta.SCSERVICES s

siteid  notes                           French                  english
412     Le cardio-/ Cardio Tennis .     Le cardio-tennis        Cardio Tennis 
412     Le cardio-/Cardio Tennis        Le cardio-tennis        Cardio Tennis 
412     La ligue de / Drop-In Tennis    La ligue de tennis      Drop-In Tennis 
411     An extended duration            An extended duration    null                    
411     Increase flexibility            Increase flexibility    Null    

Also tried using the case statement but it starts giving me null when there is no '/' delimiter.

SELECT
s.siteid,
s.notes, --This is the column that CSI uses for the description.
case when  s.siteid = 412 then split(s.notes,'/') [safe_OFFSET(0)] else null end as French,
split(s.notes,'/') [safe_OFFSET(1)] english
FROM AloomaTestBeta.SCSERVICES s

siteid  notes                           French                  english
412     Le cardio-/ Cardio Tennis .     Le cardio-tennis        Cardio Tennis 
412     Le cardio-/Cardio Tennis        Le cardio-tennis        Cardio Tennis 
412     La ligue de / Drop-In Tennis    La ligue de tennis      Drop-In Tennis 
411     An extended duration            null                     null                   
411     Increase flexibility            null                      Null                  

this what I am looking for result to be

siteid  notes                           French                  english
412     Le cardio-/ Cardio Tennis .     Le cardio-tennis        Cardio Tennis 
412     Le cardio-/Cardio Tennis        Le cardio-tennis        Cardio Tennis 
412     La ligue de / Drop-In Tennis    La ligue de tennis      Drop-In Tennis 
411     An extended duration            null                    An extended duration 
411     Increase flexibility            Null                    Increase flexibility

Solution

  • Below is for BigQuery Standard SQL

    #standardSQL
    SELECT
      s.siteid,
      s.notes, --This is the column that CSI uses for the description.
      IF(v, SPLIT(s.notes,'/')[OFFSET(0)], NULL) French,
      IF(v, SPLIT(s.notes,'/')[SAFE_OFFSET(1)], SPLIT(s.notes,'/')[OFFSET(0)]) English
    FROM `AloomaTestBeta.SCSERVICES` s, UNNEST([s.notes LIKE '%/%']) v
    

    You can test, play with above using sample data from your question as in example below

    #standardSQL
    WITH `AloomaTestBeta.SCSERVICES` AS (
      SELECT 412 siteid, 'Le cardio-/ Cardio Tennis' notes UNION ALL
      SELECT 412, 'Le cardio-/Cardio Tennis' UNION ALL
      SELECT 412, 'La ligue de / Drop-In Tennis' UNION ALL
      SELECT 411, 'An extended duration' UNION ALL
      SELECT 411, 'Increase flexibility' 
    )
    SELECT
      s.siteid,
      s.notes, --This is the column that CSI uses for the description.
      IF(v, SPLIT(s.notes,'/')[OFFSET(0)], NULL) French,
      IF(v, SPLIT(s.notes,'/')[SAFE_OFFSET(1)], SPLIT(s.notes,'/')[OFFSET(0)]) English
    FROM `AloomaTestBeta.SCSERVICES` s, UNNEST([s.notes LIKE '%/%']) v  
    

    with result

    Row siteid  notes                           French          English  
    1   412     Le cardio-/ Cardio Tennis       Le cardio-      Cardio Tennis    
    2   412     Le cardio-/Cardio Tennis        Le cardio-      Cardio Tennis    
    3   412     La ligue de / Drop-In Tennis    La ligue de     Drop-In Tennis   
    4   411     An extended duration            null            An extended duration     
    5   411     Increase flexibility            null            Increase flexibility       
    

    If you got how above works - you are ready for more elegant solution

    #standardSQL
    SELECT
      s.siteid,
      s.notes, --This is the column that CSI uses for the description.
      SPLIT(s.notes,'/')[SAFE_OFFSET(v)] French,
      SPLIT(s.notes,'/')[SAFE_OFFSET(1 - v)] English
    FROM `AloomaTestBeta.SCSERVICES` s, UNNEST([IF(s.notes LIKE '%/%', 0, 1)]) v