Search code examples
pythonpandasstringdataframeimpala

Python Change part of a string based on a condition from a list


Hi im trying to make a python script which will update a line each time the script is run.

im having a bit of a headscratcher to how best to tackle the part where i have to update each line (a select statement) based on the value of a dataframe.

simplified i got a string 'select null r_cnt, null t_cnt, null r_dur, null t_dur from my_table'

and i got a list containing the fields for this line [t_cnt, r_dur]

I then want the new string output to be the first string where we have removed the null in front of the the values which was present in my list but kept null in front of those not in the list.

'select null r_cnt, t_cnt, r_dur, null t_dur from my_table'

my whole code looks something like this now where im stuck at the point i mentioned above

str_to_execute = f"select * from {db}.table_desc where grp_id in (400,500,300,1200) and id not in(127,140,125)"
    cursor.execute(str_to_execute)
    df = as_pandas(cursor)

    for index, row in df.iterrows():
        # print(row['name'])
        str_to_execute = f"SHOW COLUMN STATS {db}.ctrl_{row['id']}"
        cursor.execute(str_to_execute)
        loop = as_pandas(cursor)
        for index, row in loop.iterrows():
            print(row['Column'])
            
            str_to_execute = f"select concat(cast(ctrl_id as string),cast(ctrl_date as string)) primarykey, ctrl_id, ctrl_date,null r_cnt, null t_cnt, null r_dur, null t_dur,null r_amt, null t_amt,null p_cnt, null p_dur,null p_amt, null ro_vol, null t_vol, null r_vol, null p_vol, null ro_amt, null ro_cnt,    from {db}.ctrl_{row['id']}"
            if #This is where im stuck

Solution

  • Try:

    s = 'select null r_cnt, null t_cnt, null r_dur, null t_dur from my_table'
    lst = ['t_cnt', 'r_dur']
    
    checklist = ['null r_cnt', 'null t_cnt', 'null r_dur']
    checkliststr = ','.join(checklist)
    
    for itm in lst:
        if itm in checkliststr:
            print('null ' + itm)
            s=s.replace('null ' + itm, itm)
    print(s)