Search code examples
pythonpandasdictionarypython-re

python: Dictionary key as row index, values as column headers. How do I refer back and select specific values in a df using a dictionary?


I have a dataframe that looks like this:

a=['a','b','c','d']
b=['the','fox','the','then']
c=['quick','jumps','lazy','barks']
d=['brown','over','dog','loudly']
df=pd.DataFrame(zip(a,b,c,d),columns=['indexcol','col1','col2','col3'])

and a dictionary that looks like this:

keys=['a','b','c','d']
vals=[]
vals.append(['col1','col3'])
vals.append(['col1','col2'])
vals.append(['col1','col2','col3'])
vals.append(['col2','col3'])
newdict = {k: v for k, v in zip(keys, vals)}

What I'm trying to do is to create a new column in df which constructs a statement for each row. Taking the first row as an example, the sentence should look like so:

"col1 is 'the' | col3 is 'lazy' "

another example using the 3rd row just to make the task at hand crystal clear: "col1 is 'brown' | col2 is 'the' | col3 is 'then' "

essentially, I want to refer to the dictionary values to look up the column in df using the dictionary keys as the row reference matching to indexcol in df.

Thanks in advance.


Solution

  • I guess this is what you're looking for

    def func(df_row):
        return ' | '.join(
            f'"{col}" is "{df_row[col]}"'
            for col in newdict[df_row['indexcol']]
        )
    
    df['new col'] = df.apply(func, axis=1)
    
    indexcol col1 col2 col3 new col
    a the quick brown "col1" is "the" | "col3" is "brown"
    b fox jumps over "col1" is "fox" | "col2" is "jumps"
    c the lazy dog "col1" is "the" | "col2" is "lazy" | "col3" is "dog"
    d then barks loudly "col2" is "barks" | "col3" is "loudly"