I have a dataframe that looks like this:
a=['a','b','c','d']
b=['the','fox','the','then']
c=['quick','jumps','lazy','barks']
d=['brown','over','dog','loudly']
df=pd.DataFrame(zip(a,b,c,d),columns=['indexcol','col1','col2','col3'])
and a dictionary that looks like this:
keys=['a','b','c','d']
vals=[]
vals.append(['col1','col3'])
vals.append(['col1','col2'])
vals.append(['col1','col2','col3'])
vals.append(['col2','col3'])
newdict = {k: v for k, v in zip(keys, vals)}
What I'm trying to do is to create a new column in df which constructs a statement for each row. Taking the first row as an example, the sentence should look like so:
"col1 is 'the' | col3 is 'lazy' "
another example using the 3rd row just to make the task at hand crystal clear: "col1 is 'brown' | col2 is 'the' | col3 is 'then' "
essentially, I want to refer to the dictionary values to look up the column in df using the dictionary keys as the row reference matching to indexcol in df.
Thanks in advance.
I guess this is what you're looking for
def func(df_row):
return ' | '.join(
f'"{col}" is "{df_row[col]}"'
for col in newdict[df_row['indexcol']]
)
df['new col'] = df.apply(func, axis=1)
indexcol | col1 | col2 | col3 | new col |
---|---|---|---|---|
a | the | quick | brown | "col1" is "the" | "col3" is "brown" |
b | fox | jumps | over | "col1" is "fox" | "col2" is "jumps" |
c | the | lazy | dog | "col1" is "the" | "col2" is "lazy" | "col3" is "dog" |
d | then | barks | loudly | "col2" is "barks" | "col3" is "loudly" |