I have a pandas data frame, and I would like to make a new column/s based on the dictionary values.
Here is my df and dictionary:
data = ['One', 'Two', 'Three', 'Four']
df = pd.DataFrame(data, columns=['Count'])
dictionary = {'One':'Red', 'Two':['Red', 'Blue'], 'Three':'Green','Four':['Green','Red', 'Blue']}
This is the result I would like to achieve,
Preferably with blank fields instead of None values, does anybody know a way?
I tried the below:
df = df = pd.DataFrame([(k, *v) for k, v in dictionary.items()])
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
df
However, for keys that do not have multiple values, it seems to split the actual string per letter over the columns like so:
A solution where it maps the values to one columns separated with a delimiter (,) would also be helpful.
There are lists in values of dictionary, so need if-else
statement for prevent unpack strings by *
:
df = pd.DataFrame([(k, *v)
if isinstance(v, list)
else (k, v) for k, v in dictionary.items()])
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
print (df)
name n1 n2 n3
0 One Red None None
1 Two Red Blue None
2 Three Green None None
3 Four Green Red Blue
Details:
print (((*'Red', )))
('R', 'e', 'd')
print (((*['Red', 'Blue'], )))
('Red', 'Blue')
Preferably with blank fields instead of None values, does anybody know a way?
Add DataFrame.fillna
:
df = pd.DataFrame([(k, *v)
if isinstance(v, list)
else (k, v)
for k, v in dictionary.items()]).fillna('')
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
print (df)
name n1 n2 n3
0 One Red
1 Two Red Blue
2 Three Green
3 Four Green Red Blue
If same indices and number of rows use DataFrame.join
:
If original df_orig
has different like default RangeIndex
add index=df_orig.index
to DataFrame
constructor:
df = pd.DataFrame([(k, *v)
if isinstance(v, list)
else (k, v)
for k, v in dictionary.items()], index=df_orig.index).fillna('')
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
df = df_orig.join(df)
If need merge by name
column by left join in DataFrame.merge
:
df = df_orig.merge(df, on='name', how='left')