I have a dataframe with a column val_string
that is sometimes populated with a base64 encoded string, and other times is NaN.
df
base_type val_int val_string
0 integer 34 NaN
1 string NaN c3RyaW5nMQ==
2 integer 108 NaN
3 integer 3586 NaN
4 string NaN c3RyaW5nMg==
How do I apply base64.b64decode
to only the rows that have a val_string
that is not NaN?
I tried this, only to get a strange OSError: could not get source code
:
df['val_string'] = df['val_string'].apply(lambda x: df['val_string'] if pd.isna(df['val_string']) else base64.b64decode(x))
Any help would be much appreciated!
Use boolean indexing:
from base64 import b64decode
m = df['val_string'].notna()
df.loc[m, 'val_string'] = df.loc[m, 'val_string'].apply(b64decode)
Or with your approach:
from base64 import b64decode
df['val_string'] = df['val_string'].apply(lambda x: x if pd.isna(x)
else b64decode(x))
Output:
base_type val_int val_string
0 integer 34.0 NaN
1 string NaN b'string1'
2 integer 108.0 NaN
3 integer 3586.0 NaN
4 string NaN b'string2'