In a pandas df, trying to regularize every incidence of the term 'Heb' to "HEB"
In a pandas df, trying to regularize every incidence of the term 'Heb' to "HEB"
using sidetable, I tried this and it worked.
(df['Description'].str.contains('H-E-B', case=False, regex=False), 'HEB'),
but then I tried this and it didn't work.
(df['Description'].str.contains(['HEB', 'H-E-B', 'Heb online' ], case=False, regex=False), 'HEB'),
Error message reads:"AttributeError: 'list' object has no attribute 'upper'"
Would expect Input:
Description | Amount |
---|---|
HEB | 100.00 |
HEB#123 | 100.00 |
HEB online | 100.00 |
H-E-B Gas | 100.00 |
Heb Carwash | 100.00 |
Output:
Description | Amount |
---|---|
HEB | 100.00 |
HEB | 100.00 |
HEB | 100.00 |
HEB | 100.00 |
HEB | 100.00 |
The str.contains()
method expects a string and you are giving it a list of strings instead, that's why you are getting the AttributeError.
To replace all descriptions that contain some variation of 'Heb'
with 'HEB'
you can create a list that contains all expected variations, then loop over the rows of your dataframe and compare the description with your variations:
import pandas as pd
# Sample DataFrame
data = {'Description': ['HEB', 'HEB#123', 'HEB online', 'H-E-B Gas', 'Heb Carwash'],
'Amount': [100.00, 100.00, 100.00, 100.00, 100.00]}
df = pd.DataFrame(data)
patterns = ['HEB', 'H-E-B', 'Heb' ]
for index, row in df.iterrows():
for pattern in patterns:
if pattern in df.at[index, 'Description']:
df.at[index, 'Description'] = 'HEB'
print(df)