I have this dataframe:
dfsupport = pd.DataFrame({'Date': ['8/12/2020','8/12/2020','13/1/2020','24/5/2020','31/10/2020','11/7/2020','11/7/2020'],
'Category': ['Table','Chair','Cushion','Table','Chair','Mats','Mats'],
'Sales': ['1 table','3chairs','8 cushions','3Tables','12 Chairs','12Mats','4Mats'],
'Paid': ['Yes','Yes','Yes','Yes','No','Yes','Yes',],
'Amount': ['93.78','$51.99','44.99','38.24','£29.99','29 21 only','18']
})
Which looks like this in table form:
Date Category Sales Paid Amount
0 8/12/2020 Table 1 table Yes 93.78
1 8/12/2020 Chair 3chairs Yes $51.99
2 13/1/2020 Cushion 8 cushions Yes 44.99
3 24/5/2020 Table 3Tables Yes 38.24
4 31/10/2020 Chair 12 Chairs No £29.99
5 11/7/2020 Mats 12Mats Yes 29 21 only
6 11/7/2020 Mats 4Mats Yes 18
I want to remove both the string elements in the above. I've learnt to successfully replace the $ and £ with:
patternv='|'.join(re.escape(x) for x in ['$', '£'])
dfsupport['Amount'] = dfsupport['Amount'].str.replace(patternv,regex=True)
I now want to replace the entry that has "29 21 only" in the Amount column. My attempt has been:
patterns="{r'(\d{1,})\s(\d{1,2})\D+' : r'\1 \2'}"
dfsupport['Amount']=dfsupport['Amount'].str.replace(patterns,regex=True)
However my attempt leads to the error:
Traceback (most recent call last):
File "/home/cloud/code/learning/howmany.py", line 160, in <module>
dfsupport['Amount'] = dfsupport['Amount'].str.replace(patternv,regex=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/cloud/.venv/lib/python3.12/site-packages/pandas/core/strings/accessor.py", line 136, in wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: StringMethods.replace() missing 1 required positional argument: 'repl'
How do I fix this?
I should add that I'm seeking to have the output as "29.21"
I followed the question here
You are missing the second (mandatory) parameter of str.replace
:
dfsupport['Amount'] = dfsupport['Amount'].str.replace(r'(\d{1,})\s(\d{1,2})\D+',
r'\1.\2', regex=True)
It also seems that you're trying to use a dictionary in patterns
, this is however not working the way you did it, you have to pass a single regex.
If you want to pass a dictionary, use replace
(without str.
):
patterns = {r'(\d{1,})\s(\d{1,2})\D+' : r'\1.\2'}
dfsupport['Amount'] = dfsupport['Amount'].replace(patterns,regex=True)
Output:
Date Category Sales Paid Amount
0 8/12/2020 Table 1 table Yes 93.78
1 8/12/2020 Chair 3chairs Yes $51.99
2 13/1/2020 Cushion 8 cushions Yes 44.99
3 24/5/2020 Table 3Tables Yes 38.24
4 31/10/2020 Chair 12 Chairs No £29.99
5 11/7/2020 Mats 12Mats Yes 29.21
6 11/7/2020 Mats 4Mats Yes 18