I have a list of strings, given below from which i want to extract only numbers, and then i want to create a column based on output.
['CGST- INPUT 9% MAHARASHTRA',
'SGST-INPUT 9% MAHARASHTRA',
'CGST INPUT @6% MAHARASHTRA',
'SGST INPUT @6% MAHARASHTRA',
'CGST- INPUT 2.50% MAHARASHTRA',
'SGST-INPUT 2.50% MAHARASHTRA',
'TDS ON OFFICE RENT',
'TDS ON CONTRACTOR',
'TDS ON CONSULTANTS',
'TDS ON OFFICE RENT (COMPANY)',
'TDS ON CONSULTANY FEE']
Output should be as belows
Rate CGST SGST TDS
9 XX XX XX
6 XX XX XX
2.50 XX XX XX
I have few columns in a Dataframe which i have converted to list above. There are values in each column which i want to sum and show them saperatly as per the rate mentioned in each list item.
A regular expression that will identify numbers in a string (including those with decimal fractions) is:
r'[-+]?[0-9]*\.?[0-9]+'
So, for example :
import re
mystring = 'abc50def6.75ghi'
pattern = r'[-+]?[0-9]*\.?[0-9]+'
print(list(map(float, re.findall(pattern, mystring))))
Output:
[50.0, 6.75]
Having extracted your numbers you can then use these values to build your Dataframe