Search code examples
python-3.xpandasnumpypandas-datareader

How to extract only the numerical part of the columns in this table? (Python)


I am editing a CSV file in Python, I have deleted some columns, created an index and filtered. But I have not been able to extract the numerical part of the columas. How can I extract only the numerical information from the columns?

Extract only the numeric part of the column data. example:

MarketTime = 11: 18: 26.549

The whole column should be like this:

11: 18: 26,549

import pandas as pd

df = pd.read_csv('C:/Users/TECNOLOGIA/datos.csv',names=['LocalTime', 'Message', 'MarketTime', 'Symbol', 'Type', 'Price', 'Size', 'Source','Condition','Tick','Mmid','SubMarketId','Date'], usecols=['Type','MarketTime','Price'],index_col='Type') df=(df.loc['Type=0']) print (df)


Solution

  • Adapting the Regex/Pandas StringMethods answer given at pandas applying regex to replace values, you will have something like this:

    import pandas as pd
    
    df = pd.DataFrame(['MarketTime=11:18:28.792','MarketTime=11:18:28.792'], columns=['MarketTime'])
    
    df['MarketTime'] = df['MarketTime'].str.extract(r'([\d:,.]+)')
    print(df)