Search code examples
pythonpandasdataframedata-extraction

How to extract particular values from a dataframe col and put them in another column?


I have a dataframe consisting of a column of strings. I want to extract the place, date, and race number from one column.

Dataframe:

- S.no.            FileName
- 0.     Albany17NOV19sectionalRace1.csv
- 1.     Albany22NOV19sectionalRace4.csv
- 2.     New York1NOV19sectionalRace7.csv
- 3.     Aquis Park Gold Coast27NOV19sectionalRace2.csv 

Desired dataframe:

- S.no.   Place                     Date     Racenumber
- 0.     Albany                    17NOV19    Race1
- 1.     Albany                    22NOV19    Race4
- 2.     New York                  1NOV19     Race7
- 3.     Aquis park Gold Coast     27NOV19    Race2

Solution

  • Split by;

    1. digit followed by Nondigit and digit that is say 17NOV19

    or

    1. sectional

    or

    3 special character .

    After split drop all rows that have None as values and any others not wanted. Can rename columns if you needed to

        df=df.FileName.str.split('(\d+\D+\d+)|(sectional)|(\.)', expand=True).dropna(1).drop(columns=[4,6,11,12])
    print(df)
            
    
              
    
                        0        1      8
    0                 Albany  17NOV19  Race1
    1                 Albany  22NOV19  Race4
    2               New York   1NOV19  Race7
    3  Aquis Park Gold Coast  27NOV19  Race2