Search code examples
python-3.xpandassubstring

How to extract first 8 characters from a string in pandas


I have column in a dataframe and i am trying to extract 8 digits from a string. How can I do it

    Input
 Shipment ID
20180504-S-20000
20180514-S-20537
20180514-S-20541
20180514-S-20644
20180514-S-20644
20180516-S-20009
20180516-S-20009
20180516-S-20009
20180516-S-20009

Expected Output

Order_Date
20180504
20180514
20180514
20180514
20180514
20180516
20180516
20180516
20180516

I tried below code and it didnt work.

data['Order_Date'] = data['Shipment ID'][:8]

Solution

  • You are close, need indexing with str which is apply for each value of Series:

    data['Order_Date'] = data['Shipment ID'].str[:8]
    

    For better performance if no NaNs values:

    data['Order_Date'] = [x[:8] for x in data['Shipment ID']]
    

    print (data)
            Shipment ID Order_Date
    0  20180504-S-20000   20180504
    1  20180514-S-20537   20180514
    2  20180514-S-20541   20180514
    3  20180514-S-20644   20180514
    4  20180514-S-20644   20180514
    5  20180516-S-20009   20180516
    6  20180516-S-20009   20180516
    7  20180516-S-20009   20180516
    8  20180516-S-20009   20180516
    

    If omit str code filter column by position, first N values like:

    print (data['Shipment ID'][:2])
    0    20180504-S-20000
    1    20180514-S-20537
    Name: Shipment ID, dtype: object