Search code examples
pythonpython-3.xpandascdf

How to replace numerical values from a csv with categorical values using python


My question is about replacing a numerical value to string in csv file using python.The purpose is to calculate CDF(Cumulative Distribution Function).

Name of the data set is 'hsb', the class label is 'status' which has 304 rows of numerical data 1s and 2s. I want to replace 1 with 'positive' and 2 with 'negative'.


Solution

  • Consider below example which uses .map() map the values from a dict.

     df = pd.DataFrame({
        'col':[1,1,2,2,2,1]
    })
    

    Output:

       col
    0   1
    1   1
    2   2
    3   2
    4   2
    5   1
    

    Now,

    df['col'] = df['col'].map({1:'positive', 2:'negative'})
    

    Output:

           col
    0   positive
    1   positive
    2   negative
    3   negative
    4   negative
    5   positive