Search code examples
pythondataframedata-cleaning

How to modify (correct) values that are poorly written in a DataFrame with python


I have a csv file that contains values that are badly written. I want to correct these mistakes. for example replace Toyouta by toyota, maxda by mazda, in the column named carCompany. example.

The job I need to do is to predict the car price using these independent variables the beginning of my code


Solution

  • DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')

    eg.
    >>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
    ...                    'B': [5, 6, 7, 8, 9],
    ...                    'C': ['a', 'b', 'c', 'd', 'e']})
    >>> df.replace(0, 5)
       A  B  C
    0  5  5  a
    1  1  6  b
    2  2  7  c
    3  3  8  d
    4  4  9  e
    
    df.replace('Toyouta','toyota')
    

    should work.