Search code examples
pythonmean-square-error

comparing csv files values with mean squared error


I have 2 csv files which one consists of exactly 1 column and 27 rows (only with numbers) and i want to compare these 2 csv files row by row taking the mean squared error and print the result of every comparison so i can calculate the average mean squared error in the end.i am using pandas and sklearn any help really appreciated. Thank you in advance.

import pandas as pd
from sklearn.metrics import mean_squared_error
cars = pd.read_csv('koula.csv')
moto = pd.read_csv('katerina.csv')
print(cars)
print(moto)
for i in range(cars):
    for j in range(moto):
       print(mean_squared_error(cars,moto))

Solution

  • If you want to calculate for just one value in each row you should do:

    for i in range(len(cars)):
       print(mean_squared_error(cars[i],moto[i]))
    

    This works if your datasets have the same length.

    If you however try to calculate the error for all the rows in you datasets just use:

    print(mean_squared_error(cars.values, moto.values))
    

    This will do the same thing as above but it takes the mean of all values and this will be more useful.

    Lastly if they are the instances pd.Series class you do not have to use .values