I have a large csv file that i need to take a row of data, one at a time, and score it against a model. I have tried the code below but get an error of "X has 120839 features per sample; expecting 30"
. I can run the model against the entire dataset
and it makes predictions on each row. But i need to do it one line at a time, thank you.
loaded_model = joblib.load('LR_model.sav')
with open(r'fordTestA.csv', "r") as f:
for line in f:
line = f.readlines()[1:] ##minus headers
result = loaded_model.predict(line)
In this scenario, it doesnt seem to split the lines as there is \n
after each row. I tried to add
line = line.rstrip('\n')
This gives an error : " 'list' object has no attribute 'rstrip'"
. Thanks in advance for any feedback.
I'm not familiar with joblib
or predict()
, but:
import csv
# other code
with open(r'fordTestA.csv', 'r', newline='') as f:
rows = csv.reader(f, delimiter=',')
_ = next(rows) # skip headers
for row in rows:
line = list(map(float, row)) # convert row of str to row of float
results = loaded_model.predict(line)
# or if you need a ',' delimited string
line = ','.join(row)
results = loaded_model.predict(row)