so I was able to output my predicted numerical values into an excel file but I was wondering if it is possible to instead of the numerical value, it actual exports the string instead.
Currently it looks like this,
Column 1 | Answer Key | Predicted |
---|---|---|
Something Something | Cars | 3 |
Instead of it returning 3, I would like for 3 be replaced as the actual string its associated with (for example, idk Truck).
here is my code so far, I know I have to mess with the exporting part of the code but I cannot seem to figure this out.
texts = df['without_Tags'].astype('str')
vector = TfidfVectorizer(ngram_range=(1, 2), min_df = 2, max_df = .95)
X = vector.fit_transform(texts) #features
LE = LabelEncoder()
df['tower_values'] = LE.fit_transform(df['Tower'])
y = df['tower_values'].values
print(X.shape)
print(y.shape)
lsa = TruncatedSVD (n_components=100, n_iter=10, random_state=3)
X = lsa.fit_transform(X)
print(X.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .3, shuffle = True, stratify = y, random_state = 3)
SG = SGDClassifier(random_state=3, loss='log')
SG.fit(X_train, y_train)
y_pred = SG.predict(X_test)
print("SG model accuracy:", accuracy_score(y_test, y_pred))
print("SG model Recall:", recall_score(y_test, y_pred, average="macro"))
print("SG model Precision:", precision_score(y_test, y_pred, average="macro"))
print("SG model F1 Score:", f1_score(y_test, y_pred, average="macro"))
y_pred = pd.DataFrame(y_pred, columns=['predictions']).to_csv('prediction.csv')
final = pd.read_csv('prediction.csv')
final['pre'] = y_pred
df.to_csv('prediction.csv')
Try using inverse transform method of LabelEncoder()
y_pred = LE.inverse_transform(y_pred)