I am trying to save a one hot encoder from keras to use it again on different texts but keeping the same encoding.
Here is my code :
df = pd.read_csv('dataset.csv ')
vocab_size = 200000
encoded_docs = [one_hot(d, vocab_size) for d in df.text]
How can I save this encoder and use it again later ?
I found this in my research but one_hot() seems to be a function and not an object (sorry if this is plain wrong I am fairly new to python).
Mentioning the Answer in this Section (although it is present in Comments Section), for the benefit of the Community.
To Save the Encoder, you can use the below code:
import pickle
with open("encoder", "wb") as f:
pickle.dump(one_hot, f)
Then to Load the Saved Encoder, use the below code:
encoder = pickle.load(f)
encoded_docs =[encoder(d, vocab_size) for d in df.text]
Since the function, from.keras.preprocessing.text import one_hot
uses hash()
to generate quasi-unique encodings, we need to use a HashSeed
for reproducing our Results (getting same result even after multiple executions).
Run the below code in the Terminal, for Setting the HashSeed
: