word2vec recommendation system KeyError: "word '21883' not in vocabulary"

The code works absolutely fine for the data set containing 500000+ instances but whenever I reduce the data set to 5000/10000/15000 it throws a key error : word "***" not in vocabulary.Not for every data point but for most them it throws the error.The data set is in excel format. [1]: https://i.sstatic.net/YCBiQ.png I don't know how to fix this problem since i have very little knowledge about it,,I am still learning.Please help me fix this problem!

    purchases_train = []
    for i in tqdm(customers_train):
        temp = train_df[train_df["CustomerID"] == i]["StockCode"].tolist()
        purchases_train.append(temp)

    purchases_val = []
    for i in tqdm(validation_df['CustomerID'].unique()):
        temp = validation_df[validation_df["CustomerID"] == i]["StockCode"].tolist()
        purchases_val.append(temp)


    model = Word2Vec(window = 10, sg = 1, hs = 0,
                     negative = 10, # for negative sampling
                     alpha=0.03, min_alpha=0.0007,
                     seed = 14)

    model.build_vocab(purchases_train, progress_per=200)

    model.train(purchases_train, total_examples = model.corpus_count, 
                epochs=10, report_delay=1)


    model.save("word2vec_2.model")
    model.init_sims(replace=True)

    # extract all vectors
    X = model[model.wv.vocab]

    X.shape

    products = train_df[["StockCode", "Description"]]

    products.drop_duplicates(inplace=True, subset='StockCode', keep="last")


 products_dict=products.groupby('StockCode'['Description'].apply(list).to_dict()

    def similar_products(v, n = 6):
        ms = model.similar_by_vector(v, topn= n+1)[1:]
        new_ms = []
        for j in ms:
            pair = (products_dict[j[0]][0], j[1])
            new_ms.append(pair)

        return new_ms

        similar_products(model['21883'])

Solution

If you get a KeyError saying a word is not in the vocabulary, that's a reliable indicator that the word you're looking-up was not in the training data fed to Word2Vec, or did not appear enough (default min_count=5) times.

So, your error indicates the word-token '21883' did not appear at least 5 times in the texts (purchases_train) supplied to Word2Vec. You should do either or both of:

Ensure all words you're going to look-up appear enough times, either with more training data or a lower min_count. (However, words with only one or a few occurrences tend not to get good vectors & instead just drag the quaality of surrounding-words' vectors down - so keeping this value above 1, or even raising it above the default of 5 to discard more rare words, is a better path whenever you have sufficient data.)
If your later code will be looking up words that might not be present, either check for their presence first (word in model.wv.vocab) or set up a try: ... except: ... to catch & handle the case where they're not present.