I understand the task of retrieval - I have gone through the code; also looked into alternative approaches like SCNN which is an ultra-fast nearest neighbor.
However, I still have hard time understanding the mechanism of the following code
# Create a model that takes in raw query features, and
index = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
# recommends movies out of the entire movies dataset.
index.index_from_dataset(
tf.data.Dataset.zip((movies.batch(100), movies.batch(100).map(model.movie_model)))
)
# Get recommendations.
_, titles = index(tf.constant(["42"]))
print(f"Recommendations for user 42: {titles[0, :3]}")
model.user_model
is trained and by now should return embeddings of user_id. The input for BruteForce
layer is model.user_model
; and then it should be indexed ?
I guess the output is given user_id
42, return 3 titles, out of that movies.batch(100)
. but I can't understand the function of BruteForce and indexing !
The BruteForce layer tests all the combinations between embeddings that are extracted from the last layer of the model.
According to the tensorflow documentation for the layer the layer retuns the index of the top k results (by default 10) indexes that are the closest to each index.