Search code examples
recommendation-engine

LightFM: Weights and Sample Weights


I was looking to gain insight on the weights for the LightFM Implementation for the following:

Sample Weights

  • What are sample_weights in the fit method? I read that they are to simulate time decay but how does that work exactly? Examples or an article explaining this would be really helpful.

Interactions Matrix

  • now if I have user interactions with different content_types i.e. text, video and we don't want to really differentiate between them when we make a recommendation?
    • Do I have to make separate models for each media type? If I create one model, does it make a difference if the interactions for text is a boolean like 1.0/0.0 for clicks and if the interactions for video is in percentage_video_completed e.g. if a user sees 10 seconds out of a 15 second video, can I assign an weight as 0.667?

Solution

  • Sample weights

    You can use sample_weights to weigh the importance of any one observation, in the same way you can pass sample_weight to an sklearn classifier.

    Weights larger than 1 will lend additional weight to that observation; weights smaller than one will make it less important to the model.

    This is accomplished by scaling the learning rate for that observation by its weight.

    Interactions

    You don't have to create separate models: the two types of interactions can happily be embedded in the same model.

    In LightFM models, the data in the interactions matrix is binary. You should use the sample weights to express the confidence you have that a given interaction is positive. This can be percentage watched for movies: however, be aware that if percentage watched is routinely under 1.0, your model will put more weight on the text interactions.