Search code examples
amazon-web-servicesamazon-personalize

AWS Personalize recommending majorly on recently activity


I am using AWS Personalize custom solution for user personalization. The recommendations I am receiving are heavily based on recent activity. Activities with the same level of frequency a day old are barely considered and older ones are completely ignored. What can I do to broaden this?

My solution configuration is the following: Recipe: aws-user-personalization Interaction data is around 75 million. Items are around 2 million. Users around 10 million.

{ "performExploration": true, "algorithmHyperParameters": { "bptt": "16", "hidden_dimension": "70", "recency_mask": "false" }, "featureTransformationParameters": { "max_user_history_length_percentile": "0.95", "min_user_history_length_percentile": "0.05" } }

My solution first had recency_mask enabled but even after disabling it, the problem persisted.


Solution

  • You can try increasing bptt and hidden_dimension, both of which consider more of a user's interaction history when ranking items when higher. The values you're using for these hyperparameters are below the defaults for this recipe. Since these hyperparameters are HPO-tunable, another option is to create a solution with HPO enabled (performHPO=true) and let Personalize determine the optimal values based on your data. Note that HPO will increase the training time (and therefore cost) but you can use the tuned parameter values from a solution trained with HPO enabled in a separate solution where you provide the values to use. This will keep retraining time and cost lower since you're retraining with the tuned values.

    You indicated that you have performExploration set to true but didn't provide the value you're using for explorationWeight and explorationItemAgeCutoff. If you are using a high explorationWeight, cold items will be emphasized in recommendations which may be contributing to the results you are seeing.

    Lastly, you have 2 million items in your dataset but Personalize currently limits the number of items included in the model to 750K (see limits page). The service will select the items included in the model based on a balance of popularity and recency from the interactions. This could be a contributing factor.