Search code examples
pythonsteamsteam-web-api

Program ends up using the same values regardless of different API calls


I want to fetch 500 game reviews with Steam API. Single API call returns 20 game reviews. I loop 25 times in order to obtain 500 reviews. I also count the words in these reviews and write the occurrence of each word (frequency) out to a file.

The problem is that frequencies in my file are multiples of 25. Thus, the same 20 reviews are being fetched over and over again for all 25 iterations. Meaning the same 20 reviews were counted 25 times.

It is either with the way I am calling the API or something with my function calls which maybe store the first API response inside them throughout the runtime and keep using it even though new API responses are fetched.

Also URL's I construct seem to be correct and gives me the correct JSON when I paste them to my browser. Data I want in the API response JSON is inside "reviews" array. Each element have a "review" key which the value of them contain the actual user review.

import requests
import string

REVIEWS_COUNT = 500
REVIEWS_PER_REQUEST = 20

game_IDs = {
    "AAA": {
        "KINGDOM_COME_DELIVERANCE": "379430",
        "ASSASSINS_CREED_ODYSSEY": "812140",
    },

    "indie": {
        "RIMWORLD": "294100",
        "DUSK": "519860",
    }
}

def review_to_words(review):
    """
    input : a string "review"
    filters out anything else than pure words
    output: a string list of words in that review
    """
    words = []
    review.translate(str.maketrans('', '', string.punctuation))
    review.replace('\n', '')
    review.replace('☐', '', 1)
    review.lower()
    for word in review.split(' '):
        words.append(word)
    return words

def response_to_words(response):
    """
    input : the Steam API response
    output: a string list of words in user reviews of the Steam API response
    """
    all_words = []
    jres = response.json()
    jarr = jres['reviews']
    for jelem in jarr:
        rev = jelem['review']
        for word in review_to_words(rev):
            all_words.append(word)
    return all_words

def main():
    for game_type in game_IDs:
        for game_name in game_IDs[game_type]:
            game_id = game_IDs[game_type][game_name]
            words_count = {}  # dictionary: word -> count
            for revs_idx in range(0, REVIEWS_COUNT, REVIEWS_PER_REQUEST):
                api_url = "https://store.steampowered.com/appreviews/" + game_id + "?json=1&start_offset=" + str(revs_idx)
                response = requests.get(api_url)
                words = response_to_words(response)
                for word in words:
                    if word not in words_count:
                        words_count[word] = 0
                    words_count[word] += 1
            with open('03_words_count.txt', 'w', encoding='utf-8') as f:
                f.write(str(words_count))

if __name__ == '__main__':
    main()

Solution

  • Steam has changed to no longer use start_offset. They now use cursor, which works differently.
    Getting all reviews from a steam game using Steamworks?
    https://partner.steamgames.com/doc/store/getreviews