Search code examples
pythoncommentsconventions

Proper use of comments


For Python code, PEP 257 provides a convention for using docstrings to document structural entities: packages, modules, functions, classes, and methods.

This covers pretty much everything. Stack Overflow questions about how to comment Python code invariably elicit answers saying to use docstrings.

Where does this leave comments? Is the Pythonic approach to use docstrings exclusively and never use comments? Or do they have some place?

What is the proper use of comments in Python code?


Solution

  • Doc-strings = Info for those using your function
    Inline-comments = Explanation of why the code is written the way it is written.

    See the requests library for a great example of a project that uses comments appropriately.

    When to use Doc-Strings

    Good doc-strings provide the same type of info you typically see when reading through the Python documentation. They explain what a function does, describe the parameters, and if something is returned they should mention that. Doc-strings should also mention any side-effects that could happen as a result of calling the function.

    One way of thinking of doc-strings is to think of the info you would want to show up if that function was shown in some online documentation. There are programs like Sphinx that can auto-generate documentation based on the doc-strings.

    When to use Comments

    Comments on the other hand explain confusing pieces of code. Their purpose is to help someone who is doing bug-fixes, or otherwise making changes to your code understand what your code is doing. They should be used to help explain lines of code that are not self-explanatory just by looking at them.

    Example

    Take the below shuffling algorithm as an example. Notice that the comments are focused on explaining how the algorithm works, and not on what each line of code does. We know how to read code, but the info in the comments is useful info for anyone looking at the code. The doc-string on the other-hand provides all of the info anyone who needs to use the shuffle function would want to know. They don't care about the internals of the algorithm. They care about the inputs and outputs of the shuffle function.

    def shuffle(artist_song_dict):
        """ Shuffles songs in a semi-random fashion while keeping songs by the same artist spread out, as described in
        https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/
        artist_song_dict must be a dictionary where the keys equal the artist names and the values are an iterable of each artist's songs
        A list of shuffled songs is returned
        """
        lineup = {} #each song will be stored in this dictionary with a value between 0 and 1 representing the song's position in the lineup
        variation = .3
        for artist in artist_song_dict:
            songs = artist_song_dict[artist]
            random.shuffle(songs)
    
            # Distance between songs in the lineup, if we were to space the songs out evenly
            spread = 1/len(songs)
    
            # This is reffered to as the offset in the article, but I found this has a different purpose than what the article says.
            # Without this random variation, the number of songs an artists has in the lineup affects the probablity that their songs
            # will appear first (or sooner/later) in the lineup versus other artists
            artist_variation = random.uniform(0, spread-variation)
    
            for i, song in enumerate(songs):
                # We want to add some randomization to the lineup, but not too much, 
                # otherwise, songs by the same artists could be played twice.
                # The article recommends adding a 30% variation to each song
                song_variation = random.uniform(0, spread*variation)
    
                # Assign this song the next evenly spaced spot in the lineup plus our variations
                lineup[song] = i*(spread) + artist_variation + song_variation
    
        return sorted(lineup, key=lineup.get)
    



    Inline Comments vs Block Comments

    Inline comments look like this

    x = x + 1                 # Compensate for border
    

    While block comments look like this

    # Compensate for border.  These comments
    # often cover multiple lines.
    x = x + 1
    

    Both are valid forms of commenting. I just thought I would point that there are two forms of comments. PEP 8 specifically says to use inline comments sparingly. I believe they're talking against the improper use of using comments to explain how every single line of code works. You see this often in tutorials and on SO, but in practice, you shouldn't comment code that is self explanatory.