Search code examples
pythonmatplotlibscatter-plotbubble-chart

scatter or bubble plotting by value duplication


I have a set of sequences of numbers that are kept in a 2D list. Each element in the list is a sublist of varying lengths, say numbers in the range 1-10. Like this:

Lst = [[1,3,4,4,4,5],[2,7,2,3],[6,5,4,2,4],[2,4,5,7,5,4,2],[4,9,4,1,4,5,4]…]

Is there a way to draw plot these datas in scatter or bubble plotting with value duplication by use matplotlib? Each element in the list occupies a position on the X-axis, and all the values in the element are distributed in the corresponding Y-axis position, and the more times the value is repeated, the larger the size or dark color of the drawn point.

I already know how to use matplotlib plot scatter plotting, but I don't know how to plot a 2D list item on one Y-axis one by one.

Thank you.

enter image description here


Solution

  • You can just plot each sublist in a for loop:

    import matplotlib.pyplot as plt
    from collections import Counter
    import numpy as np
    Lst = [[1,3,4,4,4,5],[2,7,2,3],[6,5,4,2,4],[2,4,5,7,5,4,2],[4,9,4,1,4,5,4]]
    plt.figure()
    for i, j in enumerate(Lst):
        occurences, sizes = list(zip(*list(Counter(j).items())))
        plt.scatter(i*np.ones(len(occurences))+1, occurences, s=np.array(sizes)*50)
    

    Output: enter image description here

    Edit: Fulfilling request for points to also become darker. Using the answer from here: Darken or lighten a color in matplotlib

    import matplotlib.pyplot as plt
    from collections import Counter
    import numpy as np
    
    def lighten_color(color, amount=0.5):
        """
        Lightens the given color by multiplying (1-luminosity) by the given amount.
        Input can be matplotlib color string, hex string, or RGB tuple.
    
        Examples:
        >> lighten_color('g', 0.3)
        >> lighten_color('#F034A3', 0.6)
        >> lighten_color((.3,.55,.1), 0.5)
        """
        import matplotlib.colors as mc
        import colorsys
        try:
            c = mc.cnames[color]
        except:
            c = color
        c = colorsys.rgb_to_hls(*mc.to_rgb(c))
        return colorsys.hls_to_rgb(c[0], 1 - amount * (1 - c[1]), c[2])
    
    Lst = [[1,3,4,4,4,5],[2,7,2,3],[6,5,4,2,4],[2,4,5,7,5,4,2],[4,9,4,1,4,5,4]]
    occurences, sizes = list(zip(*[list(zip(*list(Counter(j).items()))) for j in Lst]))
    maximum = max(max(i) for i in sizes)
    
    plt.figure()
    for i, (j, k) in enumerate(zip(occurences, sizes)):
        plt.scatter(i*np.ones(len(j))+1, j, s=np.array(k)*50, color=[lighten_color('b', 2*m/maximum) for m in k])
    

    Output: enter image description here