Search code examples
pythonpandasmatplotlibjupyter-notebookpython-itertools

Trying to create a scatter plot where the marker and color is cycled such that colour is changed after the 1 cycle of marker is complete


I am trying to make a scatter plot the plot shows different markers for each index entry(company) in data frame along with different colors too as there are too many entries in the graph that are either too close and too many to properly differentiate(the color and the marker cycle with each and every loop so the color and symbol gets repeated since PSPPROJECT)(Example in Image: GODREJPROP and IL&FSENGG have same marker and colour).

Simply put,I wish to for the marker to be displayed in a color for 1 cycle and then be displayed in different colors for each cycle of the marker thus ensuring easy identification of points in the plot.Please suggest any fixes or alternatives for this problem or any ways to improve this code.

I would also like to take this opportunity to ask for suggestion to keep my legend entries to shape themselves in adequate columns to ensure that its not too long.

I have uploaded a plot image here:
1

I have come up with the following code so far, here "i" in the code is a data frame, "j" is a string and "EQW" is a list of tuples contain multiple elements of both types.

for i,j in EQW:
    k = i.agg(["mean", "std"]).T
    k.columns = ["Return", "Risk"]
    plt.figure(figsize = (12,8))
    mark=itertools.cycle(("o","v","^","<",">","1","2","3","4","8","s","p","P","*","h","H","+","x","X","d"))
    for l in k.index:
        plt.scatter(x = k.loc[l,"Risk"], y = k.loc[l,"Return"], s = 75,label=l,marker =next(mark))
    if len(k.index)<20:
        plt.legend(bbox_to_anchor=(1.0,1.0))
    elif len(k.index)>30 and len(k.index)<50:
        plt.legend(bbox_to_anchor=(1.0,1.0),ncol=2)
    else:
        plt.legend(bbox_to_anchor=(1.0,1.0),ncol=3)
    plt.xlabel("Risk(std)", fontsize = 15)
    plt.ylabel("Return", fontsize = 15)
    plt.title("Risk/Return for {} with Equally Weighted Portfolio".format(j), fontsize = 20)
    plt.show()

Thank You


Solution

  • You could loop through both colors and markers and use // for the colors and % for the markers to keep one color for all markers and than use the second color for all markers and so on:

    len_markers = 3
    len_colors = 2
    for i in range(len_markers*len_colors):
        print(i, i // len_markers, i % len_markers)
    
    # 0 0 0
    # 1 0 1
    # 2 0 2
    # 3 1 0
    # 4 1 1
    # 5 1 2
    

    A simple Example:

    import matplotlib.pyplot as plt
    marker_list = ['v', '^', '<', '>']
    color_list = ['r', 'b', 'g', 'y', 'm']
    
    x = np.random.random((len(marker_list) * len(color_list), 2))
    
    plt.figure()
    for i, xx in enumerate(x):
        plt.plot(*xx, color=color_list[i // len(marker_list)], ls='',
                 marker=marker_list[i % len(marker_list)], label=str(i))
    
    plt.legend(ncol=2)
    

    enter image description here