Search code examples
pythonarraysmatplotlibsequencevalueerror

ValueError: setting an array element with a sequence when trying to plot with matplotlib


When I was trying to run the following code:

from matplotlib import pyplot as plt

Xpos = df[df['bound']==1]
Xneg = df[df['bound']==0]
y1, X1 = Xpos['Id'].values, Xpos['seq'].values
y2, X2 = Xneg['Id'].values, Xneg['seq'].values

fig, ax = plt.subplots()
ax.plot(y1, X1 , label='bounded sequences', color='blue')
ax.plot(y2, X2 , label='unbounded sequences', color='red')
plt.show()

I got this error:ValueError: setting an array element with a sequence.
The sample output of df is like the one you find here.
Can anyone help?
Thanks.


Solution

  • the issue here is that you are trying to plot a list of lists.

    For the purpose of explaining you the issue, I created a sample dataset similar to the one you are using (the only difference is that the sequences are shorter). Here is the code that I'm using to create the dataset:

    df_dict = {
        "Id": [0, 1, 2, 3, 4],
        "seq": [[0, 0, 1, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]],
        "bound": [1, 0, 1, 0, 1]
    }
    df = pd.DataFrame(df_dict)
    

    If we now execute the first part of your code and print the X1 variable:

    Xpos = df[df['bound']==1]
    Xneg = df[df['bound']==0]
    y1, X1 = Xpos['Id'].values, Xpos['seq'].values
    y2, X2 = Xneg['Id'].values, Xneg['seq'].values
    print(X1)
    

    The output will be:

    [list([0, 0, 1, 0]) list([0, 0, 1, 0]) list([0, 0, 1, 0])]
    

    If what you would like to plot for X1 is the concatenation of each list, e.g. [0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0], this might solve your problem:

    import pandas as pd
    from matplotlib import pyplot as plt
    
    df_dict = {
        "Id": [0, 1, 2, 3, 4],
        "seq": [[0, 0, 1, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]],
        "bound": [1, 0, 1, 0, 1]
    }
    df = pd.DataFrame(df_dict)
    
    Xpos = df[df['bound']==1]
    Xneg = df[df['bound']==0]
    print(Xpos)
    y1, X1 = Xpos['Id'].values, [elem for sub_list in Xpos['seq'].values for elem in sub_list]
    y2, X2 = Xneg['Id'].values, [elem for sub_list in Xneg['seq'].values for elem in sub_list]
    print(y1)
    print(X1)
    fig, ax = plt.subplots()
    ax.plot(X1, label='bounded sequences', color='blue')
    ax.plot(X2, label='unbounded sequences', color='red')
    plt.show()
    

    If you want a scatter plot instead of a line plot, you just need to replace the two ax.plot functions with the following:

    ax.scatter(range(len(X1)), X1, label='bounded sequences', color='blue')
    ax.scatter(range(len(X2)), X2, label='unbounded sequences', color='red')