Search code examples
pythondataframenumpycsvscatter-plot

TypeError: unhashable type: 'numpy.ndarray' and plt.scatter()


I am having issues with the plt.scatter() function. The error message says 'Type Error: unhashable type: 'numpy.ndarray''I want this code to create a scatter plot of the x and y dataframes. The two dataframes are the same size (88,2) when I enter a sample unit into the code.

fig, ax = plt.subplots(figsize=(10,10))
plt.scatter(x,y, color='black') #this is where I am having an issue.   
plt.xlim([0,10])   
plt.ylim([0,10])   
plt.title(unit)

Here is the a sample of the information in the csv file. (the numbers is the first column, material is the second, quantity is the third and so on...)

     Material: Quantity: Unit: Date:
0    B         1         A     43455
1    B         1         A     43455
2    C         1         A     43455
3    C         1         A     43456
4    D         1         A     43455
5    D         1         A     43455
6    B         1         A     43455 
7    B         2         A     43455
8    B         8         A     43459
9    B         5         A     43467
10   B         3         A     43452
11   D         7         A     43451

Solution

  • Based on Matplotlib documentation here the inputs for plt.scatter() are:

    x, yfloat or array-like, shape (n, ) The data positions.

    But in your code what you're passing to the scatter function are two pd.DataFrame. So the first column are the names but the second columns are where the values stored:

    fig, ax = plt.subplots(figsize=(10,10))
    plt.scatter(x.values[:, 1], y.values[:, 1], color='black') #this is where I am having an issue.   
    plt.xlim([0,10])   
    plt.ylim([0,10])   
    plt.title(unit)
    plt.xlabel('X')
    plt.ylabel('Y')
    

    scatter2d