Search code examples
python-3.xmatplotlibpython-idle

PyPlot with IDLE


I am attempting to plot some data using matplotlib.pyplot in IDLE. This is my first time trying to use matplotlib outside of a classroom.

My code is pretty straightforward: load some data from a CSV into a dataframe and try to put it on a scatterplot.

import matplotlib.pyplot as plt
import pandas as pd

filename = 'data.csv'
data = pd.read_csv(filename, delimiter=',')

plot_me = data.plot(x=data['Density'], y=data['Cost per Pupil'], kind='scatter')

However, this returns a long error which I am having trouble following:

Traceback (most recent call last):
  File "C:/Users/dmccarville/Desktop/DJM Figures/plot.py", line 10, in <module>
    plot_me = data.plot(x=data['Density'], y=data['Cost per Pupil'], kind='scatter')
  File "C:\Python34\lib\site-packages\pandas\plotting\_core.py", line 2617, in __call__
    sort_columns=sort_columns, **kwds)
  File "C:\Python34\lib\site-packages\pandas\plotting\_core.py", line 1859, in plot_frame
    **kwds)
  File "C:\Python34\lib\site-packages\pandas\plotting\_core.py", line 1684, in _plot
    plot_obj.generate()
  File "C:\Python34\lib\site-packages\pandas\plotting\_core.py", line 240, in generate
    self._make_plot()
  File "C:\Python34\lib\site-packages\pandas\plotting\_core.py", line 833, in _make_plot
    scatter = ax.scatter(data[x].values, data[y].values, c=c_values,
  File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 1958, in __getitem__
    return self._getitem_array(key)
  File "C:\Python34\lib\site-packages\pandas\core\frame.py", line 2002, in _getitem_array
    indexer = self.loc._convert_to_indexer(key, axis=1)
  File "C:\Python34\lib\site-packages\pandas\core\indexing.py", line 1168, in _convert_to_indexer
    return labels.get_loc(obj)
  File "C:\Python34\lib\site-packages\pandas\core\indexes\base.py", line 2442, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5280)
  File "pandas\_libs\index.pyx", line 134, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:4819)
  File "C:\Python34\lib\site-packages\pandas\core\base.py", line 51, in __str__
    return self.__unicode__()
  File "C:\Python34\lib\site-packages\pandas\core\series.py", line 982, in __unicode__
    width, height = get_terminal_size()
  File "C:\Python34\lib\site-packages\pandas\io\formats\terminal.py", line 33, in get_terminal_size
    return shutil.get_terminal_size()
  File "C:\Python34\lib\shutil.py", line 1071, in get_terminal_size
    size = os.get_terminal_size(sys.__stdout__.fileno())
AttributeError: 'NoneType' object has no attribute 'fileno'

I was able to locate this bug report, which suggests that this can't be accomplished with IDLE. As something of a novice, that sounds incredible to me.

Is this correct? Can I really not make a scatterplot using IDLE? If I can, what do I need to do?


Solution

  • This is a guide to debugging your problem:

    First, simplify your program, such that it doesn't depend on external data, such that the error is reproducible. See Minimal, Complete, and Verifiable example. This could look like the following:

    import matplotlib.pyplot as plt
    import pandas as pd
    
    data = pd.DataFrame({"x" : [1,2,3],"y":[3,2,4]})
    
    plot_me = data.plot(x=data['x'], y=data['y'], kind='scatter')
    
    plt.show()
    

    Now running this will produce an error,

    Traceback (most recent call last):
      File "D:\Data\Computer\Entwicklung\python\SO_plot_dataframe.py", line 13, in <module>
        plot_me = data.plot(x=data['x'], y=data['y'], kind='scatter')
    
    # and so on. 
    

    This indicates that there is something wrong with the creation of the plot, not the actual showing.

    To see what could go wrong, (1) look at other examples and/or (2) the documentation. From the documentation you will find that the arguments x and y should be a "label or position", but not a dataframe column itself.

    So changing

    plot_me = data.plot(x=data['x'], y=data['y'], kind='scatter')
    

    to

    plot_me = data.plot(x='x', y='y', kind='scatter')
    

    where x and y are column labels from the DataFrame, will give you the desired plot.

    At the end this has nothing to do with IDLE, but with a wrong syntax used in one of the commands.