Search code examples
pythoncsvpandasparallel-coordinates

Plotting parallel coordinates in pandas/python


I am trying to use pandas in python to plot the following higher-dimensional data: https://i.sstatic.net/34nbR.jpg

Here is my code:

import pandas
from pandas.tools.plotting import parallel_coordinates

data = pandas.read_csv('ParaCoords.csv')
parallel_coordinates(data,'Name')

The code fails to plot the data, and the Traceback error ends with:

Keyerror: 'Name'

What is the second argument in parallel_coordinates supposed to say/do? How can I successfully plot the data?


Solution

  • The second argument is supposed to be the column name that defines class. Think ['dog', 'dog', 'cat', 'bird', 'cat', 'dog'].

    In the example online they use 'Name' as the second argument because that is a column defining names of iris's

    Doc

    Signature: parallel_coordinates(*args, **kwargs)
    Docstring:
    Parallel coordinates plotting.
    
    Parameters
    ----------
    frame: DataFrame
    class_column: str
        Column name containing class names
    cols: list, optional
        A list of column names to use
    ax: matplotlib.axis, optional
        matplotlib axis object
    color: list or tuple, optional
        Colors to use for the different classes
    use_columns: bool, optional
        If true, columns will be used as xticks
    xticks: list or tuple, optional
        A list of values to use for xticks
    colormap: str or matplotlib colormap, default None
        Colormap to use for line colors.
    axvlines: bool, optional
        If true, vertical lines will be added at each xtick
    axvlines_kwds: keywords, optional
        Options to be passed to axvline method for vertical lines
    kwds: keywords
        Options to pass to matplotlib plotting method