Search code examples
pythoncsvnumpygenfromtxt

How to extract data from .csv file and create a plot?


I have a .csv file with 24columns x 514rows of data. Each of these column represent different parameters and I wish to study the trends between different parameters.

I am using genfromtxt to import the data as a numpy array such that I can plot the values of two particular columns(e.g. column 9 against column 11). Here is what I have so far:

import matplotlib.pyplot as plt
import numpy as np


data = np.genfromtxt('output_burnin.csv', delimiter=',')

impactparameter=data[:,11]
planetradius=data[:,9]

plt.plot(planetradius,impactparameter,'bo')

plt.title('Impact Parameter vs. Planet Radius')
plt.xlabel('R$_P$/R$_Jup$')
plt.ylabel('b/R$_star$')

plt.show()

With this code I encounter an error at line 12:

    impactparameter=data[:,11]
IndexError: too many indices

What could the problem be in here?

Also, I have been trying to figure out how to give each column a header in the .csv file. So instead of counting the column number, I can just call the name of that particular column when I do the plotting. Is there a way to do this?

I am a complete newbie in Python, any help would be much appreciated, Thanks!


Solution

  • Also, I have been trying to figure out how to give each column a header in the .csv file. So instead of counting the column number, I can just call the name of that particular column when I do the plotting. Is there a way to do this?

    To give columns in your array names, you need to make it a structured array.

    Here's a simple example:

    a = np.zeros(5, dtype='f4, f4, f4')
    a.dtype.names = ('col1', 'col2', 'col3')
    print a[0]  # prints [0, 0, 0], the first row (record)
    print a['col1']  # prints [0, 0, 0, 0, 0], the first column
    

    If you have the column names at the beginning of your CSV file, and set names=True in np.genfromtxt, then Numpy will automatically create a structured array for you with the correct names.