I am new to python and have a piece of coursework to generate a heatmap from data in a csv file. Across the top are performance indicators of cars, and down the side are names of cars. I have also sliced the headings of rows and columns into arrays. I am unsure as to how I can load these values into axes ticks.
I am using the code ax.set_xticks() and the same for y but unsure of what to put into the brackets. The ax.get_xticks() also doesn't work. I have tried multiple varieties of things and all of them throw an error, chiefly the name of the slice; see code below.
if __name__ == "__main__":
# Load the data.
fileObj = open("CARS.csv").readlines()
lines = [line.strip().split(",") for line in fileObj]
# Reads the data into a list,
# then slicing to extract a list of cars, kpis and scoring.
cars = [line[0] for line in lines[1:]] #array storing car names
kpis = lines[0][1:] #array storing kpis
scoring = np.array([line[1:-1] for line in lines[1:]], dtype=float)
fig=plt.figure(figsize=(6,3))
ax=fig.add_subplot(111)
ax.set_xlim([0,9])
ax.set_xticks(kpis[lines[0][1:]])
ax.set_yticks(cars)
ax.set_title('Sportscar KPI Data')
ax.set_xlabel('KPI's')
ax.set_ylabel('Sportscars')
im = ax.imshow(scores, interpolation='nearest', aspect='auto')
plt.show()
I am hoping to have the sportscar names from the slice put into the ticks section, and an error brings up the list of names that could not be carried across. The same for kpi's.
Short answer:
Use pandas and seaborn:
import pandas as pd
import seaborn as sns
df = pd.read_csv('CARS.csv', index_col=0)
sns.heatmap(df)
Long answer:
You can get your code to run by correcting some mistakes...
fileObj = open("CARS.csv").readlines()
lines = [line.strip().split(",") for line in fileObj]
cars = [line[0] for line in lines[1:]]
kpis = lines[0][1:]
scoring = np.array([line[1:-1] for line in lines[1:]], dtype=float)
fig=plt.figure(figsize=(6,3))
ax=fig.add_subplot(111)
ax.set_title('Sportscar KPI Data')
ax.set_xlabel("KPI")
ax.set_ylabel('Sports car')
ax.set_xticks(range(len(kpis)))
ax.set_xticklabels(kpis)
ax.set_yticks(range(len(cars)))
ax.set_yticklabels(cars)
im = ax.imshow(scoring, interpolation='nearest', aspect='auto')
plt.show()
However,
there are a lot of things which wouldn't be done like this in python (besides all the indentation errors and the fact, that you try to plot scores
although your variable was introduced as scoring
):
First of all, if opening a file, you would do this in a first line and then begin to access it:
fileObj = open("CARS.csv")
lines = [line.strip().split(",") for line in fileObj.readlines()]
Then you can closr the file properly after reading it by
fileObj.close()
because now fileObj
really is a file object, fileObj.readlines()
is not, it is a list of strings.
Anyway, it would be even better to use a with
block for that file handling task, which does closing automatically for you:
with(open('CARS.csv')) as fileObj:
lines = [line.strip().split(",") for line in fileObj.readlines()]
(Note that here indentation is needed for everything which should be handled by the with block)
But still, even if this is already better, noone would do this step by step this way, reading the file line by line, extract labels by hand and convert a subsection to a numpy array.
Even if you for whatever reason cannot/do not want to use pandas and seaborn as suggested above, numpy has several well written importers itself like e.g. np.genfromtxt
.