I'm doing a project for school and want to interpret this data in a 3d scatterplot but I keep getting "ValueError: could not convert string to float: 'Location'" when I run this code:
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
df = pd.read_csv('baseball2.csv')
fig = plt.figure()
ax = fig.add_subplot(111, projection = '3d')
x = ['Location']
y = ['Landing']
z = ['Speed']
ax.scatter(x, y, z)
ax.set_xlabel("Location")
ax.set_ylabel("Landing")
ax.set_zlabel("Speed")
plt.show()
Here is the CSV file it's not long:
Location, Landing, Speed,
1B, 3BF, 90,
1A, FLF, 93,
3B, 2B, 91,
2C, SRF, 92,
1C, P, 83,
2C, C, 85,
3A, FLF, 93,
2C, SRF, 84,
3A, SS, 93,
1C, CF, 92,
2B, FRF, 91,
3A, FLF, 90,
3A, FLF, 91,
1C, C, 91,
3A, C, 91,
2B, HR, 91,
2A, DRF, 92,
3B, SRF, 82,
1B, SCF, 82
You can do it as a 3D scatter plot and use pseudo-numerical values for your categorical variables, but the resulting figure will be very difficult to read. I recommend a 3D bar plot like this:
import matplotlib.pyplot as plt
import pandas as pd
fig = plt.figure(figsize=(10, 10))
ax = fig.gca(fc='white', projection='3d')
df = pd.read_csv('data_files/original_file.csv')
xy_data = [i for i in range(len(df[' Landing']))]
ax.bar3d(xy_data,xy_data, df[' Speed'],1,1, -df[' Speed'])
ax.set_xticklabels(df["Location"])
ax.set_yticklabels(df[" Landing"])
ax.set_xlabel("Location")
ax.set_ylabel("Landing")
ax.set_zlabel("Speed")
plt.show()
One last detail. In your csv file, there is a leading whitespace before Landing and Speed in the header line. Keep that whitespace in mind when you call a dataframe column, as shown in my code.
ADDENDUM In response to your comment, here is the code for a 3D scatter plot. The xy_data list comprehension in both plots serves to create fake numerical values that are required for these types of graphs since they usually operate on continuous variables. Your categorical variables are then assigned to these numerical placeholders by set_xticklabels() and set_yticklabels().
import matplotlib.pyplot as plt
import pandas as pd
fig = plt.figure(figsize=(10, 10))
ax = fig.gca(fc='white', projection='3d')
df = pd.read_csv('data_files/original_file.csv')
xy_data = [i for i in range(len(df[' Landing']))]
ax.scatter3D(xy_data,xy_data, df[' Speed'], color = 'green')
ax.set_xticklabels(df["Location"])
ax.set_yticklabels(df[" Landing"])
ax.set_xlabel("Location")
ax.set_ylabel("Landing")
ax.set_zlabel("Speed")
plt.show()