Search code examples
pythonpandasmatplotlibplotscatter-plot

Creating a Scatter Plot with different colors for a certain column


I'm currently trying to create a scatter plot with certain traits. My original 'eu' contains countries with city statistics like longitude, latitude, region, etc. I created a dataframe with where the country is grouped with the cities that it contains while showing their longitude and latitude at the same time.

cityScatter = eu[['country', 'city', 'longitude', 'latitude', 'temperature']] # Create a new dataframe to hold specific items.
newCityScatter = cityScatter.groupby(['country', 'city', 'longitude', 'latitude']).count()['temperature'].to_frame() # Group the countries with their respective cities.
newCityScatter = newCityScatter.drop(['temperature'], axis=1) # Remove temperature now that we have it grouped.
newCityScatter # Result

enter image description here

This is the table I get. It's what I want since all the countries are grouped together with the cities. Now my problem comes with creating a scatter plot. I want the graph to have x = longitude and y = latitude for each city. However, the country the city resides in must have its own color on the graph.

For example, Graz, Innsbruck, and Linz could be colored blue since they reside in Austria. Meanwhile, Edinburgh, Exter, Glasgow, Inverness, and Swansea could be red since they reside in the United Kingdom.

newCityScatter.plot(kind='scatter',x='longitude',y='latitude')

When I run the above statement, I get "TypeError: no numeric data to plot". I want to figure out how I can get a scatter plot up with these contraints on.

DISCLAIMER: I CANNOT use seaborn. I want to figure out how to do this with only matplotlib.


Solution

  • I don't think you need .groupby() or any other preprocessing.

    I think you could just do:

    cityScatter.plot(kind='scatter', x='longitude', y='latitude', c='country')