I have a pandas DataFrame with 'area' and 'elevation' columns. I tried to show area distribution with respect to elevation. I thought it should be simple but I did not find a way to do that.
My dataframe looks like:
df:
area ele
0 97395.147254 6017.877745
1 69704.264405 5974.124316
2 71490.518833 5838.256686
3 23235.692475 5793.837788
4 65254.056661 5787.050911
5 17407.853780 5734.049234
6 17556.149643 6106.984128
7 33557.481232 5716.453589
8 37932.703188 5870.938016
9 19417.303768 5987.567275
10 26290.210275 6232.612380
11 45211.104174 5777.812375
12 35457.722243 5707.920921
13 83353.269135 5778.740416
14 68906.869455 5951.361295
15 66991.699542 6146.242249
16 43415.962994 6041.594263
17 74985.484055 5835.818736
18 53145.779672 5952.993800
19 36893.436921 6008.634508
20 59647.991246 5883.823537
21 53032.278932 5771.375295
Obviously
df.hist()
did not work. I am looking for something like:
Where the area should be in the y-axis.
import matplotlib.pyplot as plt
plt.hist(df.ele,bins=[100*x+5500 for x in range(0,10)],weights=df.area)
Explanation:
[100*x+5500 for x in range(0,10)]
forms the bins for the histogram for every 100 m of elevation: 5500, 5600, ..., 6400. This is not essential but it makes the histogram look nicer, otherwise the bins will not end at round numbers.
The 'summing up' is done by using the weights
keyword, as already mentioned in the comments to the question above. See the pyplot.hist documentation for further details.