I read in the data as following
polygons = polygons.geojson
polygons = gpd.read_file(polygons)
income = income.csv
provi_income = pd.read_csv(income)
income
Primary_IncomeNet identificatie
1 14734 PV20
2 16502 PV21
3 12917 PV22
4 31189 PV23
5 12060 PV24
6 59168 PV25
7 45021 PV26
8 94199 PV27
9 111357 PV28
10 10405 PV29
11 76537 PV30
12 29362 PV31
polygons
identificatie geometry
0 PV20 MULTIPOLYGON (((265275.541 549247.459, 265285....
1 PV22 MULTIPOLYGON (((231437.815 516445.643, 231430....
2 PV26 MULTIPOLYGON (((131894.470 429932.357, 131917....
3 PV24 MULTIPOLYGON (((157694.139 473920.680, 159406....
4 PV23 MULTIPOLYGON (((248291.900 459808.449, 248302....
5 PV27 MULTIPOLYGON (((131700.944 464257.265, 131702....
6 PV25 MULTIPOLYGON (((181361.527 418255.386, 181384....
7 PV28 MULTIPOLYGON (((88397.000 413853.999, 89142.01...
8 PV21 MULTIPOLYGON (((189491.268 535832.617, 189494....
9 PV30 MULTIPOLYGON (((167891.450 359190.720, 168085....
10 PV31 MULTIPOLYGON (((199549.696 308385.049, 199558....
11 PV29 MULTIPOLYGON (((50235.786 357928.267, 50243.18...
And then merge based on common id. This merges the two data frames but the Primary_IncomeNet column shows NaN. Why does the Primary_IncomeNet column not merge properly and how do perform the merge correctly?
merge = polygons.merge(income, on="identificatie", how= 'left')
identificatie geometry Primary_IncomeNet
0 PV20 MULTIPOLYGON (((265275.541 549247.459, 265285.... NaN
1 PV22 MULTIPOLYGON (((231437.815 516445.643, 231430.... NaN
2 PV26 MULTIPOLYGON (((131894.470 429932.357, 131917.... NaN
3 PV24 MULTIPOLYGON (((157694.139 473920.680, 159406.... NaN
4 PV23 MULTIPOLYGON (((248291.900 459808.449, 248302.... NaN
5 PV27 MULTIPOLYGON (((131700.944 464257.265, 131702.... NaN
6 PV25 MULTIPOLYGON (((181361.527 418255.386, 181384.... NaN
7 PV28 MULTIPOLYGON (((88397.000 413853.999, 89142.01... NaN
8 PV21 MULTIPOLYGON (((189491.268 535832.617, 189494.... NaN
9 PV30 MULTIPOLYGON (((167891.450 359190.720, 168085.... NaN
10 PV31 MULTIPOLYGON (((199549.696 308385.049, 199558.... NaN
11 PV29 MULTIPOLYGON (((50235.786 357928.267, 50243.18... NaN
Based on the alignment of the column (identificatie
) of the dataframe income
, I suspect that she holds trailing whitespaces since their values are aligned to the left. So, try to strip
them before the merge.
income["identificatie"] = income["identificatie"].str.strip()
merge = polygons.merge(income, on="identificatie", how= "left")
Output:
print(merge)
identificatie geometry Primary_IncomeNet
0 PV20 MULTIPOLYGON (((265275.541 549247.459, 265285.... 14734
1 PV22 MULTIPOLYGON (((231437.815 516445.643, 231430.... 12917
2 PV26 MULTIPOLYGON (((131894.470 429932.357, 131917.... 45021
3 PV24 MULTIPOLYGON (((157694.139 473920.680, 159406.... 12060
4 PV23 MULTIPOLYGON (((248291.900 459808.449, 248302.... 31189
5 PV27 MULTIPOLYGON (((131700.944 464257.265, 131702.... 94199
6 PV25 MULTIPOLYGON (((181361.527 418255.386, 181384.... 59168
7 PV28 MULTIPOLYGON (((88397.000 413853.999, 89142.01... 111357
8 PV21 MULTIPOLYGON (((189491.268 535832.617, 189494.... 16502
9 PV30 MULTIPOLYGON (((167891.450 359190.720, 168085.... 76537
10 PV31 MULTIPOLYGON (((199549.696 308385.049, 199558.... 29362
11 PV29 MULTIPOLYGON (((50235.786 357928.267, 50243.18... 10405