I'm a student from Moscow State University and I'm doing a small research about suburban railroads. I crawled information from wikipedia about all stations in Moscow region and now I need to subset those, that are Moscow Central Diameter 1 (railway line) station. I have a list of Diameter 1 stations (d1_names) and what I'm trying to do is to subset from whole dataframe (suburban_rail) with isin pandas method. The problem is it returns only 2 stations (the first one and the last one), though I'm pretty sure there are some more, because using str.contains with absent stations returns what I was looking for (so they are in dataframe). I've already checked spelling and tried to apply strip() to each element of both dataframe and stations' list. Attached several screenshots of my code.
stations' list I use to subset
checking manually for Bakovka station
checking manually for Nemchinovka station
Thanks in advance!
Next time provide a minimal reproducible example, such as the one below:
suburban_rail = pd.DataFrame({'station_name': ['a','b','c','d'], 'latitude': [1,2,3,4], 'longitude': [10,20,30,40]})
d1_names = pd.Series(['a','c','d'])
suburban_rail
station_name latitude longitude
0 a 1 10
1 b 2 20
2 c 3 30
3 d 4 40
Now, to answer your question: using .loc
the problem is solved:
suburban_rail.loc[suburban_rail.station_name.isin(d1_names)]
station_name latitude longitude
0 a 1 10
2 c 3 30
3 d 4 40