Sorry for the title, I couldn't come up with one that described this issue succinctly & accurately.
Say you have dataframe such as:
Time Temp RH Sensor Unit
0 2015-12-07 00:06:00 14.912000 42.324 A 1
1 2015-12-07 00:12:00 14.768000 42.371 A 2
2 2015-12-07 00:18:00 14.601000 42.415 A 1
3 2015-12-07 00:24:00 14.457000 42.462 A 4
...
And you want to subset these data by the Unit
column. If you have the Unit
you want to use to create the subset you could do:
subset = df[df['Unit'] == 4]
...and if you wanted to subset with multiple Unit
values you could do:
subset = df[(df['Unit'] == 4) | (df['Unit'] == 1)]
The problem I have is that I am using a for loop to do these operations and the number of Unit
s included changes (length of value list varies from 1-3). In other words, imagine Unit
is a list of lists that I am looping through:
for i in Unit:
subset = df[(df['Unit'] == i]
...
Of course, the above will work when i
is a singe value, but not when it is a list of multiple values. Is there a way to do this without an if
statement?
If I understand correctly, you're trying to use boolean indexing against a list of conditions? For example, see the below Dataframe:
df
a
0 12
1 65346
2 1243
3 63
4 568
5 243
and you'd like to index on this list of conditions:
conditions = [12, 568]
You can use a Series method isin()
df[df['a'].isin(conditions)]
a
0 12
4 568