I'm working with this dataframe:
Keyword URL Type Position_Group
A A Ad 1
A B Ad 2
A C Organic 1
A D Organic 2
A E Organic 3
A F Featured_Snippet 1
..
A P Organic 20
A Q Organic 21
A R Ad 6
I want to be able to filter up to Type
= Organic
& Position_Group
<= 20
while also being inclusive of the other Type
(Ad
& Featured_Snippet
) since its results fall within top 20 organic positions. The goal of this use case is to be able to filter up to top 20 Organic
positions while also capturing other Type
in between.
The expected output should be like this:
Keyword URL Type Position_Group
A A Ad 1
A B Ad 2
A C Organic 1
A D Organic 2
A E Organic 3
A F Featured_Snippet 1
..
A P Organic 20
Thanks in advance!
Assuming the dataframe is df
:
df.iloc[: list(df[(df["Type"] == "Organic") & (df["Position_Group"] == 20)].index)[0]+1]
This filters the dataframe by Type == "Organic"
and Position_Group == 20
, then returns the index to a list and takes the first item (there should only be one item, but I have found the list is necessary to get the actual value, not Int64Index([6], dtype='int64')
). 1 is added to this so that it is included, and the rows up to there are taken from the dataframe.
For the addition of for each x
in Keyword
, it can be completed with a for loop:
for i, j in df.groupby("Keyword"):
j.reset_index(drop=True, inplace=True)
print(j.iloc[: list(j[(j["Type"] == "Organic") & (j["Position_Group"] == 20)].index)[0]+1])
I have used print
here, but if you wanted to create a new dataframe with only these rows:
df2 = pd.DataFrame(columns=df.columns)
for i, j in df.groupby("Keyword"):
j.reset_index(drop=True, inplace=True)
df2 = pd.concat([df2, j.iloc[: list(j[(j["Type"] == "Organic") & (j["Position_Group"] == 20)].index)[0]+1]])