How to use top_n for conditional extraction

So I have a column (category) that contains either "Yes" or "No" in my df and in order to create a more balanced sample I want to select the rows with the first 500 cases of "Yes" and the first 500 cases of "No" from my dataset.

I've tried this code:

top_n(df,500, category=="Yes")

But this select ALL cases of yes instead of only the first 500 I also tried this but this gave me an error though I'm sure it makes no sense

df %>% filter(top_n(500, category == "Yes") & top_n(500, category=="No")) I need a bit of help with the right direction

Solution

I'd probably just use head for this, and filter directly on the data frame

df1 <- head(df[df$category == "Yes",], 500)
df2 <- head(df[df$category == "No",], 500)

# to combine
out <- rbind(df1, df2)

I'm guessing top_n does something similar. I expect there is a nicer way with dplyr but this should work :)

rintrojs only shows first dialog in Safari
Automatically read a column of lowercases True and False as logical
How do I add counts to a stacked bar graph?
Counting the number of rows between each pair of months?
How to create a conditional panel using a reactive object that is passed from another module?
Plot multiple normalized stock charts from different dates into a single plot
Select columns based on string match - dplyr::select
Looking for a more efficient way to replace matrix elements
modelsummary modelplot: change linewidth
custom R function with group argument does not work while using the filter
ggsurvplot function, risk table alignment problem
plot running average in ggplot2
Calling variable in df within function
How to find position of running minimum (runMin) in a vector in R?
Using httr2::last_response() in conjunction with purrr::possibly()
Cropping a raster using terra does not return the expected extent
Can janitor::clean_names be used on only certain columns in a data frame?
Efficient way of row binding time series in a data.table, with correctly sorted timestamps
Defining optional arguments in R when more complex function
Can I change the cursor in plotly only when hovering over points?
Issue Loading RStoolbox: "Cannot find proj.db" Error
Filter CSV files for specific value before importing
Matching the same lines from 2 different files and 2 columns
Conditional coloring and outer borders in pdf KableExtra table in R
How can I use observe
Posterior predictive check for GAM (mgcv in R)
Embed images in plotly tick labels
Product of two beta distributions
R: how to include a character between two repeted items in R?
Categorizing data from 7 columns into 2