Is it possible to select a random (or pseudo-random) subset from a database using a function like dplyr::sample_n() but in dbplyr or another R package that runs SQL queries?
The purpose is to test queries on small batches before running a time-intensive data pull from the database.
This seems to be working for our MySQL server:
dbGetQuery(con,
"SELECT *
FROM data_table_name
ORDER BY RAND()
LIMIT 500;")