I am hoping to use SQLite3's generate_series Table-Valued Function from within R's sqldf
library, which in turn uses the RSQLite
library. I expect once installed/configured, I will be able to call it as:
sqldf('SELECT value FROM generate_series(5,100,5)')
From reading Run-Time Loadable Extensions, I understand I will need to first call load_extension(X), specifically as
sqldf('select load_extension("C:\\my\\path\\to\\sqlite\\pathext\\misc\\series.c")')
However, I also read that Extension loading is off by default, and I can not figure out how I will have to turn it on from within a sqldf
call. Is this possible at all? Will I have to compile my own sqlite3 with this default toggled?
Further, I would like to confirm that there is no shortcut to Compiling A Loadable Extension myself with MinGW, because apparently without compiling ext/misc/series.c, I will find that SQLite generate_series is missing, and none of the Downloads provide it compiled.
If I understand correctly, I will NOT have to compile all of SQLite, I can use the downloaded version in conjunction with the extension compiled by me.
Finally, how do I assure that the version of sqlite3 that sqldf
/ RSQLite
picks up is the one I've configured? Will this force me to compile my own RSQLite
rather than simply install.packages(c('RSQLite'))
?
All this seems like a lot of work just for the sake of this one extension. Is there a better way to go?
Thanks for your helps!
Suggest you submit an issue on the RSQLite github site for it to be added but in the meantime here are some workarounds that don't involve messing with rebuilding SQLite and RSQLite.
1) Recursive CTE This can be done without the extension using a CTE as shown in https://www.sqlite.org/series.html .
To be specific try this sqldf code:
library(sqlite)
start <- 1
end <- 3
step <- 1
fn$sqldf("
WITH RECURSIVE generate_series(value) AS (
SELECT $start
UNION ALL
SELECT value+$step FROM generate_series
WHERE value+$step<=$end
) SELECT value FROM generate_series
")
## value
## 1 1
## 2 2
## 3 3
2) Non-recursive CTE Suppose we have a data frame known to have at least the required number of rows. It's content does not otherwise matter. For example, the built in BOD
data frame has 6 rows and suppose we want 3 rows which is no more than 6 so it will work.
fn$sqldf("
WITH generate_series(value) AS (
SELECT $start + $step * (rowid - 1) FROM BOD LIMIT ($end - $start) / $step + 1
) SELECT value FROM generate_series
")
## value
## 1 1
## 2 2
## 3 3
If start
and step
are 1 this can be simplified to:
fn$sqldf("
WITH generate_series(value) AS (
SELECT rowid FROM BOD LIMIT $end
) SELECT value FROM generate_series
")
3) Create in R Another possibility is to create the series in R and then just use it in sqldf
:
generate_series <- data.frame(value = seq(start, end, step))
sqldf("SELECT value FROM generate_series")
## value
## 1 1
## 2 2
## 3 3