I struggle to call a plr function in postgreSQL from a R-script and use it in ggplot2 - geom_function. The following examples are extremly simplified but hopefully show the issue.
Let's assume I have the following plr function:
CREATE OR REPLACE FUNCTION public.mypgfunc(
x numeric,
a numeric)
RETURNS numeric
LANGUAGE 'plr'
COST 100
VOLATILE PARALLEL UNSAFE
AS $BODY$
return (x*a)
$BODY$;
To call this from ggplot - geom_function I thought, I could write the following simple wrapper function to do the query (I use the rpostgres package):
myWrapper <- function(x , a) {
con <- dbConnect(drv = RPostgres::Postgres() , dbname='mydb')
q <- dbSendQuery(con , "select mypgfunc( $1 , $2 )")
dbBind(q , c(x,a))
y <- dbFetch(q)
dbClearResult(q)
dbDisconnect(con)
return(y)
}
But if I now call this function from ggplot, I get the following warning message and an empty plot:
Computation failed in
stat_function()
: Query requires 2 params; 102 supplied.
The ggplot code looks as follows:
ggplot() +
geom_function(fun = myWrapper , args = list(a = 5))
If I write the plr function in R instead and call this from geom_function, everything works fine. If I call the myWrapper directly (outside ggplot) with just one value for x and a, respectively, everything works fine as well.
So, what do I have to change?
I don't now much about using SQL databases in R. But I think I know the reason of your error.
If you look at the help page for stat_function
(which is what geom_function
uses under the hood), you will see that, by default, it creates 101 values of x over an arbitrary range (I think it is [0,1]).
This is why the error message mentions that "Query requires 2 params; 102 supplied". You are concatenating x
(which is a vector of length 101) with a
(length 1), when you call dbBind(q , c(x,a))
.
The size of x
is defined by the parameter n
. So if you run:
ggplot() + geom_function(fun = myWrapper , args = list(a = 5), n=1)
You are passing a single value of x
to myWrapper
(I think x=0
) and you should get the same result as the situation you described in the last sentence (I got a warning message because ggplot could not draw a line with only 1 point).
So, basically, you need to do separate queries for each value of the vector x
. A straight forward way of doing this is looping over the values of x
:
y <- NULL
for (xx in x) {
q <- dbSendQuery(con , "select mypgfunc( $1 , $2 )")
dbBind(q , c(xx, a))
if (is.null(y)) {
y <- dbFetch(q)
} else {
y <- rbind(y, dbFetch(q))
}
dbClearResult(q)
}
Like I said, I have not worked with SQL in R. So I am sure there are smarter ways of doing this where you don't need to call an ugly for
loop (not sure also if you need the dbSendQuery
inside the loop).