Really enjoyed using Rmarkdown/Knitr execution of SQL chunks recently, however it seems there is a limitation for more complex queries. I've been using a Redshift data base with an RJDBC powered connection.
Specifically using Common Table expressions with INNER JOIN
:
```{sql, connection=redshift, output.var="Field_count"}
WITH
cte AS (
SELECT DISTINCT field
FROM
table
WHERE date >= '2017-01-01'
)SELECT count(DISTINCT field)
FROM cte
INNER JOIN table_2 t2 ON t2.join_here = cte.join_here;
```
I've successfully used CTE and INNER JOIN in individual queries, but combining them leads to an error:
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC results set for ", : Unable to retrieve JDBC result set for WITH
cte AS (
SELECT DISTINCT field
FROM
table
WHERE date >= '2017-01-01'
)SELECT count(DISTINCT field)
FROM cte
INNER JOIN table_2 t2 ON t2.join_here = cte.join_here; ([Amazon] (500335) One query is expected: WITH
cte AS (
SELECT DISTINCT field
FROM
table
WHERE date >= '2017-01-01'
)SELECT count(DISTINCT field)
FROM cte
INNER JOIN table_2 t2 ON t2.join_here = cte.join_here;) Calls: <Anonymous> ... dbSendQuery -> dbSendQuery -> .local -> .verify.JDBC.reslt Execution Halted
It appears that this is being translated wrongly/duplicated on the querying out of the RStudio session.
Is this a known limitation? If so is it documented anywhere?
Is this a known bug? If so is it in an existing issue?
EDIT: This is only visibly happening within the knit process, individually running chunks interactively returns expected results.
I'd suggest trying another version on Redshift JDBC driver.
I got the same error, however, it was while using Pentaho so mileage may vary. I didn't find any documentation or other info related to this. My problem got solved by switching away from the newest driver version. I was using RedshiftJDBC4-1.2.1.1001.jar (JDBC 4.0 compatible) and switched to RedshiftJDBC4-1.1.10.1010.jar.