I've run into a strange problem that occurs on my CentOS 6.5 box hosted on AWS that does not occur on my Windows machine.
When sending a large dbSendQuery (string with 15,000+ bytes/characters), I get error messages. With smaller queries, it runs fine.
Initially, I thought it was an issue with my Postgres installation, so I updated to 9.4 but still got the same error with RPostgreSQL. My next thought was that it was an SSL issue, so I tried with RJDBC with the sslmode=disable flag, but that did not work either.
This is the error message with RPostgreSQL:
> library(RPostgreSQL)
Loading required package: DBI
> rsdrv <- dbDriver("PostgreSQL")
> rscon <- dbConnect(rsdrv, dbname="dbname",host="redshift.connection.string",port=5439,user="username",password="password")
> dbSendQuery(rscon,query)
Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (SSL SYSCALL error: Connection timed out)
This is the error with RJDBC:
> library(RJDBC)
> pgsql <- JDBC("org.postgresql.Driver", "/postgresql-9.3-1102.jdbc41.jar", "`")
> con <- dbConnect(pgsql, "jdbc:postgresql://redshift.connection.string:5439/dev?sslmode=disable", "username", "password")
> dbSendQuery(con,query)
Error in .jcall(s, "Ljava/sql/ResultSet;", "executeQuery", as.character(statement)[1], : ignoring SIGPIPE signal
Here is the sessioninfo() on my CentOS 6.5
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RPostgreSQL_0.4 DBI_0.3.1
Here is the sessionInfo() on the Windows 7:
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RPostgreSQL_0.4 DBI_0.3.1
loaded via a namespace (and not attached):
[1] tools_3.1.2
I would attach a sample dbSendQuery() query that causes these errors, but since they are 15,000 characters I'm not sure this is a good idea. The use case for this is to auto-generate a CREATE TABLE statement, in my case 500++ columns. On my Windows machine, it sends the query in half a second, and on the Redshift side, creates the table almost instantly, so there is no issue with the query taking too much time (and timing out..?)
Thanks!
This sounds like it may be related to the MTU of your network interface. Can you please try and adjust the MTU to 1500: Example Setting MTU on CentOS: ip link set dev eth0 mtu 1500
add MTU=1500 to the end of that /etc/sysconfig/network-scripts/ifcfg-eth0