Search code examples
rrpostgresqlrjdbcr-dbi

Error on dbSendQuery with RPostgreSQL and RJDBC on CentOS 6.5 and not Windows 7


I've run into a strange problem that occurs on my CentOS 6.5 box hosted on AWS that does not occur on my Windows machine.

When sending a large dbSendQuery (string with 15,000+ bytes/characters), I get error messages. With smaller queries, it runs fine.

Initially, I thought it was an issue with my Postgres installation, so I updated to 9.4 but still got the same error with RPostgreSQL. My next thought was that it was an SSL issue, so I tried with RJDBC with the sslmode=disable flag, but that did not work either.

This is the error message with RPostgreSQL:

> library(RPostgreSQL)
Loading required package: DBI
> rsdrv <- dbDriver("PostgreSQL")
> rscon <- dbConnect(rsdrv, dbname="dbname",host="redshift.connection.string",port=5439,user="username",password="password")
> dbSendQuery(rscon,query)
Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI driver: (SSL SYSCALL error: Connection timed out)

This is the error with RJDBC:

> library(RJDBC)
> pgsql <- JDBC("org.postgresql.Driver", "/postgresql-9.3-1102.jdbc41.jar", "`")
> con <- dbConnect(pgsql, "jdbc:postgresql://redshift.connection.string:5439/dev?sslmode=disable", "username", "password")
> dbSendQuery(con,query)
Error in .jcall(s, "Ljava/sql/ResultSet;", "executeQuery", as.character(statement)[1],  :  ignoring SIGPIPE signal

Here is the sessioninfo() on my CentOS 6.5

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] RPostgreSQL_0.4 DBI_0.3.1

Here is the sessionInfo() on the Windows 7:

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RPostgreSQL_0.4 DBI_0.3.1      

loaded via a namespace (and not attached):
[1] tools_3.1.2

I would attach a sample dbSendQuery() query that causes these errors, but since they are 15,000 characters I'm not sure this is a good idea. The use case for this is to auto-generate a CREATE TABLE statement, in my case 500++ columns. On my Windows machine, it sends the query in half a second, and on the Redshift side, creates the table almost instantly, so there is no issue with the query taking too much time (and timing out..?)

Thanks!


Solution

  • This sounds like it may be related to the MTU of your network interface. Can you please try and adjust the MTU to 1500: Example Setting MTU on CentOS: ip link set dev eth0 mtu 1500

    add MTU=1500 to the end of that /etc/sysconfig/network-scripts/ifcfg-eth0