Search code examples
sql-server-2008jdbclarge-data

Unable to insert 250MB byte array onto remote SQL Server


I am generating a byte array of size about 250MB. This is then being inserted into a table in a remote Microsoft SQL Server using standard JDBC code in a Java application.

My first setup - application running on one Amazon EC2 instance, Microsoft SQL Server running on a separate Amazon EC2 instance. In this case the INSERT statement works - pass.

My second setup - application running locally on my laptop, database hosted on the Amazon EC2 instance. Here the connection is made correctly but on the line
pstmt.executeQuery(1, byteArray);
the program keeps running without terminating even after say half an hour - fail.

To check that I am actually able to access and insert data into the remote DB from my local machine, I wrote a trivial application to insert one integer into the table - pass.

I suspect that since I do not have a really fast internet connection, my data upload fails. Is there anyway to speed up the data transfer from JDBC to the remote SQL Server database.

I have thought about increasing the packet size on the SQL Server - but I am not sure if this is the right way, since it works when using another Amazon EC2 instance to run the same application - only not when running from local instance of the application. Running Amazon EC2 instances for every developer on the team is an expensive proposition.


Solution

  • Your problem is network latency and not the database itself.

    Think about how can you optimize transfer of the file to the EC2 instance. Sending a 250mb file in a sync manner is never a good way. This will always cause problems. Uploading the file and inserting it in the database are two different things. Do not mix them,

    Thing you should do:

    1. compress the file as much as you can and then send it. AND
    2. Use ftp or http (same performance though) and upload the file to the server. Then do what ever you want to with that file on the server. (like inserting it to the database)

    OR

    1. The other super-awesome way of transmitting large file over the network is: split the file in chunks (after compressing it), send them asynchronously (eg: twister in python or nio in java) and finally merge all the files on the server side (use checksum to validate the integrity of the file).