Search code examples
sql-serverrmicrosoft-r

rxImport potential issue in RevoScaleR


I have a SQL connection to a table on my SQLServer, which I have imported with the following line:

master_table <- RxSqlServerData(etc...)

Then, my goal is to save/import this table using rxImport and save it to a .xdf file, which I have called readTest <- 'read_test.xdf

The table is quite large, so I have set this in my rxImport:

rxImport(master_table, outFile=readTest, rowsPerRead=100000,reportProgress=1)

However, it has been running for 10 minutes now, and NO progress of rows being read/imported is being printed on the screen. Did I do this correctly? I wanted to output similar "progress" that is printed when a ML algorithm is run like RxForest or similar?

Thanks.


Solution

  • It's possible that the connection to your SQL Server database is relatively slow, report progress will only show progress when a batch of rows is complete. If the rows are relatively large, you could see nothing returned on the terminal for quite some time.

    For best performance with rxImport(), ensure that rowsPerRead is the largest possible size that your local machine memory can handle. This will make progress reports less frequent, but, it will give you a faster import time. The only case where this isn't true is when importing an XDF file.