Search code examples
sqlsql-serverrsql-server-2016

R in SQL Server: Output data frame into a table


This probably has a simple answer but I cannot figure it out as I'm still getting a hang of working with R in SQL Server. I have a piece of code that reads in data from a SQL Server table, executes in R and returns a data frame.

execute sp_execute_external_script
    @language=N'R',
    @script=N'inp_dat=InputDataSet
    inp_dat$NewCol=max(inp_dat$col1,inp_dat$col2)
    new_dat=inp_dat
    OutputDataSet=new_dat'
    @input_data_1=N'select * from IM_COMP_TEST_SQL2016.dbo.temp_table';

I want to insert new_dat into a SQL Server table (select * into new_table from new_dat). How do I go about this?


Solution

  • As shown in this tutorial, you can use INSERT INTO ... EXEC in a previously created table with columns aligning to script's dataframe return:

    INSERT INTO Table1
    execute sp_execute_external_script
        @language=N'R',
        @script=N'inp_dat <- InputDataSet
                  inp_dat$NewCol <- max(inp_dat$col1,inp_dat$col2)
                  new_dat <- inp_dat',
        @input_data_1=N'SELECT * FROM IM_COMP_TEST_SQL2016.dbo.temp_table',
        @output_data_1=N'newdat';
    

    However, to use the make-table query may require OPENQUERY() or OPENROWSET() using an ad-hoc distributed query as described in this SO Post to return the output of stored procedure:

    Stored Procedure

    CREATE PROCEDURE dbo.R_DataFrame
    
    AS
    
    BEGIN
        execute sp_execute_external_script
            @language=N'R',
            @script=N'inp_dat <- InputDataSet
                      inp_dat$NewCol <- max(inp_dat$col1,inp_dat$col2)
                      new_dat <- inp_dat',
            @input_data_1=N'SELECT * FROM IM_COMP_TEST_SQL2016.dbo.temp_table',
            @output_data_1=N'newdat';
    
            -- ADD ALL COLUMN TYPES;
            WITH RESULT SETS (("newdat" [col1] varchar(20), [col2] double, [col3] int ...));
    END
    GO
    

    Action Query

    SELECT * INTO Table1 
    FROM OPENROWSET('SQLNCLI', 'Server=(local);Trusted_Connection=yes;',
                    'EXEC dbo.R_DataFrame')