Search code examples
razure-data-lakeu-sqldata-lake

Encountered error "Cannot convert type System.Nullable`1[System.Int64][] to an R vector"


I'm trying to run a job on Data Lake Store, but I get an error.

I inserted into the u-sql script an R script.

In my R script I use the dataset to calculate the percentiles of my variable and as output I create a dataframe that contains the result of the calculation.

this is a part of my script:

REFERENCE ASSEMBLY [ExtR]; 
DECLARE @data string = @"/output/model/...";
DECLARE @Model_traffic_percentile_outputfile string = "/output/model/...";
DECLARE @myRScript = @"
prob <- c(0.9999995,0.9999996,0.9999997,0.9999998,0.9999999,1)
values <- quantile(inputFromUSQL$total_bytes, probs = prob, type = 6)
outputToUSQL <- data.frame(values, prob)";

@input = 
EXTRACT [Period] string,
        [H_IMSI_BK] long,
        [H_BTSCarrierExternalCode_BK] long,
        [sum_session_duration] long,
        [sum_session_bytes_in] long,
        [sum_session_bytes_out] long,
        [sum_session_count] long
FROM @data
USING Extractors.Csv(skipFirstNRows:1);

@imsi_traffic_data =
SELECT [H_IMSI_BK],
       SUM(([sum_session_bytes_in] + [sum_session_bytes_out]) * [row_count]) AS [total_bytes]
FROM @input
GROUP BY [H_IMSI_BK];

@ExtendedData =
SELECT [total_bytes] AS Par,
   *
FROM @imsi_traffic_data;

@RScriptOutput = REDUCE @ExtendedData ON Par
  PRODUCE Par, values long, prob float
  READONLY Par
  USING new Extension.R.Reducer(
    command:@myRScript,
    rReturnType:"dataframe",
    stringsAsFactors:false);

OUTPUT @RScriptOutput TO @Model_traffic_percentile_outputfile
  USING Outputters.Csv(outputHeader : true, quoting : false);

But I get this error:

Description

Vertex failure triggered quick job abort. Vertex failed: SV2_Aggregate[0] 
with error: Vertex user code error.

Details

Vertex SV2_Aggregate[0].v1 {669A5438-5EFD-437D-906C-F069CCD2C5B4} failed 

Error:
Vertex user code error

exitcode=CsExitCode_StillActive Errorsnippet=

INNERERROR

Description

Unhandled exception from user code: "Cannot convert type 
System.Nullable`1[System.Int64][] to an R vector"
The details includes more information including any inner exceptions and the stack trace where the exception was raised.

Does anyone knows how to solve this?

Thanks


Solution

  • The problem is that the R script can not handle 64-bit data types.

    To create the input dataset I used the script that is generated by default by the command Create EXTRACT script, which in this case automatically assigns to all the fields of the dataset the data types long, which contains 64-bit values.

    So I modified the extract script changing the data types in this way:

    @InputData = 
        EXTRACT [Period] string,
                [H_IMSI_BK] string,
                [H_BTSCarrierExternalCode_BK] string,
                [sum_session_duration] int,
                [sum_session_bytes_in] double,
                [sum_session_bytes_out] double,
                [sum_session_count] int,
                [row_count] int
        FROM @data
        USING Extractors.Csv(skipFirstNRows:1);
    

    While to handle the nullable types I modified the script in this way:

    @imsi_traffic_data =
    SELECT [H_IMSI_BK],
           SUM(([sum_session_bytes_in] + [sum_session_bytes_out]) * [row_count]) ?? 0 AS [total_bytes]
    FROM @InputData
    GROUP BY [H_IMSI_BK];
    

    With these changes the script works correctly.