Search code examples
sasuploadsnowflake-cloud-data-platformroundingrounding-error

How many rows will SAS check to determine if the input data contains decimal?


Recently I found a problem when uploading excel file to snow flake using SAS.

My excel include 2 columns 1400 rows, in the first column only 11 rows have 2 decimal place, in the second column, most of the data have 2 decimal place.

When I upload the file to snowflake, in the first column all 11 rows with 2 decimal place have been rounded to the nearest whole number. but in the second column all decimal numbers remains.

I heard that the Power Query will check the first 200 rows, if all the first 200 rows are whole number it will round following decimal to whole number. What about SAS? Will SAS also check certain amount of rows to decide the data format for the rest of the file, and how many rows will SAS check?

The code I used to upload the excel

FILENAME REFFILE '/sample.xlsx';

PROC IMPORT DATAFILE=REFFILE
    DBMS=XLSX
    OUT=WS_CRA.output;
    GETNAMES=YES;
RUN;

Solution

  • The data is being loaded with SAS.

    Their documentation says:

    For some input data sources, such as a Microsoft Excel workbook, the first eight rows of data are scanned. The most prevalent data type (numeric or character) is used for a column. This is the default. If most of the data in the first eight rows is missing, SAS defaults to the CHAR data type and any subsequent numeric data for that column is set to missing. (You can change the default from 8 to 0 in the Windows registry; 0 causes all the rows in the column to be scanned to determine the type.

    https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acpcref/p0jf3o1i67m044n1j0kz51ifhpvs.htm

    So if you are able to change the default from 8 to 0 in the Windows registry, all rows will be scanned. Otherwise 8.