Search code examples
sassas-iml

How to not change the length of preexisting character variables when creating a new dataset in proc-iml?


I have a dataset which I manipulate in proc-iml and then create a new dataset reading some of the manipulated values in. When I read character values in, their length is changed from 7 to 9.

This doesn't really create a problem, except for the minor annoyance that when I later merge this new dataset, I receive the warning that the variables' length is different in two datasets.

Is there a way to keep the length of the original variable?

Sample code

data data1;
infile datalines delimiter=',';

input classif :$9. time :$7.;
datalines;
05, 2021_11
051, 2021_11
;
run;

proc iml;
    use work.data1;
    read all var {classif time } into _temp_1;
    classif = _temp_1[,1];
    time   = _temp_1[,2];
close;
create work.data2 var{classif time};
append; 
quit;

Observe how the length of time is 7 in data1, but 9 in data2.


Solution

  • As @Richard explained, this happens when you read two character variables that have different lengths into columns of a common matrix. I can think of at least three workarounds. Depending on your application, one of these methods might be more convenient than others.

    proc iml;
    /* Option 1: Read variables into vectors, not a matrix */
    use work.data1;
    read all var {classif time };
    close;
    print (nleng(time))[L="nleng(time)"];
    
    /* Option 2: Allocate time to have LENGTH=7 and copy the data in */
    use work.data1;
    read all var {classif time } into _temp_1;
    close;
    time = j(nrow(_temp_1), 1, BlankStr(7));  /* allocate char vector */
    time[,]   = _temp_1[,2];                  /* copy the data */
    print (nleng(time))[L="nleng(time)"];
    
    /* Option 3: Read into a table instead of a matrix. */
    tbl = TableCreateFromDataset("work", "data1") ;
    classif = TableGetVarData(tbl, {"Classif"});
    time = TableGetVarData(tbl, {"time"});
    print (nleng(time))[L="nleng(time)"];