Search code examples
sasenterprise-guide

Size of a SAS dataset is increasing on using a where clause


I have a SAS dataset sized close to 20 GB, I am trying to use a where clause but am unable to do so as the size of the dataset is swelling up and I am getting as error stating: SAS file is full. When I last looked at the dataset being created in the library the size was close to 45GB. My code is:

data fin2;
set fin;
where segment = "a";
run;

segment has 2 possible values a and b (a is close to 50% of the rows)

can anyone tell me what is happening and why is the size of dataset increasing. I tried doing this with an if command also but to no avail.


Solution

  • The code you provide creates new dataset fin2 keeping existing dataset fin untouched. So I assume you're not happy that fin has 20GB and fin2 has 45GB.

    I expect fin was created earlier, is not stored in WORK library and has some compression option enabled. To verify, check output from:

    proc contents data=fin;run;
    

    Contrary, I expect the compression for fin2 is not enabled.

    To decrease size of fin2 table try doing this with char or binary compression option:

    data fin_segment_a (compress=char);
    set fin;
    where segment = "a";
    drop segment;
    run;