Search code examples
sas

How to compress all datasets in a SAS library?


I have a library (folder) with several large secondary datasets, which were created through time-consuming and complex joins and queries of different large datasets. It took days to process and write some datasets and it would probably take a month to re-do the role thing.

The problem is that these files are not compressed, and now we are out of space in our server. So is there a way to compress all datasets in a library?

I know I could write several datasteps, for example:

DATA folder.graduates2010 (compress=yes);
SET folder.graduates2010;
RUN;

But since there are several datasets and their names do not follow a single pattern (to write a macro to do everything), I thought that maybe there is an easier way of doing it, for example, with proc datasets or something alike.


Solution

  • To do this there are a couple of options:

    Normally you will find a 'sashelp' library in each session, in here you'll find a view 'vtable' that can be filtered to your library in question to access the table names.

    Alternatively, proc datasets can also be used with an output to generate a list of table names.

    Either of these two table lists can then be passed into a macro statement as you've described.

    Do a ctrl+f of vtable here: https://www.lexjansen.com/pharmasug/2015/SS/PharmaSUG-2015-SS01.pdf

    There is some great boiler plate you can take (it has the first option).

    proc sort data=sashelp.vtable out=dsn;
        by memname;
        where libname=‘<your_lib>’;
    run;
    
    %macro zip_dsn;
    
        data _null_;
            set dsn end=last;
                call symput('dsn'||left(_n_),trim(memname));
                if last then call symput('count',_n_);
        run;
    
        %do i = 1 %to &count;
    
            data <your_lib>.&&dsn&i.. (compress=yes);
                set <your_lib>.&&dsn&i..;
            run;
    
        %end;
    %mend zip_dsn;
    %zip_dsn;