Search code examples
sassas-macro

SAS: How to find the dataset in a library that contains the most cases?


As there are many datasets contained in one library. How can I use SAS code to find out which dataset has the largest number of cases? Suppose the library name is "SASHELP".

Thank you!


Solution

  • The SQL dictionary.* family of tables gives access to all sorts of metadata. Be careful, some dictionary requests can cause a lot of activity as it collects the information requested.

    From Docs:

    How to View DICTIONARY Tables

    DICTIONARY Tables and Performance

    When you query a DICTIONARY table, SAS gathers information that is pertinent to that table. Depending on the DICTIONARY table that is being queried, this process can include searching libraries, opening tables, and executing SAS views. Unlike other SAS procedures and the DATA step, PROC SQL can improve this process by optimizing the query before the select process is launched. Therefore, although it is possible to access DICTIONARY table information with SAS procedures or the DATA step by using the SASHELP views, it is often more efficient to use PROC SQL instead.

    Note: SAS does not maintain DICTIONARY table information between queries. Each query of a DICTIONARY table launches a new discovery process.

    Example:

    * Use dictionary.tables to get the names of the tables with the 10 most rowcount;
    
    proc sql;
      reset outobs=10;
    
      create table top_10_datasets_by_rowcount as
      select libname, memname, nobs 
      from dictionary.tables
      where libname = 'SASHELP'
        and memtype = 'DATA'
      order by nobs descending
      ;
    
      reset outobs=max;