Good programmers write clean code and remove temporary datasets that won’t be used.
In most cases, datasets with zero observations are no longer useful. However, you won’t check all your datasets manually and remove them if it has zero observations.
So, how to delete only an empty dataset in SAS? First, you need to check the number of observations. Then, if the number of observations is zero, you remove the dataset. Otherwise, you do nothing.
In this article, we discuss a macro function that checks if a dataset is empty and removes it accordingly. Also, we demonstrate how to remove all empty datasets from a specific library.
Contents
Sample Data
For the example in the next sections, we create an empty dataset based on the famous Iris Flower dataset from Robert Fisher. We only read the structure from the dataset and use the stop keyword to prevent further processing.
/* CREATE EMPTY DATASET */ DATA WORK.MY_EMPTY_DS; SET SASHELP.IRIS; STOP; RUN;
Remove an Empty Dataset with a Macro Function
Create an Macro Function
The first step in creating a macro function that removes a dataset conditionally if it’s empty is to create the structure of the macro. We will call our macro function delete_if_empty.
With the %macro delete_if_empty() and %mend we mark the beginning and end of the macro function. We use %delete_if_empty() to call and run our macro.
%MACRO DELETE_IF_EMPTY(DS); /* 1) COUNT OBSERVATIONS */ /* 2) DELETE DATASET IF NUMBER OF OBSERVATIONS = 0 */ %MEND; %DELETE_IF_EMPTY(DS = WORK.MY_EMPTY_DS);
You could run this function, but, since it has no executable code, it won’t do anything.
Count the Number of Observations
The first step in the maco function is to count the number of observations. SAS provides many ways to count the number of observations in a dataset. One of the fastest ways is by using the descriptor portion of the dataset. This is the method we will use.
DATA _NULL_; IF 0 THEN SET &DS. NOBS=N; CALL SYMPUTX('NOBS', N); STOP; RUN;
The descriptor portion of a dataset contains metadata about this specific dataset. For example, the number of rows is stored in the nobs variable of the metadata. We use the SYMPUTX function to create a macro variable that contains the number of observations.
Since we only want to access the metadata, we use the stop keyword to prevent SAS from further processing the data.
%MACRO DELETE_IF_EMPTY(DS); /* 1) COUNT OBSERVATIONS */ DATA _NULL_; IF 0 THEN SET &DS. NOBS=N; CALL SYMPUTX('NOBS', N); STOP; RUN; %PUT NUMBER OF OBSERVATIONS IN &DS.: &NOBS.; /* 2) DELETE DATASET IF NUMBER OF OBSERVATIONS = 0 */ %MEND; %DELETE_IF_EMPTY(DS = WORK.MY_EMPTY_DS);
If you run the code above, it will show a message in the log with the number of rows of your input dataset ds.
Delete SAS Dataset if it’s Empty
The final step of the macro function is to remove the input dataset if the number of observations is 0.
We use the IF statement to check if the value of the macro variable nobs is zero. If the value of nobs is zero, then we use the PROC SQL procedure to delete the input dataset.
So, the final macro function looks like this:
%MACRO DELETE_IF_EMPTY(DS); /* 1) COUNT OBSERVATIONS */ DATA _NULL_; IF 0 THEN SET &DS. NOBS=N; CALL SYMPUTX('NOBS', N); STOP; RUN; %PUT NUMBER OF OBSERVATIONS IN &DS.: &NOBS.; /* 2) DELETE DATASET IF NUMBER OF OBSERVATIONS = 0 */ %IF &NOBS. = 0 %THEN %DO; PROC SQL; DROP TABLE &DS.; QUIT; %END; %MEND; %DELETE_IF_EMPTY(DS = WORK.MY_EMPTY_DS);
If you execute the code above, then the SAS log will show you that the empty dataset has been removed.
Delete All Empty SAS Datasets from a Library
Above we have shown how to check if one SAS dataset is empty and delete it accordingly. However, how can you remove all empty datasets from the same library at once? There is a very elegant and efficient way to do this.
With the dictionary tables, we can filter all datasets from a library that has 0 observations. To do so, we use the LIBNAME and NLOBS columns from the dictionary tables. In the example below, we check the WORK library for empty datasets.
Then, we use the CATX function to concatenate the library name and the dataset name (with a period in between).
With the SELECT INTO clause, we create a macro variable that stores all the empty datasets separated by a comma. In the example below, the name of the macro variable is droplist.
Finally, we use the DROP TABLE statement and the macro variable to delete all empty datasets.
PROC SQL; SELECT DISTINCT CATX('.',LIBNAME,MEMNAME) INTO :DROPLIST SEPARATED BY ',' FROM DICTIONARY.TABLES WHERE LIBNAME IN ('WORK') AND NLOBS=0; DROP TABLE &DROPLIST; QUIT;
If you want to remove all blank datasets from multiple libraries, you can modify the LIBNAME IN (‘WORK’) statement.