Have you ever had the need to check whether a SAS dataset was empty? In this article, we provide 4 ways how to do this.
You check if a SAS dataset is empty by simply counting the number of observations. If the dataset exists but has zero observations, then the dataset is empty. You can check the number of observations with the COUNT function, the Dictionary Tables, the Descriptor Portion of a dataset, or with a macro function.
For the examples in this article, we create an empty dataset based on the famous Iris Flower dataset from Robert Fisher (1936).
data work.ds_iris_empty; set sashelp.iris; stop; run;
Check if a Dataset is Empty with the COUNT Function
The first and easiest way to check if a dataset is empty is with the COUNT function in a PROC SQL procedure. With this function, you can count the number of observations in a dataset.
This method is perfect for beginners because the code is very readable and easy to remember. However, this method needs to processes the complete dataset to count the number of observations. So, if the dataset isn’t empty and has many observations, this method can be slow.
proc sql; select count(*) as nobs from work.ds_iris_empty; select count(*) into :nobs_1 from work.ds_iris_empty; %put &=nobs_1.; quit;
The macro variable nobs_1 is zero, so the dataset in empty.
In the example above, we counted the number of observations with the COUNT function and stored the result in a macro variable using the SELECT INTO clause. Assigning a macro variable with the number of observations makes it easier to check later on if your table is empty.
Check if a Dataset is Empty with the Dictionary Tables
The second method to check if a SAS dataset is empty uses the Dictionary Tables. You might not be familiar with the Dictionary Tables (like many are), and that’s unfortunate because they are very powerful!
Dictionary Tables contain Metadata, that is data about data, and are available from the moment you start a SAS session. An example of metadata is the number of observations in a dataset. If you modify a dataset during your SAS session, the dictionary tables are updated automatically.
The biggest advantage of this method to check if a SAS dataset is empty is that the number of observations in a dataset is directly available in the Dictionary Tables. It isn’t necessary to process the dataset. This makes this method very fast.
To filter your table’s metadata from the Dictionary Tables, you need to provide the name of your table (memname) and the library (libname). After filtering your table, the number of observations is stored in the nobs column.
proc sql; select nobs from dictionary.tables where libname = 'WORK' and memname = 'DS_IRIS_EMPTY'; select nobs into :nobs_2 from dictionary.tables where libname = 'WORK' and memname = 'DS_IRIS_EMPTY'; %put &=nobs_2.; quit;
The value of the macro variable nobs_2 is 0. Hence, our SAS dataset is empty.
Note that you must provide the libname and memname values in uppercase.
Check if a Dataset is Empty with the Descriptor Portion of the Dataset
The third method to find out if a SAS dataset is empty uses the descriptor portion of a dataset.
A SAS dataset consists of a data portion (the actual data) and a descriptor portion (the metadata). Contrary to the Dictionary Tables, the Descriptor Portion contains only metadata about one specific table. So, with the descriptor portion, you can directly access the number of observations in your dataset without filtering. This makes this method extremely fast.
The number of observations of a dataset is stored in the nobs variable of the descriptor portion. We can access this variable by opening the dataset (if 0 then set work.ds_iris_empty), reading the value of the nobs variable, and closing the dataset directly without further processing (stop).
data _null_; if 0 then set work.ds_iris_empty nobs=n; call symputx('nobs_3', n); stop; %put &=nobs_3.; run;
The dataset is empty because nobs_3 is zero.
Unlike the previous examples, we use the SYMPUTX function to create a macro variable of the number of observations. You can’t access the descriptor portion of the data from the PRC SQL procedure.
Check if a Dataset is Empty with a Macro Function
For the fourth and final method to check if a dataset is empty, we create a macro function. The macro function makes this method reliable and reusable.
The macro function consists of 3 steps and uses metadata to obtain the number of observations. First, we use the OPEN function to open the dataset that we what to check for emptiness. Then, we use the ATTRN function with the obs argument to retrieve the number of observations from the metadata. We store the number of observations in the nobs macro variable. Finally, we close the dataset with the CLOSE function and return the macro variable nobs.
%macro count_obs(ds); %let DSID=%sysfunc(OPEN(&DS.,IN)); %let NOBS=%sysfunc(ATTRN(&DSID.,NOBS)); %let RC=%sysfunc(CLOSE(&DSID.)); &nobs; %mend; %let nobs_4 = %count_obs(work.ds_iris_empty); %put &=nobs_4.;
Because nobs_4 = 0, our dataset is empty.