If you work with data, sooner or later, you need to know the number of observations (rows) in your data set. For example, to check whether a data set is empty or to verify that your join was carried out successfully. In SAS, you can count the number of rows in a table in various ways. In this article, we will show four of them and discuss their advantages and disadvantages.
We explain the four methods with examples. For the examples, we use the famous Iris Flower data set that was introduced by Robert Fisher in 1936. The data set consists of 50 samples from each of the three species of Iris. So, 150 observations in total.
4 Methods to Count the Number of Rows
Method 1: PROC SQL & Count
Probably the easiest way to count the number of rows in a SAS table is with the count-function within a PROC SQL procedure. This method is easy to understand and to remember. However, this has a disadvantage: it could be slow. Especially for large data set because this method reads every row one by one.
/* Method 1 */ proc sql; select count(*) as N from sashelp.iris; quit; proc sql noprint; select count(*) into :nobs_1 separated by ' ' from sashelp.iris; quit; %put &=nobs_1.;
You can create a macro variable with the select into-method for further use.
Method 2: PROC SQL & Dictionary Tables
The second method also uses the PROC SQL procedure, but this time in combination with the Dictionary Tables. Dictionary Tables are metadata (i.e. data about data) and are automatically available when a SAS session starts. They are updated automatically whenever there is a change in a data set.
Amongst others, the Dictionary Tables contain the number of rows of a data set. Considering that the number of rows is directly available in these special tables, it’s unnecessary to read all the rows of the table you are interested in.
Regarding the syntax of this method, the libname= is the name of the library where you have stored your table. The memname= refers to the name of your table. Like the first method, you can create a macro variable with the number of rows using the select into-statement.
/* Method 2 */ proc sql; select nobs from dictionary.tables where libname = 'SASHELP' and memname = 'IRIS'; quit; proc sql noprint; select nobs into :nobs_2 separated by ' ' from dictionary.tables where libname = 'SASHELP' and memname = 'IRIS'; quit; %put &=nobs_2.;
Method 3: Data _NULL_ & Descriptor Portion
The third method to count the number of rows in SAS uses a data step and the descriptor portion. Although this method uses a data step, it doesn’t create a table. This option is one of the most efficient ways to count the number of rows because it doesn’t go over each row of the data set.
A SAS data set consists of two portions. Namely, the data portion which contains the actual data of the table and the descriptor portion. The descriptor portion contains information on the physical data set and on the individual variables. It’s like dictionary table exclusively for one sas data set.
SAS stores the number of observations in a data set in an automatic variable called nobs. The nobs variable is part of the descriptor portion. We use the if 0 statement to ‘open’ the descriptor portion and the stop statement to prevent an endless loop.
/* Method 3 */ data _null_; if 0 then set sashelp.iris nobs=n; put n; stop; run; data _null_; if 0 then set sashelp.iris nobs=n; call symputx('nobs_3', n); stop; run; %put &=nobs_3.;
You can use the symputx-call to create a macro variable of the number of observations. We use the symputx-call because it removes, in contrast to the original symput-call, the trailing blanks.
Method 4: A Macro Function
The fourth method requires creating a macro function. However, because the macro function can be reused, we prefer this method if you need to count the number of rows frequently for different data sets.
The macro function consists of 3 parts and uses the metadata of the data set. First, we open the data set of interest with the open-function. Then, we read the number of rows with the attrn-function and the NOBS option and store it in a macro variable. Finally, we close the data set and return the macro variable.
/* Method 4 */ %macro count_obs(ds); %let DSID=%sysfunc(OPEN(&DS.,IN)); %let NOBS=%sysfunc(ATTRN(&DSID.,NOBS)); %let RC=%sysfunc(CLOSE(&DSID.)); &nobs %mend; %let nobs_4 = %count_obs(sashelp.iris); %put &=nobs_4.;