If you work with surveys, you need to know how to deal with missing data. Although it’s crucial to know how to replace missing data, it’s even more important to know the number of missing data.

So, how do you count the number of missing values in SAS? **You can use the PROC FREQ procedure to count the number of missing values per column. If you want to know the number of missing values per row, you need to NMISS function or the CMISS function.**

In the remainder article, we discuss in more detail how to count the number of missing values per column and per row. We give examples of how to use the PROC FREQ procedure, the NMISS function, and the CMISS functions. We support all examples by images and SAS code.

## Sample Data Missing Values

For the examples in this article, we have created the dataset below. The dataset contains three character columns and three numeric columns. The columns are conveniently named *char_var1 – char_var3* and *num_var1 – num_var3*.

data my_ds; infile datalines dlm=","; input char_var1 $ char_var2 $ char_var3 $ num_var1 num_var2 num_var3; datalines; A, B, C, 1, 2, 3 A, B, ,1, 2, . , , ,., ., 3 A, , C, ., ., . , B, ,1, 2, . A, , ,., 2, . ; run;

## Count the Missing Values by Row

We start by counting the number of missing values per row in a SAS dataset. We separate this section according to the variable type (character, numeric, and both).

### Character Variables

If you want to count the number of **missing character values** per row, you can use the CMISS function. The CMISS function takes as arguments the columns you want to analyze.

So, in our example we could use the following code:

CMISS(char_var1, char_var2, char_var3)

However, since all character variables start with *char_var* we can use the colon (:) modifier to make our code more efficient.

data work.cnt_missing_row_char1; set work.my_ds (keep= _character_); cnt_missing_character = cmiss(of char_var:); run;

Above, we used the actual names of the columns as arguments for the CMISS function. However, if you want to know the number of missing values in a row for **all** **character** **columns**, you can use the special keyword **_character_**. With this keyword, SAS takes into account automatically all character columns and makes your code more efficient.

data work.cnt_missing_row_char2; set work.my_ds (keep= _character_); cnt_missing_character = cmiss(of _character_); run;

### Numeric Variables

You can use the NMISS function to **count the missing numeric values** in SAS. Like the CMISS function, the NMISS function takes as arguments the columns you want to analyze.

Again, we could use the following syntax to count the number of missing numerical values:

NMISS(num_var1, num_var2, num_var3)

However, the we will use the colon modifier to make our code more efficient.

data work.cnt_missing_row_num1; set work.my_ds (keep= _numeric_); cnt_missing_numeric = nmiss(of num_var:); run;

The second method to count all the missing values in numeric columns is using the **_numeric_** keyword. However, if you use this method you must subtract 1 to get the correct answer. Let me explain this.

The numeric columns in our original dataset (*work.my_ds*) are *num_var1*, *num_var2*, and *num_var3*. In the code below, we define a new, extra numeric column, namely *cnt_missing_numeric*. This column will contain the number of missing values per row. So, *cnt_missing_numeric* is also a numeric value. Since this is a new variable, it has no value at first. However, we don’t want to take this column into account while counting the number of missing numeric value. Hence, we need to subtract 1.

data work.cnt_missing_row_num2; set work.my_ds (keep= _numeric_); cnt_missing_numeric = nmiss(of _numeric_)-1; run;

### Character and Numeric Variables

If you want to count all the missing values by row in a SAS dataset, you have two options.

First, you can explicitly name the range of variables you want to take into account as arguments of the CMISS function. So, in this example, all variables between *char_var_1* and *num_var_3*.

data work.cnt_missing_row_all1; set work.my_ds; cnt_missing_all = cmiss(of char_var1 -- num_var3); run;

Second, we can use the special keyword **_all_**. Using this keyword, SAS takes all variables into account while counting the number of missing values. However, like in the previous section, we need to subtract 1 (see above), because the new column *cnt_missing_all* should be ignored.

data work.cnt_missing_row_all2; set work.my_ds; cnt_missing_all = cmiss(of _all_)-1; run;

## Count the Missing Values by Column

Instead of counting the missing values per row, you can also count the number of **missing values per column**.

You can use the PROC FREQ procedure to count the number of missing values per column. You use the statement “**table** *_all_ / missing*” to do this for all variables. If you want the count the number of a specific (type) of variable, you can change the *_all_* keyword for *_character_*, *_numeric_*, or a column name.

proc freq data=work.my_ds; table _all_ / missing; run;

## 2 thoughts on “How to Count the Number of Missing Values in SAS”

Comments are closed.