SAS How To's

3 Simple Ways to Find the Most Frequent Value (Mode) in SAS

So, you are probably here because you want to find the most frequent value (i.e., the mode) of a variable in your dataset. Also, you might have discovered that SAS does not have a built-in function to calculate the mode (such as sum() or mean()). However, there exist many ways to find the mode in SAS.

The easiest way to find the most frequent value in SAS is with PROC UNIVARIATE. You only need to provide the name of your dataset and the relevant variable, and SAS calculates automatically the mode. Optionally, you can add other statements to your code to create an output dataset or calculate the mode per group.

Besides PROC UNIVARIATE there exist other methods to find the most frequent value. For example, PROC MEANS and PROC SQL. In this article, we discuss all 3 of them and provide coding examples that you can use in your own SAS program.

Calculate the Mode in SAS with PROC UNIVARIATE

The easiest way to find the most frequent value in SAS is with the PROC UNIVARIATE procedure. This SAS Base procedure assesses the distribution of a variable and returns, amongst others, the mode.

These are the steps to find the mode with PROC UNIVARIATE

  1. Start the UNIVARIATE procedure

    You start the UNIVARIATE procedure with the PROC UNIVARIATE statement.

  2. Specify the input dataset

    You specify the input dataset with the DATA=-option followed by the name of your dataset. Your dataset can use both datasets from the WORK library and a permanent library.

  3. Add the MODE option

    By default, PROC UNIVARIATE calculates the mode of a variable. However, it doesn’t show the frequency of the most frequent value. To solve this problem, you can add the MODE option to the PROC UNIVARIATE statement. As a consequence, the SAS report contains a separate section with the mode and frequency.

  4. Define the relevant variable

    You define the variable you want to examine with the VAR statement. This statement starts with the VAR keyword followed by one or more numeric variables.

  5. Run the UNIVARIATE procedure

    To submit and execute the UNIVARIATE procedure, you finish your code with the RUN statement.

In the example below we calculate the mode of the variable horsepower from the CARS dataset in the SASHELP library.

proc univariate data=sashelp.cars mode;
    var horsepower;
run;
Find most frequent value in SAS with PROC UNIVARIATE

Note that, you can’t find the most frequent value of a character variable with PROC UNIVARIATE (nor with PROC MEANS). However, you can use PROC SQL to find the mode of a character variable.

Calculate the Mode per Group with PROC UNIVARIATE

Above we explained how to find the overall mode of a variable. However, you can also use PROC UNIVARIATE to calculate the mode within groups.

To find the mode per group in SAS, you need to add the CLASS statement to the PROC UNIVARIATE procedure. The statement starts with the CLASS keyword followed by the variable that defines the groups. As a result, PROC UNIVARIATE generates a report per group with, amongst others, the mode.

In the example below, we calculate the mode of the horsepower variable per drivetrain.

proc univariate data=sashelp.cars mode;
    class drivetrain;
    var horsepower;
run;

The drivetrain variable has 3 distinct values, namely All, Front, and Rear. Hence, the UNIVARIATE procedure generates 3 reports each of which contains the mode of the horsepower variable.

Create an Output Dataset with the Most Frequent Value

By default, PROC UNIVARIATE only generates a report. Unfortunately, it doesn’t create a dataset with the mode that you can use in other parts of your program. However, you can add an extra statement to the PROC UNIVARIATE procedure to create the output dataset.

To create an output dataset in PROC UNIVARIATE with the mode, you need to add the OUTPUT statement to your code. The statement starts with the OUTPUT keyword, followed by the OUT=-option, the name of the output dataset, and the mode=-option. With the mode=-option you define the name of the variable that contains the mode.

With the SAS code below, we calculate the most frequent value of the horsepower variable and save the result in an output dataset (work.mode_horsepower). The output dataset has one column, called mode, which contains the mode.

proc univariate data=sashelp.cars mode;
    var horsepower;
    output out=work.mode_horsepower mode=mode;
run;
Create output dataset with mode in SAS

Similarly, you can also create an output dataset with the mode per group. For example, here we calculate the mode per drivetrain and store the result in work.mode_horsepower.

proc univariate data=sashelp.cars mode;
    class drivetrain;
    var horsepower;
    output out=work.mode_horsepower mode=mode;
run;
Create an output dataset with the most frequent value per group in SAS with PROC UNIVARIATE

Use the ROUND=-option to Find the Mode

In the examples above, we’ve calculated the mode of a variable that only contains integers (rounded numbers). However, it’s also possible to calculate the mode of a variable that contains numbers with decimals.

For example, the variable enigineSize has values with one decimal.

The UNIVARIATE procedure has a useful feature to round numbers first before it calculates the mode. For example, in the code snippets below we calculate the mode without and with rounding. As you can see, the results are different.

proc univariate data=sashelp.cars mode;
    var engineSize;
run;
proc univariate data=sashelp.cars mode round=1;
    var engineSize;
run;

Read this article to learn mode about rounding numbers in SAS.

Find the Mode in SAS with PROC MEANS

The second method to find the most frequent value of a variable in SAS is with the PROC MEANS procedure. This SAS Base procedure helps you analyze your data by showing the number of observations, the mean, the standard deviation, the minimum, and the maximum of a variable.

Although not by default, PROC MEANS can also calculate the mode of a variable. You only need to add the MODE option to the PROC MEANS statement and SAS will create a report with the most frequent value of the relevant variable.

In the example below, we use the MODE option to find the most frequent value of the variable horsepower.

proc means data=sashelp.cars mode;
    var horsepower;
run;
Calculate MODE in SAS with PROC MEANS

Similar to the PROC UNIVARIATE procedure, you can’t use PROC MEANS to find the mode of a character variable. Nevertheless, you can use the PROC SQL procedure to do so (see below).

Find the Mode per Group with PROC MEANS

You can also use PROC MEANS to find the mode of a variable per group. To do so, you add the CLASS statement to your code. This statement starts with the CLASS keyword followed by the variable that defines the groups.

With the SAS code below we calculate the mode of the variable horsepower per drivetrain.

proc means data=sashelp.cars mode;
    class drivetrain;
    var horsepower;
run;

By running this code, SAS creates a report with a table of 3 columns

  1. The value of the group
  2. The number of observations in the group, not the frequency of the mode.
  3. The mode of the relevant variable within the group.
Find the most frequent value per group in SAS with PROC MEANS

Create an Output Dataset wit the Mode

Similar to PROC UNIVARIATE, by default PROC MEANS does only generates a report with the mode. However, it is possible to create an output dataset with the most frequent value.

To create an output dataset with the mode in PROC MEANS, you need to add the OUTPUT statement to your code. The statement starts with the OUTPUT keyword, followed by the OUT=-option, and the MODE=-option. You use the OUT=-option to define the name of the output dataset. Likewise, you use the MODE=-option to define the name of the column that holds the mode.

In the example below, we use the OUTPUT statement to save the most frequent value in an output dataset.

proc means data=sashelp.cars mode;
    var horsepower;
    output out=work.mode_horsepower mode=mode;
run;

Besides the mode, the output dataset also contains the _TYPE_ and _FREQ_ columns. You can remove these columns by adding the DROP=-option to your code.

Create output dataset with mode in SAS

Likewise, you can add the OUTPUT statement to the PROC MEANS procedure when you calculate the mode of a variable per group.

By default, the output dataset contains the mode of each group and the overall mode. Therefore, we recommend adding the NWAY option to the PROC MEANS statement. As a result, the output dataset only contains the mode per group (and not the overall mode).

proc means data=sashelp.cars mode nway;
    class drivetrain;
    var horsepower;
    output out=work.mode_horsepower mode=mode;
run;
Create output dataset with most frequent value per group in SAS

Compute the Mode in SAS with PROC SQL

A third method to find the most frequent value in SAS is with PROC SQL. In contrast to the previous methods, you can use PROC SQL to calculate the mode of a character variable.

Finding the mode of a variable in SAS with the PROC SQL procedure is a two-step process. First, you count the frequency of each unique value in the variable. Then, you filter the value with the highest frequency.

You can count the frequency of each unique value with the COUNT function and the GROUP BY clause. With the GROUP BY clause, you tell SAS to count the frequency of each value instead of the overall number of observations.

Next, you use the HAVING clause and the MAX function to find and filter the unique value with the highest frequency.

Below we provide an example.

proc sql;
    create table work.mode_step_1 as
	select horsepower,
		count(*) as freq
	from sashelp.cars
	group by horsepower;
quit;
 
proc sql;
    select horsepower as Mode
    from work.mode_step_1
    having freq = max(freq);
quit;
Find mode with PROC SQL

The SAS code above creates a report with the mode. However, if you want to save the mode in a dataset, you need to add the CREATE TABLE clause to your code.

Calculate the Mode per Group with PROC SQL

You can also use the PROC SQL procedure to find the most frequent value per group. To do so, you only need to add the variable that defines the groups to the GROUP BY clause.

For example, with the code below we find the mode of the variable horsepower per drivetrain.

proc sql;
    create table work.mode_step_1 as
	select drivetrain,
		horsepower,
		count(*) as freq
	from sashelp.cars
	group by drivetrain, horsepower;
quit;
 
proc sql;
    select drivetrain,
	horsepower as Mode
    from work.mode_step_1
    group by drivetrain
    having freq = max(freq);
quit;
Find most frequent value per group in SAS with PROC SQL

One thought on “3 Simple Ways to Find the Most Frequent Value (Mode) in SAS

Comments are closed.