If you work with data, you can calculate the minimum value per column or row. In a previous article, we discussed how to find the lowest value per column. Now, we will explain how to find in SAS the minimum value per row across multiple columns.
In SAS, you can find the minimum value per row across multiple columns with the MIN function. You use the MIN function in a SAS DATA Step. The MIN function takes as arguments the columns across which you want to calculate the lowest value.
In this article, we demonstrate how to use the MIN function and show some examples. We also use arrays to select multiple columns and make your code more robust and efficient.
Contents
How to Find the Minimum Value of Two Columns
In our first example, we show how to find the minimum value of two columns.
First, we create a SAS dataset that has two columns (first_value and second_value) and 6 rows. For each row we want to find the minimum value of the two columns. Note that this dataset has missing values.
data work.my_ds1; input first_value second_value; datalines; 1 3 6 4 4 -2 5 9 7 . . -6 ; run; proc print data=work.my_ds1 noobs; run;
To find the lowest value of two columns in SAS you can use the MIN function in a DATA Step. The MIN function takes as arguments the names of the columns you want to compare and returns the lowest values. Missing values are ignored by this function.
In the example below, we create a new column lowest_value that will be assigned the minimum value of the columns first_value and second_value.
data work.lowest_value; set work.my_ds1; lowest_value = min(first_value, second_value); run; proc print data=work.lowest_value noobs; run;
As you can see in the image above, the MIN function returns the lowest value of the two columns and ignores missing values.
How to Find the Minimum Value Across Multiple Columns
In the example above, we found the minimum value of two columns. However, you can also use the MIN function to find the lowest value across multiple columns. How to do that will be shown here.
To explain how to find the minimum value across multiple columns we create a new sample dataset. This dataset has four columns (first_value, second_value, third_value, and fourth_value) and 6 rows.
data work.my_ds2; input first_value second_value third_value fourth_value; datalines; 1 3 4 2 6 4 1 . 4 -2 3 0 5 9 -3 -1 7 . 5 -2 . -6 8 3 ; run; proc print data=work.my_ds2 noobs; run;
We can apply the same strategy as in the previous section to find the lowest value. That is, explicitly specifying the names of all the columns you want to compare. With the SAS code below, we show how to do that.
data work.lowest_value; set work.my_ds2; lowest_value = min(first_value, second_value, third_value, fourth_value); run; proc print data=work.lowest_value noobs; run;
In the example above, we found the minimum value of four columns. We did this by explicitly specifying the column names in the MIN function. This works fine if you want to compare few columns, but it can be a tedious task if you have many columns. Fortunately, there is a way to do this more efficiently.
You can use a SAS array to store all column names and use it as an argument of the MIN function. The MIN function will compare all the columns in the array and return the lowest value.
An extra advantage of the array is that you don’t have to specify all columns explicitly that will be part of it. If you want to specify a range of columns (based on their position), you only need to specify the first and last column name separated by a double dash (i.e., –).
In the example below, we first define an array called columns. This array contains all columns between first_value and fourth_value (both included). After that, we use the MIN function to find the lowest value across all columns in our array columns.
data work.lowest_value; set work.my_ds2; array columns first_value -- fourth_value; lowest_value = min(of columns[*]); run; proc print data=work.lowest_value noobs; run;
As you can see, we found the lowest value across four columns without the need to explicitly specifying all column names. This makes our code more efficient.
Do you know? How to Write Efficient Code with an Array
Find the Minimum Value Across Multiple Columns with the Same Name
In the previous section, we demonstrated how to use an array to find the lowest value across multiple columns. We used the first and last column to specify the range of the columns. However, how do you calculate the lowest value of columns with a similar name?
To demonstrate this we create a new dataset with 6 columns and 6 rows. The columns are called value1, value2, value3, value4, value5, and value6. We want to find the lowest value of these columns for each row.
data work.my_ds3; input value1 - value6; datalines; 1 3 4 2 1 3 6 4 1 . 0 2 4 -2 3 0 -1 7 5 9 -3 -1 2 6 7 . 5 -2 -4 5 . -6 8 3 8 4 ; run; proc print data=work.my_ds3 noobs; run;
Again, we make our code more efficient by using an array. This array will also contain a range of columns. However, this array contains columns based on their names. We use a single dash to create an array that contains all columns that are “alphabetically between” the columns value1 and value6. So, in this case, the position of these columns in the dataset doesn’t matter.
In the example below we use this method.
data work.lowest_value; set work.my_ds3; array columns value1 - value6; lowest_value = min(of columns[*]); run; proc print data=work.lowest_value noobs; run;
This method is very useful when you want to find the lowest value across multiple columns that have similar names. For example, all columns that start with the same prefix.
How to Find the Minimum Value of All Numeric Columns
Until now, we have explicitly specified the column names or the range of columns of which we want to find the minimum. But, how do you find the minimum value across all numeric columns?
You can find the lowest value of all numeric columns combining the MIN function and the special keyword _numeric_.
We create a sample dataset to demonstrate how to do this. This dataset has 5 columns and 6 rows. As you can see, the column names are completely different.
data work.my_ds4; input ColA VarB Age Salary X; datalines; 1 3 4 2 1 6 4 1 . 0 4 -2 3 0 -1 5 9 -3 -1 2 7 . 5 -2 -4 . -6 8 3 8 ; run; proc print data=work.my_ds4 noobs; run;
Instead of creating an array with a range of columns (with the double dash), we can use the special keyword _numeric_. In the SAS code below, we use this keyword as the argument of the MIN function to find the minimum value across all numeric columns.
data work.lowest_value; set work.my_ds4; lowest_value = min(of _numeric_); run; proc print data=work.lowest_value noobs; run;
As you can see, the column lowest_value contains the lowest value of all (numeric) columns.
How to Find the Column Name of the Column with the Lowest Value
To finish this article, we demonstrate not only how to find the lowest values, but also how to retrieve the name of the column with the lowest value. This might be useful if you want to save the name of the column as a macro variable.
There are 3 steps to find the name of the column with the lowest value:
- Find the lowest value.
- Find the position of the column with the lowest value.
- Convert the position of the column in the name of the column.
As discussed in the sections above, you can find the lowest value with the MIN function and an array.
To find the position of the column with the lowest value we use the WHICHN function. This function searches for a numeric value that is equal to the first argument and returns its position in the array (the second argument).
Finally, you use the VNAME function to convert the position of the column to the name of the column.
data work.lowest_value; set work.my_ds4; array columns _numeric_; lowest_value = min(of columns[*]); index_lowest_value = whichn(lowest_value, of columns[*]); column_lowest_value = vname(columns[index_lowest_value]); run; proc print data=work.lowest_value noobs; run;
Note that when there are two columns with the (same) lowest value, this method returns the name of the first column with the lowest value.
2 thoughts on “How to Find the Minimum Value Across Columns in SAS”
Comments are closed.