On some occasions, you have a SAS dataset of which you want to change some variable name(s). For example, after importing an Excel file. So, how do you rename a variable in SAS?
You rename a SAS variable with the RENAME option. The RENAME option is a data set option which enables you to change the name of a variable. First of all, you define the old variable name, then an equal sign, and finally the new variable name. You can rename one or more variables with one RENAME option.
The RENAME Option
Before we show how to use the RENAME option, we start with the syntax of the RENAME option.
RENAME=(old-name-1=new-name-1 <old-name-2=new-name-2 ...>)
As you can see, you can change the name of one or multiple variables with one RENAME option.
The RENAME option is a dataset option. So, you can use this option for both the input dataset (SET statement) as well as the output dataset (DATA statement).
If you use the RENAME option together with the KEEP, DROP, or WHERE statement, keep the following in mind:
- The RENAME option is applied after the KEEP and DROP options. Therefore, the KEEP and DROP options use the old variable name.
- The RENAME option is applied before the WHERE option. So, you should use the new variable name in the WHERE option.
Rename One Variable
In this example, we show how to rename one variable in a SAS dataset.
Our small sample dataset has three columns, namely variableA, variableB, and columnC. We want to change the name of variableA to FirstColumn.
We use the RENAME option as a dataset option in the SET statement. Between parenthesis, we write the name of the column we want to change, an equal sign, and the new column name.
As you can see in the image above, we have successfully changed the name variableA to FirstColumn.
We have also created an extra column (extraColumn) by concatenating the character “A” to the value of FirstColumn. Note that we had to use the new column name (FirstColumn) instead of the old column name (variableA) to create this extra column. We had to because we used the RENAME option as a dataset option in the SET statement. So, SAS no longer recognizes the old variable name(s) while processing the data.
Rename Multiple Variables
Above we showed how to change the name of one variable. But, how to do you rename multiple variables in a SAS dataset?
You can use the RENAME option also to change the name of multiple variables (at once). The RENAME option is a dataset option that you can use in the DATA statement and SET statement. You write the old variable names followed by an equal sign and the new variable names, all between parenthesis.
In this example, we use the same dataset as above. However, now we want to rename multiple column names. More specifically, we want to change the variables variableA, variableB, and columnC to FirstColumn, SecondColumn, and ThirdColumns, respectively.
We use the RENAME option as a dataset option in the DATA statement.
Note that you don’t need a comma to separate the variables you want to rename.
Besides renaming variables, we also created an extra column (extraColumn). In this extra column, we added the character “A” to the value of columnA. As you can see in the code above, we had to use the old variable name to create the extra column. Since we used the RENAME option as a dataset option in the DATA statement, the change of variable names has only an effect on the output dataset. Therefore, we use the original variable names while processing the data.
Rename All Variables
Up till now, we have explicitly specified in the RENAME option which variable name(s) we want to change. However, if you have a dataset with many columns and you want to rename all of them, explicitly specifying all names can be a bothering task. So, how can you rename all variables in a SAS dataset?
You can use the SAS dictionary tables and a SAS macro variable to change all variable names at once. In the examples below, we show how to use this method to add a suffix/prefix to all variables, and how to change variable names with the same pattern.
Rename All Variables by Adding a Suffix
In this section, we demonstrate how you can rename all variables and add a suffix. But first, we need to discuss the SAS dictionary tables.
A dictionary table contains metadata about all datasets in your current SAS session. There exist different dictionary tables. We use the columns dictionary table to rename our variables.
Below we show a snippet from the dictionary table columns. It contains information about the columns of the famous Iris Flower dataset and our sample dataset “MY_DS”.
We use the following steps to rename all variables and add a suffix:
- Obtain all column names of our dataset
We use the libanme and memname columns from the dictionary.columns table to filter the metadata of our dataset.
- Create a string that will serve as an argument for the RENAME option
With the CATS function, we create a string with the old_name = new_name format.
For more information about the CATS functions and other useful functions to concatenate strings, see this article.
- Save the syntax for the RENAME option in a SAS macro variable
With the SELECT INTO clause, we create a SAS macro variable that contains all arguments for the RENAME option.
For more, detailed information about the powerful SELECT INTO clause, see this article.
- Rename the variables with the RENAME option
Finally, we rename the variables of our dataset by combining the RENAME option and the SAS macro variable
Rename All Variables by Adding a Prefix
Above we demonstrated how to add a suffix to all your variable names. We use the same steps to add a prefix to all columns. The only difference is in the arguments of the CATS function.
Rename All Variables Starting with the Same Prefix
Finally, we demonstrate how to rename all variables that start with the same pattern in a SAS dataset. For example, we want to change all variable names that start with variable and replace it with my_column.
We use a similar approach as in the previous examples. Again, we use the dictionary tables to retrieve the column names we want to change and save the argument of the RENAME option as a macro variable. However, there are two differences:
- We use an extra filter to select only those columns that start with a given pattern. To filter only the columns that start with variable, we use the filter name like “variable%”.
- Within the CATS function, we use the TRANWRD function to replace variable by my_column. For more information about how to replace a word in a string with another string, see this article.