SAS How To's

Dynamically Rename Multiple Column Names in SAS

While working with data, you may need to change the variables of your data set or table. In this article, we explain how to dynamically rename multiple column names in SAS. First, we will describe the RENAME option of the data step. This option is useful when you want to rename some variables. However, if you need to rename a lot of variables, then the MODIFY statement is more practical. This will be explained towards the end of the article.

The RENAME Syntax

The syntax of the RENAME option is shown below

RENAME = (var_name_1_old = var_name_1_new ….. var_name_n_old = var_name_n_new)

With the RENAME options you can rename one or more variables at the same time. First you specify the old variable name. After the equal-sign follows the new variable name.

The effect of the RENAME options depends on where you place it:

  • In the SET statement: If you write the RENAME option in the SET statement, then the new variable name(s) will take effect when the Program Data Vector (PDV) is created. Hence, all programming statements in the data step must refer to the new variable name(s).
  • In the DATA statement: If you write the RENAME option in the DATA statement, then the new variable name(s) will take effect after all programming statements.

Examples: Rename Column Names in SAS

In the two examples below, we will rename variables using the RENAME options. For this purpose, we create the data set below. This data set contains two columns with student names and their test scores.

/* CREATE A SIMPLE DATA SET */
DATA WORK.GRADES;
	INPUT NAME $ GRADE;
	DATALINES;
MARIA 80
JOHN 70
PETER 85
EMMA 75
MIA 90
WILLIAM 65
;
RUN;
A Simple SAS Data Set

Rename ONE Variable

In this example, we rename the column GRADE to SCORE. We use the RENAME option in the SET statement. See the example code below on how to do this.

/* RENAME THE COLUMN 'GRADE' TO 'SCORE' */
DATA WORK.SCORES;
	SET WORK.GRADES (RENAME=(GRADE=SCORE));
RUN;
SAS Renamed 1 column

Rename TWO Variables

As mentioned in the syntax section, you can use the RENAME option to rename more than one variable at the same time. This example shows how to rename the NAME column to FIRST_NAME, and the GRADE column to SCORE. This time we place the RENAME option in the DATA statement.

/* RENAME THE COLUMNS 'NAME' AND 'GRADE' */
DATA WORK.FIRST_NAME_SCORES (RENAME=(NAME=FIRST_NAME GRADE=SCORE));
	SET WORK.GRADES;
RUN;
SAS Renamed 2 columns

Rename MULTIPLE Variables Dynamically

Now suppose you have a SAS data set with many columns and you want to rename (some of) them. You can use the RENAME option and write out all the changes you need to carry out. However, this may be infeasible because the number of column names to change if very large and/or you don’t know all the column names. In the example below, we show a way to rename multiple column names dynamically at the same time.

For this example, we create a data set with student names and their grades on four exams. The goal is to change the column names from exam_a, exma_b, exam_c and exam_d to exam_1, exam_2, exam_3 and exam_4.

/* CREATE A DATA SET WITH MULTIPLE COLUMNS */
DATA WORK.EXAM_GRADES;
	INPUT NAME $ EXAM_A EXAM_B EXAM_C EXAM_D;
	DATALINES;
MARIA 80 70 90 80
JOHN 70 65 75 75
PETER 85 80 80 90
EMMA 75 80 75 70
MIA 90 85 80 80
WILLIAM 65 70 75 75
;
RUN;
A SAS Data Set with multiple columns

To dynamically rename all aforementioned column names, we create a mapping table. This mapping table contains a column with the old column names and a column with the new column names.

Use the Power of the Dictionary Tables

First, we obtain the current column names using the Dictionary Tables. Dictionary Tables are metadata (i.e., data about data) and are automatically available when a SAS session starts. They are updated automatically whenever there is a change in a data set.

We use the Dictionary Table COLUMNS which contains information about all the columns of the tables in the current session. We need three WHERE statements to select all the numeric columns of our example data set.

/* OBTAIN COLUMN NAMES */
PROC SQL;
	CREATE TABLE WORK.COLUMN_NAMES AS
		SELECT NAME AS COL_NAME
		FROM DICTIONARY.COLUMNS
		WHERE LIBNAME = "WORK"
		AND MEMNAME = "EXAM_GRADES"
		AND TYPE = "num";
QUIT;
The Column Names to be Renamed

Create a Mapping Table

In the previous step, we obtained the current column names of our example data set. In this step, we create the second column of our mapping table. This table contains the new variable names. For each column name, we need to replace the last character (a, b, c, etc.) with an increasing number. We do this with the use of the CATS function, the SUBSTR function, and the FIND function.

/* CREATE A MAPPING TABLE WITH THE NEW COLUMN NAMES */
DATA WORK.MAPPING_COL_NAMES;
	SET WORK.COLUMN_NAMES;
 
	COL_NAME_NEW = CATS(SUBSTR(COL_NAME, 1, FIND(COL_NAME,"_")),_N_);
RUN;
Mapping Table Column Names to be Renamed Dynamically

Create a Macro Variable with all the Renames

The last step to use the MODIFY statement to dynamically rename multiple variable names is to create a macro variable with the syntax. We want a list of all the renames. The code below shows how to create a macro variable with the SELECT INTO statement.

/* SAVE THE MAPPING AS A MACRO VARIABLE */
PROC SQL;
	SELECT CATX("=", COL_NAME, COL_NAME_NEW)
		INTO :LIST_RENAME SEPARATED BY " "
		FROM WORK.MAPPING_COL_NAMES;
QUIT;
 
%PUT &=LIST_RENAME.;
SAS Macro Variable to Rename Column Names Dynamically

Apply the MODIFY Statement

Finally, we can use the macro variable list_rename in the MODIFY statement of the datasets procedure to rename the variables.

PROC DATASETS LIB=WORK;
	MODIFY EXAM_GRADES;
	RENAME &LIST_RENAME.;
RUN;
SAS Data Set with Column Names Dynamically Renamed

Although the total number of lines of code above might seem substantial, this code is very convenient to dynamically rename multiple column names in SAS. Especially when you need to change many (tens, hundreds, etc.) of column names. This code works also if you don’t exactly know all the current column names. For example, if you don’t know how many exams the students made. Furthermore, you can create a macro function of the code above and reuse it.