As a SAS Programmer, you often need to combine multiple strings into a new variable. But, how do you concatenate strings in SAS? In this article, you find the answer.
In SAS, you can concatenate multiple strings with the concatenation operator (||) or with one of the CAT* functions (CAT, CATT, CATS, and CATX). All these methods combine strings but differ in how they treat blanks.
In this article, we discuss and compare these 5 methods. Additionally, we show how to concatenate variables with PROC SQL.
5 Ways to Concatenate Strings in SAS
In this section, we discuss 5 methods to combine two or more strings. We use these methods in a SAS Data Step. If you want to know how to concatenate strings with PROC SQL, then scroll down for more information.
In some of the examples in this article, we use the sample dataset below. You can copy the SAS code to create the my_data dataset and try the different methods for yourself.
/* Sample Data */ data work.my_data; length Type Name Color $10; Type = "Fruit"; Name = "Apple"; Color = "Green"; Price = 2.50; output; Type = " Fruit"; Name = "Cherry "; Color = " Red "; Price = 1.50; output; Type = " Veggie "; Name = " Onion "; Color = " White"; Price = 0.50; output; Type = "Veggie"; Name = " Carrot"; Color = " Orange "; Price = 1.00; output; run;
As you can see, most values contain (on purpose) leading and/or trailing blanks. These blanks are necessary to demonstrate some of the differences between the concatenation operator and the different CAT* functions.
Method 1: The Concatenation Operator (||)
The first and oldest method to concatenate strings in SAS is the concatenation operator. In other words, the double vertical bar: ||.
You can use the concatenation operator to combine variables, constants, and expressions. Each time you want to combine two strings, you place the concatenation operator between them. So, to combine three strings, you need the following syntax.
string-1 || string-2 || string-3
In the example below, we use the concatenation operator to combine the variables name and price from the sample dataset my_data.
/* Method 1: Concatenation Operator */ data work.method_1; set work.my_data; new_var = "The" || Name || "costs: $" || Price; run;
As the image above shows, we have created a new column, new_var, with the concatenation operator. It combines two columns (name and price) and two constants (The and costs: $).
Notice that the concatenation operator doesn’t remove leading and/or trailing blanks. If you would like to use the concatenation operator but get rid of whitespace, you should add the STRIP function.
Also notice that you can use numeric columns (such as price) when you concatenate variables. In this process, SAS converts the numeric values into a character before it concatenates the strings. Because of this conversion, there are blanks between the dollar sign and the price value in the example.
Method 2: The CAT Function
The second method to concatenate multiple strings in SAS is with the CAT function.
The CAT function is the simplest function of the family of CAT* functions. This function returns a concatenated string of the input items without removing leading nor trailing blanks. However, if the input item is numeric, then its value is converted to a character string, and leading and trailing blanks are removed.
The input items can be variables, constants, or expressions, and are separated by a comma. So, the correct syntax of the CAT function is:
CAT(item-1, item-2, item-3, etc.)
In the example below, we use the CAT function to combine the variables name and price from the sample dataset my_data.
/* CAT Function */ data work.method_2; set work.my_data; new_var = CAT("The", Name, "costs: $", Price); run;
As you can see, the result of the CAT function and the concatenation operator (||) is the same for character input. However, the result differs for numeric input. The CAT function removes blanks from (converted) numeric input, while the concatenation operator doesn’t.
Method 3: The CATT Function
The third method to combine strings in SAS is the CATT function.
The CATT function is part of the CAT* family. You can use this function to concatenate multiple strings while removing trailing blanks from the input items.
The input items of the CAT function can be variables, constants, and expressions. All items are separated by a comma.
CATT(item-1, item-2, item-3, etc.)
In the example below, we use the CATT function to combine the variables name and price from the sample dataset my_data.
/* CATT Function */ data work.method_3; set work.my_data; new_var = CATT("The", Name, "costs: $", Price); run;
As you can see in the table above, the CATT function combined the input items and removed trailing blanks.
The second “T” in “CATT” stands for Trim. The TRIM function in SAS trims the trailing blanks from variables. If you want to know more about the TRIM function and other functions that remove blanks, I recommend this article.
Method 4: The CATS Function
The fourth method to combine multiple strings in SAS is the CATS function.
The CATS function is also part of the CAT* family. This function concatenates strings by first removing leading and trailing blanks. It can combine variables, constants, and expressions. The input items are separated by a comma.
CATS(item-1, item-2, item-3, etc.)
Below, we use the CATS function to combine the variables name and price, as well as two constants.
/* CATS Function */ data work.method_4; set work.my_data; new_var = CATS("The", Name, "costs: $", Price); run;
As you can see in the table above, the CATS function combines the power of the normal CAT function and the STRIP function. The STRIP function is a very useful function that removes both leading and trailing blanks. You can read more about this function, here
Method 5: The CATX Function
The last method to combine multiple strings in SAS is the CATX function.
The CATX function creates a character string by combining multiple variables and separates them with a delimiter. Like the CATS function, the CATX function removing leading and trailing blanks before the concatenation.
The first argument of the CATX function is the delimiter (of one or more characters). After the delimiter follow the items you want to concatenate.
CATX(delimiter, item-1, item-2, item-3, etc.)
Note that the delimiter (like the input items) can be a variable, a constant, or an expression. For example, frequently used delimiters are the forward-slash (“/”) or whitespace (” “).
In the example below, we concatenate all the columns of the my_data dataset, separated by a forward-slash (/).
/* CATX Function */ data work.method_5; set work.my_data; new_var = CATX("/", Type, Name, Color, Price); run;
The CATX function is very useful if you want to concatenate many values and separate them by the same character.
To conclude this section, the table below shows a comparison of the features of all the methods to concatenate multiple values in SAS.
|Method||Removes blanks (Character)||Removes blanks (Numeric)||Separator|
|Concatenation Operator (||)||No||No||No|
|CATT Function||Trailing Only||Yes||No|
|CATS Function||Leading & Trailing||Yes||No|
|CATX Function||Leading & Trailing||Yes||Yes|
I highly recommend using always the CATS or CATX function. Especially if you use the concatenated strings in a join or merge. By using these functions, you prevent issues with blanks.
Concatenate a Range of Variables in SAS
Now, take a look at the table below.
Suppose you want to concatenate the variables Model, Type, Origin, and DriveTrain (separated by a forward-slash). You could use the CATX function and write-out all the variable names. For example:
data work.concatenate_range1; set work.my_cars; new_var = catx("/", Model, Type, Origin, DriveTrain); run;
However, if you have many variables, it can be a time-consuming task to write-out all variable names in the CATX function. Fortunately, all CAT* functions support the OF syntax to specify a list of variables. Hence, you could write the code above also as:
data work.concatenate_range2; set work.my_cars; new_var = catx("/", OF Model--DriveTrain); run;
As you can see, the result is the same. So, you use the OF keyword and the double hyphen to select a range of variables to concatenate.
Additionally, you can use the numeric or character keyword between the double hyphens to select a range of variables of a specific type. For example, the code below concatenates only the character variables between the columns col1 and col5.
data work.concat_range_char; set work.my_range_data; new_var = CATX("/", OF col1-character-col5); run;
You can also concatenate only the numeric values.
data work.concat_range_num; set work.my_range_data; new_var = CATX("/", OF col1-numeric-col5); run;
Concatenate all Variables of the Same Type in SAS
You can also you the CAT* functions to concatenate all variables of the same type into one character string (without specifying their names). You need the _character_ keyword or _numeric_ keyword to do this.
For example, you can use this code to concatenate all character variables.
data work.concat_all_char; set work.my_range_data; new_var = CATX("/", of _character_); run;
For numeric columns, you can use this code:
data work.concat_all_num; set work.my_range_data; length new_var $200; new_var = CATX("/", of _numeric_); run;
Note: If you want to concatenate all the numeric values with the CATX function, it is necessary to define the length of the new variable (for example new_var). For the other CAT* functions, this isn’t necessary.
Concatenate Strings in SAS with PROC SQL
If you prefer working with PROC SQL, then you can use all CAT* functions to concatenate strings. You can also use the concatenation operator (||) only if all the values you want to concatenate are character values. If one of the values is numeric, then SAS will return an error:
ERROR: Concatenation (||) requires character operands.
/* Concatenate with PROC SQL */ proc sql; create table work.concat_sql as select *, "The" || Name || "costs: $" || Price as concat_operator, /* error */ CAT("The", Name, "costs: $", Price) as cat_func, CATT("The", Name, "costs: $", Price) as catt_func, CATS("The", Name, "costs: $", Price) as cats_func, CATX("/", Type, Name, Color, Price) as catx_func from work.my_data; quit;
Unlike the concatenation operator (||) in a SAS Data Step, the concatenation operator in PROC SQL doesn’t convert numeric values automatically into a character value. So, to make the concatenation operator work with numeric values in PROC SQL, you first need to convert the numeric values explicitly into a character value with the PUT function.