In this article, we discuss how to create and use an array in SAS. We also show how to combine arrays and DO loops to make your SAS code more efficient. We support all topics with examples.
Contents
What is an Array
A SAS Array is a set of variables of the same type. The variables in an array are called elements and can be accessed based on their position, i.e., their index. You use the name of the array to reference the set of variables.
Arrays are useful to create new variables and carry out repetitive tasks. For example, with a DO loop.
You can declare and use an array in a SAS DATA Step or in a SAS Macro Function.
How to Create an Array in SAS
The ARRAY statement creates an array in SAS. This statement starts with the ARRAY keyword, followed by the array name, the array lentgh, and some optional parameters.
array array-name {n} <$> <length> <array-elements> <initial-values>;
For example,
array temperature_array {12} tempCelciusMonth1 - tempCelciusMonth12;
Below we describe the syntax of the ARRAY statement in more detail.
They ARRAY Keyword
The ARRAY statement starts with the ARRAY keyword. This keyword tells SAS to define a new array.
The Array Name
The ARRAY keyword is followed by the name of the array.
The array name must be a valid SAS name. That is to say, it has a maximum length of 32 characters and starts with a letter or an underscore. Subsequent characters must be alphanumeric characters. An array name can’t contain blanks nor special characters (except underscores).
Besides that the array name must be a valid SAS name, it can’t have the same name as any variable on the dataset. Also, although possible, you should not give your array the same name as an existing SAS function.
The Array Length
The last mandatory part of the ARRAY statement is the length of the array.
The array length specifies the number of elements in the array and must be enclosed in braces, brackets, or parenthesis. You can use the asterisk (*) to let SAS determine the number of elements. In this case, you might even omit the specification of the array elements.
The Array Type
A common question about arrays in SAS is “how to create a numeric or character array?“.
By default, the elements of an array are numeric. However, you can create a character array in SAS by placing the dollar sign ($) after the length of the array.
For example: Here we define a character array with three elements.
array names_array {3} $ first_name middle_name last_name;
Note: If the elements have been already defined as characters (e.g., in the SET statement), it isn’t necessary to use the dollar sign.
The Elements Length
All elements in a SAS array have a default length of 8 bytes. For an array with numeric elements, this is enough. But, a character element of 8 bytes can only store information of at most 8 characters.
In the array below, the first name, middle name, and last name can only contain names of 8 characters. Names of more than 8 characters will be truncated. For example, Rodriguez would be stored as Rodrigue (without the last z).
array names_array {3} $ first_name middle_name last_name;
The standard length of the elements can be changed by specifying the desired length after the dollar sign ($) (for character arrays) or after the array length (for numeric arrays).
For example. Here we define a character array whose elements have a length of 20.
array names_array {3} $ 20 first_name middle_name last_name;
Note 1: If the elements have been already defined as characters (e.g., in the SET statement), it isn’t necessary to use the dollar sign.
Note 2: The length of elements in a numeric array must be between 3 and 8.
Do you know? How to Change the Length of an Ordinary Variable
The Array Elements
The array elements can be existing variables or new variables. All elements must have the same type, either numeric or character.
If the elements of the array are existing variables, then the array automatically inherits the type and length of the variables. If the elements of the array are new variables, SAS creates a numeric array with elements of length 8.
There are 5 ways to specify the elements of an array.
1. Explicitly define the names of the elements
You can explicitly define the elements of an array. This method works for elements based on existing variables and new variables. For example:
array names_array {3} $ first_name middle_name last_name;
Note that in this case you can omit the length of the array or use the asterisk.
2. Select a range of elements
With a single hyphen (-) you select (for existing variables) or create (for new variables) a range of elements. For example, below we create a numeric array of 6 elements called height1, height2, …, height6.
array height_array {6} height1-height6;
If you use the single hyphen to define the elements of an array, it isn’t necessary to explicitly specify the length of the array. You could either use the asterisk or omit the length definition.
array height_array {*} height1-height6; array height_array height1-height6;
3. Select all variables between a start and end variable
With a double hyphen (–) you can select a range of existing variables to form the elements of an array. If you use this method, SAS will create elements of all variables between the first and last variables.
For example, here we use a double hyphen to create a numeric array of all variables between the EngineSize and Weight variables of the CARS dataset in the SASHELP library.
data work.test; set sashelp.cars; array height_array {*} EngineSize--Weight; run;
If you use a double hyphen to select the elements of the array, it isn’t necessary to explicitly define the length of the array.
Note: The start variable, the end variable, and all variables in-between must be either numeric or character.
4. Select all variable of the same type
There is an efficient way to create an array of all existing numeric or character variables.
With the _NUMERIC_ keyword, you can declare an array where all the numeric variables are used as elements. Alike, you can use the _CHARACTER_ keyword to select all character variables as the array elements. If you use one of these keywords, it isn’t necessary to define the type and length of the array.
For example, here we create a numeric array of all the numeric variables in the CARS dataset from the SASHELP library.
data work.test; set sashelp.cars; array my_array {*} _numeric_; run;
5. Let SAS create the elements
If you don’t specify the elements of the array with one of the methods mentioned above, SAS creates elements based on the array name. In this case, it is necessary to define the length of the array.
For example, with the ARRAY statement below, we create an array of length 6 where the elements are called height1, height2, …, height6.
data work.test; array height {6}; run;
This method only works for creating new variables (numeric and character).
The Initial Values
You can assign initial values to the elements of an array in the form of a list.
The initial values can be either numeric or character. Character strings must be enclosed in quotation marks. The list of initial values must be written between parenthesis, and the values can be separated by a blank or a comma.
In the example below, we define a numeric array of 4 weights and assign initial values.
data work.test; array weight_array {4} weight1-weight4 (80, 75, 92, 87); run;
If a list of initial values has fewer values than elements in the array, the latter elements won’t be assigned an initial value. Also, SAS writes a warning to the log.
Examples
Here we provide some examples of how to create numeric and character arrays.
How to Create a Numeric Array
Here we create a numeric array of 4 elements. The elements represent the temperature in Fahrenheit on 4 different days. We assign the elements an initial value.
You create a numeric array in SAS with the ARRAY statement. This statement consists of:
- The ARRAY keyword.
- The array name.
- The length of the array.
- The elements (optional).
- The initial values (optional).
data work.temperature; array temperature_array {4} temp1 temp2 temp3 temp4 (58, 69, 80, 72); run; proc print data=work.temperature noobs; run;
You can write the code above in a more efficient way. Here are 2 ways to create the same array.
data work.temperature; array temperature_array {*} temp1-temp4 (58, 69, 80, 72); run; data work.temperature; array temp {4} (58, 69, 80, 72); run;
It isn’t mandatory to assign the elements an initial value. If you omit the list of initial values, the variables temp1 until temp4 will have missing values.
Do you know? How to Replace Missing Values
How to Create a Character Array
Here we create a character array of 3 elements. The elements represent 3 cities.
You need an ARRAY statement to define a character array in SAS. This statement has the follow parts:
- The ARRAY keyword.
- The array name.
- The array length.
- The dollar sign ($).
- The length of the elements (e.g., 20).
- The elements (e.g., city1, city2, and city3) (optional).
- The initial values (e.g., Paris, Rome, and Amsterdam) (optional).
data work.city; array city_array {*} $ 20 city1-city3 ("Paris", "Rome", "Amsterdam"); run; proc print data=work.city noobs; run;
The key to creating a character array in SAS is the dollar sign ($). If you omit the dollar sign, SAS will assume that the elements of the array are numeric.
How to Use an Array in SAS
Like mentioned before, SAS arrays are extremely useful to create new variables and carry out repetitive operations. Here we provide some examples.
Referencing Elements of an Array
If you work with arrays in SAS, you need to know how to reference their elements. You access an element in an array based on its position (or index) in the array.
The first element of an array in SAS has position 1, the second element has position 2, etc. In contrast to many other programming languages, the first element in a SAS array has position 1 instead of 0.
To reference an element of an array, you need the array name followed by the element’s position between braces.
For example, here we refer to the second element of an array.
data work.city; array city_array {*} $ 20 city1-city3 ("Paris", "Rome", "Amsterdam"); put "The 2nd city is: " city_array{2}; run;
Array Operators
You can use the elements of an array to carry out arithmetic operations and as arguments of numeric and character functions. Hence, you could sum, multiply, concatenate, etc. the elements of an array.
However, there are two operators that are especially powerful when you work with arrays, namely the IN operator and the OF operator. Here we discuss what they do and how to use them.
The IN Operator
The IN operator checks if a given value is equal to the value of one of the elements of an array. You can use this operator for numeric and character values.
For example, here we check if the city_array contains the value Madrid. We create a new column check_Madrid with the answer.
data work.city; array city_array {*} $ 20 city1-city3 ("Paris", "Rome", "Amsterdam"); if "Madrid" IN city_array then check_Madrid = "Yes"; else check_Madrid = "No"; run; proc print data=work.city noobs; run;
The OF Operator
You can use the OF operator when you carry out arithmetic operations. With the OF operator, SAS takes all elements of the array into account while performing the calculation. You place the OF operator within the arithmetic function followed by the array name and an asterisk between parenthesis.
For example, here we create a SAS dataset with the test scores of three students. We want to efficiently calculate the average score and highest score.
data work.students; infile datalines dlm=","; input student_name $ student_id physics biology geography; datalines; Mike, 1, 70, 65, 82 Maria, 2, 88, 75, 79 Alex, 3, 64, 72, 80 ; run; proc print data=work.students noobs; run; data work.scores; set work.students; array scores_array {*} physics--geography; avg_score = mean(of scores_array(*)); max_score = max(of scores_array(*)); run; proc print data=work.scores noobs; run;
Use an Array in a DO Loop
Have you ever performed the same operation on multiple variables by writing many lines of SAS code? Did you wonder if you could obtain the same results with less code?
If you have to perform a repetitive task, you can combine the power of the SAS array and a DO loop to make your code more efficient.
Instead of writing one line of code for each variable, you can create an array of the variables you want to modify. Then, you use the DO loop to efficiently iterate over each element of the array and carry out the desired operation.
Example 1: How to Convert all Character Variables into Uppercase
In the example below, we use an array and a DO loop to convert all character variables of the CARS dataset into uppercase.
Before
proc print data=sashelp.cars (obs=10) noobs; run; data work.cars_upcase; set sashelp.cars; array char_array {*} _character_; do i = 1 to dim(char_array); char_array(i) = upcase(char_array(i)); end; run; proc print data=work.cars_upcase (obs=10) noobs; run;
After
These are the steps to convert all character variables in SAS into uppercase:
- Define an array of all character variables in the dataset with the _CHARACTER_ keyword.
- Create a DO loop to iterate over all the array elements.
- Use the UPCASE function to convert all values into uppercase.
To loop over all elements of an array, you need to know its length. You can determine the length of an array in SAS with the DIM function.
Example 2: How to Multiply all Variables by a Constant
In this example, we demonstrate how to efficiently multiply all variables by a constant.
We have created a sample dataset of 5 columns and 3 rows. This dataset contains the temperature in Fahrenheit in 3 cities at 4 moments in time. The goal is to create 4 new columns that contain the temperature in Celsius.
Celsius = (Fahrenheit -32) * (5/9)
data work.temperature_F; infile datalines dlm=","; input city $ temp_F1-temp_F4; datalines; Berlin, 48, 72, 69, 40 Madrid, 59, 85, 80, 52 Paris, 54, 73, 70, 47 ; run; proc print data=work.temperature_F noobs; run; data work.temperature_C; set work.temperature_f; array temp_F_array {4} temp_F1-temp_F4; array temp_C_array {4} temp_C1-temp_C4; do i=1 to 4; temp_C_array(i) = (temp_F_array(i)-32) * (5/9); end; drop i; run; proc print data=work.temperature_C noobs; run;
These are the steps to multiply all variables in a SAS dataset by a constant:
- Define an array of the existing variables.
- Define an array of the new variables that will contain the converted values.
- Create a DO loop to iterate over all the elements in both arrays.
- Apply the desired formula for each element.
Special Arrays in SAS
Besides the normal arrays we have discussed so far, there exist two special arrays in SAS, namely the temporary array and the multi-dimensional array.
Create a Temporary Array
A temporary SAS array is an array that only exists while you execute a DATA Step. Moreover, the elements in a temporary array aren’t associated with variables. Therefore, the elements of a temporary array don’t appear in the output dataset. Temporary arrays are normally used to store constants.
You create a temporary SAS array with the ARRAY statement. The statement starts with the ARRAY keyword, followed by the array name, the length of the array, and the _TEMPORARY_ keyword.
When you define a temporary array, you must explicitly define the number of elements. Hence, you can’t use the asterisk.
Like all other arrays, the elements in a temporary array must be all numeric or all characters. By default, the elements are numeric. You use the dollar sign ($) to define a character array.
Because the elements of a temporary array aren’t based on variables, they don’t have names. Therefore, to reference the array elements, you may only use the array references.
Finally, providing a list of initial values in the ARRAY statement isn’t mandatory. You can set the values in another way while executing the DATA Step.
Example: A Temporary Array
array constants_array {3} (2, 10, 50);
Create a Multi-Dimensional Array
Another special type of array is the multi-dimensional array.
As the name suggests, a multi-dimensional array has more than 1 dimension. In most cases, a multi-dimensional table has two dimensions and is used as a lookup table.
You create a multi-dimensional array with the ARRAY statement. The statement starts with the ARRAY keyword, followed by the array name, and the array dimension. You define the dimension between braces; first the number of elements in the first dimension, then the number of elements in the second dimension, etc.
A multi-dimensional array can be numeric or character. They can also be temporal.
Example: Two-Dimensional Array
array two_dim_array {2,3} (1, 2, 3, 9, 8, 7);
To find the dimensions of a multi-dimensional SAS array you can use the DIM function. The DIM1 function returns the number of elements in the first dimension. The DIM2 function returns the number of elements in the second dimension. Etc.
Common Errors
If you use arrays, especially in combination with DO loops, errors can occur. In this section, we discuss the most common errors and how to solve them.
Too Many Variables Defined
The first common error associated with arrays is “ERROR: Too many variables defined for the dimension(s) specified for the array“.
The cause of this error is that you try to declare an array with more variables than elements. For example:
data work.test; array weight_array {4} weight1-weight6; run;
In the example above, we try to declare an array based on 6 variables (weight1, weight2, …, weight6) while we defined the number of elements as 4. You can fix this error by changing using the asterisk to define the number of elements in the array.
data work.test; array weight_array {*} weight1-weight6; run;
Array Subscript Out Of Range
A second common error regarding arrays and DO loops is the “ERROR: Array subscript out of range“-error. This error occurs in the following situation.
data work.test; array weight_array {6} weight1-weight6; do i=1 to 7; weight_array(i) = i * 10; end; run;
In the code above, we declare an array of 6 elements and try to use a DO loop to set its values. An error occurs because the DO loop performs more iterations (7) than elements in the array (6).
The best way to fix this problem is by making use of the DIM function. The DIM function returns the number of elements in an array. Using the DIM function in the DO loop prevents the DO loop from performing more iterations than elements.
data work.test; array weight_array {6} weight1-weight6; do i=1 to dim(weight_array); weight_array(i) = i * 10; end; run;
Mismatch Between Array Type and Element Type
The last common error we discuss is the “ERROR: Attempt to initialize variable ABC in numeric array with character constant“-error.
This error can occur in the following situation.
data work.test; array my_array {*} value1-value4 ("A", "B", "C", "D"); run;
In this case, we try to declare an array of 4 elements with initial values. By default, all arrays in SAS are numeric. Nevertheless, the initial values are character. Hence, we can fix this error by changing the array type from numeric to character. You can do this with the dollar sign ($).
data work.test; array my_array {*} $ value1-value4 ("A", "B", "C", "D"); run;
For more information about arrays in SAS, check the official documentation.
One thought on “How to Create & Use an Array in SAS (All You Need to Know – Examples!)”
Comments are closed.