You can use the SUBSTR() function in combination with the LENGTH() or REVERSE() function to extract the last character from a string in SAS. We demonstrate how to apply these two methods, as well as how to extract the last alphabetic or numeric character from a string.
Contents
Two Methods to Extract the Last Character from a String
In this section, we demonstrate how to extract the last character from a string in SAS in two ways.
Method 1: Using the LENGTH Function
The first method to get the last character from a string combines the power of the SUBSTR() function and the LENGTH() function.
With the SUBSTR() function you can extract a substring from a longer string. The SUBSTR() function has three arguments, namely:
- String (required): A character string from which you want to extract a substring.
- Position (required): A number that specifies the position of the start of the substring.
- Length (optional): A number the specifies the length of the substring. If you omit this argument, the substring will consist of all characters from the starting position until the end of the string.
So, for example:
data _null_; substr_with_length = substr("Hello World!", 3, 7); put "SUBSTR() with Length Argument: " substr_with_length;substr_without_length = substr("Hello World", 3);
put "SUBSTR() without Length Argument: " substr_without_length;
run;
We can use the SUBSTR() function to extract the last character of a string by setting the position argument equal to the position of the last character. By definition, the position of the last character in a string is always equal to the length of the string.
If the length of your string is always equal, you can set the position argument by hand. However, if the length of your string differs from time to time, you need the LENGTH() function.
The LENGTH() function has one argument, namely a string, and returns its length (i.e., the number of characters in the string). So, for example:
data _null_; length_string = length("Hello World!"); put "Length of string: " length_string; run;
Now, we combine the SUBSTR() function and the LENGHT() function to extract the last character of the strings in the table below.
data work.last_char_1; set work.my_ds;my_string_length = length(my_string);
last_char = substr(my_string, my_string_length, 1);
run;
Note that you could write the code above more efficiently. First of all, you could use the LENGTH() function as a nested function within the SUBSTR() function. And second, you could omit the length argument because the position argument is already the last character of the string.
So, you could use this code:
data work.last_char_1;
set work.my_ds;
last_char = substr(my_string, length(my_string)
);
run;
Method 2: Using the REVERSE Function
The second method to extract the last character from a string in SAS combines the SUBSTR() function and the REVERSE() function.
As the name suggests, the SAS REVERSE() function returns its argument (a string) in reversed order. So, for example:
data _null_; reversed_string = reverse("Hello World!"); put "Reversed string: " reversed_string; run;
So, extracting the last character of a string is the same as extracting the first character of the string in reversed order. Therefore, we can use the REVERSE() function as a nested function in the SUBSTR() function and set both the position and length arguments to one.
However, the first character(s) of the reversed string might be a blank (i.e., whitespace). So, to extract the first character, we use the STRIP() function to get rid of any blanks. (In this post, we discuss the STRIP() function in more detail, as well as other functions to remove blanks.)
So, for example:
data work.last_char_2; set work.my_ds;my_string_reverse = reverse(my_string);
last_char = substr(strip(my_string_reverse), 1, 1);
run;
Also, the code of this method could be written a bit more efficiently (but less readable).
data work.last_char_2;
set work.my_ds;
last_char = substr(strip(reverse(my_string)
), 1, 1);
run;
Do you know? How to Count the Number of Specific Characters in a String
Extract the Last N Character from a String
Above. we discussed how to extract the last character from a string. With some slight modifications, you can use these methods also to extract the last N characters from a string. Below we show what to do to get the last 4 characters.
Method 1: SUBSTR() & LENGTH() functions
To extract the last 4 characters from a string, you need to set the position argument of the SUBSTR() function to the fourth to last position of your string (you can omit the length argument). By definition, the fourth to last position of a string is its length minus 3.
So, for example:
data work.last_n_chars_1; set work.my_ds; last_4_chars = substr(my_string, length(my_string)-3); run;
Method 2: SUBSTR() & REVERSE() functions
You can get the last 4 characters from a string using the SUBSTR() function and the REVERSE() function by setting the length argument to 4. Keep in mind that you need an extra REVERSE() function to get the last 4 characters in original order. See the code below.
data work.last_n_chars_2;
set work.my_ds;
last_4_chars = reverse(substr(strip(reverse(my_string)), 1, 4));
run;
Extract the Last Character of a Specific Type from a String
Sometimes your string in SAS consists of different types of characters, like digits, alphabetic characters, or special characters. In this section, we demonstrate how to extract the last character of a specific type from a string.
Extract the Last Alphabetic Character from a String
There are three steps to get the last alphabetic character from a string.
- Create a new string that only contains the alphabetic characters of the original string.
- Determine the number of characters of the new string.
- Extract the last character from the new string.
For step 1, we use the COMPRESS() function. You can use the COMPRESS() function for many things, one of which is removing unwanted characters. To create a new string with only alphabetic characters, you can use the following code.
data _null_; alphabetic_char_only = compress("a1!b2$c3%", ,"ka"); put "Only alphabetic characters: " alphabetic_char_only; run;
We used “ka” as the third argument of the COMPRESS() function. These letters stand for keep alphabetic.
(Although the COMPRESS() function is mostly used to remove blanks, it has many more functionalities. Read this post to learn more about the COMPRESS() function.)
Since we now know how to create a new string that keeps only the alphabetic characters, we can use the SUBSTR() and LENGTH() functions to extract the last alphabetic character from a string.
So, for example:
data work.last_alphabetic_char; set work.my_ds;alphabetic_chars = compress(my_string, ,"ka");
alphabetic_chars_length = length(alphabetic_chars);
last_alphabetic_char =substr(alphabetic_chars,alphabetic_chars_length);
run;
Of course, you could write this code more efficiently or use the SUBSTR() function in combination with the REVERSE() function.
last_alphabetic_char = substr(compress(my_string, ,"ka"),length(compress(my_string, ,"ka")))
last_alphabetic_char = substr(strip(reverse(compress(my_string, ,"ka"))),1,1)
To extract the last 4 alphabetic characters from a string, you can use this code.
last_alphabetic_char = substr(compress(my_string, ,"ka"),length(compress(my_string, ,"ka"))-3)
Extract the Last Digit from a String
Finally, we demonstrate how to extract the last digit(s) from a string in SAS. Again, there are three steps:
- Use the COMPRESS() function to create a new string that only contains the digits from the original string.
- Use the LENGTH() function to determine the number of digits in the new string.
- Use the SUBSTR() function to extract the last digit from the new string.
With the COMPRESS() function and the “kd” argument, we can create a new string that contains only the digits from the original string. For example:
data _null_; digit_char_only = compress("a1!b2$c3%", ,"kd"); put "Only digit characters: " digit_char_only ; run;
The “kd” argument in the code above stands for keep digits.
(Many SAS Programmers use the COMPRESS() function only for removing whitespace. However, the function can help you with many other tasks. Read this post to learn more.)
Now that we have a string with only digits, we can use the SUBSTR() function and the LENGTH() function to extract the last digit. So, for example:
data work.last_digit_char; set work.my_ds;digit_chars = compress(my_string, ,"kd");
digit_chars_length = length(digit_chars);
last_digit_char = substr(digit_chars, digit_chars_length);
run;
You could write the code above in a more efficient way and extract the last digit in just one step. Only, you could use the REVERSE() function.
last_digit_char = substr(compress(my_string, ,"ka"),length(compress(my_string, ,"kd")))
last_digit_char = substr(strip(reverse(compress(my_string, ,"kd"))),1,1)
To extract the last 2 digits from a string, you can use this code.
last_2_digits = substr(compress(my_string, ,"ka"),length(compress(my_string, ,"ka"))-1)