SAS Functions SAS How To's

How to Count the Number of Specific Characters in a SAS String

In this article, we discuss how to count the number of occurrences of specific characters in a SAS string.

You can count the number of occurrences of a specific character in a SAS string with the COUNTC function. This function takes as arguments the string and the character you want to count. The COUNTC function can also count all alphabetic characters, all digits, blanks, etc.

In the remainder of this article you will learn how to use the COUNTC function and get the most out of it.

How to Count All Characters in a SAS String

Although the goal of this article is to demonstrate how to count the number of occurrences of a specific character in a string, you might need to know the total number of characters, too.

In SAS, you can find the total number of characters in a string (i.e., its length), with the LENGTH function. The LENGTH function takes as argument a string and returns the number of characters it has.

Here is an example.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_characters = length(string);
	output;
 
	string = 'abc123';
	count_characters = length(string);
	output;
 
	string = 'A!b %C-9';
	count_characters = length(string);
	output;
run;
 
proc print data=work.ds noobs;
run;
Count the length of a string in SAS

Do you know? How to Find the Length of a Numeric Variable

How to Count Specific Characters in a SAS String

You can use the COUNTC function to find the number of times a specified character appears in a SAS string.

The COUNTC function has 2 required arguments and 1 optional argument:

  • Sting: Your SAS String
  • Character(s): The character(s) of which you want to count.
  • Modifier(s) (optional): Modifies the behavior of the COUNTC function.
COUNTC(string, character(s) <,modifier(s)>)

Count One Specific Character

A common task in SAS is to count the number of times one specific character appears in a string.

You can use the COUNTC function to count the number of occurrences of one specific character in a SAS string. To do so, you need to provide two arguments. The first argument is your string. The second argument is the character you want to count (enclosed between quotes).

The SAS code below count the number of times the character a occurs in a string.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_a = countc(string, 'a');
	output;
 
	string = 'ababab';
	count_a = countc(string, 'a');
	output;
 
	string = '123456';
	count_a = countc(string, 'a');
	output;
 
	string = 'AAAA';
	count_a = countc(string, 'a');
	output;
run;
 
proc print data=work.ds noobs;
run;

As you can see in the image below, the COUNTC function is case-sensitive. In other words, it counted the number of occurrences of the letter a (lowercase) but ignored the letter A (uppercase).

Count the character a

If you want the COUNTC function to count both lowercase and uppercase characters, you need a modifier.

Count Multiple Characters

You can also use the COUNTC function to count the occurrence of multiple characters. In order to do so, you need to provide the characters you want to count as the second argument of the COUNTC function.

For example, below we count the number of a‘s and b‘s.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_ab = countc(string, 'ab');
	output;
 
	string = 'ababab';
	count_ab = countc(string, 'ab');
	output;
 
	string = '123456';
	count_ab = countc(string, 'ab');
	output;
 
	string = 'AAAABBBB';
	count_ab = countc(string, 'ab');
	output;
run;
 
proc print data=work.ds noobs;
run;

Note that, by default, the COUNTC function is case-sensitive. Hence, in this example, the COUNTC function ignores capital A‘s and capital B‘s.

Count the characters a and b in a SAS string

Count All Alphabetic Characters

If you want to count the number of all alphabetic characters is a SAS string, you could provide the COUNTC function with all letters as second argument (aAbBcCdD …). However, this would result in a very long second argument.

Instead, you can use the modifier argument of the COUNTC function to count all alphabetic characters in a SAS string (both lowercase and uppercase). More specifically, you need the A-modifier as the third argument. You can leave the second argument blank.

In the example below, we show how to use a modifier to count all alphabetic characters. Note that we leave the second argument blank, i.e., we use two consecutive commas.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_alphabetic_chars = countc(string,,'a');
	output;
 
	string = 'ABCDE';
	count_alphabetic_chars = countc(string,,'a');
	output;
 
	string = '123456';
	count_alphabetic_chars = countc(string,,'a');
	output;
 
	string = 'A$b&amp;C';
	count_alphabetic_chars = countc(string,,'a');
	output;
run;
 
proc print data=work.ds noobs;
run;

It’s also worth noting that the COUNTC function is case-insensitive when you use a modifier.

Count all alphabetic characters

Count All Digits

Like counting all alphabetic characters, you can also use a modifier to count the number of digits in a SAS string. In this case, you need the D-modifier. Again, you can leave the second argument blank.

Below we provide an example how to use the D-modifier and count the number of digits.

data work.ds;
	length string $20;
 
	string = '12345';
	count_digits_chars = countc(string,,'d');
	output;
 
	string = '20.5';
	count_digits_chars = countc(string,,'d');
	output;
 
	string = 'ABC-123';
	count_digits_chars = countc(string,,'d');
	output;
 
	string = 'Two';
	count_digits_chars = countc(string,,'d');
	output;
run;
 
proc print data=work.ds noobs;
run;
Count the digits in a SAS string

Count All Alphabetic Characters and Digits

You can also combine modifiers to make your code more efficient.

For example, to count all alphabetic characters (lowercase and uppercase) ánd all digits, you can use the COUNTC function and combine the A and D modifiers. In the example below, we show how to do this.

data work.ds;
	length string $20;
 
	string = '12345';
	count_alphabetic_digits_chars = countc(string,,'ad');
	output;
 
	string = '20.5';
	count_alphabetic_digits_chars = countc(string,,'ad');
	output;
 
	string = 'ABC-123';
	count_alphabetic_digits_chars = countc(string,,'ad');
	output;
 
	string = 'Two';
	count_alphabetic_digits_chars = countc(string,,'ad');
	output;
run;
 
proc print data=work.ds noobs;
run;
Count all alphabetic characters and digits in a SAS string

Count All Spaces

Another task could be to count the number of spaces (blanks) in a SAS string.

For this task you can use the COUNTC function and the S-modifier. If your string has trailing blank which you want to be ignored, you need to add the T-modifier, too. This modifier removes trailing blanks.

Here we show an example.

data work.ds;
	length string $20;
 
	string = '12345';
	count_spaces = countc(string,,'st');
	output;
 
	string = 'A B';
	count_spaces = countc(string,,'st');
	output;
 
	string = '1 2 3';
	count_spaces = countc(string,,'st');	
	output;
 
	string = 'A/B';
	count_spaces = countc(string,,'st');
	output;
run;
 
proc print data=work.ds noobs;
run;
Count spaces

Do you know? How to Efficiently Remove Leading and Trailing Blanks

Ignore Lowercase and Uppercase

As mentioned before, the COUNTC function is case-sensitive.

If you want to count the number of occurrences of a character in both lowercase and uppercase, you could explicitly mention them both as the second argument. For example, to count the number of a’s and A’s, you could use ‘aA’ as the second argument.

However, you can use the I-modifier to make the COUNTC function case-insensitive and count both lowercase and uppercase characters in a SAS string.

See the example below.

data work.ds;
	length string $20;
 
	string = 'abccba';
	count_a_ignore_case = countc(string,'a','i');
	output;
 
	string = 'ABCCBA';
	count_a_ignore_case = countc(string,'a','i');
	output;
 
	string = 'AAAaaa';
	count_a_ignore_case = countc(string,'a','i');	
	output;
 
	string = '1234';
	count_a_ignore_case = countc(string,'a','i');
	output;
run;
 
proc print data=work.ds noobs;
run;
Ignore cases when you count characters in a SAS string

Count All Not-Specified Characters

Now, suppose you want to count the number of all character that you have not specified.

To count all characters in a SAS string that you don’t explicitly specify, you can use the V-modifier. Note that, the COUNTC function remains case-sensitive when you use this modifier.

In the example below, we use the V-modifier to count all characters that are neither a nor b. We also add the T-modifier to ignore trailing blanks.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_not_ab = countc(string,'ab','vt');
	output;
 
	string = 'ab-AAABBB';
	count_not_ab = countc(string,'ab','vt');
	output;
 
	string = '123';
	count_not_ab = countc(string,'ab','vt');	
	output;
 
	string = 'aaa';
	count_not_ab = countc(string,'ab','vt');
	output;
run;
 
proc print data=work.ds noobs;
run;
Count all not-specified characters

More Examples

Here we provide some extra examples of how to use the COUNTC function.

Count All Not Alphabetic Nor Digit Characters

You can use the A-, D-, T-, and V-modifiers to count all the characters that are neither alphabetic characters nor digits. Each modifier has the following function:

  • A-modifier: Count the number of alphabetic characters.
  • D-modifier: Count the number of digits.
  • T-modifier: Remove trailing blanks.
  • V-modifier: Count all characters that aren’t mentioned before, i.e., alphabetic characters and digits.

See the SAS code below for an example.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_not_alphabetic_not_numeric = countc(string,,'adtv');
	output;
 
	string = '12345';
	count_not_alphabetic_not_numeric = countc(string,,'adtv');
	output;
 
	string = '20.5%';
	count_not_alphabetic_not_numeric = countc(string,,'adtv');	
	output;
 
	string = 'mail@example.com';
	count_not_alphabetic_not_numeric = countc(string,,'adtv');
	output;
run;
 
proc print data=work.ds noobs;
run;

Count All Uppercase Characters

Another useful modifier is the U-modifier. This modifier counts the number of uppercase characters. See the example below.

data work.ds;
	length string $20;
 
	string = 'abcde';
	count_uppercase = countc(string,,'u');
	output;
 
	string = 'ABCDE';
	count_uppercase = countc(string,,'u');
	output;
 
	string = 'AA-80%';
	count_uppercase = countc(string,,'u');	
	output;
 
	string = 'heLLO!';
	count_uppercase = countc(string,,'u');
	output;
run;
 
proc print data=work.ds noobs;
run;

Like the U-modifier, SAS provides also a modifier to count all lowercase characters, namely the L-modifier.

Count All Consonants

As a final example, we show how to count the number of consonants in a SAS string (both lowercase and uppercase). That is to say, all alphabetic characters that except a, e, i, o, and u.

To do so, you need the COUNTC function and the COMPRESS function. There are two steps:

  1. Keep only the alphabetic characters from a string with the COMPRESS function.
  2. Count the number of characters that are not a, e, i, o, or u. You can do this by providing a, e, i, o, and u as the second argument and use the V-modifier.
data work.ds;
	length string $20;
 
	string = 'abcde';
	count_consonants = countc(compress(string,,'ak'),'aeiou','v');
	output;
 
	string = '12345';
	count_consonants = countc(compress(string,,'ak'),'aeiou','v');
	output;
 
	string = 'aEiOu.&amp;Np';
	count_consonants = countc(compress(string,,'ak'),'aeiou','v');	
	output;
 
	string = 'mail@example.com';
	count_consonants = countc(compress(string,,'ak'),'aeiou','v');
	output;
run;
 
proc print data=work.ds noobs;
run;

Do you know? All the Uses of the COMPRESS Function

2 thoughts on “How to Count the Number of Specific Characters in a SAS String

Comments are closed.