In this article, we discuss 3 ways to format a variable in SAS and, as a result, improve the readability of your data.
You can use SAS formats to change the appearance of numeric, character, and date variables in a SAS Data Step, a PROC step, or a SAS Macro. In this article, we focus on formatting variables in datasets with the SAS Data Step, the PROC SQL procedure, and the PROC DATASETS procedure.
However, we’ve also written articles about changing the appearance of variables in different PROC steps (e.g., PROC MEANS, PROC SUMMARY, and PROC TABULATE) as well as in the SAS Macro Language.
Contents
1. How to Format a Variable in a SAS Data Step
To change the appearance of one or more variables within a SAS Data Step, you can use the SAS FORMAT statement. This statement starts with the FORMAT keyword, the variable(s) you want to modify, and the desired format.
Syntax of the FORMAT statement:
FORMAT variable-name format-name;
Example:
FORMAT income DOLLAR10.2;
A SAS Data Step can contain many FORMAT statements to format all variables one by one. However, the FORMAT statement can associate the same format with various variables. Moreover, it can assign different formats to different variables. See, the syntax below.
FORMAT variable-name(s) format-name variable-name(s) format-name variable-name(s) format-name;
For example, here we assign the same format to the variables income and expenses. We also change the appearance of the variable transaction_date.
FORMAT income expenses DOLLAR10.2 transaction_date MMDDYY10.;
Below we show in more detail how to use the FORMAT statement to change the look of numeric, character, and date variables.
Numeric Variables
The FORMAT statement is mostly used to format numeric variables. For example, by adding the dollar sign or dots and commas.
The following SAS code generates a data set with 4 numeric variables which we will use in our examples.
/* Create a Dataset */ data work.my_data; infile datalines; input units_sold price_per_unit_USD total_USD; total_EUR = total_USD * 0.9; datalines; 2 4.5 9 4 1.35 5.4 20 3 60 1500 0.55 825 2100 9.95 20895 ; run; /* Print Dataset */ proc print data=work.my_data noobs; run;
In the example below, we will use the FORMAT statement to change the appearance of one variable, namely total_USD. We use the DOLLAR12.2-format to print a $-sign followed by the value, which will have 2 decimals and commas to separate the thousands.
We place the FORMAT statement after the SET statement to maintain the current structure of the dataset
data work.my_data_fmt; set work.my_data; format total_USD dollar12.2; run; proc print data=work.my_data_fmt noobs; run;
You can also use a single FORMAT statement to change the look of several variables. For example:
data work.my_data_fmt; set work.my_data; format units_sold comma12.0 price_per_unit_USD total_USD dollar12.2 total_EUR euro12.2; run; proc print data=work.my_data_fmt noobs; run;
Instead of 3, SAS Base provides many numeric formats. See the table below.
Value | Formatted Value | Format Name | Description |
---|---|---|---|
10000 | 10,000 | COMMA12.0 | Separates every 3 digits of a numeric value with a comma and a period for decimals |
10000 | 10,000.00 | COMMA12.2 | Separates every 3 digits of a numeric value with a comma and a period for decimals |
10000 | 10.000 | COMMAX12.0 | Separates every 3 digits of a numeric value with a period and a comma for decimals |
10000 | 10.000,00 | COMMAX12.2 | Separates every 3 digits of a numeric value with a period and a comma for decimals |
10000 | $10,000 | DOLLAR12.0 | Same as COMMA format preceded by a $ sign |
10000 | $10,000.00 | DOLLAR12.2 | Same as COMMA format preceded by a $ sign |
10000 | $10.000 | DOLLARX12.0 | Same as COMMAX format preceded by a $ sign |
10000 | $10.000,00 | DOLLARX12.2 | Same as COMMAX format preceded by a $ sign |
10000 | 10.000 € | NLMNLEUR12.0 | Euro currency (no decimals) |
10000 | 10.000,00 € | NLMNLEUR12.2 | Euro currency (two decimals) |
10000 | 10,000 £ | NLMNLGBP12.0 | GBP currency (no decimals) |
10000 | 10,000.00 £ | NLMNLGBP12.2 | GBP currency (two decimals) |
10000 | 10000% | PERCENT10.0 | Percentage with no decimals |
10000 | 1.000E+04 | E10. | Scientific notation with no decimals |
Note, if SAS doesn’t display the format correctly, you might change the length of the format. For example, DOLLAR18.2 instead of DOLLAR12.2.
For more numeric formats, see these pages (1, 2).
Character Variables
Besides numeric variables, you can also use the FORMAT statement to change the look of character variables.
In contrast to numeric formats, there exists just one (frequently used) character format, namely the $CHAR. Therefore, character variables are normally associated with a user-written format. In such cases, a user-written format can be considered as a look-up table.
Character formats are easy to recognize as the name always starts with a $-sign.
For example, the user-written format below associated a grade with a PASS or a FAIL.
data work.my_data; input grade $; datalines; B C A D B . A ; run; proc print data=work.my_data noobs; run; proc format; value $grade_fmt "A", "B", "C" = "PASS" "D", other = "FAIL"; run; data work.my_data_fmt; set work.my_data; format grade $grade_fmt.; run; proc print data=work.my_data_fmt noobs; run;
Date Variables
Apart from numeric and character variables, you can also use the FORMAT statement to change the way how SAS displays date variables.
A SAS date format is a special type of a numeric format because date variables are stored as numbers. Therefore, you can use the FORMAT statement to either change the existing format of a date variable or associate an unformatted numeric variable with a date format.
For instance, this SAS code uses the FORMAT statement to change the format from DATE9. to MMDDYY10.
data work.my_data; input my_date : date9.; format my_date date9.; datalines; 01JAN2021 14JUL2018 30OCT2022 ; run; proc print data=work.my_data noobs; run; data work.my_data_fmt; set work.my_data; format my_date mmddyy10.; run; proc print data=work.my_data_fmt noobs; run;
Like numeric formats, SAS Base offers many date formats. See here a list of the most common ways to display dates.
SAS dates might be confusing. Sometimes a value looks like a date but isn’t a real SAS date. This can happen on two occasions:
- A number looks like a date, e.g. 01012021 (January 1st, 2021), or
- A character looks like a date, e.g. 01/01/2010.
Please, read this article to change a number that looks like a date into a real SAS date or this article to convert a text string into a SAS date.
2. How to Format a Variable with PROC SQL
Instead of associating a variable with a format in a SAS Data Step, you can also format variables with the PROC SQL procedure. In fact, there exist two ways to do so, namely with the SELECT statement or the ALTER TABLE statement.
Format Variables with the SELECT Statement
The most common way to format variables in the PROC SQL procedure is with the SELECT statement. This statement selects one or more variables and directly associates them with a format. To modify how SAS displays a variable, you use the FORMAT=-option followed by the desired format.
Syntax of the FORMAT=-option in the SELECT statement:
SELECT variable-name FORMAT=format-name, variable-name FORMAT=format-name, variable-name FORMAT=format-name, etc.;
Example:
SELECT income FORMAT=DOLLAR10.2, age FORMAT=comma12.0;
You can use the FORMAT=-option also in combination with the LENGTH=-option and the LABEL=-option. If you do so, the order of the options is irrelevant.
In contrast to the SAS Data Step, you can’t use a single FORMAT=-option to format multiple variables at once. In PROC SQL, you need to format each variable separately.
Although formatting improves the readability of your tables, it isn’t necessary to associate a variable with a format. If you omit the FORMAT=-option, SAS will either use:
- the original format of the variable from the input table, or
- no format at all, in case of a new variable.
The following SAS code shows how to use the FORMAT=-option in the PROC SQL procedure.
data work.my_data; infile datalines; input units_sold price_per_unit_USD total_USD; total_EUR = total_USD * 0.9; datalines; 2 4.5 9 4 1.35 5.4 20 3 60 1500 0.55 825 2100 9.95 20895 ; run; proc print data=work.my_data noobs; run; proc sql; create table work.my_data_fmt as select units_sold format = comma12.0, price_per_unit_USD format = dollar12.2, total_USD format = dollar12.2, total_EUR format = nlmnleur18.2 from work.my_data; quit; proc print data=work.my_data_fmt noobs; run;
Format Variables with the ALTER TABLE Statement
A drawback of formatting variables with the FORMAT=-option in the SELECT statement is that it could require many lines of code.
For example, if you want to change the appearance of some variables, you still need to explicitly select all variables. As a result, the SELECT statement can become very long.
Instead of the SELECT statement, you can use the ALTER TABLE statement and the FORMAT=-option to associate a variable with a format without creating a new table.
The statement starts with the ALTER TABLE keyword and the table name. Then after the MODIFY keyword, you specify the variable of interest followed by the FORMAT=-option and the new format.
Syntax of the FORMAT=-option in the ALTER TABLE statement:
ALTER TABLE table-name MODIFY variable-name FORMAT=format-name, variable-name FORMAT=format-name, etc.;
Example:
ALTER TABLE my_data MODIFY income FORMAT=DOLLAR10.2, age FORMAT=comma12.0;
Although the ALTER TABLE statement and the MODIFY clause allow you to only associate a format with the variables of interest, you still can’t assign the same format to multiple variables at once. In other words, you need one FORMAT=-option per variable if you use the ALTER TABLE statement.
Here we use the ALTER TABLE statement to change the look of two variables.
data work.my_data; infile datalines; input units_sold price_per_unit_USD total_USD; total_EUR = total_USD * 0.9; datalines; 2 4.5 9 4 1.35 5.4 20 3 60 1500 0.55 825 2100 9.95 20895 ; run; proc print data=work.my_data noobs; run; proc sql; alter table work.my_data modify units_sold format=comma12.0, total_USD format=dollar12.2; quit; proc print data=work.my_data noobs; run;
3. How to Format a Variable with PROC DATASETS
A less known, but still powerful way to format variables in SAS is with the PROC DATASETS procedure.
The PROC DATASETS procedure, in combination with the MODIFY statement, can change SAS datasets in various aspects. For example, changing the look of a variable with the FORMAT statement.
A FORMAT statement within a PROC DATASETS procedure and the SAS Data Step work identically.
Syntax of the FORMAT statement in the PROC DATASETS procedure:
proc datasets library=library-name; modify dataset-name; format variable-name(s) format-name; run;
Like the same statement in a SAS Data Step, you can use the FORMAT statement in PROC DATASETS to associate several variables with the same appearance, as well as assign different looks to different variables.
For example:
data work.my_data; infile datalines; input units_sold price_per_unit_USD total_USD; total_EUR = total_USD * 0.9; datalines; 2 4.5 9 4 1.35 5.4 20 3 60 1500 0.55 825 2100 9.95 20895 ; run; proc print data=work.my_data noobs; run; proc datasets library=work nolist; modify my_data; format units_sold comma12.0 price_per_unit_USD total_USD dollar12.2; run; proc print data=work.my_data noobs; run;
How to Remove a Format from a Variable
So far, we’ve discussed how to associate a variable with a SAS format. However, how do you remove a format?
You can remove a format from a variable in different ways, such as:
- A blank FORMAT statement in a SAS Data Step.
- A blank FORMAT statement in the PROC DATASETS procedure.
- An empty FORMAT=-option in a SELECT statement of the PROC SQL procedure.
- An empty FORMAT=-option in an ALTER TABLE statement of the PROC SQL procedure.
Below we show the syntax of these options.
Remove a Format with a SAS Data Step
data work.my_data; set work.my_data; format my_variable; run,
Remove a Format with the PROC DATASETS procedure
proc datasets lib=work nolist; modify my_data; format my_variable; run;
Remove a Format with the SELECT statement in PROC SQL
proc sql; select my_variable format= from work.my_data; quit;
Remove a Format with the ALTER TABLE statement in PROC SQL
proc sql; alter table work.my_data modify my_variable format=; quit;
One thought on “How to Format Variables in a SAS Dataset”
Comments are closed.