SAS Formats SAS How To's

3 Easy Ways to Format Variables in a SAS Dataset [Examples]

In this article, we explain how to format a variable in SAS and, as a result, improve the readability of your data.

You can use SAS formats to change the appearance of numeric, character, and date variables in a SAS Data Step, a PROC step, or a SAS Macro. In this article, we focus on formatting variables in datasets with the SAS Data Step and the PROC SQL procedure.

However, we’ve also written articles about changing the appearance of variables in different PROC steps (e.g., PROC MEANS, PROC SUMMARY, and PROC TABULATE) as well as in the SAS Macro Language.

1. How to Format a Variable in a SAS Data Step

As mentioned above, there exist several ways to format variables. One of which is the FORMAT statement.

The SAS FORMAT statement changes the format of a variable and is part of a SAS Data Step. The statement consists of the FORMAT keyword, the variable you want to modify, and its new format. The new format can either be a standard SAS Base format or a user-written format.

Syntax of the FORMAT statement:

FORMAT variable-name format-name;

Example:

FORMAT income DOLLAR10.2;

A SAS Data Step can contain many FORMAT statements to format all variables one by one. However, the FORMAT statement can associate the same format with various variables. Moreover, it can assign different formats to different variables. See, the syntax below.

FORMAT variable-name(s) format-name variable-name(s) format-name variable-name(s) format-name;

Example:

FORMAT income expenses DOLLAR10.2 transaction_date MMDDYY10.; 

Below we show in more detail how to use the FORMAT statement to format numeric, character, and date variables.

How to Format a Numeric Variable

The FORMAT statement is mostly used to format numeric variables. For example, by adding the dollar sign or dots and commas.

The following SAS code generates a data set with 4 numeric variables which we will use to show how the FORMAT statement can be used.

/* Create a Dataset */
data work.my_data;
    infile datalines;
    input units_sold price_per_unit_USD total_USD;
    total_EUR = total_USD * 0.9;
    datalines;
2 4.5 9
4 1.35 5.4
20 3 60
1500 0.55 825
2100 9.95 20895
;
run;
 
/* Print Dataset */
proc print data=work.my_data noobs;
run;

To format a numeric variable, and maintain the current structure of the dataset, we use the FORMAT statement after the SET statement. The FORMAT statement starts with the FORMAT keyword followed by the name of a numeric variable and ends with the new format.

In the example below, we will use the FORMAT statement to format one variable. We associate the DOLLAR12.2 format with the variable total_USD. This format displays numeric variables by printing a $-sign followed by a value containing 2 decimals. Moreover, the separates thousands with a comma.

data work.my_data_fmt;
    set work.my_data;
 
    format total_USD dollar12.2;
run;
 
proc print data=work.my_data_fmt noobs;
run;

You can also use a single FORMAT statement to associate different formats with several variables. For example:

data work.my_data_fmt;
    set work.my_data;
 
    format units_sold comma12.0 
	price_per_unit_USD total_USD dollar12.2
	total_EUR euro12.2;
run;
 
proc print data=work.my_data_fmt noobs;
run;

Above, we have used just 3 of the many numeric formats SAS Base offers. Next, we provide a list of the most common numeric formats.

Most used numeric SAS Base formats

ValueFormatted ValueFormat NameDescription
1000010,000COMMA12.0Separates every 3 digits of a numeric value with a comma and a period for decimals
1000010,000.00COMMA12.2Separates every 3 digits of a numeric value with a comma and a period for decimals
1000010.000COMMAX12.0Separates every 3 digits of a numeric value with a period and a comma for decimals
1000010.000,00COMMAX12.2Separates every 3 digits of a numeric value with a period and a comma for decimals
10000$10,000DOLLAR12.0Same as COMMA format preceded by a $ sign
10000$10,000.00DOLLAR12.2Same as COMMA format preceded by a $ sign
10000$10.000DOLLARX12.0Same as COMMAX format preceded by a $ sign
10000$10.000,00DOLLARX12.2Same as COMMAX format preceded by a $ sign
1000010.000 €NLMNLEUR12.0Euro format
1000010.000,00 €NLMNLEUR12.2Euro format
1000010,000 £NLMNLGBP12.0GBP format
1000010,000.00 £NLMNLGBP12.2GBP format
1000010000%PERCENT10.0Percentage format
100001.000E+04E10.Scientific format

Note, if SAS doesn’t display the format correctly, you might change the length of the format. For example, DOLLAR18.2 instead of DOLLAR12.2.

For more numeric formats, see these pages (1, 2).

How to Format a Character Variable

Besides numeric variables, you can also use the FORMAT statement to format character variables.

In contrast to numeric formats, there exists just one (frequently used) character format, namely the $CHAR format. Therefore, character variables are normally associated with a user-written format. In such cases, the user-written format can be considered as a look-up table.

You can recognize character formats by the $-sign. The name of a character format always starts with a $-sign.

For example, the user-written format below associated a grade with a PASS or a FAIL.

data work.my_data;
    input grade $;
    datalines;
B
C
A
D
B
.
A
;
run;
 
proc print data=work.my_data noobs;
run;
 
proc format;
    value $grade_fmt
	"A", "B", "C" = "PASS"
	"D", other = "FAIL";
run;
 
data work.my_data_fmt;
    set work.my_data;
 
    format grade $grade_fmt.;
run;
 
proc print data=work.my_data_fmt noobs;
run;

How to Format a Date Variable

Apart from numeric and character variables, you can use the FORMAT statement also to format date variables.

A SAS date format is a special type of numeric format because date variables are stored as numbers. Therefore, you can use the FORMAT statement to either change the existing format of a date variable or associate an unformatted numeric variable with a date format.

For instance, this SAS code uses the FORMAT statement to change the format from DATE9. to MMDDYY10.

data work.my_data;
    input my_date : date9.;
    format my_date date9.;
    datalines;
01JAN2021
14JUL2018
30OCT2022
;
run;
 
proc print data=work.my_data noobs;
run;
 
data work.my_data_fmt;
    set work.my_data;
 
    format my_date mmddyy10.;
run;
 
proc print data=work.my_data_fmt noobs;
run;

Like numeric formats, SAS Base offers many date formats. See here a list of the most common data formats.

SAS dates might be confusing. Sometimes a value looks like a date but isn’t a real SAS date. This can happen on two occasions:

  • A number looks like a date, e.g. 01012021 (January 1st, 2021), or
  • A character looks like a date, e.g. 01/01/2010.

Please, read this article to change a number that looks like a date into a real SAS date or this article to convert a text string into a SAS date.

2. How to Format a Variable with PROC SQL

Instead of associating a variable with a format in a SAS Data Step, you can also format variables with the PROC SQL procedure. In fact, there exist two ways to do so, namely with the SELECT statement or the ALTER TABLE statement.

Format Variables with the SELECT Statement

The most common way to format variables in the PROC SQL procedure is with the SELECT statement. This statement selects one or more variables and directly associates them with a format. To format a variable, you use the FORMAT=-option followed by the desired format.

Syntax of the FORMAT=-option in the SELECT statement:

SELECT variable-name FORMAT=format-name,
       variable-name FORMAT=format-name,
       variable-name FORMAT=format-name,
       etc.;

Example:

SELECT income FORMAT=DOLLAR10.2,
       age FORMAT=comma12.0;

You can use the FORMAT=-option also in combination with the LENGTH=-option and the LABEL=-option. If you do so, the order of the options is irrelevant.

In contrast to the SAS Data Step, you can’t use a single FORMAT=-option to format multiple variables at once. In PROC SQL, you need to format each variable separately.

Although format improves the readability of your tables, it isn’t necessary to associate a variable with a format. If you omit the FORMAT=-option, SAS will either use:

  • the original format of the variable from the input table, or
  • no format at all, in case of a new variable.

The following SAS code shows how to use the FORMAT=-option in the PROC SQL procedure.

data work.my_data;
    infile datalines;
    input units_sold price_per_unit_USD total_USD;
    total_EUR = total_USD * 0.9;
    datalines;
2 4.5 9
4 1.35 5.4
20 3 60
1500 0.55 825
2100 9.95 20895
;
run;
 
proc print data=work.my_data noobs;
run;
 
proc sql;
    create table work.my_data_fmt as
	select units_sold format = comma12.0,
        	price_per_unit_USD format = dollar12.2,
		total_USD format = dollar12.2,
		total_EUR format = nlmnleur18.2
	from work.my_data;
quit;
 
proc print data=work.my_data_fmt noobs;
run;

Format Variables with the ALTER TABLE Statement

A drawback of formatting variables with the FORMAT=-option in the SELECT statement is that it could require many lines of code. For example, if you just want to format a subset of all variables, you still need to explicitly select all variables. As a result, the SELECT statement can become very long.

Instead of the SELECT statement, you can use the ALTER TABLE statement and the FORMAT=-option to associate a variable with a format without creating a new table. The statement starts with the ALTER TABLE keyword and the table name. Then after the MODIFY keyword, you specify the variable of interest followed by the FORMAT=-option and the new format.

Syntax of the FORMAT=-option in the ALTER TABLE statement:

ALTER TABLE table-name 
    MODIFY variable-name FORMAT=format-name,
           variable-name FORMAT=format-name,
           etc.;

Example:

ALTER TABLE my_data 
    MODIFY income FORMAT=DOLLAR10.2,
       age FORMAT=comma12.0;

Although the ALTER TABLE statement and the MODIFY clause allow you to only associate a format with the variables of interest, you still can’t assign the same format to multiple variables at once. In other words, you still need a FORMAT=-option per variable.

Here we use the ALTER TABLE statement to format two variables.

data work.my_data;
    infile datalines;
    input units_sold price_per_unit_USD total_USD;
    total_EUR = total_USD * 0.9;
    datalines;
2 4.5 9
4 1.35 5.4
20 3 60
1500 0.55 825
2100 9.95 20895
;
run;
 
proc print data=work.my_data noobs;
run;
 
proc sql;
    alter table work.my_data
	modify units_sold format=comma12.0,
		total_USD format=dollar12.2;
quit;
 
proc print data=work.my_data noobs;
run;

3. How to Format a Variable with PROC DATASETS

A less known, but still powerful way to format variables in SAS is with the PROC DATASETS procedure.

The PROC DATASETS procedure, in combination with the MODIFY statement, can change SAS datasets in various aspects. For example, it can associate a variable with a format using the FORMAT statement. A FORMAT statement within PROC DATASETS works identically as the FORMAT statement in a SAS Data Step.

Syntax of the FORMAT statement in the PROC DATASETS procedure:

proc datasets library=library-name;
	modify dataset-name;
	format variable-name(s) format-name;
run;

Like a FORMAT statement in a SAS Data Step, you can use the FORMAT statement in PROC DATASETS to associate several variables with the same format, as well as assign different formats to different variables.

For example:

data work.my_data;
    infile datalines;
    input units_sold price_per_unit_USD total_USD;
    total_EUR = total_USD * 0.9;
    datalines;
2 4.5 9
4 1.35 5.4
20 3 60
1500 0.55 825
2100 9.95 20895
;
run;
 
proc print data=work.my_data noobs;
run;
 
proc datasets library=work nolist;
    modify my_data;
    format units_sold comma12.0
	price_per_unit_USD total_USD dollar12.2;
run;
 
proc print data=work.my_data noobs;
run;

How to Remove a Format from a Variable

So far, we’ve discussed how to associate a variable with a SAS format. However, how do you remove a format?

You can remove a format from a variable in different ways, such as:

  • A blank FORMAT statement in a SAS Data Step.
  • A blank FORMAT statement in the PROC DATASETS procedure.
  • An empty FORMAT=-option in a SELECT statement of the PROC SQL procedure.
  • An empty FORMAT=-option in an ALTER TABLE statement of the PROC SQL procedure.

Below we show the syntax of these options.

Remove a Format with a SAS Data Step

data work.my_data;
   set work.my_data;
 
   format my_variable;
run,

Remove a Format with the PROC DATASETS procedure

proc datasets lib=work nolist;
    modify my_data;
    format my_variable;
run;

Remove a Format with the SELECT statement in PROC SQL

proc sql;
   select my_variable format=
   from work.my_data;
quit;

Remove a Format with the ALTER TABLE statement in PROC SQL

proc sql;
    alter table work.my_data
    modify my_variable format=;
quit;

One thought on “3 Easy Ways to Format Variables in a SAS Dataset [Examples]

Comments are closed.