If you work with data, it’s sometimes necessary to combine different tables. There are two types of combining your data sets, namely, horizontally and vertically. In a previous article, we explained how you can merge your datasets horizontally. In this article, we discuss how to append your data sets in SAS vertically.
In SAS, you can append datasets with the SET statement or the APPEND procedure. It depends on the structure of your datasets, as well as on your purpose which method is the best option.
We discuss how to append data sets with:
- A similar structure, i.e., with exactly the same number of columns, column names, and column types.
- Different column names.
- Different column types.
Contents
Append Data Sets with a Similar Structure
We start our discussion about how to append data sets in SAS with the simplest case, namely combining data sets with the same structure. That is to say, two (or more) tables with the same number of columns, the same column names, and the same column types, for example, the two tables below.
data work.ds1; input columnA $ columnB; datalines; A 1 B 2 C 3 ; run; data work.ds2; input columnA $ columnB; datalines; A 4 B 5 C 6 ; run;
SET Statement
The first method to combine two tables with the same structure is with the SET statement. First, you use the DATA statement to define the name of the new table. Then, you use the SET statement followed by the names of the tables that you want to append (separate by a whitespace).
data work.ds3; set work.ds1 work.ds2; run;
In this example, we combine two tables. However, with the SET statement you can append any number of tables you want.
PROC Append
The second method to append tables is the PROC APPEND procedure.
The PROC APPEND procedure has two arguments:
- base: The name of the data set to which you want to append observations.
- data: The name of the data set you want to append to the base.
Comparing the SET statement and the PROC APPEND procedure, there are two main differences:
- With the PROC APPEND procedure, you can combine just two tables. While, with the SET statement, you can append two or more tables at once.
- With the PROC APPEND procedure, you can’t create a new data set. Observations are directly added to the base data set. This difference makes the PROC APPEND procedure more efficient than the SET statement because the base data set isn’t stored in memory (necessarily).
In the example below, we show how to use the PROC APPEND procedure to combine two tables with the same structure.
Note that the PROC APPEND procedure doesn’t create a new data set.
proc append base=work.ds1 data=work.ds2; run;
Append Data Sets with Different Column Names
In this section, we discuss how to append data sets with different column names in SAS. For example, the two tables below have columnA in common, but columnB and columnC are different.
data work.ds1; input columnA $ columnB; datalines; A 1 B 2 C 3 ; run; data work.ds2; input columnA $ columnB columnC $; datalines; A X B Y C Z ; run;
SET Statement
You can use the SET statement to append data sets with different column names. With the DATA statement, you define the name of the new table and with the SET statement, you specify which tables you want to combine. The new data set contains all the columns of the data sets mentioned in the SET statement. If one of the tables in the SET statement doesn’t contain all the columns of the other table(s), then these columns are left empty. Check the example below.
Note that the first 3 rows of columnC and the last 3 rows of columnB are empty because they are missing in the original data sets.
data work.ds3; set work.ds1 work.ds2; run;
PROC Append
You can also you the PROC APPEND procedure to combine tables with different structures. However, if you use the PROC APPEND code mentioned in one of the sections above, SAS will return an error:
ERROR: No appending done because of anomalies listed above. Use FORCE option to append these files.
SAS returns this error if the base and data tables don’t have the same structure.
proc append base=work.ds1 data=work.ds2; run;
Fortunately, you can still use the PROC APPEND procedure to combine tables with a different structure. To do so you need to FORCE option. With the FORCE option, SAS uses the structure of the base table and appends the rows from the data table that match the columns of the base table. So, in the example above, SAS appends columnA from work.ds2 to work.ds1, but columnC is ignored.
Again, SAS overwrites the base table with the result of the append procedure. Notice also that SAS still returns a warning if you try to append tables with different structures even though you use the FORCE option.
proc append base=work.ds1 data=work.ds2 force; run;
Append Data Sets with Different Column Types
Finally, we discuss how to append data sets in SAS with the same column names, but with different column types. For example, the data sets below have the same column names, but columnB of work.ds1 is numeric, while columnB of work.ds2 is character.
data work.ds1; input columnA $ columnB; datalines; A 1 B 2 C 3 ; run; data work.ds2; input columnA $ columnB $; datalines; A X B Y C Z ; run;
SET Statement
If you want to append tables with the same column names, but with different columns types, then the SET statement won’t work. Instead, it will return an error:
ERROR: Variable XYZ has been defined as both character and numeric
There is no option to make this work.
data work.ds3; set work.ds1 work.ds2; run;
PROC Append
Like the SET statement, the PROC APPEND procedure also returns an error if you try to combine tables with the same column names, but with different column types:
ERROR: No appending done because of anomalies listed above. Use FORCE option to append these files.
proc append base=work.ds1 data=work.ds2; run;
Again, you can use the FORCE option to make this append operation work. If you use the FORCE option, SAS uses the structure of the base table and appends all rows with matching column names and column types from the data table. If the data table doesn’t contain a matching column name and type, SAS appends a blank value.
proc append base=work.ds1 data=work.ds2 force; run;
2 thoughts on “How to Append Data Sets in SAS”
Comments are closed.