Uncategorized

How to Easily Calculate a Moving Average in SAS

In this article, we discuss how to calculate a moving average (or rolling average) in SAS.

The easiest way to calculate a moving average in SAS is by using the PROC EXPAND procedure. This procedure can calculate simple moving averages, weighted moving averages, and exponential weighted moving averages, amongst others. A drawback of this procedure is that you need to have the SAS/ETL license installed. Therefore, an alternative for PROC EXPAND is a simple SAS Data Step to find a moving average.

In this article, we discuss 3 different types of moving averages you can calculate with the PROC EXPAND procedure. Also, we show briefly how you can use a SAS Data Step to find a moving average. However, since this method is (much) more complex, we only show two examples.

Calculate the Moving Average with PROC EXPAND

In this section, we will discuss how to calculate three different types of moving averages. For the examples, we use the TOURISM dataset from the SASHELP library. This dataset contains many variables of which the exchange rate of the UK pound against the US dollar is one.

The plot below shows a scatterplot and line plot of the evolution of the exchange rate.

SAS Sample Data

The easiest way to find a moving average in SAS is by using the PROC EXPAND procedure. This procedure is one of SAS time series procedures and is very useful in terms of creating new variables (e.g., a moving average) and plotting trends.

However, this procedure is part of the SAS/ETL license and, therefore, might not be installed on your machine. Alternatively, you could use a normal SAS Data Step to calculate a moving average. Unfortunately, this method requires more coding and isn’t very intuitive.

Before moving on and using the PROC EXPAND procedure, it is necessary to order your dataset. In general, you order your dataset by a variable that indicates some form of time. For example, days, months, or years.

In the example below, we use the PROC SORT procedure to sort our data ascendingly by the variable year. Also, we use the KEEP=-option to maintain only the variables year and exuk (i.e., the exchange rate).

proc sort data=sashelp.tourism
    out=work.tourism_srt (keep=year exuk);
    by year;
run;

Calculate a Moving Average in SAS

After you have ordered your dataset, you can use the PROC EXPAND procedure to find the moving averages. The most common moving average is the Simple Moving Average which calculates the average of the last N observations.

This is how you calculate a simple moving average in SAS:

  1. Start the PROC EXPAND procedure with the PROC EXPAND statement.
  2. Specify your input dataset with the DATA=-option. If you omit this option, SAS will use the most recently created dataset.
  3. Specify the output dataset with the OUT=-option. This dataset will have all the columns from the input dataset plus the new column(s) containing the moving average(s).
  4. Define the conversion method with the METHOD=-option. You use this option to let SAS know how to handle missing values. If your data has no missing values, you can use METHOD=NONE. Otherwise, you could use one of the conversion methods.
  5. Specify the variable that identifies observations in your dataset with the ID statement. The ID variable must be a numeric variable and is tipically a data or date time variable. This is also the variable that you have used to order your dataset.
  6. (Optionally) Specify the groups in your dataset with the BY statement. The PROC EXPAND procedure can also calculate moving averages per group. If you want to do this, you need the BY statement followed by one or more variable names to define your groups.
  7. Specify the variable of which you want to calculate the moving average with the CONVERT statement. The statement starts with the CONVERT keyword followed by the variable of interest. Then, after the =-sign, you define the name of the new variable that will contain the moving average.
  8. Specify what type of moving average you want to calculate. You can calculate a simple moving average with the TRANSOUT=-option follwed by the MOVAVE keyword and the number of observations you want to consider. For example, transout=(movave 3) to calculate a moving average of the third order.
  9. Run the PROC EXPAND procedure with the RUN statement.

In the example below we combine all these steps. We calculate a simple moving average of the variable exuk considering the last 3 observations.

proc expand data=work.tourism_srt out=work.tourism_moving_averge_3 method=none;
    id year;
    convert exuk = exuk_ma_3   / transout=(movave 3);
run;

We use the following SAS code to compare the original data and the moving average.

proc sgplot data=work.tourism_moving_averge_3 noautolegend;
    scatter x = year y = exuk;
    series x = year y = exuk;
    series x = year y = exuk_ma_3 / name="exuk_ma_3" legendlabel="3 Year Moving Averge" lineattrs=(color=red pattern=dash);
    keylegend "exuk_ma_3";
run;
Calculate a Moving Average in SAS

If you want to calculate different moving averages, you can easily expand your code with more CONVERT statements. For example, below we use three CONVERT statements to calculate the moving average of the orders 3, 5, and 7.

proc expand data=work.tourism_srt out=work.tourism_moving_averge_3_5_7 method=none;
    id year;
    convert exuk = exuk_ma_3   / transout=(movave 3);
    convert exuk = exuk_ma_5   / transout=(movave 5);
    convert exuk = exuk_ma_7   / transout=(movave 7);
run;
 
proc sgplot data=work.tourism_moving_averge_3_5_7 noautolegend;
    scatter x = year y = exuk;
    series x = year y = exuk;
    series x = year y = exuk_ma_3 / name="exuk_ma_3" legendlabel="3 Year Moving Averge" lineattrs=(color=red pattern=dash);
    series x = year y = exuk_ma_5 / name="exuk_ma_5" legendlabel="5 Year Moving Averge" lineattrs=(color=green pattern=dash);
    series x = year y = exuk_ma_7 / name="exuk_ma_7" legendlabel="7 Year Moving Averge" lineattrs=(color=blue pattern=dash);
    keylegend "exuk_ma_3" "exuk_ma_5" "exuk_ma_7";
run;
Plot with multiple moving averages

Calculate a Weighted Moving Average

So far, we have discussed how to calculate a simple moving average. However, another frequently used type of moving average is the Weighted Moving Average.

In a weighted moving average, the observations have different weights. Typically, more recent observations are considered to be more important. As a result, a weighted moving average reacts stronger to an increase or drop in your data.

The SAS code to calculate a weighted moving average and a simple moving average is very similar. The only difference can be found in the TRANSOUT=-option. Instead of using movave N, you use movave (1, 2, …, N) to calculate a weighted moving average. The N specified how many observations are taken into account.

For example, if you want to calculate the weighted moving average of the last 3 observations, you use movave (1 2 3). Notice that you must place the numbers between parenthesis. See the example below.

proc expand data=work.tourism_srt out=work.tourism_wgth_moving_averge_3 method=none;
    id year;
    convert exuk = exuk_wgth_ma_3   / transout=(movave (1 2 3));
run;

To compare the simple moving average and the weighted moving average, we created a SAS plot.

proc expand data=work.tourism_srt out=work.tourism_wgth_moving_averge_3 method=none;
    id year;
    convert exuk = exuk_ma_3   / transout=(movave 3);
    convert exuk = exuk_wgth_ma_3   / transout=(movave (1 2 3));
run;
 
 
proc sgplot data=work.tourism_wgth_moving_averge_3 noautolegend;
    scatter x = year y = exuk;
    series x = year y = exuk;
    series x = year y = exuk_ma_3 / name="exuk_ma_3" legendlabel="3 Year Moving Averge" lineattrs=(color=red pattern=dash);
    series x = year y = exuk_wgth_ma_3 / name="exuk_wgth_ma_3" legendlabel="3 Year Weighted Moving Averge" lineattrs=(color=green pattern=dot);
    keylegend "exuk_ma_3" "exuk_wgth_ma_3";
run;
Calculate Weighted Moving Average in SAS

Calculate an Exponentially Weighted Moving Average

The last type of moving average we discuss is the Exponentially Weighted Moving Average (or exponential moving average).

Like a weighted moving average, this type gives more importance to the most recent observations. However, the significance of observations in an exponentially weighted moving average decreases exponentially instead of gradually.

You calculate an exponentially weighted moving average in SAS with the PROC EXPAND procedure. First, you use the CONVERT statement to specify your variable of interest. Then, to create an exponentially weighted moving average, you use the TRANSOUT=-option followed by the ewma α keyword. The α indicates how quickly the importance of observations decreases.

As an example, we calculate an exponentially weighted moving average with an α of 0.3.

proc expand data=work.tourism_srt out=work.tourism_exp_moving_averge_3 method=none;
    id year;
    convert exuk = exuk_exp_ma_3   / transout=(ewma 0.3);
run;

Below we compare the 3 types of moving average we’ve discussed.

proc expand data=work.tourism_srt out=work.tourism_exp_moving_averge_3 method=none;
    id year;
    convert exuk = exuk_ma_3   / transout=(movave 3);
    convert exuk = exuk_wgth_ma_3   / transout=(movave (1 2 3));
    convert exuk = exuk_exp_ma_3   / transout=(ewma 0.3);
run;
 
 
proc sgplot data=work.tourism_exp_moving_averge_3 noautolegend;
    scatter x = year y = exuk;
    series x = year y = exuk;
    series x = year y = exuk_ma_3 / name="exuk_ma_3" legendlabel="3 Year Moving Averge" lineattrs=(color=red pattern=dash);
    series x = year y = exuk_wgth_ma_3 / name="exuk_wgth_ma_3" legendlabel="3 Year Weighted Moving Averge" lineattrs=(color=green pattern=dot);
    series x = year y = exuk_exp_ma_3 / name="exuk_exp_ma_3" legendlabel="3 Year Exponential Weighted Moving Averge" lineattrs=(color=blue pattern=dashdotdot);
    keylegend "exuk_ma_3" "exuk_wgth_ma_3" "exuk_exp_ma_3";
run;
Calculate Exponentially Weighted Moving Average

Calculate the Moving Average with a SAS Data Step

As mentioned before, the PROC EXTEND procedure is only available if you’ve installed the SAS/ETS license. However, if you don’t have this license, you can still calculate the moving average.

The alternative method uses a simple SAS Data Step. Unfortunately, this method requires more code and gets quite complex.

Calculate the Moving Average of All Previous Observations

The easiest type of moving average you can calculate with a SAS Data Step is the cumulative average. This kind of moving average takes all previous observations into account.

You calculate the cumulative average in two steps. First, you need the cumulative sum of your variable. You can do so with the RETAIN statement and the SUM() function. (For other methods, see this article.)

Once you have the cumulative sum, you only need to divide it by the row number. For this, SAS has a convenient special variable, namely _N_. This variable stores the row number that SAS is currently processing.

The SAS code below shows how to combine these steps and find a cumulative average.

data work.moving_average;
    set sashelp.tourism (keep=year exuk);
 
    retain exuk_sum;  
    exuk_sum = sum(exuk_sum,exuk);
    exuk_ma = exuk_sum / _n_;
run;
 
proc print data=work.moving_average noobs;
run;

Calculate the Moving Average of the Previous N Observations

Calculating the moving average of the previous N observations with a SAS Data Step is complex. Amongst others, you need to know about DO loops, arrays, and macro variables.

With the SAS Data Step below we calculate the moving average of the last 3 observations.

%let n = 3;
 
data work.moving_average_3;
    set sashelp.tourism (keep=year exuk);
 
    array obs_(&n);
 
    do i = &n to 2 by -1;
    	obs_(i)=obs_(i-1);
    end;
 
    obs_(1)=exuk;
 
    if _n_ < &n then exuk_ma_3 = sum(of obs_(*)) / _n_;
    else exuk_ma_3 = sum(of obs_(*)) / &n;
 
    retain obs_:;
    drop obs_: i;
run;
 
proc print data=work.moving_average_3 noobs;
run;