Summing two regressors: some notes

July 28, 2020

I was recently asking myself a straightforward question in Econometrics. It was actually a bit more involved than what I’m about to share, but I’ll keep it simple. Suppose you have two random variables $x_t$ and $y_t$. We’ll make our life easier by assuming $x_t$ and $y_t$ are orthogonal. Let’s say you want to look at their sum $z_t=x_t+y_t$. Say there is another random variables, $w_t$, that is generated by the following model: $w_t=\alpha+\beta x_t+ \gamma y_t + \epsilon_t$, where $\epsilon_t$ is white noise that is orthogonal to the explanatory variables. Suppose you want to estimate the following model: \begin{align} w_t=a+bz_t+\delta_t. \label{eq:one} \end{align} What can we say about $b$? \begin{align} b=\frac{cov(z_t,w_t)}{var(z_t)}=\frac{cov(x_t,w_t)}{var(z_t)}+\frac{cov(y_t,w_t)}{var(z_t)}, \label{eq:two} \end{align} where I have used Equation \ref{eq:one} for the substitution. Developing the expression further and using the orthogonality of $x_t$ and $y_t$ we have that \begin{align} b=\beta \frac{var(x_t)}{var(x_t)+var(y_t)}+\gamma \frac{var(y_t)}{var(x_t)+var(y_t)}. \label{eq:five} \end{align} In words, $b$ is a weighted average with weights determined by relative variance.

In many applications, we are mostly interested in the effect of a one-standard deviation of our explanatory variables. Let’s first examine the case of the response of $w_t$ to a one-standard deviation shock of $z_t$: \begin{align} b\cdot \sigma(z_t)=b\cdot\sqrt{(var(x_t)+var(y_t)}, \label{eq:six} \end{align} where $\sigma(\cdot)$ denotes the standard deviation function. What can we say about the magnitude of the effect of $z_t$ on $w_t$? As an example, if we assume $x_t$ and $y_t$ have equal variance, and that $\gamma=0$, we have that $b=\frac{1}{2} \beta$ (by Equation \ref{eq:five}), so that an effect of a one standard deviation shock of $z_t$ on $w_t$ is $b\cdot \sigma(z_t)=\frac{1}{2}\beta \sqrt{2}\sigma(x_t)\approx 0.7 \beta \sigma(x_t)$. That is, we should find that the total effect is smaller than the effect of a one standard deviation of $x_t$ (by 30%). If however $\gamma=\beta c$, where $c>0$ $b=(1/2+\frac{1}{2}c) \beta$ so that $b$ becomes closer to $\beta$ as the coefficient of $y_t$ becomes bigger. It also comes closer to the effect of $\beta$ in the case where the variance of $x_t$ is much larger than the variance of $y_t$. For instance, if $x_t$ has twice the variance of $y_t$ (while still assuming $\gamma=0$), we have, by Equation \ref{eq:five} and \ref{eq:six}, that \begin{align} b\cdot\sigma(z_t)=\sqrt{1.5}\cdot\frac{2}{3}\beta \sigma(x_t) \approx 0.81 \beta \sqrt{x_t}. \label{eq:seven} \end{align} That is, the effect of $x_t$ should still be higher than the effect of $z_t$ but less than the case with equal variances. However, one can see that it is not so hard to generate cases where that is not the case. That is, even if a priori we know that the effect of $x_t$ is higher than the effect of $y_t$, the effect of $z_t$, their sum, would can still be higher than that of $x_t$. For instance, assume $y_t$ has half the effect of $x_t$, and that they have equal variances. This simply means that $\gamma=\frac{1}{2} \beta$. In this case, Equation \ref{eq:five} tells us that $b=\frac{1}{2} \beta+\frac{1}{2} \gamma = \frac{3}{4} \beta$. Plugging this into Equation \ref{eq:six}, we have that the total effect of flows is $b \cdot \sigma(z_t)=\frac{3}{4}\beta \cdot \sqrt{2}\cdot \sigma(x_t) \approx 1.06 \cdot \beta \cdot \sigma(x_t)$. That is, we’d that the total effect of flows is a tad higher than that of the effect of a one-standard deviation of $x_t$ on $w_t$.

To conclude, aggregating both measures results in an effect that can be higher than either measure. The intuition here is this: when we aggregate two measures, we will have a sum of the changes of each of the variables times a coefficient that is a weighted average of the two individual coefficients.