This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Quoted tweet: https://twitter.com/ii_sambliss/status/1335296778845949954
Also: https://twitter.com/g_kallis/status/1335615126431535106
(linked by /u/serialk )
RI
The authors of the linked tweets beleive that this graph implies that there is some sort of causal link between material usage and GDP. This is incorrect for two reasons.
(1)
Firstly, the idea to aggregate materials by weight is dumb. Materials usage in the graph apparently comes from this report (link provided on twitter). Their approach would give equal weight to 1KG of gasoline and 1KG of gold. Obviously, we should not expect these two very different 'materials' to have the same effect on GDP. OP does not beleive this to be a problem. Specifically, he thinks there's some sort of consistent relationship between GDP and materials regardless of the nonsensical aggregation method.
(2)
This brings up the second problem with the graph; there is not actually a consistent statistical relationship between the two. Basically, the graph may appear to show a strong correlation between the two, but the real reason may be due to autocorrelation in both processes (lots of examples here). To statistically check if there's a link between the two, I'll try two approaches: tests for granger causality and impulse response functions. This analysis will look weird because I'm going to show how hard it is to find a statistically significant link, which is different from doing "standard research." Anyways, I first need to get the graph's data.
Data: I grab the data from the picture of the graph, because trying to get the author's data was annoying me. I used this website to pull data from the picture of the graph. The problem with this approach of getting the data is that the website doesn't naturally pick up observations at integer years. So I sampled the data quite frequently and then created two "raw" series for the data which you can access here. I plotted these two raw series and they look quite similar to the actual data -- the raw gdp and materials series are plotted here compared to the original graph here. The problem with the raw data is that there are different x-values for the two series. So, I cleaned these series by matching each observation to the closest integer year. It's not the perfect method and introduces some noise into the data but the added noise seems pretty small. I compare the clean and raw series here. The clean data frame head looks like this.
To start, I check for Granger Causality. Does materials usage granger cause GDP movement? Does GDP movement granger cause materials usage? Also, should we use levels, logs, level diffs, or log diffs? The correct approach is probably to use logs, check for unit roots, and then use log diffs if there are unit roots. But, I'm just going to show you the results for every single approach. And, I'll consider both AIC and BIC, and fit the VARs with 10 max lags. The reason I considered both info criteria, both causality directions, and every single transformation of the data is to emphasize that it's really hard to get a statistically significant link between these two variables.
In the below tables, I report the p-values for the granger causality tests. The code for each test looks something like this. The NA values correspond to cases where the VAR lag selection procedure picked 0 lags, making it impossible to test for granger causality.
Does Materials Usage cause GDP movement?
Log | Level | Log Diff | Level Diff | |
---|---|---|---|---|
AIC | 0.574 | 0.442 | 0.618 | 0.851 |
BIC | 0.280 | 0.660 | NA | 0.107 |
Does GDP movement cause Materials Usage?
Log | Level | Log Diff | Level Diff | |
---|---|---|---|---|
AIC | 0.012 | 0.096 | 0.109 | 0.079 |
BIC | 0.397 | 0.080 | NA | 0.546 |
There's only one rejection of the null at 5% (Table 2, AIC, Log). Keep in mind that we're likely to get a rejection from testing so many hypotheses anyways. So, overall, it seems hard to get results that say materials usage and GDP movement granger cause one another in some way.
However, one might argue that most of these specifications are inappropriate - that's probably right! If I had to pick a reasonable specification, I would have chosen "does log diff materials granger cause log diff gdp?" with AIC (less chance of overfitting). The question seems reasonable because an increase in materials usage may be correlated with physical investment. Since it takes time to build, an increase in the growth rate of investment today might cause some increase in GDP growth in the next period. But, for this specification, the p-value was 0.618 🙄.
The only specification that got significant results was log(GDP) => log(Materials), which is hard to make sense of because the materials usage series is so weirdly constructed. Also, I should also say that this specification feels a bit ridiculous without including a trend term (by default, the Python VARs run with only a constant). If we run the granger causality tests for the log specifications while including a trend term, we get no results. Similarly, we shouldn't actually expect the specifications using levels to give results anyways, since these series are probably growing exponentially (GDP obviously is) so the underlying regressions would be weird. But, to emphasize again, the point here is that it's really hard to make the data show any significant link between the two series.
The second approach is to check the IRFs of the VARs. An impulse response function shows how a series changes over time after a unit shock to one of the variables. To generate the IRFs, I fit a VAR with AIC lag selection on logged variables with both a constant and a trend term. AIC prevents overfitting, while logs trends are used to handle exponential growth.
Note that the x-axis is years because the data is yearly, and the dashed lines are 95% confidence intervals. The off-diagonal subplots are the important ones. For example, the lower left subplot (log_gdp -> log_mat) shows that, after a unit shock to log gdp, it's log materials does not (statistically) significantly move away from zero compared to its counterfactual. Next, for the upper right subplot (log_mat -> log_gdp), we also get a null null result. So, there doesn't appear to be a significant correlation between increases in one variable and increases in the other (at any point in time!). By the way, if we use BIC instead, we basically get weird/null results. Dropping the trend term in the VAR also has little effect.
Next, I considered VAR AIC LogDiff Variables. In this case, I just use just a constant term in the VAR instead of including a time trend.
These are cumulative IRFs since the variables are log differenced. Note that an IRF shows the value of a variable at each point in time after a shock. If we add up a bunch of log differences (log(x_1/x_0) log(x_2/x_1) ... log(x_k/x_{k-1}))
, the result is the percent change log(x_k/x_0)
. So, the graphs show the cumulative percent change after a growth rate shock to one of the variables. Both the off-diagonal subplots basically show no results. Additionally, using BIC actually gives no results here, because, like before, BIC selects 0 lags for the VAR. Including a trend doesn't really change the results either; this is because the estimated trend terms (gdp), (mat) are basically zero anyways, as expected (no super exponential growth).
All in all, it's hard to get the IRFs to say that changes in materials/gdp cause changes in gdp/materials. The simplest explanation for all these null results is that the data is garbage anyways. Based on the underlying report, there's four categories of materials: non-metallic minerals, metal ores, fossil fuels, and biomass. And, they have subcategories (eg: biomass). But, they're all weighted the same, which means using up 1 KG of timber would contribute to the materials usage series by the same amount as using 1 KG of gold. So, duh, we get null results.
Btw here's the code; download and rename as *.ipynb to find my mistakes 😤
Subreddit
Post Details
- Posted
- 3 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/badeconomic...