We need the following packages for working on this example.
library(tidyverse) # for almost all data handling tasks
library(readxl) # to import Excel data
library(ggplot2) # to produce nice graphiscs
library(stargazer) # to produce nice results tables
Here we use an example which, at the time, made a lot of waves. Carmen Reinhart and Kenneth Rogoff (2010) wrote work that attempted to examine the relationship between Debt to GDP levels and GDP growth. Their general conclusion was that, as long as the debt to GDP ration does not exceed 90% there seems to be no clear relationship between the two. However, countries running higher debt to GDP ratios are paying a sizeable growth penalty. Reinhart and Rogoff were influential economists and had the ear of many governments around the world, including that of the UK. It was often characterised as one of the pieces of work offering the academic credibility to austerity policies.
The empirical work of Reinhard and Rogoff was critised on three aspects by Thomas Herndon, Michael Ash and Robert Pollin. This work was published in the Cambridge Journal of Economics in 2014. and previous to that as a working paper. An earlier working paper had the relevant data and code published alongside and the work presented here is based on this.
The publication of this critique made waves around the globe not least because the critique was published by academics from a lower ranked institution, one of the authors was a graduate student and because one of the shortcomings of the original work highlighted was a mistake in an EXCEL spreadsheet. This article in The New Yorker provides a nice summary of the “affair”.
The criticism of the empirical work in Reinhard and Rogoff, importantly, went beyond a deficient spreadsheet and this piece will review Thomas Herndon’s work and present it such that you can replicate it.
The following statistical concepts are used in this project
The following R programming concepts were used in this document. The links lead to pages that provide help with these issues.
This is a simplified spreadsheet (based on the work of Herndon et al., 2014): RRdata.xlsx. Download the file and save it in your working folder.
RRData <- read_excel("../data/RRdata.xlsx") # make sure you point to the right file in your working directory
RRData <- as.data.frame(RRData)
str(RRData) # prints some basic info on variables
## 'data.frame': 1171 obs. of 4 variables:
## $ Year : num 1946 1947 1948 1949 1950 ...
## $ Country: chr "Australia" "Australia" "Australia" "Australia" ...
## $ debtgdp: num 190 177 149 126 110 ...
## $ dRGDP : num -3.56 2.46 6.44 6.61 6.92 ...
We can see that three variables are numeric (num
)
variables which is as we expect. The fourth variable
Country
is labeled as a character (chr
)
variable, basically text. It will pay off for later to indicate to R
that this is a categorical (nominal) variable. This is done as
follows:
RRData$Country <- as.factor(RRData$Country)
str(RRData)
## 'data.frame': 1171 obs. of 4 variables:
## $ Year : num 1946 1947 1948 1949 1950 ...
## $ Country: Factor w/ 20 levels "Australia","Austria",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ debtgdp: num 190 177 149 126 110 ...
## $ dRGDP : num -3.56 2.46 6.44 6.61 6.92 ...
There are 1171 country-year observations. In the dataset are data from 20 countries:
# unique(RRData$Country)
levels(RRData$Country)
## [1] "Australia" "Austria" "Belgium" "Canada" "Denmark"
## [6] "Finland" "France" "Germany" "Greece" "Ireland"
## [11] "Italy" "Japan" "Netherlands" "New Zealand" "Norway"
## [16] "Portugal" "Spain" "Sweden" "UK" "US"
And data from 1946 to 2009.
summary(RRData$Year)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1946 1964 1980 1979 1995 2009
debtgdp
is the debt to GDP ratio
summary(RRData$debtgdp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.279 22.148 40.401 46.283 61.474 247.482
dRGDP
is GDP growth rate
summary(RRData$dRGDP)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10.942 1.911 3.283 3.430 5.100 27.329
The typical summary
output does not include two pieces
of info which we would often like to see, the number of observations and
the standard deviation. To the rescue comes the stargazer
package.
stargazer(RRData,type="text") # other types available: "latex", "html"
##
## ==================================================
## Statistic N Mean St. Dev. Min Max
## --------------------------------------------------
## Year 1,171 1,979.155 17.944 1,946 2,009
## debtgdp 1,171 46.283 32.395 3.279 247.482
## dRGDP 1,171 3.430 2.981 -10.942 27.329
## --------------------------------------------------
Let’s have a look at some of the data. We will first look at a
scatterplot (using the amazing ggplot
library).
ggplot(RRData,aes(x=debtgdp,y=dRGDP)) +
geom_point() + # this produces the scatter plot
geom_smooth(method = "lm", se = FALSE) + # this adds the linear line of best fit
ggtitle("GDP growth v Debt to GDP ratios, Scatterplot")
Note a few points about how this graph is being called.
ggplot(RRData,aes(debtgdp,dRGDP))
Sets up the graph. The
first input (RRData
) tells R which data to use. In the
aes()
section (aes
for aesthetics) we
determine which variable should appear on the x-axis
(x=debtgdp
) and which is to go onto the y-axis
(y=dRGDP
). At this stage we havn’t actually produced a
graph yet, we just did the ground work. Then we add the graph we want,
here a scatterplot, in R geom_point()
. As you can see, the
first line and the second line are linked with a +
. This
has to come at the end of the first line so that R knows that it should
expect more information. The last line (again attached to the previous
line via a +
)
geom_smooth(method = "lm", se = FALSE)
adds the regression
line (lm
= linear model). Try what happens if you take out
, se = FALSE
.
We could actually run this regression:
mod1 <- lm(dRGDP~debtgdp, data= RRData)
summary(mod1) # this is the standard R way for results
##
## Call:
## lm(formula = dRGDP ~ debtgdp, data = RRData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.9958 -1.5200 -0.0774 1.5707 23.6960
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.279290 0.148970 28.73 < 2e-16 ***
## debtgdp -0.018355 0.002637 -6.96 5.67e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.922 on 1169 degrees of freedom
## Multiple R-squared: 0.03979, Adjusted R-squared: 0.03897
## F-statistic: 48.44 on 1 and 1169 DF, p-value: 5.666e-12
stargazer(mod1,type="text") # a nicer way
##
## ===============================================
## Dependent variable:
## ---------------------------
## dRGDP
## -----------------------------------------------
## debtgdp -0.018***
## (0.003)
##
## Constant 4.279***
## (0.149)
##
## -----------------------------------------------
## Observations 1,171
## R2 0.040
## Adjusted R2 0.039
## Residual Std. Error 2.922 (df = 1169)
## F Statistic 48.439*** (df = 1; 1169)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
There are issues with running a regression like this. Perhaps most importantly that the observations cannot be argued to be independent. The growth rate and debt to GDP ratios of, say, the UK in 1981 is clearly not independent of the data in 1980. In particular debt to GDP ratios are slow moving data.
To illustrate this point we should look at the data as time series.
tempdata <- RRData %>% filter(Country %in% c("Germany","Greece","UK","US"))
ggplot(tempdata,aes(Year,debtgdp,color=Country)) +
geom_line() + # this produces the line plot
ggtitle("Debt to GDP ratios, seleted countries")
tempdata <- RRData %>% filter(Country %in% c("Germany","Greece","UK","US"))
ggplot(tempdata,aes(Year,dRGDP,color=Country)) +
geom_line(size=1) + # this produces the line plot
ggtitle("GDP growth, seleted countries")
You can clearly see that the debtgdp
data, from year to
year, are dependent. Also, the dRGDP
plot reveals that
there is a fair bit of correlation between the growth rates in
economies.
Running regressions with such highly correlated data is a tricky exercise for a number of reasons that are not to focus of this discussion. Perhaps the most important issue being that the two variables are obviously bi-directional related. The narrative of Reinhard and Rogoff’s advice is that there is dependence from the Debt to GDP ratio to GDP growth. It is equally apparent that GDP growth will have an impact on the debt to GDP ratio.
Dealing with these issues is not straightforward at all and here we will not pursue this any further. All that we require for the remainder is the calculation of averages.
Let’s look at the data for a few countries
tempdata <- RRData %>% filter(Country %in% c("Germany","Greece","UK","US"))
ggplot(tempdata,aes(debtgdp,dRGDP,color=Country)) +
geom_point() # this produces the scatter plot
From here you can see that different countries appear to have quite different patterns. For instance, throughout the sample period, the debt to GDP ration in Germany was rather constant, whereas Greece’s and the UK’s debt to GDP ratio varied substantially. No apparent correlation between GDP growth and the Debt to GDP ratio arises from this scatterplot.
Let’s calculate average and median growth rates and debt to gdp
ratios for these countries. To achieve this we will first group the data
by Country group_by(Country)
and then summarise the
variables dRGDP
and debtgdp
(summarise_at(c("dRGDP", "debtgdp"), ...)
) in the resulting
groups by applying the mean and median function
(funs(mean, median, sd)
).
tempdata %>% group_by(Country) %>%
summarise_at(c("dRGDP", "debtgdp"), funs(mean, median,sd)) %>%
print()
## # A tibble: 4 × 7
## Country dRGDP_mean debtgdp_mean dRGDP_median debtgdp_median dRGDP_sd
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Germany 3.31 17.8 2.88 14.9 2.91
## 2 Greece 2.93 67.7 3.39 77.5 3.11
## 3 UK 2.41 78.7 2.68 47.6 1.90
## 4 US 3.00 56.0 3.38 56.3 3.04
## # ℹ 1 more variable: debtgdp_sd <dbl>
In order to replicate some of the analysis in Reinhard and Rogoff and the subsequent critique in Herndon et al. we will want to create categorical variables for the Debt to GDP ratio. We place each observation in one of four bins or groups, depending on how large the Debt to GDP ratio in a particular country in a particular year is.
We are using the same cut-off points as those used in the empirical work.
RRData <- RRData %>% mutate(dgcat = cut(RRData$debtgdp, breaks=c(0,30,60,90,Inf)))
RRData %>% group_by(dgcat) %>%
summarise_at("dRGDP", funs(mean, median)) %>%
print()
## # A tibble: 4 × 3
## dgcat mean median
## <fct> <dbl> <dbl>
## 1 (0,30] 4.17 4.15
## 2 (30,60] 3.12 3.11
## 3 (60,90] 3.22 2.9
## 4 (90,Inf] 2.17 2.34
You can see that the mean and median growth rates differ between the different groups. And by and large the differences are such that in country-years where there were lower Debt to GDP ratios the growth rate was larger.
As a aspiring empirical economist or econometrician you understand that just comparing means is very different to claiming any causal relationship. In fact, at now stage here we will be able ot make a convincing statistical case for a causal relationship. We are working in the realm of establishing correlations only. But even there you know that the differences we see in the above table may well be the result of some random sampling variation. To establish a statistically significant relationship we typically resort to the tool of hypothesis testing.
Hypothesis testing is at the core of much empirical analysis. Here we
will show how to perform hyothesis tests. We are using the
t.test
function. Often hypothesis tests will be testing
hypotheses about the mean of a random variable. We also learn how to
perform hypothesis on regression coefficients (often these are nothing
else but ways to estimate means!).
Let’s start by testing a hypothesis on the sample mean of the GDP
growth rate (dRGDP
) Let’s say we want to test the
hypothesis that the mean growth rate is 3.3% (note that 3.3% in the data
is represented as 3.3). Let’s get one calculated and then discuss what
we really did:
t.test(RRData$dRGDP, mu=3.3)
##
## One Sample t-test
##
## data: RRData$dRGDP
## t = 1.4896, df = 1170, p-value = 0.1366
## alternative hypothesis: true mean is not equal to 3.3
## 95 percent confidence interval:
## 3.258847 3.600675
## sample estimates:
## mean of x
## 3.429761
You may remember that, in order to perform a hypothesis test you, the
applied economist, has to set the null (\(H_0\)) and alternative (\(H_A\)) hypothesis. Here you set the null
hypothesis that the mean (mu
) is equal to 3 (\(H_0\):mu=3.3
). You also need
to specify an alternative hypothesis. We didn’t specify any so R (or
better t.test
) used it’s default option, two sided
alternative (\(H_A: mu \neq 3.3\)).
The judgemment here is that it is possible that the true, yet unknown population mean is equal to 3.3. The test statistic is 1.4896 and the p-value is 0.1366. Only if the p-value is smaller than our chosen level of significance (often 0.01, 0.05 or 0.1) would we reject the null hypothesis \(H_0\).
Make sure you use the help function (type ?t.test
into
the command window) or search for help (type “R t.test” into your
favourite search engine) to understand more details of the function.
Let’s calculate a one sided test with the alternative that (\(H_A: mu > 3.3\)):
t.test(RRData$dRGDP, mu=3.3, alternative="greater")
##
## One Sample t-test
##
## data: RRData$dRGDP
## t = 1.4896, df = 1170, p-value = 0.0683
## alternative hypothesis: true mean is greater than 3.3
## 95 percent confidence interval:
## 3.28636 Inf
## sample estimates:
## mean of x
## 3.429761
Changing the alternative does not change the t-test but change the p-value. Now the p-value is 0.0683 and whether we reject or do not reject \(H_0\) depends on our chosen significance level.
At this stage a we need to mention a huge caveat to the above tests. Our standard tests are based on the assumption that the data are identically and independently distributed. In some sense we have already seen that the this assumption cannot be defended with the data at hand. We saw the time-series plot above which showed that observations are correlated between years and ountries, so they are not independent observations. But let’s, for the sake of this introduction, ignore this complication, but note that we need to interpret results with caution.
Above we sliced the data into different subsets. Let’s use the debt level categories and compare means of growth between the subsets.
temp_high <- RRData %>% filter(dgcat == "(90,Inf]")
temp_middle <- RRData %>% filter(dgcat == "(60,90]")
t.test(temp_high$dRGDP,temp_middle$dRGDP)
##
## Welch Two Sample t-test
##
## data: temp_high$dRGDP and temp_middle$dRGDP
## t = -2.7221, df = 184.28, p-value = 0.007109
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.817628 -0.290037
## sample estimates:
## mean of x mean of y
## 2.167972 3.221804
At the bottom of the output you can see the respective sample means (which replicates the sample means we printed earlier in a table). The hyothesis tested is \(H_0: mu_{high}=mu_{middle}\) with the alternative that the respective population means are different (\(H_a:mu_{high}\neq mu_{middle}\)). As above we could have elected to test an alternative of \(>\) or \(<\) as well.
So yes there seems to be some negative correlation between the level of debt and the economic growth. So what was all the fuss about?
The reason why the work by Reinhard and Rogoff was so controversial was that it wasn’t the numbers as reported above they published. They reported numbers after changing the dataset as follows:
## Selective treatment of early years
RRselective <- RRData %>% filter(!((Year<1950 & Country=="New Zealand") | (Year<1951 & Country=="Australia") | (Year<1951 & Country=="Canada") ))
RRselective <- RRselective %>% filter(!( Country %in% c("Australia","Austria","Belgium","Canada","Denmark") ))
And now we repeat the mean and median calculations we performed above on the complete dataset.
RRselective %>% group_by(dgcat) %>%
summarise_at("dRGDP", funs(mean, median)) %>%
print()
## # A tibble: 4 × 3
## dgcat mean median
## <fct> <dbl> <dbl>
## 1 (0,30] 4.24 4.4
## 2 (30,60] 2.98 3.06
## 3 (60,90] 3.16 2.85
## 4 (90,Inf] 1.69 2.33
This by itself changes the results somewhat (the mean growth rate of the highest indebted group drops from 2.17% to 1.69%) but the differences aren’t all that dramatic. It is the combination of the above and the following third change that result in a dramatic change.
RRselective2 <- RRselective %>%
group_by(dgcat,Country) %>% # create category and country groups
summarize( m1 = mean(dRGDP, na.rm = TRUE)) %>% # calculate average growth in each category
group_by(dgcat) %>% # group again by category
summarize( n = n(), mean = mean(m1, na.rm = TRUE), median = median(m1, na.rm = TRUE)) %>% # calculate mean in each category
print()
## # A tibble: 4 × 4
## dgcat n mean median
## <fct> <int> <dbl> <dbl>
## 1 (0,30] 13 4.09 4.00
## 2 (30,60] 15 2.87 2.89
## 3 (60,90] 14 3.40 2.86
## 4 (90,Inf] 7 -0.0242 1.03
This scheme delivers clearly lower average and median growth for the highest debt category.
The main difference in this new scheme is that not each country-year
receives the same weight. The authors calculate groups of data by
dgcat
and Country
and then calculate average
growth rates. That means that for instance France receives one average
growth rate for each of the three lowest debt categories (not in any
year did it exceed the 90% threshold) and Germany receives an average
growth for the two lowest debt categories (as it never exceeded the 60%
threshold in the sample period). Then the country averages in each
category are averaged again.
This may seem totally arbitrary and indeed the justification provided by Reinhard and Rogoff seems to be, at best, deficient. Having said that, it is not unreasonable to think carefully about the weighting. Previously we saw that there is a lot of persistence in the data. This may be a reason for perhaps not weighting every year as its own, equally weighted, observation.
You may also think about weighting country observations differently according to say population. Why should Finland obtain the same weight as the United States, the latter having a population more than 60 times larger than Finland. Economists and statisticians can have a great time discussing the merits of different weighting schemes, but if your results are crucially dependent on choosing a particular scheme then this should be clearly discussed.
In this piece you were guided through some of the controversy surrounding Carmen Reinhard’s and Ken Rogoff’s influential work on GDP growth and debt levels that put an important stamp on the discussion of austerity policies following the Great Finacial crisis.
The message you should draw from this is two-fold