Archive for the Category Methodology


Define “cause”

Tyler Cowen links to a couple of studies looking at the contribution of sectoral shocks to the business cycle.  Here’s one example, from a paper by Enghin Atalay:

Next, I examine whether the choice of elasticities has implications for individual historical episodes. Figure 4 presents historical decompositions for two choices of εM. In both panels, εD = εQ = 1. In panel A, I set εM = 1; and, in panel B, εM = 0.1. With relatively high elasticities of substitution across inputs, each and every recession between 1960 and the present day is explained almost exclusively by the common shocks. The sole partial exception is the relatively mild 2001 recession. In 2001 and 2002, Non-Electrical Machinery, Instruments, F.I.R.E. (Finance, Insurance, and Real Estate), and Electric/Gas Utilities—together accounting for GDP growth rates that were 2.0 percentage points below trend.

Table 3, along with panel B of Figure 4, presents historical decompositions, now allowing for complementarities across intermediate inputs. Here, industry-specific shocks are a primary driver, accounting for a larger fraction of most, but certainly not all of, recent recessions and booms. According to the model-inferred productivity shocks, the 1974–1975 and, especially, the early 1980s recessions were driven to a large extent by common shocks.27 At the same time, the late 1990s expansion and the 2008–2009 recession are each more closely linked with industry-specific events. Instruments (essentially computer and electronic products) and F.I.R.E. had an outsize role in the 1996–2000 expansion, while wholesale/retail, construction, motor vehicles, and F.I.R.E. appear to have had a large role in the most recent recession.

Let’s think about this using an analogy.  Suppose you study the causes of cycles in house collapses.  Assume a community where 90% of houses have solid foundations, and 10% have rotten wood foundations.  Also assume that during floods the rate of house collapses rises from 7 per week to 450 per week.  A cross sectional study shows that 425 of the 450 collapsed houses during a flood had rotten foundations, while 25 had solid foundations.  This despite the fact that only 10% of overall homes had rotten foundations.

How much of the “cycle” in house collapses is “caused” by floods, and how much is caused by rotten foundations?  Show work.

The important question is: “How big would the business cycle be in a counterfactual where the Fed successfully stabilized NGDP growth?” I say “fairly small”.

Another question that is actually much less important, but seems more important to most people is: “How much of the instability in NGDP is due to monetary policy mistakes triggered by sectoral shocks, such as a decline in the natural rate of interest that the Fed overlooked, which was itself caused by a housing slump?”

When you read impressive looking empirical studies in top journals, do not assume that the authors are asking the right question.

Fiscal multiplier studies—it’s far worse than I thought

I was stunned to see a recent paper on fiscal multipliers use a 90% confidence interval, which seemed far too lenient.  After all, economics and many other sciences suffer from problems such as data mining, publication bias, and inability to replicate findings.  I’d like to see the standard statistical significance cut-off point raised from 95% to something stronger, maybe 98%.  When I did this recent post I wondered if I was making some elementary error, as econometrics is not my strong suit.

It turns out the problem is even worse than I assumed.  Indeed Ryan Murphy recently published a study of fiscal multiplier research (in Econ Journal Watch), and found that many studies use 68%!!

In recent decades, vector autoregression, especially structural vector autoregression, has been used to study the size of the government spending multiplier (Blanchard and Perotti 2002; Fatás and Mihov 2001; Mountford and Uhlig 2009). Such methods are used in a significant proportion of empirical research designed to estimate the multiplier (see Ramey 2011a). Despite being published in respected journals and cited by prominent members of the profession, much of this literature does not use the conventional standard of statistical significance that economists are accustomed to in empirical research.

Results in the literature on the fiscal multiplier are typically communicated using a graph of the estimated impulse-response functions. For instance, the effect of government spending on output may be reported by reproducing a graph of an impulse-response function of a one-unit (generally, one percentage point or one standard error) change in government spending. The graph would show the percent change in output over time following the change in government spending. To report statistical significance, authors of these studies may then draw confidence bands around the impulse response function. Ostensibly, if zero lies outside the confidence band, it is statistically distinguishable from zero. But very frequently in this literature the confidence bands correspond to only one standard error. In other words, instead of representing what corresponds to rejecting the null hypothesis at a 90% level or 95% level, the confidence bands correspond to rejecting the null hypothesis at a 68% level. By conventional standards, this confidence band is insufficient for hypothesis testing. Not every useful empirical study must achieve significance at the 95% level to be considered meaningful, of course, but a pattern of studies which do not use and reach the conventional benchmark is a cause for attention and perhaps concern. Statistical significance is not the only standard by which we should judge empirical research (Ziliak and McCloskey 2008). It is, however, a useful standard, and still an important one. Here I examine papers in the fiscal multiplier literature which apply vector autoregression methods. Sixteen of the thirty-one papers identified use narrow, one-standard-error confidence bands to the exclusion of confidence bands corresponding to the conventional standard of 90% or 95% confidence. This practice will often not be clear to the reader of a paper unless its text is read rather carefully.

I can’t even fathom what people are thinking when they use 68%.  It seems like something you’d see in The Onion, and yet apparently this stuff gets published.  Can someone help me here, what am I missing?

Lateral thinking

Over at Econlog a few weeks ago I did a post entitled “How I Think.”  I was reminded of that post when reading commenter theories on the poor health outcomes of American whites aged 45-54.  In the post, I said that when evaluating Alan Greenspan’s performance, you don’t want to focus on Greenspan; you want to focus on how other central bankers did during this period (about as well.)  When thinking about China’s growth prospects you don’t want to focus on everything you know about China (too complicated), but rather look at other East Asian countries.  And when deciding whether the gold inflows to Spain (1500-1650) were a resource curse that hurt long-term growth, you want to look at other Mediterranean regions that did not received big gold inflows.

Here’s the graph:

Screen Shot 2015-11-05 at 2.22.21 PMPeople were providing answers that seemed to miss the big picture. The graph shows two surprising results, the poor performance of middle-aged white health after 2000, and the excellent performance of Hispanic health after 2000.  In America, Hispanics tend to be disproportionately low income/working class, the group that has been hit hardest by recent economic trends. They were also especially likely (before ObamaCare) to lack health insurance.  And yet their health is significantly better than the health of French and German citizens, in the same age group.

I have no theories at all—I don’t even know if the data are accurate.  But if I was going to come up with a theory, I sure as hell would make sure it explained the sudden and massive divergence in White/Hispanic health outcomes.  If it didn’t, I’d have zero confidence that my theory was correct.

Paul Krugman has promised us an explanation in a future post.  Let’s see what he comes up with.

PS.  I don’t know about you, but to me that graph undercuts some of the recent anti-immigration hysteria.  If Hispanics are actually so inferior, so likely to degrade our precious “Anglo” civilization, how come they have such superior health outcomes?  In rich countries like America, don’t poor health outcomes often reflect poor lifestyle choices?  Just asking.

Update:  Commenter Mike Scally linked to an Andrew Gelman post that says the US data is biased about 5% upward due to compositional effects (the 45-54 age group is getting slightly older, due to boomers passing through).  So instead of rising slightly, US white (middle aged) death rates at a given age have been essentially flat. Of course other death rates fell about 30%, so there’s still a pretty big mystery.

More evidence that public opinion polls don’t measure policy preferences

I frequently argue that public opinion polls on complex policy issues are almost meaningless.  (Although polls can be useful for predicting election outcomes.)  It all depends on the framing.  Here’s another study that reached the same conclusion:

We presented respondents with two different education plans, the details of which are unimportant in this context. What is important is that half the sample was told A was the Democratic plan and B was the Republican plan, while the other half of our national sample was told A was the Republican plan and B was the Democrats’ approach.

The questions dealt with substantive policy on a subject quite important to most Americans “” education “” and issues that people are familiar with “” class size, teacher pay and the like.

Nonetheless, when the specifics in Plan A were presented as the Democratic plan and B as the Republican plan, Democrats preferred A by 75 percent to 17 percent, and Republicans favored B by 13 percent to 78 percent. When the exact same elements of A were presented in the exact same words, but as the Republicans’ plan, and with B as the Democrats’ plan, Democrats preferred B by 80 percent to 12 percent, while Republicans preferred “their party’s plan” by 70 percent to 10 percent. Independents split fairly evenly both times. In short, support for an identical education plan shifted by more than 60 points among partisans, depending on which party was said to back it.

Most polls on policy questions report little more than mood affiliation.

Update:  Here’s how Yahoo describes the charges against Dennis Hastert:

Hastert pleads not guilty in hush money case

The former House Speaker is accused of agreeing to paying $3.5M to hide past misconduct claims

Interesting that the American press is so ashamed of our country that they refuse come right out and say that it can be illegal to withdraw cash from your own bank account, and instead feel a need to make up lies about Hastert being charged with paying hush money.

Update#2:  Et tu, Vox?

The quasi-monetarists are winning . . .

Check out this interview with Chicago Fed president Charles Evans(sent to me by JimP):

Where is the common ground on the committee right now?

Evans:The statement is fairly clear on that. We see the economy is recovering. We see inflationary pressures lower and we see the unemployment rate high and it is going to be slower to come down. With the funds rate already at zero, there is a pretty valid question as to how accommodative is monetary policy. Some people would point to the size of our balance sheet and say there is an enormous amount of accommodation. Just look at the amount of excess reserves in the system. Milton Friedman looked at the U.S. economy in the 1930s and he saw low interest rates as inadequate accommodation, that there should have been more money creation at that time to support the economy. That wasn’t based upon the narrowest measure of money, like the monetary base or our balance sheet. It was based on broader measures like M1 and M2 and how weak those measures were. I’ve come to the conclusion that conditions continue to be restrictive even though we have a lot of so called accommodation in place. An improvement would be a dramatic increase in bank lending. That would be associated with broader monetary aggregate increases. Then we would begin to see more growth and more inflationary pressures and then that would be a time to be responding.

It’s so gratifying to read this.  As you may know, a small band of us “quasi-monetarists” have been making some of these points for several years.  Most people unthinkingly assumed Fed policy was ultra-loose in 2008-09, merely because interest rates were low and the base had grown enormously.  We pointed out that the same thing had occurred during the early 1930s, and Friedman and Schwartz showed that policy was actually tight in the only sense that really matters—relative to what was needed for on-target inflation and/or NGDP.

Now we have a top Fed official saying things are actually “restrictive,” and using some of the same examples from the 1930s that we often cite.  In my view quasi-monetarism is the best way to diagnose the stance on monetary policy, as we understand that low interest rates often merely reflect a weak economy and severe disinflation.

I suppose I should try to define ‘quasi-monetarism.’

1.  Like the monetarists, we tend to analyze AD shocks through the perspective of shifts in the supply and demand for money, rather than the components of expenditure (C+I+G+NX).  And we view nominal rates as an unreliable indicator of the stance of monetary policy.  We are also skeptical of the view that monetary policy becomes ineffective at near-zero rates.

2.  Unlike monetarists, we don’t tend to assume the demand for money is stable, and are skeptical of money supply targeting rules.

Unfortunately there are almost as many nuances to quasi-monetarism as there are quasi-monetarists. I’ll list a few names in the blogging community, with apologies to those who don’t wanted to be included, and those I leave out accidentally.  I think of monetary bloggers like Nick Rowe, David Beckworth, Bill Woolsey, Josh Hendrickson, myself, and I’m sure there are others.  Some of my frequent commenters have their own blogs, but if I try to list everyone the omissions will just become more noticeable.  If you have a blog and consider yourself quasi-monetarist leave your name in the comment section and I’ll add it here:

Update:  Marcus Nunes, (who has a Portuguese language blog.)  Also commenter “123” who has a blog entitled “TheMoneyDemand”.

I think in the long run quasi-monetarism will merge with monetarism, and become one big school of thought with different perspectives.  If Steven Williamson hadn’t already taken “new monetarism” that might be the right term.  But his perspective and methods are quite different.

HT:  Liberal Roman