Putting history back into economic history

One of the many things that made me become disenchanted with economics was the way that economists relied more and more on purely technical approaches to complex real world problems.  It’s like studying biology with a physicist’s theoretical toolkit.  Yes, physics underlies biological processes, but there is so much more.

I was particularly disturbed when I saw economists just grab a data set from the Great Depression, and think that they could study the Depression without actually knowing what was going on.  “It’s all in the data.”  Actually it isn’t.  You have to know which data are appropriate, which are accurate, and which variables are exogenous.  Numerical data alone can never answer those questions.

I thought of these issues recently when I came across the following post by Bill Mitchell.  The entire last part of the post is devoted to demolishing a paper I did back in 1995 with Steve Silver.  Unfortunately the critique falls flat, because Professor Mitchell doesn’t seem to know the historical background of the data he uses.

There are three major problems with his analysis.  He uses annual data where monthly data is needed, he uses employment levels where output is needed, and when he does use monthly output data, he uses a measure of industrial production that is so flawed as to be worthless.

The easiest way to explain this issue is by focusing on the largest of FDR’s 5 wage shocks, the President’s re-employment agreement of late July 1933.  Here’s what happened in 1933:

Output was severely depressed in early 1933, as a gold panic occurred during the Presidential interregnum.  Between March and July real wages plummeted by about 14% as the WPI index rose by a roughly equal amount.  (This was caused by dollar depreciation against gold.)    Nominal wages were flat.  Industrial production soared by 57%, regaining half the ground lost in the previous 3 1/2 years.  We should have been out of the Depression by late 1934 or 1935.  Yet FDR couldn’t leave well enough alone.  He ordered industries to cartelize and firms were ordered (actually strongly pressured) to raise wages by 20% and cut hours by 20%.  Output immediately started to fall rapidly and didn’t reach July 1933 levels again for another two years.  This is the big wage shock that Mitchell claims helped the recovery.

Then in late 1933 the gold-buying program gradually started to push prices higher, and real wages fell a bit allowing the recovery to resume.  The same mistake was made in mid-1934, with the same results.  In late 1936 and 1937 wages also rose rapidly because of union drives associated with the Wagner Act and FDR’s big win.  This time prices were also rising fast so the collapse in output was delayed until later in 1937 when real wages soared.  The minimum wage increases of late 1938 and late 1939 also halted promising recoveries.

Mitchell somehow thinks the way to test the effect of the NIRA was to use employment, not output, even though the policy was designed to reduce hours worked by 20%, which would obviously affect output, not the number of jobs.  (Ironically, FDR’s program was designed to reduce hours and output.  So if Mitchell claims it did not, then he is claiming FDR’s program failed to achieve its announced goals.)

If you look at my description of 1933 (or any of the wage shocks) you will immediately see why annual data is almost worthless.  Often you would see sharp rises and falls in both real wages and output occur within a single calendar year.  You must use monthly data to see what is going on.  And you must use output data, not employment.

Unfortunately we don’t have monthly GDP data, but fortunately we do have monthly data for industrial production, which is far more cyclical that agriculture or services.  Although industrial production rises and falls more sharply than overall RGDP, the pattern is similar and industrial production is closely correlated with many other cyclical indicators.

To his credit, Mitchell did attempt to replicate my results with industrial production data.  Unfortunately he relied on the Miron/Romer data.  It pains me to say this, as I respect both economists greatly, but their IP series can only be described as extremely bizarre.  If you don’t know much about the Great Depression you will have to take this partly on faith, but I’ll try to sketch out a few reasons:

The Fed’s IP series shows a big boom in the first 7 months of 1929, and then a sharp depression late in the year.  And this correlates with everything we know about 1929, from industry data, from news reports about the economy in the NYT and WSJ, from other market indicators like commodity and stock prices, with everything.  Almost everyone believes there was one of the great booms in American history in the first 7 months of 1929.  Except Miron/Romer.  They show a near depression, with IP falling almost 20%.  Even worse, they show IP rising after August 1929 and into early 1930, when almost all the data we have from autos, steel, rail shipments, you name it, shows the economy plunging into a deep depression.

And it’s not just 1929, unfortunately this bizarre behavior continues all through the Depression.  They have IP flat in 1931, when both the Fed and all the contemporary account show a disastrous fall.  (Indeed on the graph in Mitchell’s post December 1931, a dreadful time, seems about the same as the cyclical preak of August 1929.)  They have IP nearly doubling in about 2 months during late 1934.  They have industrial production no higher in late 1936 and early 1937 than in some months in 1933 and 1934.  Much worse, they show industrial production flat in the last part of 1937, repeating the embarrassing mistake of late 1929.  And I could go on and on.  The series is full of sharp fluctuations that don’t seem to correlate with anything going on in the real world.  I have to assume that had a grad student do the data collection and manipulation.  No serious student of the Depression could take this IP series seriously.

Why does this matter?  In our paper the (negative) correlation between real wages and IP was so obvious when you look at the graph, that no amount of tweaking could change the result.  When Mitchell reversed the result with Miron/Romer data it should have been a red flag.

I’ll just answer a few more charges.  Mitchell accuses us of making some arbitrary decisions like using IP and using a 4 month window for the effects of wage shocks.  He also asks why I looked at only those 5 fluctuations in IP in my post, as there are many others.

1.  Even Mitchell agrees that there are lots of high frequency IP shocks.  So we needed monthly data, and IP data was the best cyclical indicator we could find.

2.  Given that many of the high frequency shocks occur within a calendar year (even in the graph he shows) it seemed reasonable to use a four month window for my blog post.  One month is obviously too short to avoid the ‘noise’ problem and 12 months would cover several cycles.  My post findings are not that sensitive to 4 months; 2, 3, 4, 5, 6, 7, 8 would all show a similar pattern, but the absolute changes would vary.

3.  I looked at those 5 IP shocks because I was examining the impact of nominal wage shocks in that post, and there were only 5 big nominal wage shocks in the New Deal.  Somehow he is thinking about the issue backward, he seems to think I was investigating IP shocks, and claiming they were all caused by wage shocks.  I’m not an RBC economist.  I have a whole book manuscript showing that most of the variation of IP during the Depression was caused by monetary shocks, not wage shocks.

I don’t mind people criticizing my work on the Depression; I don’t doubt that there are lots of flaws in it.  But you first need to familiarize yourself with the stylized facts, and the nature of the public policy shocks you are investigating.  You can’t just grab some data and run a bunch of regressions.  Unfortunately I think most of the profession disagrees with me on this issue.  In my view the entire field of macroeconomics should be thought of as a branch of economic history, and should be taught that way.  Try getting tenure in an elite program with that attitude.

HT:  Nicholas Blanchard

PS.   The last sentence doesn’t mean that there aren’t good macro historians at elite programs.