Any thoughts on Zipf’s Law?

I don’t have time to do posts on what I’d really like to talk about, the new Krugman and Wells article or Russ Roberts’ piece on banking.  So I’ll defer those until after my trip to Oxford, and instead do a short fun piece on Zipf’s Law.  Well, fun for nerds like me, who find descriptive statistics to be endlessly fascinating.  (Not the other kind of statistics.)

Mankiw recently linked to an Edward Glaeser article on Zipf’s Law, which reminded me of a table of skyscraper statistics.  It may be that everything I say is already widely known, and fully explained.  If so I can count on my very smart commenters to point that out.  For those who don’t know, Zipf’s Law says that in a ranking of entities by size, the second on the list will be about 1/2 the size of the first, the third will be 1/3 the size of the first, the 10th largest will be one tenth the size of the first, etc.  A good example is the population of US cities:

Rankâ†“  Cityâ†“  Stateâ†“  Populationâ†“
1  New York  New York  8,363,710
2  Los Angeles  California  3,833,995
3  Chicago  Illinois  2,853,114
4  Houston  Texas          2,242,193
5  Phoenix  Arizona  1,567,924
7  San Antonio  Texas          1,351,305
8  Dallas  Texas                  1,279,910
9  San Diego  California  1,279,329
10  San Jose  California  948,279

I find this kind of spooky, as these cities grew spontaneously.  Note that if you look at this Wikipedia list you will find that the big cities are actually a bit too small when compared to cities ranked, 20, 30, etc.  The Glaeser article shows that metro populations have the same problem–big cities are a bit too small.

Does this work for other countries?  My hunch is that it works for smaller cities in most countries, but the bigger cities are often much less perfectly representative of Zipf’s Law.  Germany, for instance, has no dominant NYC-type city, but rather a half dozen cities with about 2 million people.  This may reflect the fact that Germany was once a collection of independent states.  To make this point another way, I don’t think that Zipf’s Law even comes close to working at the world level.  I suppose Tokyo is the largest metro area (30-35 million?), but there must be at least a dozen metro areas of at least 16 million.  If not, there soon will be.

I got to thinking about Zipf’s Law when I ran across this data for cities with the most skyscrapers:

1   84922  Hong Kong
2   35811  New York (inc Jersey City, Fort Lee, Guttenburg)
3   19670  Tokyo
4   18129  Shanghai
5   16426 Chicago
6   15262   Dubai
7   13375   Bangkok
8   10368   Guangzhou
9    8849    Chongqing
10  7923 Shenzhen
11  7697    Singapore
12  7674    Kuala Lumpur
13  7195    Seoul
14  6598    Manila
15  6053 Toronto (inc. Mississauga)
16  5590    Jakarta
17  5473 Osaka
18  5371    Beijing
19  4909    Miami (inc Miami Beach)
20  4903 Nanjing
22  4693 Sydney (inc. N. Sydney, Chatswood, Bondi, St. Leonards)
23  4411 Moscow
24  3978 SÃ£o Paulo
25  3842 Los Angeles (inc Burbank, El Segundo, W Hollywood)
26  3626 Melbourne
27  3564 Atlanta (inc Vinnings, North Atlanta)
28  3435 San Francisco
29  3274 Panama City
30  2959 Wuhan

50   1993    Taipei

90     823    Ankara

This is actually far better than the US city population example, as it holds pretty well throughout the entire 1 to 90 range.  Indeed the top of the list looks almost spookily like the US population data.

At this point you might be thinking; so what?  After all, skyscraper intensity is presumably correlated with population.  But that’s exactly the problem.  At the world level, metro area populations don’t even come close to following Zipf’s law, at least for the largest metro areas.  Also notice that Hong Kong is not among the top 30 world metro areas, and Dubai is not in the top 100.  On the other hand the list includes only three cities from Europe and Latin America, which have lots of big metro areas.  Nor does it seem related to level of development.  Poor Asian cities often have many more skyscrapers than Japanese cities of equal size.  (Compare Osaka to the big cities in China and SE Asia.

And finally, is this just a fluke, or is there some underlying reason for the pattern?  Skyscrapers are being built at a furious rate in Asia, especially China.  I’m pretty sure that the nice Zipf’s Law pattern for the top 10 will break down in a few decades.  Enjoy it while it lasts.  Any thoughts would be appreciated.  Apologies if I have merely reproduced what is already widely known.

BTW, I tried to model the failure of Zipf’s Law for world metro populations by considering a model of the world as a limitless plain, where a combination of pre-modern transport constraints, language regions, and nationalism, created lots of similar size countries, each having major capital cities of roughly equal size.  As the number of countries approached infinity, I’d expect Zipf’s law to do very poorly.  Unfortunately, although Zipf’s law doesn’t work for world metro pops, it does work for world country populations.  And in a hundred years it will probably work even better for country pops, as the largest countries are expected to have:

India:  1.5 billion

China:  750 million

USA:  500 million

Explain that!

Part 2.  More on NGDP targeting.

I’ve argued that once you start thinking in terms of NGDP, it’s hard to avoid evaluating monetary policy in terms of changes in NGDP, or M*V.  And once you start using NGDP as an indicator of policy, it’s hard to avoid the next logical step, which is that the Fed should target NGDP.  Matt Yglesias has been recently using NGDP as evidence of the need for more stimulus.  Now he has taken the plunge, and gone for NGDP targeting:

It’s probably worth observing that the dual mandate is arguably conceptually incoherent. In normal times, the Fed only uses one policy instrument so it can’t really be targeting two things. The main practical upshot of the dual mandate is that it’s impossible to say for sure whether or not the Fed is meeting it. If you gave the Fed a single clear mandate””keep M*V growing at a steady rate of approximately such-and-such then Congress and the President could specifically say whether or not the Fed was executing its mission.

It’s good to have left-of-center pundits on board.  As I have said, this idea should appeal to liberals who worry that inflation targeting gives too little attention to unemployment, but are also knowledgeable enough to realize that the Fed can only hit one target at a time.

3.  Immigration and housing

Ryan Avent recently cited my earlier post linking the 2007 crackdown on immigration and the housing bust, and then added the following comments:

I think you want to be careful about assigning too much causation to this factor, but demand is demand, and a negative shock to expected growth in housing demand from immigrants certainly wouldn’t have helped matters in 2007. Here in the Washington area Prince William County, in the Virginia suburbs, adopted particularly draconian immigration-status check rules during the late stages of the bubble, and it subsequently experienced some of the largest declines in real estate values.

Along these lines, here‘s Richard Green:

“My colleague Dowell Myers points out that for the housing market in the US to remain healthy, we must “cultivate new immigrant residents.” Arizona’s new law, which would require immigrants (legal or otherwise) to “carry papers” creates what I would consider to be an atmosphere of hostility to immigrants–all immigrants. I am also awaiting the spectacle of a police officer demanding the “papers” of a native-born Latino.

In any event, people have a propensity to go where they feel welcome, and avoid places where they are not. Hostility to immigrants in general and Latinos in particular seems to be a political loser in California, so Arizona’s policies may lead to higher demand for houses in California.”

I buy what he’s selling. And consider that Phoenix home values have declined 52% from their peak, are still off on a year-over-year basis, and declined in both January and February of this year. As Mr Sumner put it, now might not be the optimal moment to send out a signal to property markets that Hispanic immigration is about to slow sharply.

I agree, and would add that I didn’t mean to suggest that the immigration slowdown caused the entire housing bust, I think the bubble/bust was mostly due to previous errors by private lenders, F&F, moral hazard, and then later in 2008 by tight money.  But immigration probably played a non-trivial role in southwestern markets.

Tags:

17 Responses to “Any thoughts on Zipf’s Law?”

1. E. Diaz
30. April 2010 at 08:00

When I visited Hong Kong in 2002 one of my cab drivers said that most of the high rises were places of residence. Could it be that residents in Asian nations are less averse to living in high rises?

2. Bill Woolsey
30. April 2010 at 08:02

Growth path, not rate.

We need to work on Ynglesias a bit more. But it is good news.

Good luck at Oxford

3. Mike Sandifer
30. April 2010 at 08:08

Scott,

I’d already been thinking in this direction with regard to neural networks. I believe the answer lies in small world networks(scale free), which follow power laws, such as Zipf’s. We’re finding these clustered networks in very many systems, including the internet, the brain, river systems, etc.

http://en.wikipedia.org/wiki/Small-world_network#Examples_of_small-world_networks

http://en.wikipedia.org/wiki/Scale-free_network

http://en.wikipedia.org/wiki/Power_law#Examples_of_power_law_functions

4. Mike Sandifer
30. April 2010 at 08:09

5. Dan Carroll
30. April 2010 at 08:57

When looking at US city population, I suggest looking at CSA’s, as city limits are just lines on a map. CSA’s vary depending on how they are defined.

1. New York City – 22.0 m
2. Los Angeles – 17.8 m (81%)
3. Chicago – 9.8 m (45%)
4. DC – 8.1 m (37%)
5. San Francisco – 7.4 m (33%)

6. Doc Merlin
30. April 2010 at 10:43

Zipf’s isn’t surprising to me at all.

Any time you have a statistical process (not dependent on size) with the same process generating it, you will get a power law when you look at the characteristic vs frequency. It is not freaky, it is really, more or less, the null result. Its only interesting if when you plot it on a log( the characteristic) vs the frequency, you get kinks in the line. The kinks mean that something interesting is happening at different characteristic sizes/frequencies.

I haven’t done it, but I suspect that if you look at log size of a company’s employment rolls vs number of companies, you will see kinks at 100 employees, 50, ~5, and then again near 1. Because the generation process and laws for companies changes at those sizes.

7. Marc Cooperman
30. April 2010 at 10:49

Zipf’s law is a ‘power law’. it’s the same thing at work in the ‘fat tails’ of financial returns. The fallacy of the ‘normal curve’ (bell curve) in financial returns played a big part in the market disruptions since 2007. the emergence of the power law (i.e. the observed behavior of the returns in the tails of the distribution) is probably an artifact of the population of capital pools (traders). in english, there are relatively fewer pools of larger size, and when they trade, they move the market. I suspect that what you are seeing in skyscrapers is related to the amount of credit available to build them.

8. Felix
30. April 2010 at 13:31

“Germany, for instance, has no dominant NYC-type city, but rather a half dozen cities with about 2 million people.”

I hope the things you tell us about the depression have a stronger basis in reality. Zipf’s law seems to fit german cities quite well.

http://en.wikipedia.org/wiki/List_of_cities_in_Germany_with_more_than_100,000_inhabitants

9. scott sumner
1. May 2010 at 07:01

E. Diaz. Yes.

Bill, I agree. In fairness to Yglesias, I recall in a previous post he had a graph showing NGDP far below trend, and talked about the need to catch-up.

Mike, Yes, it applies to a wide range of cases.

Dan, I agree, and as you point out the pattern for big cities is not quite as nice as when using city limits. Glaeser’s article shows it works for smaller cities even using metro areas.

Doc, Why does it apply to metro areas in the US, but not world metro areas? You don’t get the result “any time”.

Marc, There is far more credit to build skyscrapers in Japan than HK. I find it interesting that the law works well despite very different building codes in different parts of the world. I still wonder if it is just coincidence, and will be gone after 30 years.

In my view the non-normality of investment returns is related to the non-normality of underlying fundamental shocks to the economy.

Felix, Thanks for correcting me. Oddly it doesn’t work well for cities 6 through 15 on that list, which all have about the same population. I was probably thinking about German metro areas, but as my list for the US was cities, that’s a pretty lame excuse on my part. For what it’s worth, here are the metro areas pops:

http://en.wikipedia.org/wiki/Demographics_of_Germany

Note that if you treated Koln and Dusseldorf as separate cities (which seems reasonable), then there would be six big metro areas in Germany, each with 4 to 6 million people.

I was relying on memory, and forgot that the city populations in Germany are much different from the metro areas. Traveling in Germany one does not get the sense of there being a dominant city a la Paris or London. I should not have relied on memory.

10. Doc Merlin
3. May 2010 at 10:43

‘Doc, Why does it apply to metro areas in the US, but not world metro areas? You don’t get the result “any time”.’

US metro formation is very organic, and with a few exceptions the processes for forming a city are very similar across the states.

In the rest of the world, I imagine the generating processes for city growth are very different across countries so one countries Zapf’s law wouldn’t necessarily apply to another country.

11. rob
3. May 2010 at 12:00

i dont see how slowing poor immigrants would help a housing bust. it seems like it would be the other way around. in the SW at least, illegal immigrants are the ones building all the new homes. if people expected that cheap labor to dry up, house prices for middle to high range houses should go up since labor costs would go up. no?

12. rob
3. May 2010 at 18:14

I’ll add, i would think recent immigrants would affect rental rates more than home prices, yet rental rates did not increase like home prices. in fact, one of the key measures used by those claiming home prices had risen too far was the high ratio of owning vs. renting. cooks, construction workers and day laborers were not buying a lot of houses. at least it’s hard to imagine they were.

13. rob
3. May 2010 at 18:36

a final point: i think you have the cause and effect reversed. as the housing market slowed, construction jobs dried up and many migrants, with no work, went back home.

14. scott sumner
4. May 2010 at 06:50

Doc, I think it is much simpler. Suppose there are diseconomies of scale beyond some point. Then as you have more and more countries there will be a logjam of big cities (which are often capitals of each country) but no super-big cities. There is no city in China, India or the US that has anywhere near the share of our population that the London and Paris metro areas have in their countries. So very big countries appear like more than one country, with multiple big cities instead of just one.

If the world was 10 times bigger, with 10 times as many countries as now, the biggest metro area would still be about 35 million, but there would be 10 times more cities in the 10 to 20 million range. I think that’s why Zifp’s Law doesn’t work for world metro areas.

rob, There was no housing price boom, rather it was land prices that boomed. Housing construction is a near constant cost industry, so immigration affects the demand-side much more than the supply-side. The housing boom in the southwestern US was most intense in areas where many hispanics were buying homes.

Also recall that it doesn’t matter if immigrants buy new or used homes, just that they add to demand. If immigrants buy crummy 2 bedroom ranch houses built in the 1950s, that pushed the (native born) residents of those houses out into newer homes. What matters is the total demand for housing.

I agree with your last point—causality went both ways.

15. Michael F. Martin
6. May 2010 at 12:03

Scott,

You should see Sornette’s work on “dragon kings”

http://technologyreview.com/blog/arxiv/23935/

16. ssumner
7. May 2010 at 06:46

Michael, Thanks. That was an interesting article.

17. Brendan Darcy
10. April 2012 at 03:30

Scott
Zipf’s law might explain why we are alone in the universe. The most common score in cricket is a duck (tbat’s zero for you North Americans). The second most common score is 1, the third is two an so on. Given that we exist our universe cannot be a duck (the number if civilisations is at least 1) but perhaps we can postulate the expected number using Zipf’s law??
B