CHAPTER 5

The Size-Density Hypothesis

We shall not cease from exploration
And the end of all our exploring
Will be to arrive where we started
And know the place for the first time.
— T. S. Eliot, Four Quartets - Little Gidding

The next phase of the research was less pre-planned than the work I had done so far. After arriving at Western Washington University, I was browsing through a world atlas^[1] when I began to wonder about the administrative divisions of some of the nations shown there. Would the model developed to account for size variation in U.S. counties have anything to say, for example, about variation in county size in the British Isles? What about the départements of France? The provincias of Spain?

I couldn't approach such a problem directly because I didn't have the kinds of historical maps used in the case of the United States, Figs. 4-1 and 4-2. Still, some aspects of a map like Fig. 5-1 were suggestive. Ireland was originally divided into counties in 1167 when Henry II established English courts at Dublin. There is a rough size- distance relation, county size increasing with distance from Dublin. Northern Ireland, which only separated from the remainder in 1922, became more heavily populated with modern industrialization, but the presence of a size-distance relation suggests that Ireland may once have fit the historical U.S. pattern.

FIG. 5-1. BRITISH ISLES

A similar observation can be made for England and Wales. The English counties emerged gradually from ancient kingdoms. Those of Wales were created shortly after 1284 when Edward I placed Wales under the English crown. Again, it is apparent that there is a size-distance relation for the region as a whole, with London as its focus, larger counties to the north and southwest. In Scotland, where "sheriffdoms" were established in the 12th century, the determining factor is clearly geography; the lightly settled highlands contain much larger counties than the more intensely settled lowland region.

The difficulty is already apparent. Testing the model on other nations requires information which may be lost in pre-history. Even in the cases just mentioned, there is not sufficient demographic and cartographic information to answer whether county division in the British Isles was in any sense similar to county division in the United States. I had to develop an alternative approach — something other than the historical- geographic — if I wanted to study administrative divisions of other nations.

I asked myself what data available today would enable me to test the model developed in Chapter 3? If a long-term process of subdivision had produced administrative boundaries, what relationships would I expect to observe at present? The size-distance relation would be one such finding, but it need not show up elsewhere. It had, after all, disappeared in areas of the United States fully settled prior to the introduction of the automobile.

If one or more seats of power had once been the focus of size-distance relations, and these had all been brought together in a larger territorial unit, what would one expect to see at present? In general, it seemed to me, regions which had already been intensely settled should continue to add to their settlement intensity. Regions more lightly settled in the past should still be relatively lightly settled. This was what I later came to call the "size-density hypothesis" — smaller units should show up in regions of higher population density, with larger units in regions of lower density.

The relation between the new hypothesis and the old is shown in Fig. 5- 2. Fig. 5-2-A is the same as Fig. 3-8-I, representing expansion of the settlement area from the original settlement area (lower left hand corner). There appears to be a size-distance relation since the partially settled units are not yet fully subdivided. Assuming that older areas continue to grow, their densities would increase over time, whether or not subdivision had gone to completion. If this were the case, the result would be as shown in Fig. 5-2-B, with yellow-orange shades indicating higher density. Fig 5-2-C is what we would expect if we computed densities directly from data obtained from modern territories (density = population/area).

FIG. 5-2. SIZE-DIVISION AND SIZE-DENSITY

The situation shown in 5-2-A does not require the situation shown in 5- 2-C . But the 5-2-C would be consistent with the 5-2-A, assuming continued growth of the original settlement area (the lower left of 5-2-A). A size-density relation (assuming there is any variation in size to begin with) is consistent with the notion of an earlier process of subdivision in response to changes in population distribution.

The data needed to test the size-density hypothesis were available in the Britannica World Atlas, the maps of which had originally led me to wonder about international application of the division model. There is a section of the Atlas, "Geographical Summaries", which gives a variety of statistical information for nations from Aden to Zambia. Part of a typical entry is illustrated in Fig. 5-3, the territorial divisions of Algeria

FIG. 5-3. ATLAS ENTRY FOR ALGERIA ^[2]

This shows the size and density for each territorial unit. By inspection, the first département, Alger, has the highest density and the smallest size of any département. Among the regular départements, Saïda has the lowest density and the largest size. Including the Saharan départements, we find still larger sizes together with still lower densities.

The charts that follow were generated with StatView, though there are many equivalent applications now. For a look at how I did much of the work originally, you might want to look at Data Processing in the '70s and Later

If we attempt to graph the relation between size and density overall, however, as in Fig. 5-4, we encounter a difficulty. The largest of the units, Oasis, is so large that nearly all the others are pushed down along the horizontal axis. It is difficult to observe variation in size, relative to that of the largest unit. By the same token the highest density, that for Alger, is so high that it pushes most of the other dots together near the vertical axis.

FIG. 5-4. SIZE AND DENSITY IN ALGERIA

What produces such extremes? The theory says that increasing density should be associated with further subdivision. Yet there is no doubt a lower limit to size beyond which subdivision ceases. If a large region of a country remains unsettled while all the rest is more-or-less settled, we obtain extreme values for size. A related argument can be developed for density. Most regions would show whatever density is "normal" for their country. If one area becomes a focus of migration, attracting more and more population to itself, it would "out-distance" the normal units, resulting in extreme density. This appears to have been the case with the Algerian département containing Algiers, the national capital.

One way of reducing the effects of these extreme values is to toss them out of the analysis. While this might make some sense in the case of Algeria's Saharan départements,^[3] it is hard to justify throwing out Alger solely because it is highly dense. That, after all, is what the theory is about: variation in size being related to variation in density. We can retain extreme cases without compacting the rest of the distribution by logarithmic transformation ^[?], as in Fig. 5-5.

FIG. 5-5. LOGARITHMIC TRANSFORMATION OF FIG. 5-4

Logarithmic transformation produced an observation which I had not originally anticipated^[4]: the scatter of data points now formed an approximately straight line. My argument thus far had been only that higher densities ought to be associated with smaller territorial size. The form of the relation had not been specified. It now appeared that the "negative-ness" of the relation could be specified quantitatively, as the slope of a line fitting the scatter of data points.

FIG. 5-6. SIZE-DENSITY REGRESSION LINE FOR ALGERIA

Such a line is shown in Fig. 5-6; its formula was obtained through ordinary least-squares regression analysis^[?]. Incidentally, dropping out the two desert départements results in only slightly different values for the intercept and slope (a = 5.217, b = -0.62).

There were 98 nations in the Atlas for which suitable data were given. Several were too small to show territorial divisions; in several others data were seriously incomplete. At the time I was conducting this analysis I wanted to see if nations showed the expected negative slope relating the logarithms of size and density. I needed a decision criterion to decide whether the observed slopes departed significantly from zero in the negative direction. This implied a one-way test of significance^[?] against the null hypothesis: does the slope depart in a negative direction?

The results of these tests are shown in Table 5-1. This shows the number of divisions in each country (N), the computed slope relating log-size to log-density (b), and the probablity of obtaining a slope that negative, assuming the null hypothesis (β = 0). The tests of significance are summarized in Fig. 5-7. In all but 4 of the 98 nations the slopes were negative, as expected. Among the 94 negative slopes, 78 show departures from the null hypothesis which are statistically significant (p<.05). 50 slopes, over half the total number, were significant at the .0005 level. None of the four positive slopes (Haiti, Italy, Malawi, and Yugoslavia) was significant.

Table 5-1. Size-Density Slopes for 98 Nations

              Nation   N    b    p{β=0}
             Albania  27  -0.55  <.0005
             Algeria  15  -0.75  <.0005
              Angola  13  -0.71  <.025
           Argentina  23  -0.58  <.0005
           Australia   8  -0.74  <.025
             Austria   9  -0.73  <.005
             Belgium   9  -0.13    n.s.
             Bolivia   9  -0.27    n.s.
              Brazil  28  -0.64  <.0005
    British Honduras   6  -0.36    n.s.
            Bulgaria  28  -0.52  <.0005
               Burma   7  -0.47  <.05
            Cambodia  21  -0.89  <.0005
              Canada  15  -0.30  <.025
 Central African Rep  13  -0.36  <.005
              Ceylon   9  -0.33  <.0005
                Chad  11  -0.51  <.0005
               Chile  26  -0.45  <.0005
               China  27  -0.64  <.0005
            Colombia  29  -0.44  <.0005
               Congo  13  -1.19  <.0005
          Costa Rica   7  -0.48  <.05
                Cuba   6  -0.47    n.s.
      Czechoslovakia  11  -1.00  <.0005
             Denmark  24  -0.79  <.0005
  Dominican Republic  25  -0.21    n.s.
             Ecuador  16  -0.10    n.s.
         El Salvador  14  -0.39  <.025
             Finland  12  -0.58  <.05
             Formosa  24  -0.95  <.0005
              France  90  -0.29  <.0005
               Gabon   9  -0.55  <.05
           Germany E  15  -0.78  <.0005
           Germany W  10  -1.53  <.0005
               Ghana   9  -0.95  <.0005
              Greece   9  -0.17    n.s.
           Guatemala  22  -0.60  <.0005
               Haiti   5   0.76    n.s.
            Honduras  18  -0.58  <.005
             Hungary  24  -1.07  <.0005
             Iceland   5  -0.33    n.s.
               India  20  -0.13    n.s.
           Indonesia  27  -0.60  <.0005
                Iran  19  -0.47  <.025
                Iraq  14  -0.82  <.0005
             Ireland  26  -0.36  <.025
              Israel   6  -0.78  <.0005
               Italy  19   0.32    n.s.
         Ivory Coast   4  -0.81  <.01

              Nation   N    b    p{β=0}
             Japan  46  -0.50  <.0005 
            Jordan   8  -0.28    n.s. 
             Kenya  13  -1.09  <.0005 
           Korea N  11  -0.81  <.005 
           Korea S  11  -0.96  <.005 
           Lebanon   5  -1.05  <.0005 
        Luxembourg   4  -0.84  <.05 
 Malagasy Republic   5  -0.76  <.005 
            Malawi   3   0.16    n.s. 
          Malaysia  13  -0.99  <.0005 
            Mexico  32  -0.59  <.0005 
          Mongolia  18  -0.89  <.0005 
           Morocco  18  -0.71  <.0005 
       Netherlands  11  -0.09    n.s. 
       New Zealand  13  -0.19    n.s. 
         Nicaragua  16  -0.77  <.0005 
           Nigeria   5  -1.55  <.005 
            Norway  20  -0.89  <.0005 
          Pakistan  18  -0.32  <.025 
            Panama   9  -0.32    n.s. 
          Paraguay  17  -0.59  <.0005 
              Peru  24  -0.69  <.0005 
       Philippines  57  -0.64  <.0005 
            Poland  22  -1.18  <.0005 
          Portugal  29  -0.86  <.0005 
          Rhodesia   6  -0.48  <.05 
           Romania  18  -1.06  <.0005 
      Saudi Arabia   4  -1.13  <.005 
      Sierra Leone   4  -1.58  <.005 
      South Africa   4  -0.93    n.s. 
             Spain  50  -0.42  <.0005 
             Sudan   9  -1.05  <.0005 
            Sweden  25  -0.78  <.0005 
       Switzerland  25  -0.61  <.005 
             Syria  12  -0.76  <.0005 
          Tanzania  17  -0.80  <.005 
          Thailand   4  -0.39    n.s. 
              Togo   4  -0.72    n.s. 
           Tunisia  13  -0.81  <.0005 
            Turkey  67  -0.38  <.0005 
            Uganda   4  -0.68  <.005 
              USSR  43  -0.71  <.0005 
   United Arab Rep  38  -0.69  <.0005 
    United Kingdom 101  -0.09    n.s. 
     United States  48  -0.53  <.0005 
           Uruguay  19  -0.48  <.0005 
         Venezuela  24  -0.56  <.0005 
        Yugoslavia   9   0.74    n.s. 
            Zambia   8  -0.69  <.005

FIG. 5-7. TESTS OF THE NULL HYPOTHESIS

Having established that there was a negative relation between size and density, my interest shifted to "variation in negativity" among the nations. Fig. 5-8 shows that the negative slopes (ignoring the four positive ones) range from Sierra Leone's -1.58 to the practically non- negative -0.09 shared by Netherlands and the United Kingdom. Was this variation of any interest? What did it mean to be very negative, or moderately negative, or hardly negative at all?

FIG. 5-8. DISTRIBUTION OF NEGATIVE SIZE-DENSITY SLOPES

I wanted to describe some central value in this range, to distinguish nations which were more negative from those which were less negative. The data for the earlier analysis had been punched on IBM cards, one for each territorial division, the set grouped behind a lead card identifying the nation. I removed the national identifier cards, in effect creating one world with 1,764 territorial divisions. (see, One Worlders? It's just that easy!) The analysis of this set yielded the "world regression line":

log Area = 5.0656 - 0.65 log Density

which, with t = -36.5 (completely off the chart for 1762 degrees of freedom), clearly rejected the null hypothesis for the world as a whole.

I next intended testing the degree to which individual nations fit this world pattern, but before doing that I modified the world slope somewhat. Though I had computed a value of b = -0.6514, I found myself referring to it as the "-2/3 slope". This was part convenience, part simplicity, part esthetics. I think I also sensed (maybe I'm a closet Pythagorean) that simple ratios (1/2, 2/3, 3/4) tend to be theoretically interesting, though that certainly wasn't my conscious focus at the time. At any rate, I checked the degree to which the world regression line conformed to the simpler -2/3 value: t = 0.855 which, with N = 1764, is practically zero. So everything from here on is with reference to -2/3.^[5]

Table 5-2 shows the result of two-tailed t-tests of the hypothesis β = -2/3. Two-tailed because nations could be more or less negative than -2/3. Nations are arranged in order, from most positive to most negative slope. As can be seen, many more nations are in agreement with the -2/3 size-density hypothesis than were in agreement with the null hypothesis in table 5-1.

Table 5-2. Size-Density Hypothesis: β = -2/3


              Nation   N    b    p{β=-2/3}
               Haiti   5   0.76    n.s.
          Yugoslavia   8   0.45    n.s.
               Italy  19   0.32  <.001
              Malawi   3   0.16    n.s.
         Netherlands  11  -0.09  <.01
      United Kingdom 101  -0.09  <.001
             Ecuador  16  -0.10  <.01
               India  20  -0.13    n.s.
             Belgium   9  -0.13  <.001
              Greece   9  -0.17    n.s.
         New Zealand  13  -0.19  <.05
  Dominican Republic  25  -0.21  <.02
              France  89  -0.26  <.001
             Bolivia   9  -0.27    n.s.
              Jordan   8  -0.28  <.05
            Cambodia  17  -0.28  <.001
              Panama   9  -0.32    n.s.
              Ceylon   9  -0.33  <.001
             Iceland   5  -0.33    n.s.
                Peru  23  -0.34  <.05
            Pakistan  16  -0.34  <.001
             Ireland  26  -0.36    n.s.
    British Honduras   6  -0.36    n.s.
              Turkey  67  -0.38  <.01
            Thailand   4  -0.39    n.s.
         El Salvador  14  -0.39    n.s.
 Central African Rep  13  -0.39  <.01
               Spain  50  -0.42  <.01
            Colombia  29  -0.44  <.02
            Portugal  18  -0.44  <.01
              Canada  12  -0.45    n.s.
         Philippines  57  -0.46  <.01
                Cuba   6  -0.47    n.s.
                Iran  19  -0.47    n.s.
               Burma   7  -0.47    n.s.
             Uruguay  19  -0.48  <.01
            Rhodesia   6  -0.48    n.s.
          Costa Rica   7  -0.48    n.s.
            Mongolia  17  -0.49    n.s.
               Japan  46  -0.50  <.05
                Chad  11  -0.51    n.s.
         Ivory Coast   4  -0.51    n.s.
            Bulgaria  28  -0.52    n.s.
       United States  48  -0.53    n.s.
               Gabon   9  -0.55    n.s.
             Albania  27  -0.55    n.s.
             Finland  12  -0.58    n.s.
            Honduras  18  -0.58    n.s.
           Argentina  23  -0.58    n.s.


           Nation   N    b    p{β=-2/3}
            Chile  25  -0.59    n.s. 
          Korea N  11  -0.59    n.s. 
         Paraguay  17  -0.59    n.s. 
        Guatemala  22  -0.60    n.s. 
        Indonesia  27  -0.60    n.s. 
      Switzerland  25  -0.61    n.s. 
        Venezuela  22  -0.62    n.s. 
            China  27  -0.64    n.s. 
           Brazil  28  -0.64    n.s. 
           Uganda   4  -0.68    n.s. 
           Zambia   8  -0.69    n.s. 
             USSR  43  -0.71    n.s. 
           Angola  13  -0.71    n.s. 
          Morocco  18  -0.71    n.s. 
             Togo   4  -0.72    n.s. 
          Austria   9  -0.73    n.s. 
        Australia   8  -0.74    n.s. 
           Mexico  30  -0.75    n.s. 
          Algeria  15  -0.75    n.s. 
            Syria  12  -0.76    n.s. 
Malagasy Republic   5  -0.76    n.s. 
        Nicaragua  16  -0.77    n.s. 
           Israel   6  -0.78    n.s. 
           Sweden  25  -0.78    n.s. 
        Germany E  15  -0.78    n.s. 
          Denmark  24  -0.79    n.s. 
         Tanzania  17  -0.80    n.s. 
  United Arab Rep  16  -0.80    n.s. 
          Tunisia  13  -0.81    n.s. 
       Luxembourg   4  -0.84    n.s. 
           Norway  20  -0.89  <.001 
     South Africa   4  -0.93    n.s. 
            Ghana   9  -0.95  <.05 
          Korea S  11  -0.96    n.s. 
            Kenya   8  -0.97    n.s. 
         Malaysia  13  -0.99    n.s. 
          Formosa  18  -1.00    n.s. 
   Czechoslovakia  11  -1.00  <.01 
          Lebanon   5  -1.01  <.02 
            Sudan   9  -1.05    n.s. 
          Romania  17  -1.06    n.s. 
          Hungary  24  -1.07  <.01 
             Iraq  14  -1.10  <.02 
     Saudi Arabia   4  -1.13  <.05 
            Congo  13  -1.17  <.001 
           Poland  21  -1.19  <.001 
        Germany W  10  -1.53  <.02 
          Nigeria   5  -1.55  <.05 
     Sierra Leone   4  -1.58  <.02

Variation around the -2/3 value is shown in Fig. 5-9. There were 44 nations with slopes more negative than -2/3 (i.e., toward the -1.53 value for Sierra Leone), of which 31 were not significantly different. There were 54 nations with slopes less negative (more toward zero), and of these 34 were not significantly different. Thus, 65 nations show agreement with the -2/3 slope at the .05 level of significance.

FIG. 5-9. TESTS OF THE -2/3 SIZE-DENSITY HYPOTHESIS

Historical Tests of Size-Density

The earlier study of county size in the United States had been conducted entirely with maps; no statistics were computed. The international study was conducted entirely from a statistical approach. In an effort to relate the two, to get a sense of the connection, I conducted a statistical analysis of the growth of Oregon counties.

FIG. 5-10. OREGON COUNTIES

I wanted to do this for the entire United States, but there were not area figures for U.S. counties prior to 1930. Since I had prepared maps for Oregon earlier, by tracing these on fine graph paper I was able to estimate historical county areas by counting millimeter graph squares in each county. (It would only be some time later that I discovered the possibility of measuring areas with a planimeter, rather than counting these tiny squares ... my eyes still ache.) The results of the analysis are shown in Fig. 5- 10. The slope is initially near the -2/3 value shown for the world, but it drifts toward zero generally throughout the period, and steadily from 1900.

A parallel study of the states of the United States (shown in Fig. 5-11) showed a tendency to decline below the -2/3 value, but the slope has, in general, been steady since 1800, with a slight tendency to move toward -2/3 since about 1870.

FIG. 5-11. STATES OF THE U.S.

In addition to these, I was examining a map of the British Isles like the shown in Fig. 5-1, speculating about the possibility of size-density relations based on possible size-distance ones.

I combined the material in this chapter into a paper titled "Historical and International Tests of the Size-Density Hypothesis", and sent it to the American Sociological Review in July of 1971. The reviewers had two criticisms. First, they said there was too much material in it (odd criticism); they wanted me to throw out everything but the international study.

The second criticism (which would pop up often later on) was that the size-density hypothesis was only a tautology, an artifact, since area occurs as both the dependent variable and in the denominator of the independent variable (density = population / area). Large areas have low density by definition .

I agreed to limit the scope of the paper, and I submitted a quote which seemed to relieve the editors of the concern over tautology. The quote was from Snedecor's seminal statistics text^[6], referring to correlations between variables of the form Y and X/Y:

Having observed some unwarranted interpretations of such correlations, Karl Pearson dubbed them `spurious', and this rather derogatory title has led people to distrust them. Of course, it is the interpretation that may be spurious. The correlations are on the same footing as any others.

This hardly constituted an argument. It was an appeal to authority. But it satisfied the reviewers and the article did get published.^[7] I suggested at the end of the article, partly in response to this concern, that future work might use a different independent variable, "population potential". I take up this suggestion myself in Chapter 7.

Next Chapter

NOTES:

[1] Britannica World Atlas. London: Encyclopedia Britannica, Inc. 1967.

[2] Britannica World Atlas. 199.

[3] They are distinctly different from the rest of the nation. All the other units cluster along the coastline; these two are largely empty desert regions, stretching off to the south. Their size is the result of being empty space out to the national boundary.

[4] My original purpose in the logarithmic transformation was simply to make graphing possible, to get the dots away from the axes. The mathematical-theoretical significance of the resulting logarithmic equation came later.

[5] I ignored the intercept in this and most later studies. The intercept's value is a function of the units in which area is measured (square kilometers, square miles, acres, hectares, etc.), so the actual number is more conventional than theoretical.

[6] George Snedecor, Statistical Methods (4th edition) 162, Ames, Iowa: Iowa State College Press, 1946

[7] G. Edward Stephan, "International Tests of the Size-Density Hypothesis," American Sociological Review, 37:365-8. 1972