CHAPTER 12
## Deviation from Size-Density
Ernest Rutherford
This table shows the nations from Table 5-2 which deviated from the world (and by now theoretical) slope of -2/3. We omitted the United Kingdom since it was the subject of studies reported in Chapter 7. As in Table 5-2, nations are ordered by their size-density slope, from the most positive (Italy) to the most negative (Sierra Leone).
Beginning with the most negative case first, it probably would have been
wise originally to have ignored nations with so few divisions (here
N = 4). It's hard to say what deviance (or conformity) means
when so few units are involved. But since it was part of the set of
nominally deviant nations, we analyzed it again anyway. Here is the
It is obvious that one district is different from the others. The "Western Area" is much smaller. It appears to be little more than a capital district (like Washington, D.C.), the location of the capital (Freetown, population 127,917). Its area of 327 square miles compares with, for example, such cities as Dallas, Indianapolis, Kansas City, New York, Phoenix or San Diego. It probably should have been excluded in the original study. One problem in the original study was the difficulty, in the early 1970s, of obtaining useful graphic output from computers. I had a program which would generate crude scatter diagrams on a printer, but the display was very limited (it wouldn't generate regression lines, for example). Had I generated scatter diagrams for each of the 98 nations studied, I might have discovered the problem with Sierra Leone (and a whole set of nations as we'll see below). Anyway, my interest back then was in the entire data set, not individual nations. The following figure shows the scatter diagram for Sierra Leone, together with the regression line (red) and the world regression line (light blue). Note that even though Sierra Leone "rejects" ß = -2/3, the dots fall close to the world line (This is reminiscent of some of the police car patrol districts in Chapter 7)
Clearly, as an outlier the Western Area (red dot) generates a very high coefficient of determination. Removal of this point makes the slope still more negative (green line, for the three remaining points, with very little variation in log d). So, in a way, the resulting set conforms even less to the hypothesized value of -2/3. Still, reduction of r-squared results in the probability value higher than the .05 rejection level.
A similar problem occurred with the next most negative slope, West
Germany. The
The scatter diagram for West Germany shows a similar pattern: As with Sierra Leone, removal of the outliers reduced the coefficient of determination and brought the probability well above the .05 level. In this case the new slope (green line) is actually closer to -2/3 after removal of the two city-states. I will not include further examples
of
The cities ignored and the resulting probabilities are shown in the next table (the original probabilities have been computed exactly; the first table showed only critical values). In every case ignoring cities brought the data sets into conformity with ß = -2/3.
This technique did not work in the case of Iraq because Baghdad
Cities clearly should not be included in a test of a theory addressed to territorial divisions since the process through which they are created differs from that for territories. Cities grow outward from a point and need not be contiguous. This is usually not a problem when we are familiar with the data set being examined. We know, for example, that Washington, D.C., is not really a state, even when we find it included in a statistical table reporting state-level statistics. When you are unfamiliar with a country, however, it may be hard to tell which units are not territorial divisions. Often the only clue may be their extremely small size and usually very high density. This may seem to provide further support for the size-density law, but the problem is that statistical outliers like these tend to create very high correlation coefficients, and these tend toward rejection of the hypothesized -2/3 slope
We continue with the nations
Portugal presents a different problem. Its most dense
Furthermore, unlike Montevideo, Porto doesn't
Is this simply an exercise in curve-fitting? Clearly, we could go on
removing non-fitting data-points until results meet theoretical
expectation. But "cooking the data", if that is all we are doing, hardly
constitutes a
Eight units have been removed from the original Colombia data set. They
differ from the remaining twenty-one
Exclusion of Spain's most dense
The Central African Republic presented a new problem. Here the apparent
outliers are
Theoretical conformity was achieved for sixty-five of Turkey's sixty-
seven
In the West Pakistan did not conform by the criterion employed here (b = -0.42; p = .02140). There were several outliers, however: the two mountainous units, Kalat and another, labeled "Quelta" in the table, "Quetta" on the map. Dropping these two didn't change the slope much but did reduce the coefficient of determination: r-squared went from .68 to .26. Whether this is "conformity" or simply "failure to reject" is debatable.
I could see no way of excluding units from Ceylon (now Sri Lanka) to achieve conformity with theoretical expectations. The slope is far from -2/3; the coefficient of determination is extremely high; there are no outliers responsible for either.
Removing the three least dense provinces from Cambodia produced conformity (b = -0.47; p = .11078). Taking out one more, which appears to be grouped with the others, resulted in much greater conformity for the remaining provinces even though the coefficient of determination increased as well (r-squared went from .58 to .71).
France proved intractable. There was only one obvious outlier
Importantly for the purpose of theory, if territorial divisions are created equal, in the territorial sense, there can be no size-density relation. If the dependent "variable" were indeed constant, there could of course be no relation between it and any independent variable.
As was the case with Turkey, the Dominican Republic conformed to the
theoretical expectation when both the least dense and most dense
As is evident, there is nothing can be done to produce conformity in Belgium's nine provinces. You can remove the three least dense to obtain p = .04915, just below the criterion for conformity (b = -0.28; r- squared = .50). But that takes away a third of all the units, and does so for no reason other than to fit the curve (the reduced N makes the difference).
The Netherlands, like Belgium, consists of very few units to begin with. Dropping the two most dense provinces achieves theoretical conformity, but it does so with a significantly further reduced N and an extremely low coefficient of determination.
Italy appears to have only one outlier, the least dense of its
The results in this section are summarized in the following table. As in the preceding section, removal of very few units often results in theoretical conformity. Accounting for this will be the topic of the next chapter.
Since I raised some question (in the Appendix to the preceding
chapter) to using the "p < .05" criterion, it is reasonable to ask
why I use it here to decide about "conformity" to the hypothesis. I
continue quoting the source from the previous Appendix. The choice of a level of significance ... will usually be somewhat arbitrary since in most situations there is no precise limit to the probability of an error of the first kind that can be tolerated. It has become customary to choose ... one of a number of standard values such as .005, .01 or .05. There is some convenience in such standardization since it permits a reduction in certain tables needed for carrying out various tests. Otherwise there appears to be no particular reason for selecting these values....In this research I am not simply testing a null hypothesis. The size- density hypothesis is, first of all, not null: it asserts a definite, explicit, precise -2/3 relationship between variables. It emerged empirically, as the world regression line in the international study (Chapter 5), and is theoretically derivable (Chapters 8 and 10). It is supported with research involving other, cross-cultural territorial divisions (Chapter 6): - tribal territories in Oregon
- tribal territories in California
- tribal territories in Africa
- Siwai village polygons in Bougainville
- police sectors and patrol districts in Seattle
- police service beats in Honolulu
- Roman Catholic and Episcopal dioceses
- counties in pre-industrial England
- new administrative counties in England
- U.S. counties and population potential
- size-population for cities and urbanized areas
- population distribution within cities
- gravity model of migration-interaction
- rank-size rule for cities
- square-cube law of formal organization
much
lower level for testing the hypothesis with any particular data set.
Reporting the results with the .05 criterion is a conservative posture:
it sets aside all previous theory and research, giving the size-density
hypothesis no more credence than we normally give the null.
NOTES:
[1] Douglas R. McMullin, "Urbanization and Territorial
Subdivision: An Analysis of Nations which Deviate from the Size-Density
Hypothesis", M.A. thesis, Department of Sociology, Western Washington
University, 1981. The results are reported in G. Edward Stephan, Douglas
R. McMullin and Karen Stephan, "Statistical
and Historical Analyses
of Nations which Deviate from the Size-Density Law",
[2]
[3]
[4] Charles Breunig,
[5] E. L. Lehmann, |