Approx 1 SD decline in general intelligence since 1889 confirmed in a near-perfectly-matched sample from 1989


Michael A. Woodley, Jan te Nijenhuis & Raegan Murphy have followed up their long-term analysis of reaction times as evidence of a rapid and significant decline in average intelligence since Victorian times


by publishing a rapid and robust refutation of the criticism that this result could be explained-away by differential sampling.


They present a modern population sample from 1989 which is (as-near-as-dammit in this imperfect world) perfectly--matched with Francis Galton's sample from the late 19th century.

The 1989 sample had an average reaction time of 245 milliseconds compared with Galtons 1889 average reaction time of much slower average reaction time of 194 ms - confirming Woodley at al's identification of a decline of general intelligence of approximately one standard deviation, or 15 IQ points.


This unusual compression of the time-scale of scientific debate presents a litmus test of the honesty and competence among the commentators who rejected the original paper on micro-methodological grounds of having serious concerns about sampling issues; grounds which I argued were inappropriate, incompetent and - in their effect - anti-scientific: 


Thus we are now in a position to observe whether such critics understand and acknowledge that they have in fact been refuted; or else whether they reveal the existence of some hidden agenda by maintaining their rubbishing and rejection of the Woodley et al paper by ignoring this refutation, or shifting the grounds for criticism. 


ORIGINAL PAPER: A high-quality replication of Galton’s study one century later: Wilkinson & Allison (1989)

Michael A. Woodley, Jan te Nijenhuis & Raegan Murphy

In Woodley, te Nijenhuis, and Murphy (2013, in press) we argue that intelligence has declined substantially since Victorian times, based on a meta-analysis of simple reaction time. An exchange of ideas started at several blogs. We hereby reply to the blogposts of Scott Alexander and HBD Chick, reacting to an earlier post made by us.

A paper has come to our attention that provides strong evidence against the supposed representativeness problem across cohorts (e.g. Alexander, 2013). The study in question is that of Wilkinson and Allison (1989) using a sample of 5,324 visitors to the London Science Museum, which is situated at the exact site of Galton’s 19th century Anthropometric Laboratory in South Kensington.  All visitors undertook psychophysical testing on a simple reaction time-measuring apparatus, just as the people in Galton’s study did. Of these mixed-sex participants 1,189 were aged between 20 and 29, and are thus highly similar to the age range employed in our own study. Their simple RT mean was substantially slower than the weighted 1889 RT mean (245 ms vs. 194.06 ms), and furthermore the mean of this sample falls very close to the meta-regression-estimated mean across studies for the late 1980s (approximately 250 ms, see: Figure 1 in Woodley, te Nijenhuis & Murphy, 2013). The remarkable features of this study are the ways in which it replicates virtually every significant demographic aspect of Galton’s study.

There is the issue of a participation fee. Galton is known to have requested a participation fee of 3 pennies (approximately £5 in modern UK currency). The London Science Museum required the payment of an admissions fee right up until December 2001. Furthermore it still requires the payment of fees of £6 to £10 for access to some special exhibitions (London Science Museum, 2013a). The Wilkinson and Allison (1989) study was in fact conducted as part of a special exhibition entitled Medicines for Man, which was hosted by the Museum from the early 1980s (Medicines for Man Organizing Committee, 1980). Therefore participation fees were employed in the case of both studies.

There is strong evidence for the demographic convergence between the two studies. Johnson et al. (1985) indicate that whilst Galton’s sample included persons from all occupational and socioeconomic groups in Victorian London, it was nonetheless skewed towards students and professionals, and both groups could fairly be described as solidly White and middle class. In the last decades of the 20th century, museum attendance in the UK exhibited precisely the same skew in terms of sociodemography. Eckstein and Feist (1992) for example noted that most UK museum visitors are drawn from White and upper-middle-class populations. Furthermore Hooper-Greenhill (1994) observed that the largest minority ethnic groups in the UK (i.e. Asians and Afro-Caribbeans) are underrepresented amongst museum visitors. In acknowledging this issue, a House of Commons report in 2002 stated that free admission to museums would unlikely ‘… be effective in attracting significant numbers of new visitors from the widest range of socio-economic and ethic groups’ (House of Commons report, 2002, p. 23).

The presence of this self-selection amongst visitors strongly harmonizes the studies of Galton and Wilkinson and Allison. Add to this the fact that participation fees were employed in both cases, the fact that the geographical locations were exactly the same and finally the fact that the age demographic of interest (i.e. twenty-somethings) were intensively sampled in both cases (i.e. 3,410 in the case of Silverman’s subset of Galton’s sample and 1,189 in the case of Wilkinson and Allison). The net of this is that the studies become even more strongly convergent in terms of comparing like with like. Thus the argument of more heterogeneous samples visiting museums in the 1980s compared to more restricted samples visiting museums in the 1880s is critically weakened. The principal objections that can be leveled against this are as follows.

Firstly there is the issue of tourism. Most tourists to the UK are from the US and Europe (Tourism 3B), meaning that they are likely to be both ethnically and socioeconomically matched to the majority of the participants in this study (i.e. UK citizens). In fact, international arrivals in the United Kingdom in 1990 show that of the 439 million inbound tourists, 60% were European in origin and 21% emanated from the Americas. Hence, 81% of the tourist population came from groups which are highly ethnically similar to the British. Only 12% came from Asia and the Pacific with a meager 3% coming from the Middle East and 2% from Africa (Tourism 3B). In sum, it is unlikely that tourists being tested in the 1989 study were substantially ethnically different from the typical UK museum visitor. Based on current statistics from the Science Museum, the preponderance of visitors hail from the UK (69%) and the preponderance of those are from Greater London (44%; London Science Museum, 2013b). Historically, especially prior to the 1990s this figure would have been much higher, owing to far lower levels of tourism to the UK (in 1990 international tourism levels were less than half the current levels,  >940 million per year, BBC, 2013). This means that in all likelihood well over 70% of the participants in Wilkinson and Allison’s study would have been British, and the overwhelming majority of these would have been White, upper middle-class and from London. The overwhelming majority of the international visitors would have been ethnically and broadly socioeconomically matched to the British visitors.

Secondly is the issue of instrumentation. Galton utilized a pendulum chronoscope with a temporal resolution of around a centi-second (i.e. 1/100th of a second, or 0.01 seconds). The electronic apparatus employed by Wilkinson and Allison in all likelihood had a higher resolution (post-1908 chronoscopy at least had the potential to be accurate to a single milli-second; Haupt, 2001), however a centi-second level only resolution in Galton’s apparatus cannot account for the substantial discrepancies between these two studies.
Thirdly, Galton’s sample was single person-single trial, whereas Wilkinson and Allison’s study employed two practice trials followed by 10 trials per person for the purposes of averaging. This protocol would almost certainly have enhanced the reliability of Wilkinson and Allison’s data relative to Galton’s (Jensen, 1980); however in both cases we are dealing with aggregates. Strong biases (i.e. jumping the gun vs. slow to start) have the potential to cancel each other out when employing these sorts of very large datasets, as these sources of error are distributed in a Gaussian fashion. This means that aggregate-level mean-wise comparisons are appropriate for comparisons between data exhibiting different coefficients of reliability coupled with very large Ns.

On this basis Wilkinson and Allison’s (1989) study must be considered an excellent replication of Galton’s study. Its mean reaction time for the relevant age cohort is almost precisely where our meta-regression predicts it should be. This is clearly strong supporting evidence for the robustness of the increase in simple RT latency produced to date and so puts even more nails in the coffin of those who argue that the trend can be accounted for by lack of representativeness across cohorts.

Very interesting findings (hello again! I am still reading backward from the end.) I don't know if you read comments this far back, but anyway, what I find intriguing here is the observation of an apparent change within one particular demographic group. (Assuming, of course, that the data is not explained by something silly, like a miscalibrated instrument in the 1800s or people having different reaction times at different times of day or after watching a peaceful movie at the museum theatre.) One could easily imagine large-scale demographic changes which could alter the intelligence of an overall society, but we have instead here significant change within a single group, implying a major change to that group--immigration, lower class reproduction rates, etc., are all irrelevant. Of course, the most commonly cited culprit is TV, but I was thinking rising age of childbirth. Alternatively, rising standards of living could make the upper/middle class less intellectually exclusive, or the whole society could be less intelligent. Or all or none of the above.