What Arthur Jensen Overlooked
Geraldine McDonald
Abstract
Thirty years ago Arthur Jensen made strong claims about the relative abilities of Black and White high school students in the United States. He argued that there was stability in the average differences in their scores. Jensen’s data, collected originally by Audrey Shuey and spanning 50 years did not take into account two pieces of worthwhile knowledge. First as de Lemos demonstrated in the 1980s IQ tests standardised on school children measure level of schooling rather than age. However, the scoring system of IQ tests designed for school children assumes maturational increments and hence the raw scores must be adjusted for chronological age in order to obtain an IQ. The second bit of worthwhile knowledge concerns age distributions at levels of education systems. Demographic data typically show shifts in age distributions over time, and differences by school, region and gender. The effect of the interaction of age norms and population distributions on the average scores of samples of school children will be demonstrated. It will be argued that stability in the IQ scores of any one population group is highly unlikely not because of shifts in intelligence but because of the factors that control cohort progress through school.
Introduction
In his well known article "How much can we boost IQ and scholastic achievement?’ published 30 years ago Jensen (1969; 1972) maintained that the IQ measured something with a strong genetic component. This view had implications for his explanation of racial differences in intelligence between black and white students in the United States and the so-called 15-point gap. Jensen was not the first to link racial differences with a standard deviation based on the Gaussian curve. In 1869 Francis Galton (1892) beat Jensen to this conclusion by exactly 100 years. Galton had no data beyond the distributions of physical characteristics such as the chest measurements of soldiers collected and published by Quetelet the Belgian Astronomer-Royal. Galton’s argument was by analogy. If physical measurements were distributed in a Gaussian curve then intelligence must be similarly distributed. As for the gap between races his evidence was anecdotal.
Just when it appeared that the debate over racial differences in intelligence had petered out it was revived by the publication of The Bell Curve in 1994 (Herrnstein and Murray, 1996). The chapter on race and IQ elicited the strongest protest (Fischer et al., 1996). However, the protest consisted of a repetition of earlier criticisms. No fresh ideas emerged.
In New Zealand the Maori-European difference has, as Shuker (1988) noted, been popularised as the 10-point gap (Reid and Gilmore, 1988). The gaps are based on performance on intelligence tests and both the 10- and the 15-point gaps became educational folklore. A general practitioner, Dr Smith, who worked in the Hokianga in the 1940s and 50s, believed in the genetic basis of the Maori-European intelligence gap. In a talk to the Wellington Branch of the Royal Society in 1951 he drew the following conclusion (Kemble Welch, 1965, p. 142) which is the same as that which appeared later in Jensen’s 1969 paper; education cannot overcome biology.
From my personal observation the Maori’s inability to reason in the abstract shows that our present methods of educating the Maori are futile (my italics).
The Problem
While critics of the Jensen thesis have pointed to a number of problems with his arguments I want to look at only three; the scientific rigour, the constancy of a 15-point (and a 10-point) gap, and the sufficiency of this as evidence for genetic determination.
The academic discourse relating to differences in the test performance of different racial groups is presented by those supporting a genetic determination as scientifically rigorous. Jensen has specified the scientific task as getting at the facts and properly verifiable explanations (Jensen, 1973, p.17). Henry Garrett (introduction to Shuey, 1966) wrote that the honest psychologist "is interested simply in uncovering differences in performance when such exist and in inferring the origin of these differences. And this is certainly a legitimate scientific enterprise". Nevertheless Jensen’s critics have pointed to a lack of scientific rigour in the presentation of his claims. For example Dobzhansky (1973 p. 88) comments that the plethora of papers comparing IQ test results of whites and blacks which were used as the basis of Jensen’s arguments had been uncritically compiled by Shuey (1966). My contribution to the issue of scientific rigour is to point to some facts about the composition of school populations which have a bearing on the interpretation of Jensen’s evidence.
The second claim by the supporters of a genetic explanation is that the IQ gap is constant or stable; the mean of white IQs being 100 while that of the black school population being one standard deviation below. Shuey (1966, p. 502) concluded that the IQs of Negroes (sic) enrolled in the American public schools have proved to be relatively stable.
The third claim is that the regularity of the results of IQ testing is evidence for a genetic basis for the difference. Dobzhansky (1973) notes that Jensen takes stability as evidence of a strong genetic component. Shuey (1966, p. 521) concludes that there are native differences between Negroes and whites as determined by intelligence tests and Garrett, in his introduction to Shuey (1966), writes that "we are forced to conclude that the regularity and consistency of these results strongly suggest a genetic basis for the differences".
My aim is to examine the claim represented by the specification of an IQ gap of a particular size. Is it really as stable as claimed for the United States? Is the 10-point gap really stable in New Zealand? If a score gap is not constant over time then this considerably weakens any argument that it is largely genetic. What I will show is that at least a portion of the IQ score is likely to be unstable. Further that it is the age portion of the score, the part which is meant to measure genetically controlled maturation, which results in this instability.
What Jensen overlooked is the relationship between the way in which IQ scores are calculated and the age distributions of children across grades. This is the so-called age-grade problem. But during the thirty years of intense debate on the causes of racial differences in measured intelligence neither have any of Jensen’s critics taken these two factors into account.
How to Create a Score Gap
If you look at the conversion tables in a test manual you will see that age norms move the item scores downwards as age increases. Depending on the test the lowering may take place at 1, 3 or 6-month intervals. The score when age has been taken into account is matched to a point on a scale.
We can start by looking at the distribution of scores by age for a general population of school children. To isolate the effects of age on scores we need to assume that every child gets the identical item (raw) score. Then any difference between the item scores and final scores will be solely the consequence of age differences. In Figure 1 I have used the general pattern of age by grade found in the New Zealand education system to show the proportions of each three-months age group at a grade level. The figure uses a spread of only fifteen months of age that is only 3 months more than the 12-months expected at each grade level. In fact pupils may range up to two and a half years across a grade level. Since age distributions vary over time the figure should be taken as indicative only.
|
Ages |
Scores |
|||
|
9;6-9;8 |
54 |
2% |
54 |
|
|
9;9-9;11 |
63 |
33% |
63 |
|
|
10;0-10;2 |
45 |
45% |
45 |
|
|
10;3-10;5 |
42 |
16% |
42 |
|
|
10;6-10;8 |
34 |
4% |
34 |
I accept that "races are revealed as fluid, elusive and paradoxical, if not illusory" (Kohn, 1996) but to present my arguments I will adopt the categories used by others. My first task is to demonstrate how to create a score gap between two groups which I shall call Race 1 and Race 2. They correspond approximately to Maori and non-Maori as interpreted in the Education Statistics of New Zealand from which typical age differences were obtained. The statistics provide information on distributions across grade levels but do not provide details on three months differences. These were modelled on the patterns shown in the TOSCA (Reid et al., 1981) unpublished database. The demonstration remains conservative in that the age range is only 15 months as in Figure 1. It is also conservative in that it uses the categories "Maori" and "non-Maori". "Non-Maori" includes all ethnic groups other than "Maori". It is not synonymous with "European". Again the demonstration is indicative only.
Let us imagine the usual testing situation of a sample of children at one grade level. Children of all ages at that level sit an intelligence test. Now assume as was done in the illustration for the general population that every child gets the same raw score (24) which happens to be the average of the scores at that grade level. But the average ages differ, Race 1 being on average older than Race 2. There are different proportions of pupils within each 3-months interval according to race. Next consult a test manual (Reid, et al., 1981) to convert the average item scores for each three months group to a score derived from the percentile rank scale provided. The result is shown in Table 1.
Table I. Effect of age distributions on scores
|
Age |
Raw score |
Final score |
Race 1 Proportions |
Race 2 Proportions |
|
9;6-9;8 |
24 |
54 |
1% |
3% |
|
9;9-9;11 |
24 |
53 |
30% |
36% |
|
10;0-10;2 |
24 |
45 |
44% |
46% |
|
10;3-10;5 |
24 |
42 |
20% |
12% |
|
10;6-10;8 |
24 |
34 |
5% |
3% |
Sources: Education Statistics of New Zealand; TOSCA Teacher’s Manual
Although it can be accepted that every pupil is in the same grade and we have further assumed that all have obtained the same item score, because the age distributions differ a gap appears. Race 1 is now scoring below Race 2 as can be seen in Figure 2.
|
Race 1 |
Race 2 |
|
|
54 |
1% |
3% |
|
43 |
30% |
36% |
|
45 |
44% |
46% |
|
42 |
20% |
12% |
|
34 |
5% |
3% |
The gap in average intelligence test scores shown in Figure 2 is caused not by performance difference but solely by age differences at grade level. The age-grade difference by race interacts with the scoring system of IQ tests. The gap shown in Figure 1 is an artifact of the age norms first introduced in Alfred Binet’s concept of Mental Age and incorporated in all intelligence tests and many other standardised tests ever since. The age norms can create a gap or augment gaps which already exist between the item scores of population groups categorised by race, gender, social class, urban or rural location. The point is that if age by grade distributions are not the same for any two samples scores will differ.
Age or Grade?
In all school systems there is an overall mismatch of age to grade and within this mismatch it is common for economically disadvantaged groups irrespective of race to differ in their age patterns (McDonald, 1993). This is because of selective promotion on entry or in the early years of schooling. Selective promotion means that pupils may be accelerated, placed at the expected level, or held back. Factors influencing differences in distributions include teachers’ beliefs about the value of delay for population groups such as boys or particular racial groups. However, the most important factor is a child’s birth date in relation to the school year. Contrary to popular belief children in a grade lower than their age peers are not necessarily slow. They may simply be the youngest in the year group.
If pupils are spread across up to three grade levels, which factor has the greater influence on children’s ability to answer items on an IQ test; their age or their grade? In 1989 three studies were reported which looked specifically at this issue. There were two large-scale studies (de Lemos, 1989; Cahan and Cohen, 1989). These were set in different countries, Australia and Israel, were carried out independently and by different methods. Also in 1989 at the NZARE conference (McDonald, 1989) I reported a study based on the analysis of approximately 3000 scores, ages and grade levels drawn from the database developed from of the norm sample for the New Zealand TOSCA (Reid et al., 1981). All three studies controlled for age in grade. All three investigations concluded that intelligence tests or test items measure grade level effects rather than age effects. De Lemos (1994) found this to be true even for Raven’s Progressive Matrices.
Figure 3 shows the results from my own study. The scores were collected at the beginning of May which is 2 months before mid-year. The analysis is related solely to children whose ages match the expected grade as these were recorded in the database. Plotted at three months intervals at 3 levels on TOSCA primary Form A it can be seen that there is no rise from youngest to oldest in the year group. The rise in a cross-sectional sample of this kind is simply from grade to grade. This is a result which has far reaching implications.
|
0;0-0;2 |
0;3-0;5 |
0;6-0;8 |
0;9-0;11 |
|
|
11 at level 5 |
42.97 |
42.05 |
42.94 |
40.17 |
|
10 at level 4 |
35.00 |
31.62 |
33.2 |
31.18 |
|
9 at level 3 |
24.57 |
23.93 |
25.03 |
24.19 |
Are these three studies the only evidence for the influence of grade level rather than age on test performance? No, there is plenty of other evidence. Studies of the so-called birth date effect show that if pupils have been at school for about four to five years and the average scores within the age group at the intended grade level are plotted, the score line is flat rather than sloping. In the UK, however, the youngest group on average performs marginally worst. In other systems of education this pattern is reversed and the youngest in the year group produce the highest average scores (McDonald, 1999).
Another source of confirmation of the influence of grade level comes from studies looking at comparisons of the performance of pupils who had been delayed in exit from the beginning classes or who had been required at some stage to repeat a grade and those who had progressed according to the expectation of age (Kenny, 1988; Holmes, 1989). In general, grade level emerges as the most important factor from this line of investigation.
Yet another source can be found in the so-called "Flynn effect"; that IQ scores rise over time. However, over time children have also become younger at grade levels. Grade level stays the same but since IQ scores give credit for younger age the overall IQ will rise (McDonald, 1998). Changes in age composition at grade level can either raise or lower scores but the trend over many years has been a lowering of average age and hence a rise in average scores.
A further source can be observed in the rise in average scores by grade level as presented in test manuals. In conclusion there is no evidence that measured intelligence rises with the age of individuals independently of grade level and the belief that it does is a misreading of the so-called age norms.
The age portion of a collective IQ may differ between racial groups. If this age difference changes in any way then the average scores will remain stable only if the item portion of the double score, the part which relates to grade level, is able to compensate. For example, if for reasons of promotion policy, black students in the United States become on average older at levels of schooling relative to white students and so lose 1 point of average IQ, then to maintain a gap of constant size the black average item score would have to rise by 1 point. It is unlikely that this would ever happen because holding students back would bring their scores into line with the standard of a lower grade level as illustrated in Figure 3 (see also, de Lemos 1989; Cahan & Cohen, 1989). The most likely result would be a 2 point increase in the gap.
It could be argued, but not very convincingly, that it is genetic influences which result in black students repeating grades. However, neither Jensen nor his supporters claim that the genes account for any rise or fall in average age at grade level. They argue that it is scores on IQ tests which reflect genetic influence.
Variations in Age by Grade
Before returning to Jensen I want to show that, in general, patterns of age by grade are not constant as they would be if they were contingent upon the genetically determined ability levels of individual children. Age distributions are unstable in a variety of ways. First, national age distributions vary. Figure 4 shows age by grade distributions at three grade levels of 9-year-olds in four countries which took part in the IEA TIMSS (Martin & Kelly, 1997). For the purpose of weighting the samples of 9- and 13-year-olds who were the focus populations of TIMSS, population estimates were developed by Pierre Foy, Statistics Canada. Figure 4, based on unpublished tables, show that the patterns are similar but also that there is variation across the four systems. This variation cannot be attributed to differences in national intelligence. The variations reflect differences in the way each system controls its pupil flows.
[Figure 4]
Age distributions also change over time. Terman and Merrill (1961) found changes in age by grade distributions over the twenty years 1937 to 1957 in a sample from the United States. See Figure 5.
|
V |
VI |
VII |
VIII |
IX |
X |
XI |
XII |
|
|
1937 |
0 |
2.8 |
5.6 |
23.4 |
37.4 |
26.2 |
4.6 |
0 |
|
1957 |
1 |
3 |
7 |
15 |
30 |
37 |
6 |
1 |
Median ages vary by school type. The position in New Zealand in 1963, a year in which these variations were supplied in detail in the Education Statistics of New Zealand (Department of Education, 1964) is shown in Table II. Note particularly the variations in age between Maori and non-Maori. There is aa difference of 9 months in the median ages of European girls in Maori schools and Maori boys in state schools.
Table II. Median ages at Year 4 by race and type of school (NZ 1963)
8;6 European girls in Maori schools
8;8 Girls in state schools
8:9 Boys in private schools
8;10 European boys in Maori schools
8:11 Maori girls in Maori schools
9;0 Maori girls in private schools
9;0 Maori boys in private schools
9:1 Maori girls in state schools
9;2 Maori boys in Maori schools
9;3 Maori boys in state schools
Source: Education Statistics of New Zealand 1964
To sum up; the ages of school children at grade levels are not stable across education systems, they have altered over time, they vary by school type and they vary by race. They also vary by gender and by socio-economic status but these are not the focus of this present account.
Age in Jensen’s Data
The compilation by Audrey Shuey on which Jensen drew his conclusions was based on approximately 382 studies carried out over a period of 50 years of black-white average score differences. Was there evidence of age by grade variation in any of these studies?
Although Shuey did not have any systematic information on age by grade for pupils at the time they were tested and obviously did not think it relevant she does record what some of the original investigators had reported. These provide glimpses of differences in age distributions and suggest variation by race over time and from study to study. For example, a study in 1928 noted a large number of over-age pupils (Shuey, 1996, p. 179) and that in coloured (sic) schools yearly promotion was 76% while in white schools it was 84%. A 1947 study reported that the spread of ages in grades 1-8 for white pupils ranged from 6 to 17 years while black pupils ranged from 6 to 19 (Shuey, op.cit. p. 170). This means that some black pupils were up to 2 years behind white pupils. There is mention (Shuey, op. cit., p. 248) of differentials in early leaving, another sign of delay in progress. Another report (Shuey, op. cit., p. 147) notes that coloured (sic) pupils were 6.6% of the total enrolment in grades 1-9 but 11.7% of the non-promotions.
The next issue is whether there is any evidence of the magnitude of changes in age distributions by race over time. There is evidence particularly from the 1980s when the setting of standards and the application of cut scores became popular in the United States in attempts to raise standards. A widely publicised example of a "commonsense approach" to school improvement was put into practice in one small school district. Students who failed to reach a predetermined achievement level were required to repeat a year. The demographic consequences are shown in Figure 6.
|
White girls |
White boys |
Black girls |
Black boys |
|
|
1973 |
94.1% |
90.7% |
97.1% |
97.6% |
|
1980 |
91.8% |
90.0% |
67.0% |
59.0% |
If, following the implementation of the policy, all students had by some strange coincidence attempted TOSCA Primary Form A, the difference in the gap due to age would have increased from no difference before the policy to just over 2 percentile points following the policy.
This is only one example of a school district in which the introduction of tougher standards resulted in an alteration of age by grade by race. I submit, therefore, that over the 50 years from which Shuey drew her figures, and Jensen his conclusions, the age relationship between black and white could not possibly have been stable either nationally or from study to study. If the double score of the IQ (age and items passed) had indeed been stable over fifty years then this must have been because any falls in the grade levels of the black students repeating must have been compensated for by their increased average item scores. But this result is extremely unlikely because, as already shown, average scores relate to level of schooling.
The 10-point Gap
Reid and Gilmore (1988, p.9) say that
On average, a 10-point gap in the mean score between European and Maori attainment was demonstrated at all three levels of the TOSCA.
When, during the trial of the TOSCA test items, the comparison was between non-European and European matched on socio-economic status, the authors found that "European children as a group typically performed only marginally better on the trial tests overall by approximately 2-3 points" (Reid and Gilmore, 1988, p. 10). Controlling for social class had substantially reduced this particular gap.
Throughout this paper I have demonstrated the influence of age in grade on the average scores of groups of children. Changes in ages result in changes in scores. This is so whether the calculations are carried out on item scores or final scores. If the calculations are made at one grade level in a horizontal comparison, scores are adjusted downwards for increments of age. If the calculations are made by age using a vertical comparison, population groups with more pupils at a lower level for their age will have a greater number of item scores relating to lower grade level averages. The demographic factor, as shown for the school district illustrated above in Figure 6 could account for up to 2 score points of difference. Somehow a genetically determined gap on the basis of race seems to disappear bit by bit rather like the smile on the Cheshire cat.
Summary and Conclusions
The genetic claim for IQ differences by race is based on an argument about individuals. It is individuals who bear the genes which are supposed to link to IQ results (see, Burt, 1961). But the evidence presented by Jensen and others uses group data based on samples of children. The age samples are not restricted to one grade level. They are spread across grade levels and different proportions of children of any one birth date are represented at any one grade level. The demographic pattern of ages across grades does not exhibit constancy across time or from site to site because it is the consequence of the way in which education systems control pupil flows. These flows are not determined by individual ability either national or by race but by system factors controlling group, not individual, progress (McDonald, 1993). A recent publication by the American Psychological Association (Neisser, 1998) reports that the extremes of the bell curve have come closer together but the authors fail to notice that pupils today are spread across a narrower band of grades. The age distribution has changed.
My argument in this presentation has been that Jensen and a whole heap of others, both his supporters and his adversaries, have overlooked the problem of the IQ scoring system applied to school populations. I also challenge you to find any discussion (as distinct from brief mention) of age distributions and their effects in any of the numerous texts on the IQ debate. Neither Jensen nor Shuey nor any of the New Zealanders reporting a 10-point gap between Maori and European have shown that the age differences between the samples were the same from site to site or sample to sample.
The weight of evidence is that age relationships do and have changed, often to the disadvantage of one race. If pupils end up at lower grade levels an age sample will not be able to produce group item scores to compensate for the depressing effect of these lower grades on score levels. Working to one’s level of schooling is a far more plausible reason for what Jensen interpreted as the massive washout of gains from early intervention, an interpretation which led to his claims that attempting to raise IQ by intervention programmes was not possible because the differences were genetic.
Ages are ages. Children’s ages are not altered by social class position or by item bias or by nutrition. Ironically it is this part of the scoring system and not the item score which is intended to take account of the process of maturation expressed as the gradual unfolding of a biological template. Yet another bit of educational folklore.
The idea of a stable age difference in IQ scores between one race and another is extremely unlikely given the way in which IQ scoring systems work and the way in which education systems spread children of one age across levels. In view of the evidence of instability in distributions of age by grade presented in this paper and the failure by those presenting the genetic viewpoint to check age distributions, I do not think the arguments for genetically determined race differences in IQ have any claim to scientific rigour. Stability in age differences on the basis of race over time and from site to site seems most unlikely. Furthermore, none of the twin studies that I am aware of has sampled monozygotic twins at different grade levels. My conclusion is that stable gaps indicating genetic difference are little more than educational folklore.
Acknowledgements
I would like to thank Robert Garden for assistance with the location of TIMSS data and Pierre Foy, Statistics Canada, for supplying me with unpublished age by grade estimates prepared for the IEA TIMSS. Neither is responsible for my use of them. Thanks are also due to Alison Gilmore for access to the TOSCA database and to Barb Bishop for assistance with analysis.
References
BURT, C. (1961) Is intelligence distributed normally? British Journal of Statistical Psychology, 16, 175-190.
CAHAN, S. & COHEN, N. (1989) Age versus schooling effects on intelligence development, Child Development 60, 1239-49.
CATES, J. & ASH, P. (1983) The end of a commonsense approach to basics, Phi Delta Kappan, 65(2), 137-38.
DE LEMOS, M. (1989) Effects of relative age within grade: Implications for the use of age-norms for group tests of general ability, Bulletin of the International Test Commission, 28/29, 21-44.
DE LEMOS, M. (1994) Not so straightforward: Interpreting the scores, Evaluation and Research in Education, 8; 1&2, 69-83.
DOBZHANSKY, T. (1973) Genetic Diversity and Human Equality, New York: Basic Books.
DEPARTMENT OF EDUCATION (1964) Education Statistics of New Zealand, Wellington: Department of Education. The publisher is now the Ministry of Education.
JENSEN A. (1969) How much can we boost IQ and scholastic achievement? Harvard Educational Review, 39(1), 1-123.
JENSEN, A. (1972) Genetics and Education. London: Methuen.
JENSEN, A. (1973) Educability and Group Differences, London: Methuen.
FISCHER, C.S., HOUT, M., JANKOWSKI, M.S., LUCAS, S.R., SWIDLER. A. & VOSS, K. (1996) Inequality by Design: Cracking the Bell Curve Myth. Princeton NJ: Princeton University Press.
GALTON, F. (1892) Hereditary Genius: An Inquiry into its Laws and Consequences. London: Macmillan & Company. First edition 1869.
HERRNSTEIN R. & MURRAY, C. (1996) The Bell Curve: Intelligence and Class Structure in American Life, New York, Simon and Schuster. First published in 1994.
HOLMES, C. (1989) Grade level retention effects: A meta-analysis of research studies, in L. Shepard & M. Smith (Eds.) Flunking Grades: Research and Policies on Retention, Lewes, Falmer.
KEMBLE WELCH, G. (1965) Doctor Smith: Hokianga’s King of the North, Blackwood & Janet Paul.
KENNY, D. (1988) The effect of grade repetition on the performance of infants and primary students. Paper presented at the 24th International Congress of Psychology, Sydney.
KOHN, M. (1996) The Race Gallery: The Return of Racial Science, London, Vintage. First published 1995.
MCDONALD, G. (1989) The normal curve of intelligence: Is this a representation of promotion patterns? Paper presented at the Annual Conference of the New Zealand Association for Research in Education, Heretaunga.
MCDONALD, G. (1993) Ages, stages and evaluation: The demography of the classroom, Evaluation and Research in Education, 7:3, 143-154.
MCDONALD, G. (1998) "Working its magic"? IQ rise and the demography of the classroom, Oxford Review of Education, 24:2, 225-234.
MCDONALD, G. (1999) Comparing school systems to explain enduring birth date effects, (re-submitted to Compare).
MARTIN, M. & KELLY, D. (Eds) (1997) TIMSS Technical Report, Volume II: Implementation and Analysis (Chestnut Hill. MA, TIMMS International Study Center Boston College).
NEISSER, U. (Ed.) The Rising Curve. Washington, D.C. American Psychological Association.
REID, N. & GILMORE, A. (1988) Test of Scholastic Abilities: Technical Supplement. Wellington: NZCER.
REID, N., JACKSON, P., GILMORE, A. & CROFT, C. (1981) Test of Scholastic Abilities: Teacher’s Manual, Wellington: NZCER.
SHUEY, A. (1966) The Testing of Negro Intelligence, New York, Social Science Press (2nd edition). First published 1958.
SHUKER, R. (1988) The OTIS test: its development and use, in M. Olssen (Ed.) Mental Testing in New Zealand, Dunedin: University of Otago Press.
TERMAN, L. & MERRILL, M. (1961) Stanford-Binet Intelligence Scale: Manual for the third revision Form L-M. London, George G. Harrap.
Address for correspondence: Geraldine McDonald, Honorary Fellow, School of Education, Victoria University of Wellington, PO Box 600, Wellington.
E.mail geraldine.mcdonald@clear.net.nz