Race, Test Bias, and 'IQ Measurement'

1800 words

The history of standardized testing—including IQ testing—has a contentious history. What causes score distributions between groups of people? I stated at least four reasons why there is a test gap:

(1) Differences in genes cause differences in IQ scores;

(2) Differences in environment cause differences in IQ scores;

(3) A combination of genes and environment cause differences in IQ scores; and

(4) Differences in IQ scores are built into the test based on the test constructors’ prior biases.

I hold to (4) since, as I have noted, the hereditarian-environmentalist debate is frivolous. There is, as I have been saying for years now, no agreed-upon definition of ‘intelligence’, since there are such disparate answers from the ‘experts’ (Lanz, 2000; Richardson, 2002).

For the lack of such a definition only reflects the fact that there is no worked-out theory of intelligence. Having a successful definition of intelligence without a corresponding theory would be like having a building without foundations. This lack of theory is also responsible for the lack of some principled regimentation of the very many uses the word ‘intelligence’ and its cognates are put to. Tao many questions concerning intelligence are still open, too many answers controversial. Consider a few examples of rather basic questions: Does ‘intelligence’ name some entity which underlies and explains certain classes of performances1, or is the word ‘intelligence’ only sort of a shorthand-description for ‘being good at a couple of tasks or tests’ (typically those used in IQ tests)? In other words: Is ‘intelligence’ primarily a descriptive or also an explanatorily useful term? Is there really something like intelligence or are there only different individual abilities (compare Deese 1993)? Or should we turn our backs on the noun ‘intelligence’ and focus on the adverb ‘intelligently’, used to characterize certain classes of behaviors? (Lanz, 2000: 20)

Nash (1990: 133-4) writes:

Always since there are just a series of tasks of one sort or another on which performance can be ranked and correlated with other performances. Some performances are defined as ‘cognitive performances’ and other performances as ‘attainment performances’ on essentially arbitrary, common sense grounds. Then, since ‘cognitive performances’ require ‘ability’ they are said to measure that ‘ability’. And, obviously, the more ‘cognitive ability’ an individual posesses the more that individual can acheive. These procedures can provide no evidence that IQ is or can be measured, and it is rather besides the point to look for any, since that IQ is a metric property is a fundamental assumption of IQ theory. It is imposible that any ‘evidence’ could be produced by such procedures. A standardised test score (whether on tests designated as IQ or attainment tests) obtained by an individual indicates the relative standing of that individual. A score lies within the top ten perent or bottom half, or whatever, of those  gained by the standardisation group. None of this demonstrates measurement of any property. People may be rank ordered by their telephone numbers but that would not indicate measurement of anything. IQ theory must demonstrate not that it has ranked people according to some performance (that requires no demonstration) but that they are ranked according to some real property revealed by that performance. If the test is an IQ test the property is IQ — by definition — and there can in consequence be no evidence dependent on measurement procedures for hypothesising its existence. The question is one of theory and meaning rather than one of technique. It is impossible to provide a satisfactory, that is non-circular, definition of the supposed ‘general cognitive ability’ IQ tests attempt to measure and without that definition IQ theory fails to meet the minimal conditions of measurement.

These is similar to Mary Midgley’s critique of ‘intelligence’ in her last book before her death What Is Philosophy For? (Midgley, 2018). The ‘definitions’ of ‘intelligence’ and, along with it, its ‘measurement’ have never been satisfactory. Haier (2016: 24) refers to Gottfredson’s ‘definition’ of ‘intelligence, stating that ‘intelligence’ is a ‘general mental ability.’ But if that is the case, that it is a ‘general mental ability’ (g) then ‘intelligence’ does not exist because ‘g’ does not exist as a property in the brain. Lanz’s (2000) critique is also like Howe’s (1988; 1997) that ‘intelligence’ is a descriptive, not explanatory, term.

Now that the concept of ‘intelligence’ has been covered, let’s turn to race and test bias.

Test items are biased when they have different psychological meanings across cultures (He and van de Vijver 2012: 7). If they have different meanings across cultures, then the tests will not reflect the same ‘ability’ between cultures. Being exposed to the knowledge—and correct usage of it—on a test is imperative for performance. For if one is not exposed to the content on the test, how are they expected to do well if they do not know the content? Indeed, there is much evidence that minority groups are not acculturated to the items on the test (Manly et al, 1997; Ryan et al, 2005; Boone et al, 2007). This is what IQ tests measure: acculturation to the the tests’ constructors, school cirriculum and school teachers—aspects of white, middle-class culture (Richardson, 1998). Ryan et al (2005) found that reading and and educational level, not race or ethnicity, was related to worse performance on psychological tests.

Serpell et al (2006) took 149 white and black fourth-graders and randomly assigned them to ethnically homogeneous groups of three, working on a motion task on a computer. Both blacks and whites learned equally well, but the transfer outcomes were better for blacks than for whites.

Helms (1992) claims that standardized tests are “Eurocentric”, which is “a perceptual set in which European and/ or European American values, customs, traditions and characteristics are used as exclusive standards against which people and events in the world are evaluated and perceived.” In her conclusion, she stated that “Acculturation
and assimilation to White Euro-American culture should enhance one’s performance on currently existing cognitive ability tests” (Helms, 1992: 1098). There just so happens to be evidence for this (along with the the studies referenced above).

Fagan and Holland (2002) showed that when exposure to different kinds of information was required, whites did better than blacks but when it was based on generally available knowledge, there was no difference between the groups. Fagan and Holland (2007) asked whites and blacks to solve problems found on usual IQ-type tests (e.g., standardized tests). Half of the items were solvable on the basis of available information, but the other items were solveable only on the basis of having acquired previous knowledge, which indicated test bais (Fagan and Holland, 2007). They, again, showed that when knowledge is equalized, so are IQ scores. Thus, cultural differences in information acquisition explain IQ scores. “There is no distinction between crassly biased IQ test items and those that appear to be non-biased” (Mensh and Mensh, 1991). This is because each item is chosen because it agrees with the distribution that the test constructors presuppose (Simon, 1997).

How do the neuropsychological studies referenced above along with Fagan and Holland’s studies show that test bias—and, along with it test construction—is built into the test which causes the distribution of the scores observed? Simple: Since the test constructors come from a higher social class, and the items chosen for inclusion on the test are more likely to be found in certain cultural groups than others, it follows that the reason for lower scores was that they were not exposed to the culturally-specific knowledge used on the test (Richardson, 2002; Hilliard, 2012).

The [IQ] tests do what their construction dictates; they correlate a group’s mental worth with its place in the social hierarchy. (Mensh and Mensh, 1991)

This is very easily seen with how such tests are constructed. The biases go back to the beginning of standardized testing—the first one being the SAT. The tests’ constructors had an idea of who was or was not ‘intelligent’ and so constructed the tests to show what they already ‘knew.’

…as one delves further … into test construction, one finds a maze of arbitrary steps taken to ensure that the items selected — the surrogates of intelligence — will rank children of different classes in conformity with a mental hierarchy that is presupposed to exist. (Mensh and Mensh, 1991)

Garrison (2009: 5) states that standardized tests “exist to assess social function” and that “Standardized testing—or the theory and practice known as “psychometrics” … is not a form of measurment.” The same way tests were constructed in the 1900s is the same way they are constructed today—with arbitrary items and a presuppossed mental hiearchy which then become baked into the tests by virtue of how they are constructed.

IQ-ists like to say that certain genes are associated with high intelligence (using their GWASes), but what could the argument possibly be that would show that variation in SNPs would cause variation in ‘intelligence’? What would a theory of that look like? How is the hereditarian hypothesis not a just-so story? Such tests were created to justify the hierarchies in society, the tests were constructed to give the results that they get. So, I don’t see how genetic ‘explanations’ are not just-so stories.

1 Blacks and whites are different cultural groups.

2 If (1), then they will have different experiences by virtue of being different cultural

3 So blacks and whites, being different cultural groups, will score differently on tests of ability, since they are exposed to different knowledge structures due to their different cultures and so, all tests of ability are culture-bound. Knowledge, Culture, Logic, and IQ

Rushton and Jensen (2005) claim that the evidence they review over the past 30 years of IQ testing points to a ‘genetic component’ to the black-white IQ gap, relying on the flawed Minnesota study of twinsreared apart” (Joseph, 2018)—among other methods—to generate heritability estimates and state that “The new evidence reviewed here points to some genetic component in Black–White differences in mean IQ.” The concept of heritability, however, is a flawed metric (Bailey, 1997; Schonemann, 1997; Guo, 2000; Moore, 2002; Rose, 2006; Schneider, 2007; Charney, 2012, 2013; Burt and Simons, 2014; Panofsky, 2014; Joseph et al, 2015; Moore and Shenk, 2016; Panofsky, 2016; Richardson, 2017). That G and E interact means that we cannot tease out “percentages” of nature and nurture’s “contribution” to a “trait.” So, one cannot point to heritability estimates as if they point to a “genetic cause” of the score gap between blacks and whites. Further note that the gap has closed in recent years (Dickens and Flynn, 2006; Smith, 2018).

And now, here is another argument based on the differing experiences that cultural groups experience which then explains IQ score differences (eg Mensh and Mensh, 1991; Manly et al, 1997; Kwate, 2001; Fagan and Holland, 2002, 2007; Cole, 2004; Ryan et al, 2005; Boone et al, 2007; Au, 2008; Hilliard, 2012; Au, 2013).

(1) If children of different class levels have experiences of different kinds with different material; and
(2) if IQ tests draw a disproportionate amount of test items from the higher classes; then
(c) higher class children should have higher scores than lower-class children.

1 Comment

  1. Unorthodox Theory says:

    You post the same shit with different titles

    RR: No, they’re pretty different.


Please keep comments on topic.

