2100 words
Introduction
One of the first critics of IQ tests after they were brought to America and used by the US army was journalist Walter Lippmann. Lippmann was very prescient with some of his argumentation against IQ, making similar anti-measurement arguments to anti-IQ-ists today. Although he was arguing against the “army intelligence tests” (the alpha and beta along with the Stanford-Binet), his criticisms hold even today and for any so-called IQ test since they are “validated” on their agreement with other tests (that weren’t themselves validated). He rightly noted that the test items are chosen arbitrarily (and that the questions chosen reflected the test constructor’s biases), and that the test isn’t a measure at all but a sorter of sorts, which in effect classifies people. This is similar to what Garrison (2009) argued in his book A Measure of Failure. He also argued that “IQ” isn’t like length or weight, which is what Midgley (2018) argued and also what Haier (2014, 2018) stated about IQ test scores—they are not like inches, liters, or grams.
Lippmann’s critique of IQ
Lippmann got it right in one major way—he stated that IQ tests results give the illusion of measurement because the results “are expressed in numbers.” However, measurement is much more complex than that—there needs to be a specified measured object, object of measurement and measurement unit for X to be a measure, and if there isn’t then X isn’t a measure. Lippmann stated that Terman couldn’t demonstrate that he was “measuring intelligence” (McNutt, 2013: 10).
Because the results are expressed in numbers, it is easy to make the mistake of thinking that the intelligence test is a measure like a foot rule or a pair of scales. It is, of course, a quite different sort of measure. For length and weight are qualities which men have learned how to isolate no matter whether they are found in an army of soldiers, a heap of bricks, or a collection of chlorine molecules. Provided the footrule and the scales agree with the arbitrarily accepted standard foot and standard pound in the Bureau of Standards at Washington they can be used with confidence. But “intelligence” is not an abstraction like length and weight; it is an exceedingly complicated notion which nobody has as yet succeeded in defining.
He then invents puzzles which can be employed quickly and with little apparatus, that will according to his best guess test memory, ingenuity, definition and the rest. He gives these puzzles to a mixed group of children and sees how children of different ages answer them. Whenever he finds a puzzles that, say, sixty percent of the twelve year old children can do, and twenty percent of the eleven year olds, he adopts that test for the twelve year olds. By a great deal of fitting he gradually works out a series of problems for each age group which sixty percent of his children can pass, twenty percent cannot pass and, say, twenty percent of the children one year younger can also pass. By this method he has arrived under the Stanford-Binet system at a conclusion of this sort: Sixty percent of children twelve years old should be able to define three out of the five words: pity, revenge, charity, envy, justice. According to Professor Terman’s instructions, a child passes this test if he says that “pity” is “to be sorry for some one”; the child fails if he says “to help” or “mercy.” A correct definition of “justice” is as follows: “It’s what you get when you go to court”; an incorrect definition is “to be honest.”
A mental test, then is established in this way: The tester himself guesses at a large number of tests which he hopes and believes are tests of intelligence. Among these tests those finally are adopted by him which sixty percent of the children under his observation can pass. The children whom the tester is studying select his tests.
…
What then do the tests accomplish? I think we can answer this question best by starting with an illustration. Suppose you wished to judge all the pebbles in a large pile of gravel for the purpose of separating them into three piles, the first to contain the extraordinary pebbles, the second normal pebbles, and the third the insignificant pebbles. You have no scales. You first separate from the pile a much smaller pile and pick out one pebble which you guess is the average. You hold it in your left hand and pick up another pebble in your right hand. The right pebble feels heavier. You pick up another pebble. It feels lighter. You pick up a third. It feels still lighter. A fourth feels heavier than the first. By this method you can arrange all the pebbles from the smaller pile in a series running from the lightest to the heaviest. You thereupon call the middle pebble the standard pebble, and with it as a measure you determine whether any pebble in the larger pile is sub-normal, a normal or a supernormal pebble.
This is just about what the intelligence test does. It does not weigh or measure intelligence by any objective standard. It simply arranges a group of people in a series from best to worst by balancing their capacity to do certain arbitrarily selected puzzles, against the capacity of all the others. The intelligence test, in other words, is fundamentally an instrument for classifying a group of people. It may also be an instrument for measuring their intelligence, but of that we cannot be at all sure unless we believe that M. Binet and Mr. Terman and a few other psychologists have guessed correctly but, as we shall see later, the proof is not yet at hand.
The intelligence test, then, is an instrument for classifying a group of people, rather than “a measure of intelligence.” People are classified within a group according to their success in solving problems which may or may not be tests of intelligence.
Even though Lippmann was writing over 100 years ago in 1922, his critiques have stood the test of time. Being one of the first critics of hereditarian dogman, he took on Terman in the pages of The New Republic, and I don’t think Lippmann’s main arguments were touched—and 100 years later, it looks to be more of the same. Still, as shown above, even some psychologists admit that certain things that are true of physical measures aren’t true of IQ.
Jansen (2010: 134) noted that “Lippmann vehemently opposed introducing I.Q. tests into the schools on democratic grounds, contending that it would lead to an intellectual caste system.” Lippmann and other environmentalists in the 1920s sought to understand variation in IQ as due to environment, and that all individuals had the same “biological capacity” for “intelligence” (Mancuso and Dreisinger, 1969). (They also state that a physiological intelligence was the “logical outcome” of the scientific materialism of the 19th century, a point I have made myself.) Not only is the concept of “IQ/intelligence” arbitrary, it is indeed used as an ideological tool (Gonzelez, 1979; Richardson, 2017).
The fact of the matter is, IQ-ists back then—and I would say now as well—were guilty of the naming fallacy (Conley, 1986):
Walter Lippmann had exposed most of its critical weaknesses in a series of articles in the New Republic in 1922. He emphasized the fundamental point that “intelligence is not an abstraction like length and weight; it is an exceedingly complicated notion which nobody has yet succeeded in defining.”33 Then, in 1930, C.C. Brigham, one of the most influential scientific proponents of eugenics, recanted. Accepting Lippmann’s point, he accused himself and his colleagues of a “naming fallacy” which allowed them “to slide mysteriously from the score in the test to the hypothetical faculty suggested by the name given to the test.”’34 He repudiated the whole concept of national and racial comparisons based on intelligence test scores, and then, in what is surely the most remarkable statement I have ever read in a scientific publication, concluded, “One of the most pretentious of these comparative racial studies—the writer’s own—was without foundation.”35
Finally, returning to the test items being arbitrary and reflecting the test constructor’s biases, this was outright admitted by Terman. He stated that he created a “norm intelligent” group which led to the “developing [of] an exclusion-inclusion criteria that favored the [US born white men of north European descent], test developers created a norm “intelligent” (Gersh, 1987, p.166) population “to differentiate subjects of known superiority from subjects of known inferiority” (Terman, 1922, p. 656) (Bazemore-James, Shinaprayoon, and Martin, 2017). This, of course, proves Lippmann’s point about these tests—the test’s constructors assumed ahead of time who is or is not “intelligent” and then devise tests with specific item content to get their desired distribution—Terman outright admitted this. Lippmann was 100 percent right about that issue.
Conclusion
Lippmann’s critiques of IQ, although they were written 100 years ago, can still be made today. Lippmann got much right in his critiques of Terman and other hereditariansm psychologists, and his claim that if you can’t define something then you can’t measure it is valid. Although Lippmann’s warnings on the abuse of IQ testing were not heeded, the fact of the matter is, Lippmann was in the right in this debate 100 years ago. One of Lippmann’s main points—that test items are chosen arbitrarily and chosen to agree with a priori biases—still holds true today, since the Stanford-Binet is now on its 5th edition and since it is “validated” on its “agreement” with older versions of the Stanford-Binet, this assumption is carried into the modern day. (Do note that Terman assumed that men and women should have similar IQ scores and so devised his test to reflect this, and this shows, again, that the previous biased that the test constructors held were then built into the test. So this shows that what chat be built in can be built out.)
Lippmann’s observations on the limitations of IQ tests and their use in reinforcing stereotypes have stood the test of time. So by analyzing his arguments we can draw parallels to critiques of IQ today. So although Lippmann’s critiques of IQ are 100 years old there is still much value in them, since the arguments he made back then still can be made today, since there has been basically little to no progress in that time period of 100 years since he made his critiques.
It’s interesting that a journalist of all people would be able to mount the kind of critique against IQ that he did, while arguing with one of the men who first used IQ tests on a large level in America (Terman). This just goes to show that the IQ debate in the past 100 years has hardly made any progress on the hereditarian side, since they still rely on twin studies and other similar studies to argue for a genetic or hereditary hypothesis of mental abilities. But mental abilities and psychological traits are molded by what one experiences in their lives—indeed, one’s IQ is an outcome of one’s life experiences due to the kind of cultural and psychological tools that people acquire as they age.
Lippmann correctly argued that nothing measurable underlies the concept of “intelligence”—and that argument he made is one that is very familiar to readers of this blog. This argument is one that is quite powerful, and it is obviously quite old as evidenced by Lippmann’s prescient critique of “IQ.” Lippmann had some great knowledge on how tests were constructed and on the methodological and theoretical pitfalls that still plague psychometrics today. While psychometricians have yet to address the serious pitfalls that invalidate their field, the use of their methods continues to perpetuate biased outcomes and reinforce social inequalities, since it is claimed that where a group falls on their “intelligence” score is where they fall on the social hierarchy. This then goes back to Terman’s assumptions when he created the Stanford-Binet. Have any ideas about society and how it’s structured? Have any ideas about the so-called distribution of “intelligence”? Just build it into the test and then claim that your test is showing biologically natural intelligence levels between people. This is what Lippmann argued against, and he was right to argue against it; he knew that the test constructors used items that showed what they wanted.
So, anyone who wants to argue against the concept of IQ would do well to read Lippmann’s 6 articles arguing against the concept of IQ/intelligence.
you know lippman was a jew right?
rr! comrade! have you actually ever seen anti-black racism?
out here in the PNW…the black kids called me lazy!
almost all white schools k-12 + a few blacks + a few “asians”…
NOTHING!
they were just kids.
LITERALLY NOTHING!
IT LITERALLY NEVER CAME UP!
LikeLike