NotPoliticallyCorrect

Home » IQ » IQ Test Construction

IQ Test Construction

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 142 other followers

Follow me on Twitter

Charles Darwin

Denis Noble

JP Rushton

Richard Lynn

L:inda Gottfredson

Goodreads

Advertisements

1550 words

No one really discusses how IQ tests are constructed; people just accept the numbers that are spit out and think that it shows one’s intelligence level relative to others who took the test. However, there are huge methodological flaws in regard to IQ tests—one of the largest, in my opinion, being that they are constructed to fit a normal curve and based on the ‘prior knowledge’ of who is or is not intelligent.

What people don’t understand about test construction is that the behavior genetic (BG) method must assume a normal distribution. IQ tests have been constructed to display this normal distribution, so we cannot say whether or not it exists in nature, though few human traits fall on the normal distribution. The fact of the matter is this: The normal curve is achieved through keeping more items that people get right while keeping the smaller proportion of items that people get right and wrong. This forces the normal curve and all of the assumptions that come along with this so-called IQ bell curve.

Even then, the fact that the normal distribution is forced doesn’t mean as much as the assumptions and conclusions drawn from the forced curve. It is assumed that individual test score differences arise out of ‘biology’, however with how test questions are manipulated to get the results that the test constructors want, it is then assumed that the cause for individual test score differences are ‘biological’ in nature, however we don’t know if these distributions are ‘biological’ in nature due to how the tests are constructed.

The fact of the matter is, the tests are constructed based off of the prior knowledge of who is or is not intelligent. This means that we can ‘build the test’ to fit these preconceived notions. The problem of item selection was discussed by Richardson (1998) who discussed boys scoring a few points higher than girls, and wondering whether or not these differences should be ‘allowed to persist’ or not. Richardson (1998: 114) writes (12/26/17 Edit: I’ll also provide the quote that precedes this one):

“One who would construct a test for intellectual capacity has two possible methods of handling the problem of sex differences.

1  He may assume that all the sex differences yielded by his test items are about equally indicative of sex differences in native ability.

2  He may proceed on the hypothesis that large sex differences on items of the Binet type are likely to be factitious in the sense that they reflect sex differences in experience or training. To the extent that this assumption is valid, he will be justified in eliminating from his battery test items which yield large sex differences.

The authors of the New Revision have chosen the second of these alternatives and sought to avoid using test items showing large differences in percents passing.”  (McNemar 1942:56)

This is, of course, a clear admission of the subjectivity of such assumptions: while ‘preferring’ to see sex differences as undesirable artefacts of test composition, other differences between groups or individuals, such as different social classes or, at various times, different ‘races’, are seen as ones ‘truly’ existing in nature. Yet these, too, could be eliminated or exaggerated by exactly the same process of assumption and manipulation of test composition.

And further writes on page 121:

Suffice it to say that investigators have simply made certain assumptions about‘what to expect’ in the patterns of scores, and adjusted their analytical equations accordingly: not surprisingly, that pattern emerges!

The only ‘assumption’ that the test constructors have is the biases they already have on who is or is not ‘intelligent’ and then they construct the test through item selection, excising items that don’t fit their desired distribution. Is that supposed to be scientific? You can ask a group of children a bunch of questions and then construct a test to get the conclusion you want based on item selection.

The BG method needs to assume that IQ test scores lie on a normal curve and that it is a quantitative trait that exhibits a normal distribution, though Micceri (1989) showed that normal distributions for measurable traits are the exception, rather than the rule, for numerous measurable traits. Richardson (1998: 113) further writes:

The same applies to many other ‘characteristics’ of IQ. For example, the ‘normal distribution, or bell-shaped curve, reflects (misleadingly as I have suggested in Chapters 1 to 3) key biological assumptions about the nature of cognitive abilities. It is also an assumption crucial to many statistical analyses done on test scores. But it is a property built into a test by the simple device of using relatively more items on which about half the testees pass, and relatively few items on which either many or only a few of them pass. Dangers arise, of course, when we try to pass this property off as something happening in nature instead of contrived by test constructors.

So with the knowledge of test construction, then there is something very obvious here: we can construct IQ tests that, say, show blacks scoring higher than whites and women scoring higher than men. We can then make the assumption that there are genes that are responsible for this distribution and then ‘find genes’ that supposedly cause these differences in test scores (which are constructed to show the differences!). What then? Let’s say that someone did do that, would the logical conclusion be that there are genes ‘driving’ the differences in IQ test scores?

Richardson (2017: 3) writes:

In summary, either directly or indirectly, IQ and related tests are calibrated against social class background, and score differences are inevitably consequences of that social stratification to some extent. Through that calibration, they will also correlate with any genetic cline within the social strata. Whether or not, and to what degree, the tests also measure “intelligence” remains debateable because test validity has been indirect and circular. … Such circularity is also reflected in correlations between IQ and adult occupational levels, income, wealth, and so on. As education largely determines the entry level to the job market, correlations between IQ and occupation are, again, at least partly, self-fullfilling. … CA [cognitive ability], as measured by IQ-type tests, is intrinsically inter-twined with social stratification, and its associated genetic background, by the very nature of the tests.

This, again, falls back on the non-existent construct validity that IQ tests have. Construct validity “defines how well a test or experiment measures up to its claims.” No such construct validity exists for IQ tests. If breathalyzers didn’t test someone’s fitness to drive, would they still be a good measure? If they had no construct validity, if there was no biological model to calibrate the breathalyzer against, would we still accept it as a realistic model to test people against and judge their fitness to drive? Still yet another definition of construct validity comes from Strauss and Smith (2009) who write that psychological constructs are “validated by testing whether they relate to measures of other constructs as specified by theory.” No such biological model exists for IQ; why expect some type of biological model like this when there are other perfectly well-reasoned response to how and why individuals differ in IQ test scores (Richardson, 2002)?

The normal distribution is forced, which IQ-ists claim to know. Richardson (1998) notes that Jensen “noted how ‘every item is carefully edited and selected on the basis of technical procedures known as “item analysis”, based on tryouts of the items on large samples and the test’s target population’ (1980:145).” These ‘tryouts’ are what force the normal curve, and no matter how ‘technical’ the procedures are, there are still huge biases, which then make people draw huge assumptions, again, based on who is or is not intelligent.

In sum, IQ tests are constructed to fit a normal curve on the basis of an assumption of a normal distribution, and on the presupposed basis of who is or is not ‘intelligent’ (whatever that means). The BG method needs to assume that IQ is a quantitative trait which exhibits a normal distribution. IQ is assumed to be like height, or weight, but which physiological process in the body does it mimick? I have argued that there is no physiological basis to ‘IQ’ or what they test and that they can be explained not by biology, but through test construction. I wonder what the distributions of IQ test scores would look like without forced normal distributions? Since it is assumed that IQ tests something directly measurable—like height and weight as is normally used—then they must fall on a normal distribution, which all other measurable psychological traits do not show (Micceri, 1989Buzsaki and Mizseki, 2014).

Some may argue that ‘they know this’ (they being psychometricians). However, ‘they’ must know that most of their assumptions and conclusions about ‘good and bad genes’ lie on the huge assumption of the normal distribution. IQ test scores do not show a normal distribution, they were designed to create it. The fact that most psychological traits show a strong skew to one side and so that’s why a normal distribution is forced is meaningless. The fact of the matter is, just through how the tests are constructed means that we should be cautious as to what these tests test with the assumptions that we currently have about them.

Advertisements

15 Comments

  1. RaceRealist says:

    Melo,

    This forced distribution makes data less accurate: Blatantly false, and only someone who doesn’t understand what Normal distributions are would ever say something like that.

    Of course there is no way to ‘know’ whether or not it ‘makes data less accurate’, but with the knowledge of other psychological traits’ distribution not being normal it’s a great guess. Of course there is a chance that the data is less accurate and that one should be cautious on the conclusions they draw from the tests. (Like saying X is smarter than Y because he scored higher on the test and the reason is ‘genetic’.)

    So basically your argument is redundant. It assumes a lot of things, and tries to paint a “conspiracy”.

    It’s not ‘redundant’ nor does it ‘try to paint a “conspiracy”‘. The basic argument brings up the flaws in test construction and cautions to be extremely careful with the conclusions drawn from the forced normal distribution. It is then assumed that people towards the right end have a surfeit of ‘good genes’ while those towards the middle have ‘average genes’ and those towards the left end have ‘bad genes’. The assumption is that genes are ‘additive’ and have ‘independent genetic effects’. This implies that genes work ‘independent’ of the environment, which is very, very wrong:

    … these conclusions are erroneous due to large violations of the additivity assumption underlying behavioral genetics methods – that sources of genetic and shared and nonshared environmental variance are independent and non-interactive.

    Daw, J., Guo, G., & Harris, K. M. (2015). Nurture net of nature: Re-evaluating the role of shared environments in academic achievement and verbal intelligence. Social Science Research, 52, 422-439. doi:10.1016/j.ssresearch.2015.02.011

    That’s one reason why the assumption of the normal distribution is flawed; it then assumes that people have ‘good and bad’ genes that ’cause’ their IQ scores. Genes don’t work like that.

    Psychology is one of the softest ‘sciences’ out there. So they’ll do anything to protect their ‘golden egg’ called ‘IQ’ since it’s the ‘best they have’, and even then, as I’m showing, it’s not good enough but they have fooled themselves and others that their construct called ‘IQ’ predicts life success due to ‘testing for something biological’ (whatever it is) and that if you score higher you have a surfeit of ‘better genes’ than one who scores lower. Genes don’t work that way, and that’s what psychologists want you to believe with their forced normal distribution through item selection.

    Like

    • meLo says:

      The validity of IQ does not rest upon it’s genetic correlations. If that’s your only argument then Im not sure what you want from me.

      Like

    • RaceRealist says:

      It rests on there being no agreed-upon model for its validity. If there were no validity to say scale weight would it be a useful measure? Breathalyzers? White blood cell count? It’s validity (or lack thereof) is important to discuss.

      Like

    • meLo says:

      The physiology argument will have to wait RR, plus aren’t you writing something about that?

      Like

  2. IQ tests are not perfect at measuring intelligence, but they do a good enough job. High scores correlate with success at life, both in long-term success such as career and income, but also the acquisition of abstract, cognitive-demanding skills such as coding, math, and writing. Very seldom will someone online post about a high SAT and or IQ score and be as dumb as a rock; usually, such people are quite articulate and well-read. Sub-tests such as digit recall admit a normal distribution without any need for forcing.

    Like

    • RaceRealist says:

      They don’t even do a ‘good enough job’ because there is a preconceived notion of who is or is not intelligent which is built into the test. The ‘high scores correlate with success in life’ because of how they’re constructed. They ‘correlate with academic achievement’ because they’re different versions of the same test. What people post online about their own test scores are irrelevant but of course there is a relationship like that between IQ and achievement tests because they’re, again, different versions of the same test. Are people’s IQs just digit recall though? Digit recall tests working memory (whatever that is) so the fact that a subtest has an unforced normal distribution (barely) says nothing to the overall critique of the normal curve being forced through item selection.

      Like

  3. Steve Sailer says:

    “we can construct IQ tests that, say, show blacks scoring higher than whites ”

    If you could create such a test while maintaining its usefulness, you would become very rich. Why don’t you construct such a test and make lots and lots of money?

    Like

    • RaceRealist says:

      What usefulness? The ‘usefulness’ is built into the test. I’m not trained in ‘item analysis’ (Jensen, 1980) so I cannot construct such a test. However, what I stated is the logical conclusion of test construction. The quotation from McNemar shows how arbitrary the process of ‘item analysis’ is and proves my point on even group ‘differences’.

      That’s not to say that I don’t believe that races are ‘equal’ in their mental faculties—even ‘intelligence’ (whatever that is). However, IQ doesn’t test ability for complex cognition (Richardson and Norgate, 2014) the ‘usefulness’ of the test is built into it. It only has as much ‘predictive power’ as the test constructors allow.

      Liked by 1 person

    • Bald says:

      Okay so, are you saying that all studies around IQ are false?

      Such as the consumption of fish and IQ for example? Such as those about lead and IQ?

      Like

    • Bald says:

      Because I think that your opinion isn’t conclusive enough compared to the empirical evidences around IQ (Flynn effect, regression to the mean…etc)

      Also, what’s the point of making a test where black women score higher than white men if they still have lower SES?

      To finish, you seems to depend too much on Richardson.

      Like

    • rw95 says:

      Hey Steve, why don’t you come back when you actually have some kind of credentials in this sort of thing? Because I know you’re so qualified to talk about matters of biology and psychology with your Master’s Degree in… Finance and Marketing.

      By the way, how’s that whole Trump thing going?

      Like

    • RaceRealist says:

      Okay so, are you saying that all studies around IQ are false?

      False? No. Does it test what psychometricians et al claim it does? Not by a long shot.

      Such as the consumption of fish and IQ for example? Such as those about lead and IQ?

      I touched briefly on breastfeeding and intelligence in my reply to Jared Taylor. The fact that it’s associated with IQ in RCTs means… What? That fatty acids are good for brain development? Who knew? In regard to lead and IQ, lead changes how the brain functions, which also can be passed down epigenetically, from mother to child, then child to the grandchild. The fact that lead depresses IQ is meaningless because it disrupts normal brain functioning.

      Because I think that your opinion isn’t conclusive enough compared to the empirical evidences around IQ (Flynn effect, regression to the mean…etc)

      The Flynn Effect can be explained by the rise in the middle class. Regression to the mean proves that IQ tests biological processes?

      Also, what’s the point of making a test where black women score higher than white men if they still have lower SES?

      What’s the point of this comment?

      To finish, you seems to depend too much on Richardson.

      Irrelevant.

      rw95,

      Hey Steve, why don’t you come back when you actually have some kind of credentials in this sort of thing?

      Appeals to authority aren’t cool.

      Like

    • rw95 says:

      RR,

      The urge to stick it to Steve is extremely tempting. But your blog, your rules.

      Like

    • Bald says:

      RaceRealist, your response about ‘regression to the mean” is too vague.

      And my criticism about Richardson is not incorrect, I’ve heard that his studies are based on low samples, check E. Kirkegaard article on you, it also talk about Richardson.

      I’m not saying that you’re wrong, just that you lack of solid evidences for your main arguments. Richardson seems to be a huge contrarian.

      but I will concede for the rest of your points such as lead, SES…etc

      Like

    • RaceRealist says:

      RaceRealist, your response about ‘regression to the mean” is too vague.

      It’s only asking if that proves that there is a biological substrate to IQ.

      And my criticism about Richardson is not incorrect

      In this instance it is because the subject of test construction and validity has not been addressed.

      I’ve heard that his studies are based on low samples, check E. Kirkegaard article on you, it also talk about Richardson.

      I’m aware of his replies to a few of my articles. I’ll respond to him in due time.

      I’m not saying that you’re wrong, just that you lack of solid evidences for your main arguments. Richardson seems to be a huge contrarian.

      Thr evidence for test construction and validity is sound and doesn’t fully rely on Richardson (as if that matters).

      but I will concede for the rest of your points such as lead, SES…etc

      It affects normal functioning and therefore is irrelevant to normal variation, which is what the discussion rests on.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

Please keep comments on topic.

Jean Baptiste Lamarck

Eva Jablonka

Charles Murray

Arthur Jensen

Blog Stats

  • 356,722 hits
Follow NotPoliticallyCorrect on WordPress.com

suggestions, praises, criticisms

If you have any suggestions for future posts, criticisms or praises for me, email me at RaceRealist88@gmail.com
%d bloggers like this: