Wednesday, December 16, 2009

How likely is each score in an IQ SEM confidence band likely to represent the person's true IQ? AP101 Brief # 3 (supplement to IAP AP101 Report #5)

I've been pleased with the response to IAP AP101 Report #5:  Standard error of measurement (SEM):  An explanation and facts for "fact finders" in MR/ID death penalty proceedings.  As a number of astute readers have noted, I stayed away from the various statistical and psychometric nuances of the SEM.  This was deliberate.  However, this was not without a cost.  Some readers have asked me to answer more specific questions.  Thus, in this Applied Psychometrics Brief (see others on blog sidebar) I briefly address one question that was asked.  If you have not read the original report, I would suggest you read it now and then return to this brief.   I will not be revisiting the basic concepts and material  presented in IAP AP101 Report # 5.

I've learned that lawyers often will often ask psychologists (who have administered intelligence tests reported in Atkins proceedings) whether the highest IQ score bounded by the 95% SEM confidence band is just as likely to represent the person's "true IQ" score as is the actual obtained score (or conversely, scores at the lower end of the SEM confidence band).  The answer is NO!

As described in IAP AP101 # 5, if a person was tested repeatedly on the same IQ test the scores would form a normal bell curve distribution.  In report #5, I included the following figure to demonstrate this phenomena with real-world data for one individual.  As you can see, the shape is approximating the normal curve. 

A nicer version, which represents the theoretical distribution of IQ scores a person would obtain if tested 100's and 100's of times, is below.

Inspection of these figures should lead to the obvious conclusion that when a person's obtained or measured IQ score is bounded by a 95% confidence interval band, the most probable estimates of the persons true IQ are near the center of the figure, and values immediately adjacent to the average or middle value.  This is the reason for the high peak.  The largest area under the curve indicates that a persons true IQ score most likely occurs in this region (near the observed or measured IQ score).  The farther a score is from the middle of the distribution (the person's obtained or measured score), the less likely it represents the persons true IQ score.  This is represented by less area under the normal curve as one moves out in either direction.

To help professionals answer specific questions about the probability of score X representing the person's true IQ, I've harnessed the technical power and tools of the normal curve (as discussed in report 5) to generate a table of probabilities.  The table is next, followed by an explanation on its use and interpretation.

This table numerically captures what is clear from the two figures.  Let me explain.  Suppose a person obtains a score 70 on an accepted standardized IQ test.  The above table lists the chances (probabilities) of scores being from 1 to 10 points higher (or lower) from the obtained IQ score (70) representing the persons true IQ.  We use Column A to add or subtract SS points from the measured IQ---and then scan across the table to examine the probability that a score represents the persons true IQ score.

  • A person's measured IQ score is the best estimate of the person's true IQ score.  Reading across the able, there is a 50% probability that, in the current example, the IQ score of 70 represents the persons true IQ (acknowledging measurement error---see report #5).  Note that any IQ score can be used with this table.  If the person obtained a score of 65, there is a 50% probability that the 65 is the best estimate of the person's true score
  • The next best estimate (for an obtained IQ of 70)  would be a score of 71 or 69, each having a 42.1% probability (chance) of representing the person's true IQ.  As one can see, the best estimates of a person's true IQ score are those scores that are in the closest proximity to the obtained IQ score (which is the most probably estimate).
  • As scores in the 95% SEM confidence band become farther away from the obtained or measured IQ score, they have a lesser probability of representing the persons true IQ score.  For example, given an IQ of 70, the upper limit of +2 SEM would be a 75 (70 plus +5 in column A).  Reading across from this value (+2 SEM = +5), the probability of the 75 representing the persons true IQ score is much lower--15.9 or approximately 16 %.  The same probability exists for a score of 65 (-5 in column A).
  • If an attorney would ask a testifying psychologist "given the defendants IQ score of 70 and the standard error of measurement (SEM) of + 5 points, isn't it true that a score of 75 is just as likely to represent the persons true IQ?"  The correct response would be:  "No.  A score of 75 has an approximately 16 % chance of representing the person's true IQ, while scores closer to the score of 70, for example, 68, 69, 71,72, and 70 in particular, are more likely to represent the person's true IQ by a factor of 2 to 3 times.  Scores of 68 and 72 have a 34% chance of representing the persons true IQ, twice as likely as the score of 75.  Scores of 69 and 71 are even more likely to represent the person's true IQ--approximately 42% probability or slightly over 2 1/2 times more likely than a 75.  The best estimate of the person's true IQ is the observed IQ score, which has a 50% probability of representing the person's true IQ--approximately 3 times as more likely than the score of 75."
I hope the above IQ/SEM probability cheat sheet will help clarify the nature SEM in IQ testing and will result in its proper interpretation in Atkins proceedings.  I hope that it, in particular, helps eliminate "SEM shell games" that try to convince the court that the highest possible score in a 95% SEM confidence band is just as likely to be the best estimate of the person's true IQ as lower scores.  Arguments of this type are a distortion of the science of psychometrics for other purposes and are wrong.  Period.

Finally, for those knowledgeable in psychometrics, please read the original repor # 5.  In that report I acknowledge many of the statistical nuances of the SEM literature (e.g., symmetrical vs asymmetrical confidence bands; bands based on first estimating a true score; conditional or ability-centered SEM's. etc.).  This brief report, as well as the original, is intended to address how SEM is most commonly used in practice (in the courts in particular)--not how it is used and discussed among measurement quantoids at professional conferences and in statistical journals.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,