Guest commentary on the Retrospective Assessment of MR: Effectively Addressing Atkins Questions: Dr. Timothy Derning

Blogmaster introduction and comments:  Below is a guest blog post by Dr. Timothy Derning in response to the recent court ruling regarding Johnston v Fl (click here for prior post that includes links to all prior posts and documents).  This is a longer than usual blog post, but I believe it is worth the space.  Also, I would LOVE to see other professionals (who practice in the area of Atkins cases) offer similar post-hoc analysis of Atkins court decisions.  They can be very educational and instructive.  Such commentaries can serve a valuable function of encouraging discussion, the exchange of ideas, and professional debate.  Thanks Dr. Derning for the post.


A brief review of the state of Florida decision regarding David Eugene Johnston dated April 5, 2010. Mr. Johnston is on Florida's death row. The issue before the court was whether Mr. Johnston has mental retardation. The court's decision was that Mr. Johnston does not have mental retardation. The defense presented four experts. The State presented two experts.

Mr. Johnston is 61 years old in 2010. He has been given a number and variety of intelligence (IQ) tests throughout his life, beginning in 1967, at age 7, when he was given the Stanford Binet, form LM, and received a 57 IQ. He was administered the WISC twice, once in 1972, FSIQ = 65 (at age 12); and two years later in 1974, FSIQ = 80 (age 14). It's important to note that the examiner in 1967 made a comment in the report that the IQ score of 57 most likely represented a depressed estimate of intellectual functioning due to an unhealthy home environment, moderate to severe perceptual problems and/or brain damage, and severe emotional disturbance.  This examiner stated Johnston’s intellectual ability and potential was "possibly within the lower dull normal range, normal level." Likewise, the 1972 examiner acknowledged that the WISC FSIQ of 65 was in the retarded range, but stated the results were suspect due to possible emotional problems.  The 1972 examiner estimated that Johnston’s ability "would be more in keeping with the slow learner or low average range rather than the mentally retarded." In 1974 the evaluator who administered a WISC and reported a FSIQ of 80, commented that young Mr. Johnston (then age 14) was cooperative and engaged. These comments played a significant role in this court's opinion.

Before proceeding its is worthwhile to pause and recall that the definition of mental retardation (aka intellectual disability) has three parts or prongs: subaverage intellectual functioning (a valid IQ score of approximately 70 + 5 points; 1 SEM); demonstrated deficits in daily adaptive living ability, and onset that begins during development (before age 18). Neither etiology, nor genetics, nor congenital deficits are mentioned or considered in the definition.

In 1988 (when Mr. Johnston was 28 years old) he was tested again with the adult Wechsler. On the WAIS-R there was a large split between Johnston’s Verbal IQ of 75 and Performance IQ of 101. Such a wide difference is statistically rare and unexpected, so much so that the full-scale IQ is regarded as uninterpretable (meaningless). Nonetheless, one of the state's experts in 2009/10 calculated a full-scale IQ 83 for WAIS-R.  Mr. Johnston was next given (at age 40) a newer Wechsler (WAIS-III) in 2000. His WAIS-III FSIQ score was 76. Another WAIS-III was administered in 2005, when Johnston was 45 years old.  On this WAIS-III his FSIQ score was 82 (or 84 - both scores are reported in this opinion). Finally, in July 2009 Mr. Johnston was administered the latest edition of the adult Wechsler IQ test (WAIS-IV, 2009), on which he received a FSIQ score of 61. In short, the pattern of Mr. Johnston's IQ scores is highly variable, ranging from a low of 57 to a high of 84 and then back down to 61. The differences among the IQ scores presented a thorny problem: which scores to accept, which to reject? How to rest comfortably with an opinion about Johnston’s level of general intellectual ability?

The defense experts presented a variety of arguments supporting the opinion that Mr. Johnston is a person with mental retardation; most centered on the IQ scores. One theme among these arguments was the interfering effects of the "Flynn Effect," a statistical phenomenon that IQ scores artificially increase over time in tests that have not been renormed for a number of years.  The defense experts believed that the Flynn Effect could account for the variable IQ scores. The other theme argued for the influence of "practice effects," which refers to the fact that the more an individual takes the same test, the more familiar they become with that test, and the more likely IQ scores will increase artificially due to practice (familiarity). While these reasons were offered, the court’s opinion does not report the experts’ explanations as to how or why these factors should influence an MR/ID Dx (or not) in Mr. Johnston's case---only that it is known that the Flynn Effect and practice effects can be variables that must be considered when evaluating a history of IQ scores. Hopefully, a more complete and relevant explanation was offered during testimony.

Much of the argument from the experts (on both sides) centered on the Wechsler IQ test itself, its validity, its psychometrics, and a comparison of scores between the WAIS-III in WAIS-IV. The defense experts argued that the WAIS-IV (2009) is a superior test to the others, uses a four-factor model to derive IQ scores, and is a superior measure of intelligence compared to the WAIS-III and it’s two-factor model of interpretation. The defense experts argued that the WAIS-IV, and consequently the 61 IQ obtained from Mr. Johnston in 2009, represented the "gold standard" of intelligence testing, and provides the best indicator of his true intellectual functioning, thus meeting the legal and clinical standard for mental retardation (actually only the first prong of the definition).

With respect to the variability in IQ scores over Mr. Johnston's lifetime, the defense experts argued that such variability is to be "expected" as there is much variability among very low IQ scores.  The defense experts then discounted (or gave little weight to) the higher 1974 and 1988 IQ scores "because those tested not reflect the most current testing data." One defense expert said he could not find the actual 1974 report, did not know where it was administered, or who administered the test, and therefore didn't trust the validity of the IQ scores.

The defense expert who administered the WAIS-IV in 2009 also administered the TOMM test as a check against malingering in order to demonstrate the validity of the WAIS-IV FSIQ. Adaptive ability was addressed by several defense experts.  One expert interviewed the mother and brother of the defendant, another administered the Adaptive Behavior Assessment System, Second Edition (ABAS-II) and reported that the defendant scored very low, 4 or less, in all 10 scales of the ABAS-II.

On the other hand the state's experts testified that they assessed Mr. Johnston in 2005 (they did not examine him in 2009, but reviewed the reports of the defense experts). One state expert administered the WAIS-III in 2005 (FSIQ = 82 or 84) and both experts concluded that Mr. Johnston was not a person with mental retardation. Both experts gave greater weight to the 1974 WISC IQ FSIQ score of 80, as they noted the examiner's positive remarks describing the defendant as alert, cooperative, friendly, verbally expressive, and exhibiting self-confidence during the testing.

The defense experts had a more awkward argument to maintain, having to weigh Mr. Johnston’s lower IQ scores more heavily, while giving less weight to higher IQ scores for various reasons. They also had to walk the gauntlet that the WAIS-III, an established and comprehensive measure of intelligence, was not a "piece of junk," while trying to give greater weight and emphasis to the 61 IQ from the WAIS-IV. In spite of various defense experts’ arguments, one of the most important and influential pieces of information came from a state's expert who testified that the correlation between the WAIS-III and the WAIS-IV is .94 "or almost perfect, which signified that the WAIS-III was measuring the same constructs as the WAIS-IV and there was a great deal of overlap between the two instruments, making them almost "identical."" One defense expert who argued for the superiority of the WAIS-IV over the WAIS-III was unable to cite the correlation between the two tests as provided in the WAIS-IV test manual, saying that the correlation was probably "mid-.8", which is about .10 lower in magnitude than is actually the case. Knowing the precise correlation (.94) allowed the state’s expert to testify convincingly that the technical concerns raised by the defense about the "two factor model" versus the "four factor model" were relatively insignificant.

The state’s experts also made a salient point that the TOMM test, administered to establish the validity of the WAIS-IV IQ test performance, was given at a much different time, and the court noted,"... the TOMM was not administered properly in that administering the TOMM and the WAIS-IV some two or three months apart, (so that) the ability to make an extrapolation from one test to the other was lost." True enough.

Also of significance, when addressing the drop in IQ from 2005 (FSIQ 82 to 84) to 2009 (FSIQ 61), the state’s experts examined individual responses to IQ test questions and found unexpected inconsistencies, such as when the defendant answered 4+5 =9 in 2005, but when asked to solve a similar problem in 2009 said, “I can’t add.” Also, Johnston could identify Martin Luther King, Jr. in 2005 but never heard of him in 2009.

While the defense experts attempted to assess adaptive ability, no defense expert interviewed anyone at the prison. The court took note of omission and expressed concern that the defense, "... did not interview any personnel at the Department Of Corrections who would have been familiar with Defendant on a day-to-day basis to further assess this issue." The court found the absence of current first-hand information was a significant weakness in the assessment data in spite of claims that adaptive test data (from the ABAS) indicated adaptive deficits. In addition, the court found the information from the defendant's mother and brother provided, "... far too little information and were too distant in time to have any probative value."

Overall, it would seem that this Atkins opinion regarding the presence of mental retardation turned on several factors in the decision. It was important to the court that IQ scores from the WAIS-III and the WAIS-IV are "virtually identical" [blogmaster coment--click here for CHC analysis of each instruments FS IQ composition] so that all technical arguments about the superiority of one score over another, and arguments advancing the psychometric superiority of the WAIS-IV, became irrelevant. This opinion noted that the “Flynn effect” arguments were made by defense experts, but the court document unfortunately provides no additional information about how the significance of the Flynn effect or practice effects were relevant to Mr. Johnston’s mental retardation claim.

This case is instructive on a number of points. For example, it can be problematic for a contemporary evaluator when childhood IQ scores are accompanied by comments from the earlier evaluator that dismiss a low IQ as not being "representative" of the true functioning of the youngster. It's often the case that less skilled examiners don't trust their own test data and tend to superimpose their own "clinical impressions" that a youngster is not mentally retarded for one reason or another. Usually bias, lack of training or experience, misinformation, or not having the benefit of 40+ years of additional research, play a large part in this clinical interpretation of IQ scores. However, comments about a youngster coming from a dysfunctional home, an unhealthy home environment, and/or having emotional problems, must be considered and given appropriate weight in the retrospective evaluation process. Additionally, when previous evaluators note that a youngster was fully cooperative and engaged during testing, that, too, must be weighed accordingly, especially when there is a noticeable increase in the IQ score. This last point was clear in this case.

In Mr. Johnston’s history of IQ scores the 1967 Stanford Binet IQ of 57 can be seen as an "outlier"-- an extremely low IQ score that is inconsistent with all other reported IQ scores. Nor is there data to support significant adaptive deficits throughout Mr. Johnston's life (i.e., very low functioning consistent with someone having an IQ 57, a very low score). Therefore, subsequent evaluators should consider the 1967 examiner’s comment that the 57 IQ score most likely represented a depressed estimate of intellectual ability as (likely) an accurate caveat. Likewise, the 1972 examiner's similar observation that emotional problems depressed the IQ score must be considered accordingly in the retrospective analysis of IQ scores. The judge in Mr. Johnston’s case found the state’s experts’ reasoning compelling. (Importantly, this judge also found the state’s experts’ explanations more detailed and credible with respect to secondary factors that could depress IQ scores in the past and in the present day, namely, anxiety about his impending execution.) The defense experts failed to overcome the “common sense” questions of the low IQ scores from Mr. Johnston’s childhood: an emotionally distraught youngster living in an unhealthy family environment is not be expected to perform at optimal levels when solving intelligence problems.

Evaluating the adaptive abilities of a defendant who has been living on death row for a number of years presents significant challenges to a contemporary evaluator, not the least of which is collecting valid and reliable information from collateral sources who know how the individual functions. As noted in this opinion, family members may be too unfamiliar and removed from current functioning to provide useful information. Additionally, they may be biased in favor of the defendant. Likewise prison personnel may not be able to provide the kind of information needed in such a limited and structured environment; they, too, may present a different bias toward normalcy. Additionally, prison personnel may not be made available to the evaluator. The court’s opinion in Mr. Johnston's case suggests, however, that it is important to make an honest effort to collect information from contemporary collateral sources, weighing and evaluating the validity of the information after it is collected, or at least after an honest attempt is made.

From a distance (and without benefit of copies of the oral testimonies) it would appear that the defense experts became “blinded” by the bright lights of the IQ test arguments. There is considerable intelligence testing research and expertise to draw upon from the extant literature. Another potential “blinding” of the defense experts is the fact that when one can establish the validity of a higher IQ score, well above the IQ range established for subaverage intellectual functioning (IQ approximately 70), the IQ score alone may have sufficient power to “conclusively refute the mental retardation diagnosis both legally and clinically” as conceded by defense experts. For this reason, a valid higher IQ score can be a “deal breaker” for the first prong of the MR definition, and thus the whole MR claim. Further assessment is not required. It is for this reason, however, that IQ scores are sometimes given more weight and emphasis than they deserve in Atkins arguments. Not infrequently, as in Mr. Johnston’s case, it is not easy to examine a retrospective history of IQ test scores and definitively establish or refute mental retardation, especially in a retrospective evaluation that spans decades, using various tests, and has been conducted by multiple examiners. In such cases retrospective evaluators may need to look elsewhere for data or information to form an opinion, namely, adaptive functioning, the “middle child” of the MR criteria.

The Johnston opinion is a good illustration of the difficulty of evaluating the subaverage intellectual functioning prong of the MR definition in the presence of multiple inconsistent IQ scores. These are typical cases that show up at the doorsteps of psychological experts. Atkins defendants with consistent IQ score histories in the 60’s or 80’s are easier to assess one way or the other. However, someone with Mr. Johnston’s IQ history confounds efforts to reach a firm conclusion regarding subaverage intellectual functioning. Technical expertise regarding psychometric issues may or may not help to untie the knot. In Mr. Johnston’s case most of the tests used were from Wechsler batteries (WISC, WAIS-R, -III, -IV), which is somewhat unusual; often a variety of brief, nonverbal only, group administered, or discontinued IQ tests are present in the defendants records. In this case it would seem the (over-) focus on the Wechsler IQ score validity took precedence among the defense experts’ opinions. Yet, when all is said and done, adaptive functioning (the second prong of the MR definition) may provide the clarity and more accurate insight necessary to evaluate a defendant’s overall functioning with respect to a finding of mental retardation.

This case also highlights the difficulties often encountered by the retrospective Atkins evaluator in both IQ and adaptive deficit domains. Perhaps Mr. Johnston’s argument for a finding of mental retardation would have been more compelling if the presence of adaptive deficits had been more thoroughly documented and presented (assuming the truth of such deficits exists). As this Atkins opinion demonstrates, arguments supporting a finding of mental retardation must balance expertise and technical knowledge about intelligence testing against practical and common sense ‘everyday’ considerations. Common sense sometimes leads and sometimes misleads, but it is always a useful foundation and context for an expert’s curiosity, evaluation focus, and final opinions. It certainly was the thread that ran through this carefully reasoned opinion.

As a result of their significant intellectual deficits, people with mental retardation have difficult lives of a particular kind. The difficulties they may experience “getting by” and “fitting in” can make them more vulnerable to criminal influences. This was the concern originally expressed by the U.S. Supreme Court in the Atkins 2002 decision when it ruled against the execution of individuals with mental retardation. The defense experts in this case had an uphill climb. They chose a thorny path. Nonetheless, the larger lesson from this case is not about psychometric technicalities, but about presenting the (in)adequacy of Mr. Johnston’s life, such as it is, or is not. That information was available to both sides. Sometimes the IQ measurement question cannot be answered to the desired level of certainty. The state’s experts in this case were direct and parsimonious. They did not lose sight of the practical issues of the case and the judge was persuaded.

In the end it is always the impaired life and deficient (dependent and limited) daily functioning that is the hallmark of mental retardation. The burden of proof for mental retardation was on Mr. Johnston; the default position was the absence of mental retardation. When the IQ score waters are muddied, as in this Atkins claim, experts must broaden their focus to include other data that may allow one to better see the forest, not just the trees. The state’s experts provided a relatively direct and persuasive context for their opinions. In this case the defense’s psychometric arguments did not carry the day and other data and explanations were not compelling. As noted, Mr. Johnston’s mental retardation claim was not an easy one to establish, and in the end the judge was not persuaded.

[Thank you to Drs. Kevin McGrew and Greg Olley for the generosity of their time making comments and editing suggestions]
