REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS

This section reviews 8 instruments that have been developed for assessing individuals' functional abilities to participate in decisions about their treatment. All of these instruments have been developed within the past 15 years, so none of them were reviewed in the first edition of this book.

In contrast, the 3 instruments reviewed in the first edition are not reviewed here, because there has been no new research on those instruments in the interim.

The first 4 instruments in this section were developed for use as clinical tools to assess patients' capacities for competent consent: the Capacity to Consent to Treatment Instrument, the Hopemont Capacity Assessment Interview, the Hopkins Competency Assessment Test, and the MacArthur Competence Assessment Tool for Treatment. These are followed by the MacArthur Competence Assessment Toolfor Clinical Research, intended for use in evaluating competence of patients to consent to participate in clinical research associated with their need for treatment. Finally, reviews are provided for three instruments (Understanding Treatment Disclosures; Perceptions of Disorder; Thinking Rationally About Treatment) that were developed as research tools to examine hypotheses regarding the capacities of persons with mental illness to make treatment decisions.

An additional instrument, the Standardized Consent Capacity Interview (SCCI), deserves mention but is not reviewed here because there was not yet information on its psychometric properties at the time of this review. The SCCI was derived from Marson's Capacity to Consent to Treatment Instrument (CCTI), which is reviewed in this section. It was designed specifically for routine clinical use and therefore attempts some of the same objectives as the CCTI but in more economical fashion. Descriptions of the SCCI have been published in studies in which it formed the structure for interviews of patients to serve as stimuli in competence to consent research (Marson, McInturff, et al., 1997; Marson, Hawkins et al., 1997) that examined clinician's competency judgments.

But the clinicians were not asked to score the SCCI itself, so the existing reports do not speak to the instrument's reliability or validity. We anticipate that reports of the psychometric properties and clinical utility of the instrument are forthcoming.

Capacity to Consent to Treatment Instrument (CCTI)

Author

Marson, D.

Author Alfiliation

Department of Neurology, University of Alabama at Birmingham

Primary Reference

Marson, D., Ingram, K., Cody, H., & Harrell, L. (1995). Assessing the competency of patients with Alzheimer's Disease under different legal standards. Archives of Neurology, 52, 949-954

Description

The Capacity to Consent to Treatment Instrument (CCTI) was originally developed in the early 1990s to assess patients' capacities related to competence to consent to treatment, especially for persons with dementias. The manual is available from the CCTI's primary author. The instrument was introduced as a "prototype" in the primary 1995 reference noted above, although it did not have a name at that time. It was identified as the CCTI in later publications, and the instrument was not altered from its "prototype" form. Some later publications, however, employ only parts of the CCTI and therefore do not present a few of the original items. The present description is based on the Marson, Ingram, Cody and Harrell (1995) article, as well as materials (including scoring criteria) provided by the instrument's author in a January 1999 CCTI revision.

The CCTI consists of two clinical vignettes, each of which presents a hypothetical medical problem and symptoms, as well as two treatment alternatives with associated risks and benefits. The two vignettes are labeled "neoplasm" (a brain tumor and its possible treatments) and "cardiac" (heart blockage problem). The wording of the vignettes is highly standardized, and they were written at a fifth- to sixth-grade reading level. After listening to a vignette (aided by a printed copy in hand during the oral presentation), the patient is asked 14 questions that elicit information with which to evaluate the patient's capacities relevant for competence to consent to treatment.

Administering both vignettes and their questions requires about 20 to 25 minutes (f).

The patient's answers contribute to two types of scores, called the "Quantitative Scoring System" and the "Qualitative Scoring System."

The Quantitative Scoring System provides scores on the following subscales, which are referred to as "legal standards" (or "LS") (referring to standards for legal competence):

• LS1: The capacity to evidence a treatment choice. This is assessed with a single item asking patients what treatment in the vignette they would choose.

• LS2: The capacity to make a "reasonable" choice. This is also assessed with the item that asks for the patient's treatment choice, and is defined as the choice that most reasonable people would make.

• LS3: The capacity to appreciate the emotional and cognitive consequences of a treatment choice. This is assessed with 3 questions in the original CcTI, and with 2 questions in the January 1999 revision, that asks patients what plans they need to make for the future and what they believe their life will be like (with or without the proposed treatment).

• LS4: The capacity to provide rational reasons for a treatment choice, or to use logical processes to compare benefits and risks of various treatments. This is assessed with a request for patients to provide all of the reasons why they chose or rejected the proposed treatment.

• LS5: The capacity to understand the treatment situation and choices. This is assessed with 9 questions that require patients recall and comprehension of the various pieces of information provided in the vignette regarding symptoms, treatment, risks, benefits, and likelihood of various outcomes.

Responses are recorded by the examiner and scored according to detailed criteria for each item, providing for scores of 2,1, and 0 points per answer. Some items allow the patient to get credit for several answers (e.g., when asked about symptoms, a score of 2 for each of 4 symptoms in the vignette).

The highest possible score for each legal standard is 2 for LS1, 1 for LS2,4 for LS3, 26 or 6 for LS4 (depending on whether the patient chose the treatment—which leads to more opportunities to obtain points— or rejected the treatment), and 64 for LS5. Cut-off scores are not provided for adequate or inadequate (competent or incompetent) performance, although the CCTI author has used two standard deviations below the group mean as a convenient way to define a relatively "low" score on an LS. There is no "total CCTI score;" each LS score stands on its own.

The Qualitative Scoring System, also called the "Error Code Scoring System," provides for the identification of 16 types of errors conceptually organized into 4 domains: language dysfunction, executive dysfunction, affective dysfunction, and compensatory responses. This part of the CCTI does not provide actual scores; it is intended instead to allow the examiner to identify the simple presence or absence of various types of errors.

Conceptual Basis

concept definition. The five "legal standards" that guided the CCTI's development were derived from previous legal analyses of competence to consent to treatment (g). Conceptualization of the Rational Reasons and Understanding standards were derived from Appelbaum and Grisso (see reviews later in this chapter for Understanding Treatment Disclosures and Thinking Rationally about Treatment).

The 16 error codes were developed in part on the basis of the authors' experience with verbal responses observed in persons with Alzheimer's Disease, and in part on the Exner special scoring system for the Rorschach. operational definition. In determining how to operationally represent these constructs, the authors endeavored to develop a format that "approximates 'real life' medical treatment decision making by requiring a subject to elect and explain a treatment decision in a verbal dialogue format" (a). The choice of the two medical conditions (neoplasm and cardiac) is not explained, but the content of the vignettes was reviewed by independent physicians for accurate representation of the conditions and treatments.

critique. One of the CCTI's strengths is its use of constructs that are based on legal analysis of competence. The method was devised for use in the study of competence among patients with Alzheimer's Disease, but the format and interview items are such that there is no reason why they could not be used with other persons whose competence is questioned.

The vignettes are standardized (using hypothetical neoplasm and heart disorder cases), which has its advantages and disadvantages. This maximizes the opportunity to develop meaningful norms for use in clinical cases and to make group comparisons in research. But in clinical cases the method would leave open the possibility that the patient might do better (or worse) in comprehending their own disorder, in that the method does not assess patients' capacities in the context of a disorder or treatment that they are currently experiencing.

Two cautions should be noted regarding the legal constructs used in the CCTI. First, not all of the legal constructs that structure the CCTI will be relevant in all states for legal determinations of competence to consent. Virtually all states' laws recognize the importance of the patient's understanding of the disclosed information, but only some states specifically refer to appreciation and reasoning.

Second, the standard called LS2, the patient's capacity to make a "reasonable choice," was originally one entry in a well-known list of legal standards for competence to consent to treatment compiled by Roth, Lidz, and Meisel (g). However, within the past two decades, this concept has disappeared from virtually all legal and conceptual analyses of competence to consent. This standard, found in some states in the first half of the 20th century, allowed courts to find people incompetent if they chose a treatment that others (or the court itself) would consider odd or ill advised. In contrast, it is fundamental to the modern doctrine of informed consent that patients are allowed the autonomy to make any choice they wish, unpopular as it might be, as long as they are doing so with abilities (to understand, appreciate, and reason) that are sufficiently intact.

Based on this analysis, clinicians should be aware that using LS2 in the CCTI when reasoning about a patient's competence, or when offering information to a court in a competence proceeding, is inconsistent with current legal standards for competence in almost all states.

The conceptualization of the error codes is novel. One can imagine their use particularly in research that relates specific types of clinical dysfunctions to deficits in performance on the formal quantitative scales of the CCTI. In clinical cases, the error codes could offer the potential for providing causal explanations for a patient's deficits in understanding, appreciating consequences, and rational reasons in the CCTI vignette.

Psychometric Development

standardization. Administration procedures, interview questions, and scoring criteria for the quantitative (LS) scales are quite specific. Every patient receives the same disclosure about the same disorders, allowing for the development of norms for comparative purposes. The criteria for assigning error codes are a good deal more complex than for the quantitative scales, but they are succinctly defined and are accompanied by examples of responses that suggest each type of error code.

reliability. Three trained raters (number of protocols scored was not specified) achieved interrater reliability of r =.83 on the scales that use interval scoring (LS3, LS4 and LS5) and 96% agreement on the categorical scales (LSI and LS2) (f). Three raters trained in the error codes achieved 81% agreement for 644 text observations within 23 protocols, with all three raters agreeing on code assignments in 65% of the observations.

norms. No formal set of norms has been provided for the CCTI. However, normative performance for groups of persons with Alzheimer's Disease and for normal comparison groups can be found in various publications of research with the CCTI. Some of these norms are stated as means and standard deviations for the various LS scales (f, with 29 persons with Alzheimer's Disease and 15 normal comparison subjects; also b, for 72 persons with Alzheimer's Disease and 21 normal comparison subjects), while others are provided as percent of subjects who were defined as "Incompetent" based on their performance below the -2 standard deviation of normal comparison samples (for example, c, with 29 persons with Alzheimer's disease and 15 normal comparison subjects).

critique. Interrater reliability for the LS scales appears to be acceptable, and the rate of agreement for the error coding is relatively good in light of the complexity of the judgments that this aspect of the CCTI scoring requires. Users should note that considerably less reliability may be expected for the error coding method, especially by clinicians who may not be highly trained in the method.

The CCTI would benefit by the publication of a set of norms that clinicians can use for comparing the level of performance of their own patient to the performance of various patient groups or normal comparison subjects. Current norms are based on relatively small samples, and they are expressed in means rather than the percent of subjects making various scores on the LS scales. Caution is required in using the "Incompetence" criterion employed in the CCTI research studies. This method of expressing a group's performance is helpful for research purposes, but should not be extended to decisions about individual patients' competence or incompetence to consent treatment.

Construct Validation

The authors performed several factor analyses of the 14 items that make up the LS3-5 scales (appreciating consequences, rational reasons, and understanding, respectively) (a). Two factors best accounted for the variance, with all items from LS3 and 4 (rational reasons and appreciating consequences) and about half of the items from LS5 (understanding, primarily items about risks and benefits of treatment) loading highest on Factor 1, and the other half of the LS5 items (primarily about symptoms and details involving memory of numerical probabilities) loading highest on Factor 2. An additional factor analysis included a number of neuropsychological measures that had been given to this group of Alzheimer's Disease patients (n = 82) along with factor scores created with the previous factor analysis. The content of the two emerging factors, interpreted based on the nature of the neuropsychological measures that loaded on them, suggested that Factor 1 pertained to verbal conceptualization and reasoning, while Factor 2 was related primarily to verbal memory.

Three reports using the same two groups of subjects (29 patients with mild or moderate symptoms of Alzheimer's Disease and 15 normal older comparison subjects) found significantly lower mean scores for the Alzheimer's Disease patients on LS3 (rational reasons), LS4 (appreciating consequences), and LS5 (understanding), but not for LS1 and 2 (c, d, f). The study's method for classifying patients as "incompetent" (below -2 SD for the normal comparison sample) classified the patients with moderate symptoms of Alzheimer's disease as incompetent in 50% of the cases on LS3 (appreciating consequences), 71% of the cases on LS4 (rational reasons), and 100% of the cases in LS5 (understanding) (c, f).

The CCTI authors have examined the relation of the instrument's LS scores to a number of neuropsychological measures of cognitive functioning. Virtually all such measures were correlated significantly with performance on LS1 and 3-5 (LS2, "reasonable choice," was not examined in these analyses). Using stepwise multiple regression analyses, LSI scores (evidencing a choice) were best identified by an auditory comprehension test, LS3 scores (appreciating consequences) by executive function measures such as Controlled Oral Word Fluency and Trail Making A, and LS5 scores (understanding) by Dementia Rating Scale Conceptualization and the Boston Naming Test (c). LS4 scores (rational reasons) were best identified by Dementia Rating Scale Initiation/Perseveration scores (d).

Error code incidence rates were significantly greater for Alzheimer's Disease patients (n = 72) than for normal older comparison subjects (n = 21) on 9 of the 19 error types (b). Types of error codes correlated with the LS scores were different for the various LSs, with correlations in the range of.14 to.36.

critique. There is substantial support here for the CCTI's construct validity and its conceptualization of abilities related to competence to consent to treatment. The studies provide good evidence for expected differences between Alzheimer's Disease patients and normal older comparison subjects. It is reasonable to believe that the CCTI would identify similar differences between normal comparison subjects and persons with serious mental disorders other than Alzheimer's Disease (e.g., schizophrenia, depression). Research to examine the application of the CCTI to other patient populations could be quite helpful in expanding the range of the CCTI's use in clinical evaluations related to competence to consent to treatment.

The authors explanations for the correlations that they found between neuropsychological test findings and various LS scores are too complex to summarize here, but they provided sound logic for explaining deficits in the various LS areas. The only caution to raise on this point is that the explanations for the relationships that were found were post hoc and that many of the neuropsychological measures were substantially related to many of the LS scores. Future research that examines a priori hypotheses about these relationships would be helpful. Research should also include a priori hypotheses about neuropsychological indices with which specific Ls constructs would not be expected to relate.

Predictive or Classijicatory Utility

No studies have examined the relation of CCTI scores to clinical or judicial judgments about competence or incompetence to consent to treatment, nor to patients' performance in actual consent circumstances.

critique. Although not directly related to classificatory utility of the CCTI, it is worth noting that the CCTI authors examined the ability of clinicians to use the LS format as a structure for organizing their judgments about patients' competence to consent to treatment (e). Five competency-experienced clinicians observed competence interviews of Alzheimer's disease patients and normal older controls. The interviews used the CCTI vignettes and interview questions. Clinicians were asked to make competence/incompetence judgments for each of the LSs (without benefit of the CCTI scoring system) and an overall competence judgment. The relation of the overall competence judgments to actual LS scores for these patients on the CCTI was not reported. The report focuses on agreement between clinicians, which was highest forLS1 (evidencing a choice) and lowest for LS3 (appreciating consequences).

Potential for Expressing Person-Situation Congruency

The treatment situations used in the CCTI vignettes were purposely standardized, not tailored to the conditions of the individual patient. Thus they provide no opportunity to examine patients' performance in relation to varying demands in terms of complexity of the treatment options.

References

(a) Dymek, M., Marson, D., & Harrel, L. (1999). Factor structure of capacity to consent to medical treatment in patients with Alzheimer's Disease: An exploratory study. Journal of Forensic Neuropsychology, 1, 27-48.

(b) Marson, D., Annis, S., McInturff, H., Bartolucci, A., & Harrell, L. (1999). Errorbehaviors associated with loss of competency in Alzheimer's Disease. Neurology, 53,1983-1992.

(c) Marson, D., Chatterjee, A., Ingram, K, & Harrell, L. (1996). Toward a neurologic model of competency: Cognitive predictors of capacity to consent in Alzheimer's Disease using three different legal standards. Neurology, 46, 666-672.

(d) Marson, D., Cody, H., Ingram, K., & Harrell, L. (1995). Neuropsychologic predictors of competency in Alzheimer's Disease using a rational reasons legal standard. Archives of Neurology, 52,955-959.

(e) Marson, D., Earnst, K., Jamil, F., Bartolucci, A., & Harrell, L. (in press). Consistency of physicians' legal standard and personal judgments of competency in patients with Alzheimer's Disease. Journal of the American Geriatrics Society.

(f) Marson, D., Ingram, K., Cody, H., & Harrell L. (1995). Assessing the competency of patients with Alzheimer's Disease under different legal standards. Archives of Neurology, 52, 949-954.

(g) Roth, L., Meisel, A., & Lidz, C. (1977). Tests of competency to consent to treatment. American Journal of Psychiatry, 134, 279-284.

Hopemont Capacity Assessment Interview (HCAI)

Author

Barry Edelstein

Author AiJilialion

West Virginia University

Primary Reference

Edelstein, B. (1999). Hopemont Capacity Assessment Interview manual and scoring guide. West Virginia University: Author. (Available from Barry Edelstein, Department of Psychology, P.O. Box 6040, WestVirginia University, Morgantown, WV 26506.)

Description

The Hopemont Capacity Assessment Interview (HCAI) (c) is a semi-structured interview in two sections. The first section, discussed here, is for assessing capacity to make medical decisions. The second section, discussed in Chapter 8, assesses capacity to make financial decisions.

The interview begins with a brief explanation that when doctors want to propose a course of treatment, they explain the benefits and risks to the individual who is given a choice of that treatment or other courses of action. The first 3 items ask the examinee to offer definitions of "benefit," "risk," and "choice."

Then a brief (six-sentence) "scenario" of a disorder (eye infection) and treatment choices is provided (described as the examinee's "friend" whom the examinee will be asked to advise). This is followed by nine items (questions) designed to identify whether the examinee can describe the medical problem, the proposed treatment, the reason for it, the possible benefits and risks, and the choices, then can offer a choice and explain the reasons for choosing it. This process is repeated with another scenario (advance directive to allow or refuse CPR) and a similar set of nine questions. The manual provides an answer sheet to record examinees' responses.

Answers to the 21 items are scored according to 2,1, and 0 or 1 and 0 scoring criteria provided in the manual. No instructions for combining the scores are provided, but one reported study has added the scores on all items to produce a total HCAI score (with a possible range of 0-33) (d).

Conceptual Basis

CONCEPTUAL definition. The HCAI was developed in reference to four legal standards described by Appelbaum and Grisso (a) relevant for assessing civil competencies (b). Questions in response to disclosed information are intended to assess the capacities to understand relevant information, demonstrate appreciation of the significance of the information for the circumstance, rationally consider the risks and benefits of different choices, and express a choice.

Operational definition. The interview was structured and made simple to be suitable for cognitively impaired elderly adults, including those residing in nursing homes. Two scenarios were chosen with different levels of risk to increase generalizability. Items were developed based on a standard nursing reference at a sixth grade reading level.

critique. The structure of the HCAI is based on modern notions of the abilities associated with competence to consent to treatment. No effort was made, however, to create scales associated with those abilities (e.g., Understanding, Appreciation or Reasoning scales). The author's intentions were to provide a structured interview with questions that could be evaluated with ratings, but that was not intended to provide norms to which individual examinees could be compared (b). This offers the benefit of flexibility and avoidance of a rigid focus on additive scores. The disadvantage of this approach is that reporting of performance must proceed item by item, given that no scale or summary scores are offered. In addition, it requires that researchers determine for themselves how they will combine the item ratings, and there is no assurance that all researchers will do it the same way. Thus across time, results of studies using the HCAI may become hard to compare.

Psychometric Development

standardization. The HCAI uses a structured interview format in that specific questions are asked, but it is semi-structured in that it allows flexibility for probing. The criteria provided for assigning 2,1, and 0 ratings to each item are explicit and clear.

reliability. Inter-rater reliability has been reported as.93 (90% agreement) (e). However, repeat testing after about 2 weeks indicated a.29 correlation between first and second administrations (e). These data came from a study of 50 psychogeriatric nursing home residents.

norms. Normative data are not available.

critique. Standardization of the procedure and scoring appear to be good, as well as initial indication of inter-rater reliability. The reports test-retest correlation is troubling, but may be related to the specific sample that was used. It is hoped that use in research with elderly persons, as well as other populations, will result in a better indication of the instrument's psychometric properties, as well as normative samples to which individual cases can be compared. At present, one study has examined the utility of the HCAI in assessing treatment decision making capacities among elderly persons in a long term care facility (d). But the study did not describe the samples' scores sufficiently to offer comparative potential forclinicians.

Construct Validation

Internal consistency as measured by coefficient alpha was.94 in a study of elderly persons in a long term care facility (d). In the same study, the summed score of the HCAI ratings for treatment decision making capacity correlated.66 with the Mini Mental Status Exam (MMSE) and.50 with the Element Disclosure version of Understanding Treatment Disclosures (see review later in this chapter).

critique. Results of construct validity are encouraging but await further support before one can judge the value of the instrument.

Predictive or Classificatory Utility

In the study noted above, HCAI summed ratings for treatment decision making capacity correlated.61 with competence or incompetence judgments provided by two "clinical interns" who "together [had] integrated information from [a] battery" including a number of psychometric tests of general cognitive, intellectual, and psychological characteristics (d, p. 626).

critique. The results tell us little about the utility of the HCAI, because there is no reason to believe that two "clinical interns," even supplied with very large amounts of data regarding individuals' cognitive capacities, represents a reliable criterion for competence. The authors of the study (d) suggested that this "gold standard" was appropriate because "clinicians traditionally have been called upon" to make these decisions. By this logic, it is up to the researcher to demonstrate that the two interns made decisions with which most clinicians would have agreed. The mere fact that they were "clinicians" does not necessarily mean that they represent standard clinical practice.

Potential for Expressing Person-Situation Congruency

The HCAI is highly standardized and does not include items to assess the fit of financial capacities as assessed on the instrument with environmental or situational circumstances (e.g., the extent of the individual's estate needing management).

References

(a) Appelbaum, P., & Grisso, T. (1988). Assessing patients' capacities to consent to treatment. New England Journal of Medicine, 319,1635-1638.

(b) Edelstein, B. (2000). Challenges in the assessment of decision making capacity. Journal of Aging Studies, 14, 423-437.

(c) Edelstein, B. (1999). Hopemont Capacity Assessment Interview manual and scoring guide. West Virginia University: Author. (Available from Barry Edelstein, Department of Psychology, P.O Box 6040, West Virginia University, Morgantown, WV 26506.)

(d) Pruchno, R., Smyer, M., Rose, M., Hartman-Stein, P., & Henderson-Laribee, D. (1995). Competence of long-term care residents to participate in decisions about their medical care: A brief, objective assessment. The Gerontologist, 35, 622-629.

(e) Staats, N., & Edelstein, B. (1995). Cognitive predictors of medical decision-making capacity. Paper presented at the meeting of the Gerontological Society of American, Los Angeles.

Hopkins Competency Assessment Test (HCAT)

Author

Janofsky, J.

Primary Author Affiliation

Johns Hopkins School of Medicine, Baltimore MD

Primary Reference

Janofsky, J., McCarthy, R., & Folstein, M. (1992). The Hopkins Competency Assessment Test: A brief method for evaluating patients' capacity to give informed consent. Hospital and Community Psychiatry, 43, 132-136

Description

The Hopkins Competency Assessment Test (HCAT) (c) was developed "to screen patients for competency to make treatment decisions and to write advance directives" (c, p. 132). It was intended as a screening tool that could be used by nonclinicians to determine whether the issue of patients' capacities should be raised for further evaluation. It is further described as an "instrument for quantitative assessment of clinical competency" and that it "does not determine legal competency but rather is an aid to the clinician in forming an opinion about clinical competency" (c, p. 132).

The HCAT consists of a four-paragraph "essay" that was written at three reading comprehension levels (6th grade, 8th grade, and 13th grade). The content of the essay informs patients about:

1. the nature of informed consent (e.g., that the patient must be informed and understand in order to provide consent to treatment),

2. that "chronic disease" can decrease a patient's ability to make decisions,

3. that patients can state their wishes in advance of disease in instructions that are called durable power of attorney, and

4. the effect this will have on future treatment.

Thus the focus of the content is partly on understanding what is required in order to consent to treatment and partly on understanding the nature and value of advance directives.

Presentation of the essay is followed by six questions, four of them open-ended, one true-false, and one a sentence completion. Two questions focus on assessing #1 above, two on #2 above, one on #3 and one on #4. Clinicians are provided an example of an adequate answer for each of these questions and are asked to assign a score of 1 for each correct answer. Scoring range is 0-10, because four points (four correct answers) are possible for one of the questions. The authors suggest a score of 3 or below as indicative of incompetence (based on research results explained later).

In research procedures (c), patients were given the 13th grade disclosure version first and, after scoring, were given the 8th grade version if they achieved a score of 7 or lower, and then received the 6th grade version if a similarly low score was made on the 8th grade version.

Conceptual Basis

concept definition. Primary publications for the HCAT do not define the concepts that the instrument is intended to measure. In their preface, however, the authors indicate the intention to assess "competency to consent to treatment" (c, p. 132), and they define this as the ability to "understand the discussion of the proposed treatment and its risk, benefits, and alternatives" and to "understand... the right to informed consent" (c, p. 132).

operational definition. The authors did not describe their process for interpreting and operationalizing the above concepts. The four paragraphs of the essay and the six questions appear to focus on the latter of the two concepts above—that is, "understand the right to informed consent"—in that the content focuses on the patient's understanding of what informed consent is, what can impair it, and how one can provide informed consent in advance of its need.

critique. The authors say that the instrument is designed to "aid clinicians in forming an opinion about clinical competency." The HCAT's content and conceptual definition, however, provide no reason to believe that the HCAT measures the range of abilities associated conceptually with competence to consent to treatment. As others have observed (e, f, g), clinicians who administer the HCAT will have acquired no information about the essentials for competence to consent to treatment—for example, how well the patient understands a disorder, a potential treatment, its risks and benefits, and alternative treatments; the patient's capacity to reason about the information; and the patient's appreciation of the significance of the information for his or her own situation.

The importance of this cannot be overstated. For example, later researchers (a) selected the HCAT, "based on these previous findings [referring to reference c] as a measure of subjects decision-making capacities" (a, p. 957). Y et the conceptualization, content, and format of the HCAT require no decisions and no decision-like processes on the part of patients to whom it is administered. The instrument only scores patients' understanding of what they are told, and what they are told does not include some of the information that patients are expected to understand (e.g., the nature of their disorder) for purposes of informed consent to treatment.

In sum, it is best to consider the HCAT as a possible means to assess patients' abilities to understand what informed consent is and what advanced directives are, but not an assessment of their "capacity to give informed consent" (as claimed in Janofsky, c, p. 132).

Psychometric Development

standardization. Administration procedures and questions are highly standardized and clearly described (a). Scoring instructions are relatively specific and simple to use.

reliability. Interscorer reliability has been examined twice. Two of the HCAT authors administered and scored 16 HCAT protocols (sample characteristics not described) (c). They reported a Pearson correlation coefficient of.95 for total HCAT scores. In another study (a), three researchers administered and scored 15 subjects (sample characteristics not described). Spearman's rank-order correlation coefficients for all possible pairs of scorers were.96,.97, and.99.

norms. HCAT results for various groups of patients are provided in some of the articles reporting research with the HCAT (b, c, d). In some cases these are means, while in others they are percent of subjects below the HCAT's competence cut-off score (that is, 3 or below = incompetent).

critique. The tests of interscorer reliability are difficult to interpret. The types of individuals (e.g., patients, non-patients, etc.) who produced the protocols are poorly identified in those reports. Similarly, the manner in which the patient samples are described in the reports makes it difficult to use them to make meaningful comparisons of patients' HCAT scores to past research results.

Construct Validation

In the original HCAT study (c) involving 41 medical and psychiatric patients, the mean HCAT score for psychiatric patients was lower than for medical patients, but not significantly. Jones et al. (1998) found that HCAT scores in a psychiatric sample (n = 43) were related to age (-.46) and education (.50). Another study (b) reported that HCAT scores were significantly lower for 16 patients (medical and neuropsychiatric) whom hospital staff judged as "incompetent" compared to 15 patients judged "competent" for purposes of providing informed consent.

The HCAT and the Mini-Mental Status Exam (MMSE) were employed together in the original study (c). The report offered no direct comparison between them, but it demonstrated that the HCAT compared more favorably with clinicians' competence judgments than did the MMSE (see review of this result in the following subsection). Another study (b), however, found that the HCAT correlated.75 with MMSE results in a psychiatric sample (and.68 with a daily living skills test).

critique. It is not possible to draw any inferences about construct validity of the HCAT from these results. Part of the problem is that the construct on which the HCAT is based is ambiguous, as noted earlier. Their intention to measure "clinical competence," "competence to consent," and "decision making about treatment" is never conceptually defined, so that it is not clear what construct to employ in making inferences about the above results.

In general, one might think that the fact that a psychiatric sample scored no differently than a medical sample (c) would challenge the construct validity of the HCAT, but this cannot be inferred since the nature of the psychiatric sample was never described. One would expect a measure of capacity to consent to be related to MMSE scores; one study reports that HCAT and MMSE scores are highly correlated, while Janofsky et al. (c) remark primarily on their differences.

Jones et al. (d) found that only 7 of 43 patients scored below the cut-off on the HCAT. Their observations of those seven patients is worth noting: "[They were] cooperative and appeared generally competent to discuss health problems, and if consent had been needed... there would have been no obvious reason to suspect their ability to consent" (d, p. 53-54).

Predictive or Classificatory Utility

Several studies have examined the ability of the cut-off score on the HCAT (incompetent = 3 or lower) to identify patients found incompetent by clinicians performing their own independent evaluations of competence. Janofsky et al. (c) found that this cut-off score identified all of the patients found incompetent, and misclassified none of the patients found competent, by a forensic psychiatrist who performed "clinical competency evaluations" and made competence/incompetence decisions on each patient. Another study (a), using clinical judgments of nursing home professionals regarding their patients' competence, found that staff considered 65% of the HCAT-incompetent patients to be incompetent, and 90% of the HCAT-competent patients to be competent.

Holzer et al. (b) compared HCAT results to consulting or admitting clinicians' formal evaluations of patients' capacities to give informed consent. The HCAT identified as competent 73% of the patients whom clinicians considered competent, and identified as incompetent 81% of the patients whom clinicians considered incompetent. In this study (b), four other measures of mental capacity and neuropsychological functioning were employed as well. All of these measures did better than the HCAT at identifying the patients judged competent by the clinicians, while the HCAT performed as well as two of the other measures in identifying patients judged incompetent by clinicians.

critique. These results do not provide information of value regarding the HCATs clinical utility. In general the proportions of "correct" classifications are neither very good nor very bad, and an instrument called a screening test should not be held to a very high classificatory standard. But in this case one cannot judge the value of the results because the standard itself is unknown. In the original study (c), for example, there was no description of what the forensic clinician was evaluating. The term used to describe the focus of their evaluations was "clinical competency," but it is not clear which clinical competency they were intending (or asked) to evaluate (e.g., competence to consent to treatment, competence to understand the meaning of informed consent, competence to execute advance directives). The other two studies also provided no way of knowing what was meant by "competency" as judged by consulting psychiatrists or nursing home staff.

Potential for Expressing Person-Situation Congruency

The HCAT "essays" are highly standardized and are not intended to assess patients' own circumstances. Thus they provide no opportunity to examine patients' performance in relation to varying demands in terms of complexity of various treatment options.

References

(a) Barton, D., Mallik, H., Orr, W., & Janofsky, J. (1996). Clinicians'judgment of capacity or nursing home patients to give informed consent. Psychiatric Services, 47, 956-960.

(b) Holzer, J., Gansler, D., Moczynski, N., & Folstein, J. (1997). Cognitive functions in the informed consent evaluation process: A pilot study. Journal of the American Academy of Psychiatry and the Law, 25, 531-540.

(c) Janofsky, J., McCarthy, R., & Folstein, M. (1992). The Hopkins Competency Assessment Test: A brief method for evaluating patients' capacity to give informed consent. Hospital and Community Psychiatry, 43, 132-136.

(d) Jones, B., Jayaram, G., Samuels, J., & Robinson, H. (1998). Relating competency status to functional status at discharge in patients with chronic mental illness. Journal of the American Academy of Psychiatry and the Law, 26, 49-55.

(e) Kaye, N. (1992). Assessing competency: Comment. Hospital and Community Psychiatry, 43, 648.

(f) Lavin, M. (1992). Assessing competency: Comment. Hospital and Community Psychiatry, 43, 646-647.

(g) Sales, G. (1992). Assessing competency: Comment. Hospital and Community Psychiatry, 43, 646.

MacArthur Competence Assessment Tool for Treatment (MacCAT-T)

Authors

Grisso, T., & Appelbaum, P. S.

Primary Author Affiliation

Department of Psychiatry, University of Massachusetts Medical School, Worcester MA

Primary Reference

Grisso, T., & Appelbaum, P. S. (1998). MacArthur Competence Assessment Tool for Treatment (MacCAT-T). Sarasota, FL: Professional Resource Press

Description

The MacArthur Competence Assessment Tool for Treatment (MacCAT-T) was designed to offer practical guidance to health professionals in their assessments of patients' decisionmaking capacities in the context of informed consent to treatment (c). As an interview guide, it was intended to assist clinicians in obtaining from patients information that is especially relevant for judgments about patients' competence to consent to treatment. It also provides a procedure for rating the quality of patients' responses to the interview questions. The MacCAT-T was published as an appendix to a book by the authors (b) and in commercial test form with a manual and materials (c).

The MacCAT-T is related to three research instruments reviewed later in this chapter: the UTD, POD, and TRAT. All four instruments were developed by a research initiative of the MacArthur Research Network for Mental Health and Law between 1989 and 1998. The three research instruments were developed first, followed by the MacCAT-T. Its purpose was to assess the same decisionmaking abilities measured by these more lengthy research instruments, but in a format that was easier to use in the course of clinical work. Whereas the research instruments used standardized disclosures and hypothetical vignettes that do not refer to patients' own specific conditions, the content of the MacCAT-T focuses specifically on the patient's own disorder, symptoms, and treatment options. Whereas the research instruments employed highly detailed and specific scoring criteria, the MacCAT-T offers more general criteria to assist clinicians in rating patients' responses. This method runs the risk of reduced precision in return for greater feasibility for clinical use in a wide range of settings and diagnostic conditions.

The MacCAT-T interview combines the process of preparing patients to make informed treatment decisions with the assessment of their capacities to decide. The parts of the interview and the sequence in which they occur are as follows:

• Understanding—Disorder. A structured format is provided for disclosing to the patient the nature of his or her own disorder. To prepare for this, the clinician lists several symptoms observed in clinical evaluation of the patient as a guide for the disclosure. This is followed by structured inquiry (open-ended) and probing questions to assess the degree to which the patient recalls and understands the various elements of the disorder that were disclosed.

• Appreciation—Disorder. The patient is then asked if he or she has any reason to doubt that he or she has the disorder that was disclosed, including an exploration of the patient's belief.

• Understanding—Treatment. A structured format is provided for disclosing to the patient the nature of the clinician's proposed treatment (prepared and presented similarly to the procedure in "Understanding-Disorder"), followed by structured inquiry to assess the degree to which the patient has understood the various elements of the treatment that were disclosed.

• Understanding Benefits and Risks: A structured format is provided for disclosing to the patient two or more main benefits and two or more main risks or discomforts of the proposed treatment (prepared and presented in a manner similar to that described in "Understanding-Disorder"), followed by structured inquiry to assess the degree to which the patient has understood the various elements of the treatment that were disclosed.

• Appreciation—Treatment: The patient is asked whether it seems possible that this treatment might be of some benefit to himself or herself, including an exploration of the reasons for the patient's belief.

• Alternative Treatments: Any other possible treatments are disclosed. (Assessment of understanding of the alternative treatments is optional.)

• First Choice and Reasoning: The patient is asked which of the treatments seems best (or that the patient is most likely to want), followed by a structured process for exploring the patient's reasoning for that choice.

• Generating Consequences: The patient is asked to describe some ways that the benefits and risks discussed earlier "might influence your everyday activities at home or at work."

• Final Choice: The patient is asked to make a final choice.

Administration requires about 20-25 minutes. A record form is provided for the clinician to use for organizing the disorder and treatment information to be presented to the patient and for recording the patient's responses to the assessment inquiries. The manual describes a method that clinicians can use to rate each of the patient's responses (2, 1, 0), using definitions and examples in the manual as a guide, and to use the individual response ratings to calculate ratings for each component of the process. The final page of the MacCAT-T record form provides spaces for adding up the ratings to produce four Summary Ratings: Understanding, Appreciation, Reasoning, and Expressing a Choice.

The Understanding Summary Rating (range = 0-6) is based on ratings for the patient's responses to the three Understanding sections of the interview (for Disorder, Treatment, and Benefits-Risks). The Appreciation Summary Rating (range = 0-4) is based on ratings for the patient's responses on the two Appreciation sections (Disorder, Treatment). The Reasoning Summary Score (range = 0-8) is based on evidence of four types of functions in the patient's explanation for his or her choice: Consequential Thinking, Comparative Thinking, Generating Consequences, and Logical Consistency. These Reasoning functions were derived from a research instrument reviewed later in this chapter, Thinking Rationally About Treatment (TRAT). Finally, the patient's ability to express a choice is signified by an Expressing a Choice Summary Rating of 2 if they are able and 0 if they are not.

The MacCAT-T manual does not provide "cut-off" scores on these Summary Rating scales for adequate or inadequate performance. The authors explain that in theory there is no absolute level of ability that indicates competence or incompetence, because this will vary with the demands of a patient's particular situation (a). Moreover, there is no "total MacCAT-T" score. The Summary Ratings simply allow clinicians to document their impressions of the degree of adequacy of patients' Understanding, Appreciation, and Reasoning, provide a means of expressing that opinion when offering explanations to others (that is, other clinicians or courts), and offer the possibility of comparison of the clinician's ratings to those of other clinicians involved in the same case.

Conceptual Basis

concept definition. The concepts of Understanding, Appreciation, and Reasoning used in the MacCAT-T are identical to the concepts underlying the three research instruments reviewed later in this chapter (see descriptions in reviews of the UTD, POD, and TRAT, respectively). (Expressing a Choice is also measured with a single item in the TRAT during the process of assessing the patient's reasoning about a choice.) The authors' decision to use these concepts to represent abilities related to competence to consent to treatment was based on legal research that had identified the legal relevance of these categories of abilities for conceptualizing competence (see later reviews of the three research instruments).

operational definition. The decision to use an interview format involving the patient's own symptoms and disorder, rather than a psychological testing format with standardized stimuli, was based on the desire to develop an instrument that could be used primarily in clinical work. Most medical circumstances in which competence to consent to treatment is evaluated do not lend themselves to protracted testing sessions. Moreover, it is often patients' ability to understand and appreciate specifically their own disorder and treatment options, not those of hypothetical cases, that is at issue in making legal, ethical and medical decisions about patients' consent.

The structure of the interview itself was drawn from legal and ethical guidelines regarding the essentials for disclosure of information in an informed consent process, including disclosure of the disorder, disclosure of the treatment, description of the benefits and risks of the treatment, and any alternative treatments and their risks and benefits. To this was added a process of decisionmaking such as would normally occur in a doctorpatient relationship (but with exploration of the patient's' choice that reveals the patient's reasons for making the decision).

The construction of Understanding ratings followed closely the format used in the parallel research instrument to measure Understanding Treatment Disclosure (UTD), but differs in that the MacCAT-T provides one general definition for rating each Understanding response as 2, 1, or 0, not separate criteria for each Understanding item as in the UTD. The Appreciation ratings are very different from the format and criteria used in the Appreciation instrument called Perception of Disorder, involving no scoring of patients' responses to hypotheticals that negate their faulty beliefs. Instead it employs much simpler criteria that focus on the presence or absence of delusional content in cases in which the patient fails to believe that he or she is ill. The Reasoning section employs several of the same operations and scoring criteria found in the parallel research measure, the TRAT (e.g., "Consequential Thinking"), but uses an abbreviated number of them.

critique. A strength of the MacCAT-T is its use of constructs that are based on legal analysis of competence and that have proved useful in studies with the parallel research instruments that influenced the development of the MacCAT-T. Its format is especially suitable for clinical practice, in that it guides clinicians through the process of informing patients about their disorders and treatments while simultaneously assessing their capacities to make treatment decisions. Moreover, its format allows it to be tailored to the individual disorders and treatment situations of patients.

The absence of a total MacCAT-T score may seem unfortunate to some users, but it makes sense conceptually because "competence" is not unidimensional. For example, if there were a MacCAT-T total score, some people might obtain a high score yet still be considered incompetent. This could happen if they performed well on two subtests (for example, perfect Understanding and Reasoning) but poorly on one other subtest (e.g., poor Appreciation due to total denial that the disorder applies to oneself). In addition, a total MacCAT-T score would not be meaningful from a legal perspective. Most states do not recognize all four of the constructs for purposes of legal definition of competence. Understanding is used in most states, but Appreciation and Reasoning are used more variably. Thus a total MacCAT-T score would misrepresent the applicable standards in some states.

Psychometric Development

standardization. Administration procedures, interview questions, probing, and rating criteria are clearly described in the manual. While the procedure is standardized in its sequence and the types of information provided and questions asked, it is not standardized with regard to content. For example, each patient will receive somewhat different information about symptoms of disorder (because the symptoms described are those of the individual patient). For this reason, the rating criteria are general rather than specific. For example, the rating criteria for Understanding cannot spell out exactly the response that is "adequate" for every possible symptom, but relies instead on the clinician's impression that the patient's symptom description was generally accurate or inaccurate (in the clinician's opinion).

reliability. Interrater reliability (a, c) was determined for 3 trained raters who independently rated 40 protocols (20 patients, 20 community comparison subjects). Intraclass correlations were.99 for understanding,.87 for Appreciation, and.91 for Reasoning. Pearsoncorrelationsbetween pairs of raters were in the high.90s for Understanding but ranged from.59 to.83 for Appreciation and Reasoning. Test-retest reliability with the MacCAT-T has not been examined.

norms. The manual provides comparative data based on two samples: 40 patients hospitalized with schizophrenia or schizoaffective disorder, and 40 individuals without mental illnesses and matched with the patient group on age, gender, race and socioeconomic status.

critique. Interrater reliability was surprisingly good, given that the nature of the instrument required less standardization and broader rating criteria than for the research instruments that preceded it. It should be noted, however, that the raters used in the test of interrater reliability were highly trained in the method. Clinicians who attempt to use the instrument only occasionally and only after casual reading of the manual might not do as well. Nevertheless, the results indicate that the MacCAT- T rating procedure has enough structure to offer potential not only for clinical work but also as an instrument in applied clinical research, given that the clinician participants receive adequate training.

The norms provided in the manual may be of some use in judging whether the scores of a particular patient are generally lower in relation to a psychiatric patient sample and a non-patient sample. The sample sizes for these norms, however, were relatively small and must be used cautiously when interpreting individual cases.

Construct Validation

MacCAT-T scores were examined in a sample of 40 patients hospitalized with schizophrenia or schizoaffective disorder and compared to those of a matched sample (age, gender, race, SES) of patients in the community without mental disorder (a). Understanding and Reasoning means for patients were significantly lower than for the community comparison group. Scores were 4 or below on Understanding (range 0-6) for 32.5% of patients and only 5% of community subjects, and were 3 or below on Reasoning (range = 0-8) for 20% of patients and 5% of community subjects.

The Appreciation components could not be administered to community subjects because this requires references to one's belief about one's own disorder. Clear deficiencies in Appreciation (0 credit) were found for 12% of patients on Appreciation of Disorder and 7.5% of patients on Appreciation of Treatment. In the same study, distributions of scores for patients and community comparison subjects were very similar to those found for the more comprehensive and standardized research measures of Understanding, Appreciation, and Reasoning from which the MacCAT-T was derived.

Total BPRS scores were not significantly related to any of the MacCAT-T summary ratings, but strong negative correlations were found between Understanding and BPRS Conceptual Disorganization and BPRS Hallucinations (a). These findings replicated the findings of earlier results with the parallel research measure of Understanding (UTD) (d). The correlation between MacCAT-T Understanding and Factor III of the BPRS (Thought Disorganization: items 4, 8, 12, 15), was -.21 (not statistically significant), considerably lower than the -.44 in an earlier study using the parallel research measure of Understanding (UTD) (d).

None of the MacCAT-T scales were correlated at a statistically significant level with patients' age, gender, race, number of prior hospitalizations, age at first hospitalization, highest occupation, or education.

critique. The results are consistent with the general notion that persons with schizophrenia, expected to have problems in processing information due to conceptual disorganization, would have greater difficulty in abilities related to treatment decision making. Other than good face validity, however, evidence to date is not available to assure that the MacCAT-T is measuring the specific abilities that it claims to measure. It has produced distributions of ratings on the various summary scales that were similar to those produced by the more sophisticated research measures of the same constructs. But no direct comparison has been made to determine the correlation between scores on the research measures and the summary rating scores of the MacCAT-T. Moreover, apart from poorer scores for persons with schizophrenia, no studies have examined whether the summary rating scores are related to behaviors (within or outside the consent context) to which one would expect understanding, appreciation, or reasoning to be related.

Predictive or Classificatory Utility

No studies have examined the relation of MacCAT-T ratings to clinician or judicial judgments about competence or incompetence to consent to treatment, nor to patients' performance in actual consent circumstances.

critique. Even if the MacCAT-T were known to measure accurately the concepts that it claims to measure, it is not clear that MacCAT-T ratings would necessarily be related substantially to competence decisions by judges or even by clinicians. This is because incompetence does not necessarily relate to poor capacities overall, but may also be related to deficits in a specific capacity when other capacities are intact. For example, theoretically one might have excellent Understanding and Reasoning yet poor Appreciation, with incompetence decided on the latter basis. Across cases, such circumstances probably would create only marginal correlations between competence judgments and any one type of ability assessed by the MacCAT-T.

Potential for Expressing Person-Situation Congruency

The MacCAT-T uses the patient's own disorder and treatment options as the stimuli for the inquiry about the patient's functioning on decisionmaking abilities. The ratings that it provides, therefore, are influenced not only by the patient's capacities, but also by the complexity of the patient's disorder and the particulars of the patient's treatment options. In this sense, the MacCAT-T is an index of person-situation congruency in the context of consent to treatment. Indeed, some patients who receive low ratings when being seen for a particular disorder might receive relatively higher ratings if they are seen again for a much different ("simpler") disorder and treatment circumstance.

References

(a) Grisso, T., Appelbaum, P., & Hill-Fatouhi, C. (1997). The MacCAT-T: A clinical tool to assess patients' capacities to make treatment decisions. Psychiatric Services, 48, 1415-1419.

(b) Grisso, T., & Appelbaum, P. (1998a). Assessing competence to consent to treatment: A guide for physicians and other health professionals. New York: Oxford University Press.

(c) Grisso, T., & Appelbaum, P. (1998b). MacArthur Competence Assessment Tool for Treatment (MacCAT-T). Sarasota, FL: Professional Resource Press.

(d) Grisso, T., & Appelbaum, P.S. (1995). The MacArthur Treatment Competence Study, III: Abilities of patients to consent to psychiatric and medical treatment. Law and Human Behavior, 19, 149-174.

MacArthur Competence Assessment Tool for

Clinical Research (MacCAT-CR)

Authors

Appelbaum, P. S., Grisso, T.

Primary Author Affiliation

Department of Psychiatry, University of Massachusetts Medical School, Worcester MA

Primary Reference

Appelbaum, P. S., & Grisso, T, (2OO1). MacArthur Competence Assessment Tool for Clinical Research (MacCAT-CR). Sarasota, FL: Professional Resource Press

Description

The MacArthur Competence Assessment Tool for Clinical Research (MacCAT-CR) was designed as a "structured interview schedule for assessing decision-making abilities relevant for judgments about subjects' competence to consent to participation in research" (c, p.1). It was derived from the MacArthur Competence Assessment Tool for Treatment (see previous review). it provides an interview procedure and a method for rating the quality of subjects' responses to the interview questions. its objective is to provide ratings and summary scores for four constructs, called "Understanding," "Appreciation," "Reasoning," and "Expressing a Choice."

The MacCAT-CR content is drawn from the specific research study for which a person is being asked to provide consent. The researcher selects information about the study taken from the study's basic informed consent disclosure, which is then used to formulate various parts of the MacCAT-CR interview process. Those contents also become the basis for the ratings of subjects' responses concerning their adequacy.

The interview begins with a section called "Understanding," involving 5 brief disclosures regarding the nature of the study, with each disclosure paragraph being followed by 1 or more questions (13 questions in all). These are outlined as follows:

• Disclosure of Nature of the Project, followed by 4 questions, 1 assessing understanding of the nature of the project and 3 assessing understanding of any 3 primary procedural elements of the project (e.g., duration, daily doses of medication).

• Disclosure that the primary purpose of the project is research, followed by 1 question assessing understanding.

• Disclosure concerning how this differs from individualized health care, and 3 questions assessing understanding of that element.

• Disclosure of the study's potential benefits and risks/discomforts, and 4 questions assessing understanding of them.

• Disclosure of subject's right to refuse or withdraw after consent, and 1 question assessing its understanding.

Subjects' 13 answers contribute to an Understanding score, based on ratings of their answers to each question on a 0-2 basis (total Understanding scores ranging from 0-26).

The second major section of the procedure ("Appreciation") focuses on subjects' abilities to acknowledge how they themselves will be affected by a decision to participate. The 3 questions focus on recognition that the study is not being done for their personal benefit, that there is a possibility of reduced benefit compared to other clinical options, and recognition that they can withdraw from the study. Their 3 answers contribute to an Appreciation score based on ratings of 0-2 for each answer (total Appreciation scores ranging from 0-6).

Next the subject is asked to make a choice about participation, contributing to a 0-2 rating indicating their ability to "Express a Choice."

Finally, subjects are asked to explain the reasons for their choice ("Reasoning"). In this context, they are also asked to describe some of the ways that participating in the project will "affect your everyday activities," then asked to provide a final decision about participation. Answers are scored for 4 matters (patterned after elements in Thinking Rationally about Treatment, reviewed later in this chapter, as well as the MacCAT-T):

• Consequential Thinking: A person's consideration of the consequences of a treatment alternative when deciding whether to reject or accept that alternative (or others).

• Generating Consequences: A person's capacity to generate potential real-life consequences of the liabilities described in an informed consent disclosure of a treatment alternative.

• Logical Consistency: Degree to which one's choice follows logically from one's explanation for the choice.

Ratings of 0-2 for each of these concepts contribute to a Reasoning score that ranges from 0-8.

Administrationrequires about 20-25 minutes. Themanual (c) provides an example of the use of the MacCAT-CR in a study of a new medication for schizophrenia, demonstrating for researchers how to transport research protocol information into the MacCAT-CR interview. A record form is provided for the clinician to use for organizing the project information to be presented to the subject and for recording the subject's responses to the assessment inquiries. A procedure and criteria for rating subjects' responses is also provided, as well as a summary rating form.

The MacCAT-CR manual does not provide cut-off scores on these Summary Rating scales for adequate or inadequate performance. The authors explain that in theory there is no absolute level of ability that indicates competence or incompetence, because this will vary with the demands of a subject's particular situation (c). Moreover, there is no "total MacCAT-CR" score; each of the four elements is communicated as individual scales.

Conceptual Basis

concept definition. The concepts of Understanding, Appreciation, Reasoning, and Expressing a Choice used in the MacCAT-CR are identical to the concepts underlying the MacCAT-T (reviewed earlier) and the three research instruments reviewed later in this chapter (see descriptions of the UTD, POD, and TRAT, respectively). The authors' decision to use these concepts to represent abilities related to competence to consent to research participation was primarily in deference to theoretical consistency with the area of competence to consent to treatment, which has received much more attention with regard to legal conceptualization of relevant functional abilities.

operational definition. The structure of the interview was drawn from legal and ethical guidelines regarding the essentials for disclosure of information in an informed consent process. To this was added a process of decisionmaking such as would normally occur in a subject's response to disclosure and invitation to participate in research. As noted in the description above, the logic and construction of the items and ratings for the MacCAT-CR followed closely the format used by its predecessor, the MacCAT-T.

critique. The MacCAT-CR uses constructs that are based on legal analysis of competence and that have proved useful in other research on civil competencies (a, b). Like the MacCAT-T, its format is especially adaptable to individual circumstances associated with the range of disorders and treatments that may be the subject of clinical research. This same format, however, makes the MacCAT-CR a somewhat different "test" from one study to another. Thus its design sacrifices some degree of crossstudy comparison (e.g., generalized psychometric reliability, the ability to develop generalized norms) for the sake of versatility in covering the range of content and circumstances represented in clinical research studies.

The absence of a total MacCAT-CR score may seem unfortunate to some users, but it makes sense conceptually because "competence" is not unidimensional. For example, if there were a MacCAT-CR total score, some people might obtain a high score yet still be considered incompetent. This could happen if they performed well on two subtests (for example, perfect Understanding and Reasoning) but poorly on one other subtest (e.g., poor Appreciation due to total denial that the disorder applies to oneself).

Psychometric Development

standardization. Administration procedures, interview questions, probing, and rating criteria are described in the manual. While the procedure is standardized in its sequence and the types of information provided and questions asked, it is not standardized with regard to content. For example, patients in different studies will receive somewhat different information about symptoms of disorder (because the symptoms described are those relevant for a specific study). For this reason, the rating criteria are general rather than specific. For example, the rating criteria for Understanding cannot spell out exactly the response that is "adequate" for every possible research condition, but relies instead on the clinician's impression that the subject's description was generally accurate or inaccurate (in the clinician's opinion).

reliability. For two studies reporting interrater reliability (e, f), intraclass correlations were.98 and.94 for Understanding,.84 and.90 for Appreciation, and.84 and.80 for Reasoning. In one of these studies (f), interexaminer reliability (two separate interviews of the same subject by different interviewers) was.77 for Understanding,.68 for Appreciation, and.82 for Reasoning.

An examination of test-retest performance with a sample of persons with major depression found no significant increase in mean scores on any of the MacCAT-CR scales (d). However, correlations between scores on first and second administration were relatively low for Understanding (.26) and Appreciation (.36), and there was no significant relation between first and second Reasoning scores (-.15). The latter result may have been produced by the tendency of some subjects to truncate their Reasoning explanations on the second administration.

norms. No norms are provided. Three studies provide data for samples of prospective research participants with schizophrenia (e), major depression (d), and Alzheimer's Disease (f).

critique. Interrater reliability as reported was quite good. One is cautioned, however, that such findings might not generalize to other studies using the MacCAT-CR, because each study employs somewhat different content based on the specific projects for which subjects are being recruited. Nevertheless, the results suggest that MacCAT-CR rating procedures have enough structure to offer potential not only for reliable scoring. It is unlikely that meaningful norms can be produced for an instrument like the MacCAT-CR, because the mean and distribution of scores for any given project are likely to be dependent on the specific content of the study in question.

Construct Validation

Three studies have examined MacCAT-CR scores in psychiatric samples (major depression, d; schizophrenia, e; Alzheimer's Disease, f). The latter two studies employed comparison groups with only minor medical or psychiatric problems; their mean performance on the main MacCAT-CR scales was very high. The study with a sample of patients with major depression (d) reported similarly high scores. Their Hamilton Depression Rating scores, however, suggested only "moderate" depression.

In contrast, significantly lower mean scores on Understanding, Appreciation and Reasoning were obtained by persons with schizophrenia compared to individuals recruited from a medical clinic (without mental disorder) (e). Similarly, significantly lower mean scores on these scales were found for Alzheimer's Disease patients compared to nonAlzheimer individuals of similar age and education (f).

Correlations between MacCAT-CR scores and Brief Psychiatric Rating Scale scores for patients with schizophrenia were moderate (r = -.34 for Understanding, -.27 for Appreciation, and -.47 for Reasoning) (e). Substantial r-squares were found in this sample for the relation of MacCAT-CR scores to scores on the Repeatable Battery for the Assessment of Neuropsychological Status (R² =.67 with Understanding,.66 with Appreciation, and.58 with Reasoning). In this study, using an educational intervention following first administration of the MacCAT-CR significantly increased schizophrenic patients' Understanding scores to a level that was not statistically significant from the non-schizophrenic comparison group.

critique. The results are generally consistent with expectations based on the nature of symptoms associated with the psychiatric groups in these studies. MacCAT-CR tasks are substantially cognitive in nature, and the patient groups for whom symptoms that impair cognitive functioning are most typical performed poorest. Moreover, MacCAT-CR scores were related to severity of symptoms and indexes of the serious of cognitive impairment in the expected direction. The resulting improvement in MacCAT-CR scores with educational intervention is intriguing, suggesting not only the instrument's responsiveness to expected changes but also the potential to improve patients' abilities to provide competent consent despite initial appearance of incapacity.

Predictive or Classificatory Utility

The study noted above involving Alzheimer's Disease patients (f) used a consensus of clinicians (without access to the MacCAT-CR scores) to categorize the patients as "capable" or "incapable" of providing competent consent to research participation. In Receiver Operating Characteristics analyses, "areas under the curve" were.90 for Understanding,.86 for Appreciation, and.88 for Reasoning. Using optimal cut-offs created by these analyses, about 62% of the subjects were "incapable" on at least one of the three ability measures.

critique. These are promising results, despite the fact that one cannot be certain about the reliability of the clinician judgments as the criterion for evaluating the MacCAT-CR.

Potential for Expressing Person-Situation Congruency

The MacCAT-CR uses the research project's own protocol and circumstances as the stimuli for the inquiry about the patient's functioning on decisionmaking abilities. The ratings that it provides, therefore, are influenced not only by the patient's capacities, but also by the complexity of the design of the research project in which they are being asked to participate. Thus the method automatically evaluates the person's abilities in relation to the demands of the specific project for which they are deciding about participation.

References

(a) Appelbaum, P., & Grisso, T. (1988). Assessing patients' capacities to consent to treatment. New England Journal of Medicine, 319, 1635-1638.

(b) Appelbaum, P., & Grisso, T. (1995). The MacArthur Treatment Competence Study: I, Mental illness and competence to consent to treatment. Law and Human Behavior, 19, 105-126.

(c) Appelbaum, P. S., & Grisso, T. (2001). MacArthur Competence Assessment Tool for Clinical Research (MacCAT-CR). Sarasota, FL: Professional Resource Press.

(d) Appelbaum, P. S., Grisso, T., Frank, L., O'Donnell, S., & Kupfer, D. (1999). Competence of depressed patients for consent to research. American journal of Psychiatry, 156, 1380-1384.

(e) Carpenter, W., Gold, J., Lahti, A., Queern, C., Conley, R., Bartko, J., Kovnick, J., & Appelbaum, P. S. (2000). Decisional capacity for informed consent in schizophrenia research. Archives of General Psychiatry, 57, 533-538.

(f) Kim, S., Caine, E., Currier, G., Leibovici, A., & Ryan, J. (2001). Assessing the competence of persons with Alzheimer's Disease in providing informed consent for participation in research. American Journal of Psychiatry, 158, 712-717.

Understanding Treatment Disclosures (UTD)

Authors

Grisso, T., & Appelbaum, P. S.

Primary Author Affiliation

Department of Psychiatry, University of Massachusetts Medical School, Worcester MA

Primary Reference

Grisso, T., & Appelbaum, P. S. (1992). Manual for Understanding Treatment Disclosures. Worcester MA: University of Massachusetts Medical School. (Available from the authors.)

Description

Understanding Treatment Disclosures (UTD) (d) was developed as a researchinstrumentto measure patients' understanding ofinformation similar to disclosures in informed consent for treatment. The UTD was developed for use in a project, the MacArthur Treatment Competence Study (b), focused on identifying patients' capacities to make decisions about consent to or refusal of treatment when they are hospitalized for mental disorders.

The instrument consists of three subtests that involve disclosing a standardized treatment situation to the respondent, followed by questions to assess the respondent's understanding of the information. This may be done with any of three forms that pertain to three different disorders: Schizophrenia, Major Depression, and a non-psychiatric disorder, Ischemic Heart Disease. All three forms have the same format and length. in the research study for which the instrument was developed, patients received the form that corresponded to their own disorder, although the instrument provides standardized disclosures to all patients with that disorder, not disclosures that are "tailored" precisely to their own symptoms.

The respondent is first given (in oral and written form) five brief paragraphs (two to three short sentences each) of information about the disorder and its treatment:

• General description of the disorder and its symptoms.

• Brief description of a medication that is frequently prescribed for the disorder.

• Symptoms that the medication is expected to relieve, and the likelihood of relief.

• Possible side-effects and their likelihood.

• Alternative treatment (psychotherapy for the psychiatric disorders, surgery for the medical disorder) and a comment about a benefit and risk for this alternative.

The information is provided to the examinee in a procedure that is called "uninterrupted disclosure," meaning that the information is provided from beginning to end before any questioning occurs. Then 10 standardized questions are asked to assess understanding of the information, with respondents answering by offering their own paraphrase of the information presented. This is called "paraphrased recall."

The respondent then receives a procedure called "element disclosure," in which each of the same five paragraphs is presented again, but one paragraph at a time, with questioning occurring immediately after each paragraph disclosure. The questioning at this point is of two kinds: "paraphrased recall," and then "recognition." The latter questioning involves the presentation of four standardized statements about information of a type in the disclosure. Two of these statements say the same thing as the information in the disclosed paragraph but in different words, while the other two say something different. The respondent must indicate whether each statement is "the same" or "different" from the disclosed information. Together these procedures create the instrument's three subtests:

• Uninterrupted Disclosure, Paraphrased Recall

• Element Disclosure, Paraphrased Recall

• Element Disclosure, Recognition

The UTD manual provides objective criteria for scoring responses. Answers to each of the five questions per subtest may receive from 0 to 2 points. There is no single "UTD Score;" that is, the scores on the three subscales are not summed to provide a summary score. The instrument yields three subscales scores—UD-PR, ED-PR, and ED-RC—ranging from 0 to 10 points each.

Conceptual Basis

concept definition. The decision to measure understanding of information that is typically provided in the informed consent process was based on the authors' review of relevant law that identified four constructs that are used in legal definitions of competence to consent to treatment: Understanding, Appreciation, Reasoning, and Ability to Express a Choice (a, b). (See the next two reviews for instruments related to the other legal constructs.) Understanding was defined as the ability to "comprehend the meaning and intent of that which they have been told in the informed consent process" (d, p. 2).

operational definition. The authors developed disclosures for three different disorders in preparation for including these three patient groups in a major study of patients' treatment decision-making abilities. The five types of information provided in the disclosures were selected specifically because legal cases have established these as the necessary elements of of adequate disclosure for informed consent.

Information included in each element of the disclosure was not intended to be comprehensive, but rather to be representative of the type of information the element represents. Moreover, the information disclosed to examinees was standardized, not adjusted to the individual circumstances of each patient because the UTD was developed specifically for use in a research study in which the design required similar stimulus conditions across participants. The disclosure elements were written rather simply, creating a level of "reading ease" that was calculated at about the average for 7th to 9th grade. Pilot studies indicated that the various forms of the UTD did not differ in their level of difficulty in understandability or readability.

The various stimulus and response subtests were selected in order to be able to control for various sources of error in the assessment of individuals' understanding. Examinees' responses to Uninterrupted Disclosure, for example, might be impaired because of poor comprehension or because of difficulty in recalling pieces of the disclosure embedded in a large amount of information. The Element Disclosure procedure, in contrast, provides an index of comprehension in a process that minimizes demands on memory. The use of a Recognition response mode in addition to Paraphrased Recall response mode recognizes that some examinees' might understand information they are provided but might not have the capacity to express it.

critique. The logic for the selection of content and the various response formats seems sound. The standardization of the content is a significant advantage in research situations, although it necessarily reduces the value of the instrument in individual forensic cases. Used for clinical purposes, the instrument could provide an index of the person's capacities for understanding the type of information associated with a particular disorder, but not necessarily the information that is specific to his or her case.

Psychometric Development

standardization. Administration procedures, interview questions, probing, and scoring criteria are very specific. The level of scoring judgment required by the criteria is no greater than is employed in standardized intelligence tests like the Wechsler.

reliability. Interscorer reliability (f) was determined for 10 recently- trained research assistants, based on independent scoring of 20 protocols. Kappa correlations for individual UTD items were.60 or above for 90% of the comparisons, and.70 or above for 74% of the comparisons. Intraclass correlations for subtests scores (for Uninterrupted-Paraphrase and Element-Paraphrase) were all above.84. (Reliability was not checked for Element-Recognition scores, which are entirely objective.) Test-retest reliability with a two-week interval ranged from.50-.80, depending on the subtest and the diagnostic identification of the participant samples.

norms. The UTD manual does not provide norms, but normative data are available in the report of the study for which the instrument was developed (e). Mean scores and distributions for the three subscales are provided for hospitalized patients, 75 with schizophrenia, 92 with major depression, and 82 with angina, as well as three comparison groups of similar size comprising persons in the community without mental disorders.

critique. The UTD's psychometric properties are generally acceptable.

Construct Validation

Internal consistency based on alpha coefficients and corrected itemtotal correlations were best on all three subscales for patients with schizophrenia (alpha =.75-.85; average item-total r's =.52-.66), and lowest for persons in the community with no mental disorder (alpha =.55-.70; average item-total r’s =.31-.46) (f). In a factor analysis that included the three disclosure procedures of the UTD together with 10 other subtests from two measures of Appreciation and Reasoning abilities (two other abilities related to legal competence: see later reviews of POD and TRAT), the highest loading subtests on the first factor were the three disclosure procedures (e). This is consistent with the assumption that the instrument measures a construct that is distinct from other legal constructs employed in competence determinations.

For long-term care residents, Pruchno et al. (g) reported a higher relation between the UTD Recognition task and MMSE scores (.68) than between UTD recall (paraphrase) tasks and MMSE scores (.50). UTD Recognition tasks were also more highly correlated to clinicians' independent judgments about competence (.60) than were UTD recall (paraphrase) tasks (.45).

Grisso and Appelbaum (e) reported significantly lower UTD scores on all subscales for persons hospitalized for schizophrenia than for their community (non-mentally ill) comparison group, and lower UTD scores on Element-Paraphrase and Element-Recognition for persons hospitalized for major depression than for their community comparison group. Chronicity of disorder (age at first hospitalization, number of prior admissions) was not related to UTD scores. Among persons with major depression, significant correlations were reported for IQ and performance on all three subtests, but not for persons with schizophrenia. In contrast, among persons with schizophrenia, significant negative correlations (-.33 to -.41) were reported between symptom severity (Brief Psychiatric Rating Scale) and performance on all three subtests (especially for symptoms of thought disorganization), but not for persons with major depression. Similarly, in a separate study, symptom severity among depressed patients was not significantly related to UTD performance (c).

critique. Some caution is warranted in regard to internal consistency for persons without mental disorders. It is possible that higher internal consistency was obtained for patients as an artifact of general negative effects of psychopathology on performance (affecting item responses relatively consistently), whereas understanding is less consistent (that is, more dependent on inherent differences in difficulty of the various elements in the disclosure) for persons without mental disorders.

UTD scores were lower for diagnostic groups and measures of psychopathology that would be expected to increase the risk of poor understanding of information presented in the context of informed consent procedures. The correlation between symptoms severity and performance within the schizophrenia group, however, indicates that "poor understanding" as measured by the UTD is not synonymous with serious mental disorder. Variability among persons with schizophrenia was considerable, whereas performance was uniformly high among persons in the non-ill comparison.

Predictive or Classificatory Utility

Studies thus far have not examined the degree to which UTD scores correspond to external criteria for understanding of other information or for competence to consent to treatment.

critique. Tests of the relation of UTD scores to courts' determinations of competence to consent to treatment would need to take into account that courts may find patients incompetent for many reasons other than deficiencies in their ability to understand treatment information.

Potentialfor Expressing Person-Situation Congruency

Various disorders and treatment situations differ in the complexity of their options, risks, benefits, and the difficulty level of the concepts or treatments that must be understood. The disclosed treatment situations used in the UTD were purposely standardized to meet the needs of a research study. Thus they provided no opportunity to examine patients' performance in relation to varying demands in terms of complexity of the treatment options.

References

(a) Appelbaum, P., & Grisso, T. (1988). Assessing patients' capacities to consent to treatment. New England Journal of Medicine, 319, 1635-1638.

(b) Appelbaum, P., & Grisso, T. (1995). The MacArthur Treatment Competence Study, I: Mental illness and competence to consent to treatment. Law and Human Behavior, 19, 105-126.

(c) Frank, L., Smyer, M., Grisso, T., & Appelbaum, P. S. (1999). Measurement of advance directive and medical treatment decision-making capacity of older adults. Journal of Mental Health and Aging, 5, 257-274.

(d) Grisso, T., & Appelbaum, P. S. (1992). Manual for Understanding Treatment Disclosures. Worcester MA: University of Massachusetts Medical School.

(e) Giisso, T., & Appelbaum, P. S. (1995). The MacArthur Treatment Competence Study, III: Abilities of patients to consent to psychiatric and medical treatment. Law and Human Behavior, 19,149-174.

(f) Grisso, T.,Appelbaum, P. S., Mulvey, E., & Fletcher, K. (1995). The MacArthur Treatment Competence Study, II: Measures of abilities related to competence to consent to treatment. Law and Human Behavior, 19, 127-148.

(g) Pruchno, R., Smyer, M., Rose, M., Hartman-Stein, P., & Henderson-Laribee, D. (1995). Competence of long-term care residents to participate in decisions about their medical care: A brief, objective assessment. The Gerontologist, 35, 622-629.

Perceptions of Disorder (POD)

Authors

Appelbaum, P. S., & Grisso, T.

Primary Author Affiliation

Department of Psychiatry, University of Massachusetts Medical School, Worcester MA

Primary Reference

Appelbaum, P. S., & Grisso, T. (1992). Manual for Perceptions of Disorder. Worcester MA: University of Massachusetts Medical School. (Available from the authors.)

Description

Perceptions of Disorder (POD) (b) was developed as a research instrument to measure patients' acknowledgement of their disorders and acknowledgement of the potential value of treatment for their illnesses. The POD was developed for use in a project, the MacArthur Treatment Competence Study (c), focused on identifying patients' capacities to make decisions about consent to or refusal of treatment when they are hospitalized for mental disorders. The authors clearly describe the POD as a tool for research investigation of non-acknowledgement of one's disorder or the potential for treatment to be beneficial, not as a clinical instrument to be used in forensic assessments intended for use in determining a patient's competence to consent to treatment (b).

The instrument is designed for use with patients who have a mental disorder or medical illness. The instrument has three parts, two of which are called "Non-Acknowledgement of Disorder" (NOD) and "NonAcknowledgement of Treatment Potential" (NOT). (The third part is not discussed here, because it consisted of exploratory items for which no scoring or research results were reported.) Both of these procedures have three interview questions.

For "Non-Acknowledgement of Disorder" (NOD), the first item is introduced by the examiner using standardized wording to describe symptoms, but allows the patient's own symptoms (from their hospital chart) to be inserted in the narrative. Patients' are then asked whether they believe that they have those symptoms, and are asked to indicate the degree of affirmative or negative opinion they have about this on a card showing a 6-point range of options. The second NOD item asks patients how serious they believe their symptoms are and obtains their answer on the same rating card. The third item uses a standardized format to indicate the name of their disorder (e.g., "schizophrenia"), to characterize its symptoms, and to tell the patient that the patient's doctor has made this diagnosis in their case. Then the patient is asked (using the same 6-point rating card) the degree to which the patient agrees or disagrees with the diagnosis.

Scoring for each of the NOD items is based on the patient's rating for Items 1 and 3, and on Item 2, a formula that indicates the correspondence or discrepancy between the patient's rating of perceived severity of symptoms and the actual severity of the patient's symptoms (which requires a measure of severity such as the Brief Psychiatric Rating Scale or the Beck Depression Inventory). Scores of 0, 1, or 2 are possible on each item, leading to NOD scores ranging from 0-6, with lower scores indicating greater non-acknowledgement of disorder.

For "Non-Acknowledgement of Treatment Potential" (NOT), the first item uses a standardized script to inform the patient of several types of treatment for the patient's disorder, then asks the patient to rate the degree to which he or she believes that "you have the kind of condition for which some types of treatment might be helpful." The second NOT item describes medication as an option, indicates that it is often helpful (for "75-90% of patients"), and obtains the patient's rating of the degree to which the patient believes that medication "might be helpful for you." Finally, the third item uses a standardized format to inform the patient that patients with mental disorders who decide not to take medications often do not improve or get worse. Then the rating card is used to obtain their level of belief that "you might get better without medication."

Whenever patients indicate any degree of disagreement with NOT items 1 or 2, or in any degree show belief that they might get better without medication on item 3, they are asked to explain their answer ("What makes you believe that..."). Their answer is then used to select from the manual any of several "hypotheticals" that have been written to "negate" the underlying presumption, or "premise," in the patient's explanation. For example, if the patient's explanation indicates the basic premise, "I'm too far gone for anything to help," the examiner selects the "Too Sick" hypothetical, and it is administered to the patient: "Imagine that a doctor tells you that there is a medication that has been shown in research to help 90% of people with problems just as serious as yours. Do you think this medication, if it existed, might be of more benefit to you than getting no treatment at all?" The patient answers on the 6-point rating form. Patients then receive a score based on their answer to the hypothetical, not on the answer they provided when rating the original question. Scores of 0, 1, and 2 are possible on each item, leading to NOT scores ranging from 0-6, with lower scores indicating greater non-acknowledgement of treatment potential.

Conceptual Basis

concept definition. The decision to measure a construct of Appreciation was based on the authors' review of relevant law that identified four constructs that are used in legal definitions of competence to consent to treatment: Understanding, Appreciation, Reasoning, and Ability to Express a Choice (a, c; also see h). (See the previous and following reviews for instruments related to the other legal constructs.) The authors identified the construct to be measured as "appreciation of the significance of the information (in a treatment disclosure) for one's own circumstances" (b). The authors further defined Appreciation as "non-acknowledgement of disorder" and "non-acknowledgement of treatment potential." They identified several reasons that a person might disavow that they have a disorder or that treatment might be of value. These include reality-based reasons, value-based reasons, coping and defense-based reasons, and organically-based reasons. Only "defense-based" disavowals are relevant for weighing competence to consent to treatment, they claimed, in that this is closest to the "lack of insight" into one's illness that is of concern in many cases of incompetence to consent to treatment, especially in clinical cases involving psychotic symptoms. Thus it is not mere disavowal of symptoms or potential values of treatment that must be measured, but rather disavowal that is the product of mechanisms of denial or psychological distortion of reality, such as is found in some cases of psychosis involving delusional thinking.

operational definition. The decision to create two types of questions, focused on possible disavowal of one's illness and disavowal of the value of treatment, was based on the authors' review of the types of concerns about "appreciation" most often raised in cases involving questionable competence to consent to treatment. Similar guidance was used to arrive at the specific items within the two parts of the measure. The decision to base NOT scores on patients' responses to hypothetical questions that negate their disavowal of the value of treatment was based on the notion that only beliefs that are held rigidly in the face of contradictory information (albeit hypothetical information) should represent lack of appreciation, as that concept is used in literature on competence to consent to treatment.

critique. There has been spirited commentary regarding the POD's operationalization of the concept of appreciation (i, j, k). The debate focused primarily on three things:

• whether the nature of the measure might discriminate against persons with ethnic/cultural or religious views that are contrary to medical or psychological notions of "disorder" and "treatment"

• whether the POD authors adequately defined the concept of appreciation, and

• whether their operational definition of the concept adequately represents the concept.

The authors addressed the first question (raised by Stefan, k), at least regarding ethnic/cultural issues, by determining that there were no significant differences between African-Americans and other subjects in their study on relevant POD measures (f). Saks and Behnke (i), however, contend that "lack of appreciation" for purposes of competence to consent should be restricted to those cases in which a person's denial of illness or treatment potential is clearly associated with "patently false beliefs" (see also Slobogin, j). They would not interpret a person's beliefs that are contrary to psychiatric diagnoses or empirical notions of treatment benefit as being a "lack of appreciation" if they are based on religious beliefs or on mere "defensiveness."

The POD authors claim that the nature of the POD's questioning typically will not result in a low score for persons with religious beliefs against treatment (examinees are not required to "accept" the treatment or approve of it, only to acknowledge that it might be of some benefit). On the other hand, the authors have acknowledged that after considerable efforts they were unable to devise an operational definition of a "patently false belief" (f). To this extent the POD falls short of representing the narrow definition of appreciation that some theorists would prefer.

Psychometric Development

standardization. Administration procedures, interview questions, probing, and scoring criteria are very specific. The process for discovering the patient's symptoms in order to insert them in the standardized script is described, although there is no protection against individual differences between examiners in their selection of symptoms among the materials that they encounter on a patient. Some judgment is needed to discover the examinee's premise for having rejected the potential value of treatment and in selecting a hypothetical based on that premise.

reliability. Interscorer reliability was not examined in the primary study for which the POD was developed because its scoring is based solely on examinees' own ratings (g). Test-retest reliability with a two-week interval was.90 on the NOD for patients with schizophrenia and.59 for patients with major depression, and on the NOT they were.66 for patients with schizophrenia and.48 for patients with major depression (g).

norms. The POD manual does not provide POD norms, but normative data are available in the report of the study for which the instrument was developed (e). Mean scores and distributions for the three subscales are provided for 75 patients hospitalized with schizophrenia, 92 patients hospitalized with major depression, and 82 hospitalized with angina.

critique. The POD's psychometric properties suggest somewhat less stability than one would hope for. Some possible reasons for this are discussed below, with reference to internal consistency.

Construct Validation

Internal consistency based on alpha coefficients was.80 for NOD and.67 for NOT. Corrected item-subtest correlations were better for items in NOD (.59 to.70) than for items in NOT (.43 to.60) (g). A factor analysis (e) that included the two POD subtests together with three disclosure procedures of the UTD (previous review) and 8 subtests of the TRAT suggested that the POD measured a construct that was quite distinct from these other two measures (neither POD subtest loaded on the two factors that emerged. In addition, the two POD subtests themselves did not share enough similarity to represent a separate factor. Consistent with this finding, the NOD and NOT were correlated only.23.

Grisso and Appelbaum (e) reported significantly more non-acknowl- edgement of one's disorder (lower NOD scores) for persons hospitalized for schizophrenia (about one-third of these patients showed significant signs of non-acknowledgement of disorder) than for persons hospitalized for major depression. No significant difference was found between these two groups regarding non-acknowledgement of the potential value of treatment (NOT scores), and only about 15% of both groups showed significant signs of non-acknowledgement of treatment potential. Among schizophrenia patients, those who had high BPRS scores on "Conceptual Disorganization" were more likely to disavow their symptoms (50%) than were those with low Conceptual Disorganization scores (12%). No significant relations were found between NOD or NOT scores and measures of symptom severity, IQ, or indexes of chronicity of disorder.

Critique. Concerning the evidence regarding questionable internal consistency, the authors of the POD comment that these results suggest that the POD and its two subtests might not have the properties of scales. They suggest that the "POD is best seen as a set of interview screening questions, with non-acknowledgement on any one of them raising a concern about a respondent's unrealistic rejection of the relevance of diagnostic or treatment information for his or her own circumstances" (g, p. 146). This dictates against interpreting POD scores as though they represented a "trait" such as generalized denial.

The results of the factor analyses suggest that the POD measures abilities or response tendencies that are quite distinct from its two companion measures (UTD and TRAT). Moreover, the two subtests of the POD do not seem to measure the same construct, or the construct that they measure can manifest itself quite differently in references to patients' beliefs about their disorder versus their beliefs about treatment.

The relations between the POD and clinical variables are generally as one would expect regarding the presumed relation between failure to acknowledge one's illness and the extent of one's psychopathology. For example, the evidence suggests that POD scores were not impaired generally across persons with significant psychopathology, but that they were impaired for persons with delusional (that is, conceptually disorganized) thinking.

Predictive or Classificatory Utility

Studies thus far have not examined the degree to which POD scores correspond to external criteria for appreciation, other measures of "insight," or other judgments about competence to consent to treatment.

critique. Tests of the relation of POD scores to courts' determinations of competence to consent to treatment would need to take into account that courts may find patients incompetent for many reasons other than deficiencies in their ability to appreciate the relevance of treatment information for their own circumstances.

Potential for Expressing Person-Situation Congruency

The questions associated with the POD were purposely standardized to meet the needs of a research study. Thus they provided no opportunity to examine patients' performance in relation to varying demands in terms of complexity of the treatment options.

References

(a) Appelbaum, P., & Grisso, T. (1988). Assessing patients' capacities to consent to treatment. New England Journal of Medicine, 319, 1635-1638.

(b) Appelbaum, P., & Grisso, T. (1992). Manualfor Perceptions of Disorder. Worcester MA: University of Massachusetts Medical School,

(c) Appelbaum, P., & Grisso, T. (1995). The MacArthur Treatment Competence Study, I: Mental illness and competence to consent to treatment. Law and Human Behavior, 19, 105-126.

(d) Appelbaum, P., & Roth, L. (1982). Competency to consent to research: A psychiatric overview. Archives of General Psychiatry, 39, 951-958.

(e) Grisso, T., & Appelbaum, P. S. (1995), The MacArthur Treatment Competence Study, Ш: Abilities of patients to consent to psychiatric and medical treatment. Law and Human Behavior, 19, 149-174.

(f) Grisso, T., & Appelbaum, P. (1996). Values and limits of the MacArthur Treatment Competence Study. Psychology, Public Policy, and law, 2,167-181.

(g) Grisso, T., Appelbaum, P. S., Mulvey, E., & Fletcher, K. (1995). The MacArthur Treatment Competence Study, II: Measures of abilities related to competence to consent to treatment. Law and Human Behavior, 19, 127-148.

(h) Roth, L., Meisel, A., & Lidz, C. (1977). Tests of competency to consent to treatment. American Journal of Psychiatry, 134, 279-284.

(i) Saks, E., & Behnke, S. (1999). Competency to decide treatment and research: MacArthur and beyond. Journal of Contemporary Legal Issues, 10, 103-129.

(j) Slobogin, C. (1996). "Appreciation" as a measure of competency: Some thoughts about the MacArthur group's approach. Psychology, Public Policy, and Law, 2,18-30.

(k) Stefan, S. (1996). Race, competence testing, and disability law: A review of the MacArthur competence research. Psychology, Public Policy, and Law, 2, 31-44.

Thinking Rationally About Treatment (TRAT)

Authors

Grisso, T., & Appelbaum, P. S.

Primary Author Affiliation

Department of Psychiatry, University of Massachusetts Medical School, Worcester MA

Primary Reference

Grisso, T., & Appelbaum, P. S. (1993). Manualfor Thinking Rationally about Treatment. Worcester MA: University of Massachusetts Medical School. (Available from the authors.)

Description

Thinking Rationally about Treatment (TRAT) (d) was developed as a research instrument to measure patients' cognitive functions that are employed in the process of deciding among alternative treatments. The TRAT was developed for use in a project, the MacArthur Treatment Competence Study (a), focused on identifying patients' capacities to make decisions about consent to or refusal of treatment when they are hospitalized for mental disorders. The instrument consists of two parts, the "TRAT Vignette" and the "TRAT Tasks."

The TRAT Vignette is provided in three forms: Schizophrenia, Depression, and Ischemic Heart Disease. (In the study for which the instrument was developed, patients received the form that corresponded to their own disorder.) The individual is asked to "assist" the hypothetical patient by recommending one of three treatment alternatives, which are provided in the vignette along with a number of their benefits and liabilities. in a standardized interview procedure, a series of questions are asked in order to elicit the subject's explanation for his or her choice, providing data for scores on five of the eight cognitive functions measured in the TRAT. These five functions are conceptually and operationally defined as follows:

• Seeking Information: A person's tendency to seek information beyond that which is provided in the disclosure of a decisionmaking problem. Credit is received if the examinee requests further specific information when offered a chance to do so.

• Consequential Thinking: A person's consideration of the consequences of a treatment alternative when deciding whether to reject or accept that alternative (or others). Credit tor this function is received if the examinee's explanation for choice of an alternative manifests the use of consequences in the reasoning for the choice.

• Comparative Thinking: A person's "simultaneous" processing of information about two treatment alternatives, such that they receive consideration in relation to each other, not merely as separate facts. Credit is received if the examinee's explanation for choosing a particular alternative refers to the consequences of two alternatives in reasonably close juxtaposition.

• Complex Thinking: A person's attention to the range of treatment alternatives available within a decision problem, even if only to reject them, rather than avoiding or neglecting consideration of some alternatives. Credit for this function is given if the examinee's explanation for an alternative manifests reference to the full range of treatment alternatives (three) offered in the vignette.

• Generating Consequences: A person's capacity to generate potential real- life consequences of the liabilities described in an informed consent disclosure of a treatment alternative. Credit is received if examinees are able to describe ways that medical consequences (e.g., medication side-effects) might influence their own everyday activities.

The examinee's responses to the TRAT Vignette interview questions are recorded verbatim on a structured response sheet by the examiner. They are then scored (2, 1, 0) for each of the five functions according to guidelines that are provided in the manual in two forms: as scoring criteria, and as a set of decision rules outlined in the form of flow diagrams.

The three TRAT Tasks are unrelated to the vignette. They are individual tasks that assess three additional cognitive functions associated with decision making or problem solving:

• Weighing Consequences: Aperson's tendency for consistent application of his/her own preferences when evaluating the desirability of consequences of the various alternatives. This function is assessed in two stages separated by other tasks. In Part I, examinees are presented with ten cards, each offering a pair of everyday activities (e.g., "Buy something at a bargain"), thus presenting all possible pairs of five activities, and in each case they are asked to choose which of the two activities they prefer. These are recorded on a response sheet. In Part II, examinees are asked to place in order of preference each of the five activities now listed singly on cards. The scoring procedure allows the examiner to score (2, 1, 0) the consistency of the examinee's stated preferences on Part I and Part II.

• Transitive Thinking: A person's functioning on a task requiring logical inferences about the relative quantitative relationships between several alternatives based on paired comparisons. Credit for this function is given for the examinee's performance on several transitive problems (e.g., A is larger than B, B is larger than C: Chose the largest).

• Probabilistic Thinking: A person's demonstration of the ability to distinguish correctly the relative values of numerical (percentage) probabilities. This is assessed with several problem questions requiring an understanding of simple percentage probability statements.

Scores are obtained for each of the eight functions, then summed to produce a total TRAT score. For reasons explained below, the authors eventually developed a TRAT-2 score, based on the sum of scores on six of the TRAT subtests (deleting Weighting Consequences and Seeking Information).

Conceptual Basis

concept definition. The authors explain that their intention in developing the TRAT was to measure the capacity to process information rationally in order to arrive at a treatment decision. The decision to measure this capacity was based on the authors' review of relevant law that identified four constructs that are used in legal definitions of competence to consent to treatment: Understanding, Appreciation, Reasoning, and Ability to Express a Choice (a). (See the previous two reviews for instruments related to the other legal constructs.) They provide evidence that this capacity is relevant (among other capacities) for legal determinations of competence to consent to treatment.

To select and define the various abilities to be assessed, the authors reviewed psychological theories of decision making and problem solving in order to identify abilities that seemed conceptually related to the capacity with which the law was concerned. The specific ability constructs were borrowed from several models of decision making and problem solving with special attention to abilities that the various theories had in common (e.g., c, g, h, i, j).

operational definition. The decision to create TRAT vignettes for three different disorders was based on the authors' intentions to use these three diagnostic patient groups in a major study of patients' treatment decision-making abilities. The various treatment alternatives provided in each vignette were based on actual treatment options that were typical for the disorders in question. Separate "TRAT tasks" were developed for three of the eight functions because they would have been difficult to include in the vignette procedure without greatly complicating the process. The tasks themselves were developed wholly as the authors' inventions, not borrowed from other tests.

critique. To the extent that the law is interested in patients' capacity to process information in reaching a treatment decision, the TRAT presents a reasonable effort to represent that capacity operationally. The specific abilities were selected, however, solely on the basis of their relation to psychological theories of decision making. Thus the relevance of the specific functions measured in the TRAT (e.g., Generating Consequences') for legal concerns is theoretical, not specifically referenced in past legal decisions.

Psychometric Development

standardization. Administration procedures, interview questions, probing, and scoring criteria are clearly described in the manual. The degree of scorer judgment required by the criteria for scoring the vignette subscales is somewhat greater than for the instrument's companion measures (the UTD and POD, reviewed earlier).

reliability. Interscorer reliability (f) was determined for 10 recently- trained research assistants, based on independent scoring of 20 protocols. Kappa correlations for individual TRAT items were.60 or above for 76% of the comparisons, although about 7% of the comparisons were below the accepted level of statistical significance. intraclass correlations for total TRAT scores all were above.88, and kappa correlations for individual TRAT subtests were.60 or above for 77% of the comparisons.

Test-retest reliability (f) with a two-week interval ranged from.66-.68, depending on the subtest and the diagnostic identification of the participant samples, with no significant difference between means at first and second administration.

norms. The TRAT manual does not provide norms, but normative data are available in the report of the study for which the instrument was developed (e). Mean scores and distributions for the three subscales are provided for hospitalized patients, 75 with schizophrenia, 92 with major depression, and 82 with angina, as well as three comparison groups of similar size comprising persons in the community without mental disorders.

critique. The TRAT psychometric properties are generally acceptable, but not as good as for the UTD (reviewed earlier). interrater reliability is somewhat weak; some of the criteria for scoring responses to the vignette procedure, although carefully described, are in some cases complex.

Construct Validation

Internal consistency based on alpha coefficients and corrected itemtotal correlations were best for patients with schizophrenia (alpha =.74; average item-total r =.42), and lowest for persons with ischemic heart disease (alpha =.39; average item-total r =.25) (f). A series of factor analyses of the subscales indicated that Weighing Consequences and Seeking Information were outliers in relation to the other six subscales. Thus for research purposes the authors employed only the six inter-related subscales to form a TRAT score called TRAT-2.

In a factor analysis that included the 8 TRAT subtests together with the 3 disclosure procedures of the UTD and the two subtests of the POD, the highest loading measures for the second factor were four of the TRAT subtests that employ the TRAT vignette (f). This is consistent with the assumption that the instrument measures a construct that is distinct from the other legal constructs employed in competence determinations.

Grisso and Appelbaum (e) reported significantly lower TRAT-2 scores for persons hospitalized for mental disorders (schizophrenia, or major depression) than for their community (non-mentally ill) comparison groups. Similarly, Frank et al. (b) found a significant negative relation between depressive symptom severity and TRAT scores.

Differences between patients with schizophrenia and their community comparison group (e) were apparent across most of the eight TRAT subscales, but apparent on only 4 of the subscales for patients with major depression and their community comparison group. TRAT-2 scores were significantly related to verbal cognitive functioning (measured with selected subtests from the Wechsler Adult Intelligence Scale-Revised), but were not significantly related to chronicity of disorder (age at first hospitalization, number of prior admissions).

critique. Two of the subscales apparently do not tap the same construct as the remainder of the TRAT subscales. One of these, Seeking Information, may simply measure inquisitiveness or degree of involvement, rather than rational processing of information. The other, Weighing Consequences, measures the stability of one's everyday preferences—a characteristic that may influence decision making but which is quite dif- ferentin nature from the other cognitive functions measured by the TRAT. These two subtests may be worth administering for the additional information that they provide, but the authors' data suggest that they impair the internal consistency of the TRAT when they are allowed to contribute to a TRAT total score.

TRAT-2 scores were lower for diagnostic groups with disorders that would be expected to manifest poorer processing of information presented in the context of a treatment decision. But the fact that TRAT-2 scores were unrelated to symptom severity is not consistent with expectation, presuming that greater severity of symptoms would interfere with information processing. The study's measure of symptom severity (Brief Psychiatric Rating Scale), however, is based on many types of symptoms that are not cognitive in nature, and this could account for the insignificant correlation between symptom severity and TRAT-2 scores.

Predictive or Classificatory Utility

Studies thus far have not examined the degree to which TRAT scores correspond to external criteria for processing of information to make a decision or for clinical or judicial judgments about competence to consent to treatment.

critique. Tests of the relation of TRAT scores to courts' determinations of competence to consent to treatment would need to take into account that courts may find patients incompetent for many reasons other than deficiencies in their ability to process information rationally.

Potentialfor Expressing Person-Situation Congruency

Various disorders and treatment situations differ in the complexity of their options, risks, benefits, and the difficulty level of the concepts or treatments that must be understood. The treatment situations used in the TRAT vignettes were purposely standardized to meet the needs of a research study. Thus they provided no opportunity to examine patients' performance in relation to varying demands in terms of complexity of the treatment options.

References

(a) Appelbaum, P.S., & Giisso. T. (1995). The MacArthur Treatment Competence Study, I: Mentalillness and competence to consent to treatment. Law andHuman Behavior, 19,105-126.

(b) Frank, L., Smyer, M., Grisso, T., & Appelbaum, P. S. (1999). Measurement of advance directive and medical treatment decision-making capacity of older adults. Journal of Mental Health and Aging, 5, 257-274.

(c) Goldfried, M., & D'Zurilla, T. (1969). A behavioral-analytic model for assessing competence. In C. D. Spielberger (Ed.), Current topics in clinical and community psychology (pp. 151-196). New York: Academic Press.

(d) Grisso, T., & Appelbaum, P. S. (1992). Manualfor Thinking Rationally About Treatment. Worcester MA: University of Massachusetts Medical School.

(e) Grisso, T., & Appelbaum, P. S. (1995). The MacArthur Treatment Competence Study, III: Abilities of patients to consent to psychiatric and medical treatment. Law and Human Behavior, 19, 149-174.

(f) Grisso, T., Appelbaum, P. S., Mulvey, E., & Fletcher, K. (1995). The MacArthur Treatment Competence Study, II: Measures of abilities related to competence to consent to treatment. Law and Human Behavior, 19, 127-148.

(g) Hogarth, R. (1987). Judgement and choice: The psychology ofdecision. New York: John Wiley.

(h) Janis, I., & Mann, L. (1977). Decision making: A psychological analysis of conflict, choice, and communication. New York: Free Press.

(i) Spivack, G., Platt, J., & Shure, M. (1976). The problem solving approach to adjustment. San Francisco, CA: Jossey-Bass.

(j) Spivack, G., & Shure, M. (1974). Social adjustment of young children: A cognitive approach to solving real-life problems. San Francisco, CA: Jossey-Bass.

<< | >>

↑

Source: Grisso T.. Evaluating Competencies: Forensic Assessments and Instruments. 2nd edition. — Springer,2002. — 564 p.. 2002

More legal literature on Laws.Studio

REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS

More on the topic REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS: