STANDARDS FOR FORENSIC ASSESSMENT INSTRUMENTS
Mental health professionals will be aware of many sources that provide and explain standards for psychological tests and measurements. Those standards should apply to the development and evaluation of FAIs (see Heilbrun, 2001), just as they do to other instruments for assessing human attributes.
The relation of FAIs to legal constructs and legal uses, however, suggests a number of special issues concerning the application of scientific standards to the development and evaluation of FAIs. Some of these issues will be raised in this section, in preparation for later reviews of FAIs, whereas possible solutions will be considered in later chapters.The outline used in this section will be applied in the chapters in which each FAI is reviewed. Standards are discussed in the following order: (1) conceptual basis, (2) psychometric development, (3) construct validation, (4) predictive or classificatory utility, and (5) potential for expressing person-situation congruency.
Conceptual Basis: Defining Legally Relevant Functional Abilities
The Functional component of a legal competence construct (see Chapter 2) requires that a FAI should be an index of functional abilities. The functional abilities to be assessed should relate to the performance of a role in an environmental context specified by the legal competence construct. The Causal component suggests that the functional abilities assessed should be related to cognitive and behavioral constructs found in the basic theories and empirical findings of psychology and psychiatry. This is necessary if one expects to use existing theory and research to guide interpretations of deficits in legally relevant abilities.
These demands have important implications for the first steps in the development or evaluation of a FAI: concept definition and operational definition.
Concept Definition
This activity requires the determination of a set of functional ability concepts or dimensions (C, Figure 1) that is related logically to two, more general sets of constructs: the legal competence standard (A, Figure 1) and psychology's or psychiatry's basic theories and empirical knowledge about human attributes (B, Figure 1).
When developing a FAI, these functional ability concepts must be carefully determined, then thoughtfully and completely defined by verbal description. An instrument is not likely to manifest construct validity in later evaluations if the instrument is not based at the outset on careful identification of the attribute dimensions to be measured (Guion, 1983).When seeking to formulate a set of well-defined dimensions of functional ability for a specific purpose, it is convenient to think in terms of a hypothetical domain of abilities. The parameters or boundaries of this domain are specified generally by the legal competence construct, often in two ways. First, the competence construct will refer to an environmental context, as explained in Chapter 2, which generally identifies a performance role for the individual within that context. second, legislative wording of legal competence standards, as well as appellate court opinions and other legal writings, may provide phrases referring to global functions with which the law is concerned. These verbal formulations of the legal competence construct, found in the law itself, sketch the boundaries of functioning within which are located an as yet undefined set of legally relevant functional abilities.
One then strives for a solution to the question of the functional ability dimensions that lie within that domain. The content validity of an instrument begins here. Content validity refers to the adequacy with which a test has sampled the behaviors or concepts associated with a particular domain. If an instrument begins with certain concept dimensions that are not relevant for the identified domain, or a set of dimensions that does not represent adequate coverage of the domain, then test items that later are devised to represent these dimensions cannot have adequate content validity for the domain.
Content validity does not rest with the mere appearance of relevance. The dimensions derive their relevance also from the method or process used for attaining the set of dimensions to represent the domain in question.
Later chapters will review the methods employed by developers of FAIs for identifying functional ability concepts within legal competence domains. Generally, the methods and the task bear some resemblance to the efforts of industrial and organizational psychologists when doing a job analysis in preparation for the construction of job-related assessment instruments. Some jobs may be analyzed for their requisite tasks and functional abilities by systematically observing people performing the job or role. More complex jobs require the use of a theory about the job and its task requirements. Similarly, concept definition for FAIs may use empirical methods, a consideration of psychological or psychiatric theories about human behavior, and the less formal theories of experts who have been in a position to form impressions of the requirements of a particular environmental context. When the environmental context is one with which legal professionals have special knowledge (e.g., the performance of defendants in trial processes), the opinions of judges and lawyers may be sought concerning the relevant functional dimensions associated with the domain. Whatever the methods, the content validity of the resulting set of ability dimensions is judged by the quality of this process.The related but distinct question of face validity takes on special significance in the selection of ability dimensions for use in legally relevant assessment instruments. In contrast to content validity, face validity refers to the appearance of relevance: that is, judgments concerning whether the final set of ability dimensions looks relevant to the purposes intended for the instrument.
With reference to FAIs, there can be no more important criterion for face validity than the opinions of judges and lawyers as prospective consumers of the results of one's anticipated assessment instrument. One should not progress beyond this point in test construction or selection without obtaining their judgments concerning the relevance of the ability dimensions for addressing the legal competence.
This may save the test developer from the frustration of proceeding through the tasks of item construction and validation, only to learn that the conceptual dimensions at the foundation of the instrument simply do not comport with judges' legal sense of the competence construct in question. It is possible that no amount of empirical validity will be able to overcome a judicial belief in the face invalidity of an instrument's dimensions in relation to the legal construct.One must realize, however, that the set of functional ability concepts that one eventually selects will not constitute a conceptual definition of the legal competence construct itself. The FAIs' ability concepts are only a conceptual definition of the functional abilities that appear to be relevant for the legal competence construct. As explained in Chapter 2 (Interactive and Judgmental components), a legal competence construct may refer to far more than a person's functional abilities alone. It may require, for example, considerations of situational variables, economic circumstances, and moral values of society. FAIs can never define legal competence, either conceptually or empirically. Therefore, their construction does not begin with this goal, but rather with conceptual definition of those aspects of the legal competence construct that refer to human functioning—those things that people know, understand, believe, or can do.
Finally, when one is forming a set of functional ability concepts related to the legal competence, it is equally important to be able to describe the assumed or hypothetical relations between these concepts and psychology's basic theories, constructs, and data for understanding human attributes. (This is the relation between B and C in Figure 1.) Various types of legal competencies may require reference to constructs in theories of psychopathology, theories of personality, developmental theories, or theories of emotion, motivation and cognition. These conceptual links between one's legally relevant ability concepts and basic theories of human behavior will be very important at a later point, when dealing with issues of construct validation and the interpretation of the meaning of functional deficits (see Causal component, Chapter 2).
Operational Definition
Once the legally relevant functional abilities are defined as concepts, they must be translated into content items for the instrument. The format of the items might be questionnaire statements with true-false or Likert- type response formats, or some other format that is more appropriate for various purposes: for example, checklists, structured and semistructured interview items, or categories of behavior to observe and record as manifested by examinees in naturalistic or assigned-task situations. In addition, some method for categorizing, rating, or scoring responses must be devised. The sum total of the content of the items, the method of administration, the response format, and the criteria for scoring or rating responses constitutes the operational definition of the legally relevant functional abilities (C in Figure 1). These are the procedures with which the attributes of functional ability will be defined and measured.
Several issues require special comment at this phase of construction of a FAI. First, the same issues of content validity that were discussed in relation to concept definition are raised again in operational definition. Now, however, one is concerned with the process for arriving at a set of tasks or test items that will assure adequacy of sampling (in terms of content relevance and content coverage) of the domain of behavior within each functional ability to be assessed. This process is the second step by which content validity will be determined.
When we arrive at a set of tasks or items, we are faced again with the question of face validity as well. We may know by now that the functional ability concepts appear to be relevant to the legal competence construct, according to our assessment of the opinion of legal professionals. Yet is the specific content of the items that now operationally define the dimensions also perceived as legally relevant? Some procedure may be required to determine this before one proceeds further.
Second, the involvement of legal professionals during test construction becomes increasingly important if one intends to develop a system to score responses according to some absolute standard of correctness. For example, a test may instruct examiners to score examinees' responses as "adequate," "marginal," or "inadequate," and examinees' scores are then interpreted accordingly. When a FAI is constructed to incorporate judgments about adequacy of response into the scoring criteria, the assistance of legal authorities in the formation of those criteria is especially important. Chapter 2 identified questions of sufficiency of ability as being legal judgments—not empirical facts or psychological judgments—when data about the ability are applied to a legal definition of competence. Scoring criteria that define as "adequate" or "inadequate" an examinee's answer to a question or performance on a task at least imply a legal judgment when the test itself was developed to address aspects of a legal competence construct. The scoring criteria, therefore, should not be based on the opinions of the test developer alone.
A third issue is the similarity or dissimilarity between the test format and the environmental context to which the legal competence refers. For example, three tests assessing a defendant's ability to communicate with an attorney all might include an item with content that examines defendants' willingness to disclose information to their lawyer. Yet one instrument may pose the question in a true-false item on a paper-and-pencil task, the second in a hypothetical situation posed by a forensic examiner in a structured interview, and the third in a procedure in which the examiner observes the examinee interacting with the examinee's lawyer. The latter of the three tests clearly has the appearance of sampling behavior that most closely approximates the environmental context (communicating with an attorney) to which the construct of competence to stand trial refers.
It is often difficult, however, to approximate closely the environmental context in an examination procedure. In the area of job-related employment testing, for example, often it is not possible to reconstruct in standardized test formats the specific conditions that will confront an examinee on the job. This has produced considerable difficulty in legal determinations of the validity or fairness of employment tests (Haney, 1982). Similarly, certain legally relevant environmental contexts cannot (and often should not) be closely approximated in formats for FAIs. For example, one investigation (Ferguson & Douglas, 1970) confronted school children with an unexpected, quasi-investigative interview intended to produce an ecologically valid examination of children's abilities to understand Miranda warnings prior to waiver of rights. Ethical considerations for psychological harm to the examinee, however, may far outweigh the anticipated value of such research or testing procedures. Thus there is no simple answer to the question of the desirable or acceptable degree of similarity between test format and the legally relevant environmental context. One merely must be aware that this may be of concern, when examiners eventually are required to interpret the degree to which the results of FAIs can be generalized to real-life environmental contexts.
Finally, when constructing ability, attitude, and personality-trait instruments, psychometricians are accustomed to employing various statistical methods for producing homogeneity of items within a scale. The objective is to attain unidimensionality, or some assurance that the items refer or contribute to a single dimension or concept (the one that guided the process of content selection for the scale).
Psychometricians who attempt to develop instruments related to legal constructs might face conflict between this general practice and the demands for content relevance and coverage in relation to legal questions. For example, certain items may correlate poorly with other items in an ability scale of which they are a part, rendering them as suspect in one's search for a homogeneous set of items. Automatic exclusion of these items, however, may have the effect of decreasing both content and face validity, given that the items originally were included as a result of a careful job of content selection.
Psychometric Development
Within the category of psychometric development are standards related to instrument standardization, reliability, and norms.
Standardization
The materials, administration, and scoring or rating procedures should be established and carefully described for examiners. The issue here, of course, is replicability. Attempts to establish the reliability or validity of a poorly standardized instrument are likely to be fruitless, due to error in measurement allowed by the vagueness of the test procedures themselves. Careful standardization, in turn, maximizes later gains in documenting reliability and validity. In addition, an examiner who can clearly describe to a court the standard procedure with which assessment data were obtained is likely to engender greater understanding on the part of judge and jury and a more firm base for credibility.
Standardization is not synonymous with the elimination of discretion in administration or scoring. Some abilities and attributes are assessed better with methods that are more flexible than questionnaire items and dichotomous response choices. Some behaviors cannot be scored by summing Likert-type responses; instead, examiners sometimes must be asked to rate the behaviors they observe. The objective of standardization is not to reduce all assessment to mechanical procedures, but rather to minimize bias and situational or examiner error to the degree that this is possible with the instrument's administration and quantitative format. Note, however, that this goal requires greater care and effort when defining procedures for subjective rating than when an instrument can be mechanically scored.
Reliability
There are many ways to examine the reliability of instruments, each offering an estimate of the error in measurement that derives from some source. Some forms of reliability estimate examiner error (due to variations in administration, rating, or scoring), whereas others estimate error in terms of changes in responses over time. Still others examine the relationship of items within a scale to each other, providing an index of internal consistency of the scale.
Not all FAIs will require high coefficients (demonstrations of low error variance) for all possible types of reliability. This is because low reliability coefficients need not be interpreted as error in all circumstances. For example, low test-retest reliability need not be interpreted as a sign of error, if the instrument claims to measure attributes that are presumed in theory to be unstable for any given person across time (e.g., transient emotional states).
Error due to changes in examinees' responses over time takes on special significance in forensic assessments. Most legally relevant functional abilities probably will be conceptualized as relatively stable attributes (though modifiable with therapeutic or educational intervention). Typically, then, one might want FAIs to demonstrate acceptable coefficients of stability. Yet forensic assessments often occur at times and in places that subject examinees to unusual stress or affective arousal. In addition, the anticipated consequences of legal decisions may motivate some examinees to perform worse (malingering), or to exhibit more socially desirable responding (dissimulation), than is representative of their typical performance or attitudes.
All of these sources of error produce special challenges for evaluating the reliability of FAIs. Further, the above examples point up the importance of examining the reliability of a FAI by obtaining test samples in the types of settings, and with the types of populations, for which the instrument eventually will be employed. Reliability coefficients based on administration of a FAI to college sophomores cannot be trusted to provide estimates of measurement error that are meaningful for use of the instrument in pretrial examinations of criminal defendants or parents seeking custody of a child.
All FAIs should demonstrate acceptably low levels of error variance associated with examiner administration and scoring. Later chapters will refer to "inter-examiner reliability" when evidence addresses error due to different administrators, and "inter-scorer or inter-rater reliability" when error is assessed as a function of different scorers or raters of test samples. These are two distinctly different sources of examiner error, although test developers often do not separate the two in their calculations of examiner- related reliability.
Norms
The special purposes of FAIs do not call for any single standard concerning the development of normative data. Different purposes will call for different standards, and some of these require explanation.
First, some FAIs might be constructed to describe examinees' abilities in an absolute sense. That is, the purpose may be merely to describe that which the examinee can and cannot do, rather than to express the level of performance in relation to other persons. Instruments of this type, of course, are referred to as "content-referenced" or "criterion-referenced," rather than "norm-referenced" instruments. They do not require the development of sample distributions of scores with which to compare the performance of an examinee. We will return in a moment to discuss certain values of normative data in forensic situations. The matter of content- and criterion-referenced instruments for forensic assessments, however, requires special comment.
When an instrument is used in a content-referenced manner, an examinee's score is expressed merely as some proportion of the continuum of possible scores on the instrument. For example, on an instrument assessing a defendant's knowledge of trial proceedings, one might report that the defendant correctly answered 75% of the items related to functions of trial participants and 45% of the items dealing with trial procedures. One could also, of course, report the specific things that the defendant seemed not to know, that is, the content of items incorrectly answered. This type of evaluation does not rely on comparison to any other external criteria. A court receiving this information would be left to consider the overall adequacy of the defendant's understanding according to any discretionary standard that it wished to apply. This type of data might be quite appropriate for many legal situations. Indeed, it is consistent with the perspective discussed in Chapter 2 (see Judgmental component): that is, description, rather than evaluative statements about the sufficiency of examinees' abilities, may represent the most appropriate or important function for examiners in legal proceedings.
In contrast, instruments that are called criterion-referenced employ an external criterion with which to evaluate the quality of an examinee's score. Often this external criterion is some index of the degree of performance necessary to satisfy requirements in a situation external to the test. For example, examinees taking an exam for state licensing as a psychologist may have to obtain a particular threshold score in order to qualify, or an employer may require that applicants for a clerical position must type or take shorthand notes at a specified rate of words per minute. The cutoff score generally is set by a standard-setting group that makes a discretionary judgment that will apply across examinees. This approach is basically non-normative; the criterion score, not examinees' performances in relation to each other, determines the evaluation.
In one sense, criterion-referenced interpretations of FAIs are in conflict with requirements for forensic assessments outlined in Chapter 2. The Interactive component of legal competence constructs suggests that degrees of examinees' functional abilities are not to be viewed as sufficient or insufficient in and of themselves. Instead, sufficiency depends on a comparison of the person's degree of ability to the performance demands of their specific environmental situation (e.g., the demands of their upcoming trial, the needs of a particular child). Law does not instruct judges to consider any particular level of ability as indicative of competence or incompetence across cases. From this perspective, neither the mental health professional nor the judiciary is authorized to set cutoff scores as determinants of examinees' legal competencies.
On the other hand, one can argue that both mental health professionals and judges are free to determine cutoff scores for other purposes. For example, they might set a cutoff score on an instrument intended merely to screen defendants for those who are in need of more extensive evaluation for competence to stand trial. The cutoff score might be set conservatively, so that it screens out only defendants who are most clearly competent.
The danger in cutoff scores for FAIs, therefore, lies in their use. A test developer is not violating professional standards by publishing a cutoff score, if the developer satisfies the requirements to make explicit the acceptable and unacceptable uses of such criteria. On the other hand, test developers cannot control the use of their instruments, and the publication of cutoff scores inevitably will lead some examiners and legal professionals to apply them as definitions of a competence decision, rather than as one type of data among many for making a decision. For purposes of evaluating FAIs, therefore, one must consider whether the test developer may be endorsing or encouraging (without intent) the misuse of a FAI by adopting the criterion-referenced approach described above.
Turning now to norm-referenced considerations, we note that various legal questions may be addressed with a comparison of the examinee's abilities to those of normative groups. One major question will be the choice of a sample on which to develop test norms. Test developers and users typically are aware of the hazards of comparing examinees' performances to norms based on samples that differ markedly in their sociodemographic or other characteristics from those of the examinee.
This matter is not settled, however, by a simple admonition to avoid such a comparison, because the legal decision process sometimes might require it. For example, a court might wish to know the degree to which a juvenile's understanding of certain rights (for purposes of evaluating validity of waiver of rights) is different from that of adults. Thus an adult sample may provide the appropriate normative data for evaluating a juvenile's ability. In another instance, a court might recognize that practically no defendant will know everything about the roles and functions of participants in a trial. It may be helpful to obtain a perspective on a particular defendant's knowledge of these matters in relation to that of the "average person." Sometimes this "average person" will be represented better by random samples of the general population than by sampling from within populations with defendants' typical sociodemographic characteristics.
The general standard, then, is to select samples for development of FAI norms with careful attention to the intended use of the norms in legal settings. These might vary considerably from one area of legal competence to another.
Construct Validation
Historically, psychometricians attempted to identify several types of validity or validation procedures, making distinctions between content validity, concurrent validity, predictive validity, and construct validity. Each of these validities is somewhat different in its meaning and implication for the purposes for which tests may be used. Among them, however, the notion of construct validity is primary. As noted by Messick, "Construct validity is... the unifying concept of validity that integrates criterion and content considerations into a common framework for testing rational hypotheses about theoretically relevant relationships" (Messick, 1980, p. 1015). Thus the validity of FAIs will be reviewed in this book without making fine, categorical discriminations between the traditional types of validity, including them generally under the heading of construct validation.
One argument against this choice might be the importance generally attached to predictive validity. This term refers to the degree to which an instrument is successful in predicting some criterion event, examinee behavior, or examinee standing on some other variable, given a time interval between examination and the future criterion index. A special interest in the predictive validity of FAIs is understandable, because legal competence constructs often refer to future consequences (e.g., subsequent trials, the future rearing of a child).
Therefore, the outline for evaluating FAIs in later chapters includes a special category entitled "Predictive or Classificatory Utility," which reviews studies that demonstrate what is generally referred to as predictive validity. Those studies, of course, will contribute also to our overall assessment of the construct validity of an instrument. The term utility was chosen, however, purposely to draw attention away from the tendency to perceive predictive validity as the sine qua non for instrument validity. The reasons that predictive data should not play such a strong role in overall evaluation of FAIs will be made clear in the discussion of utility, to which we will turn in a moment.
Construct validation refers to an accumulation of evidence concerning the degree of confidence with which a FAI can be interpreted as an index of the functional ability concepts that it claims to define operationally. Construct validity is not an absolute condition. Contrary to some test authors' claims, no instrument is ever simply "valid." At any given time, the evidence accumulated merely increases or decreases our confidence in the FAI as an operational definition of its functional ability concepts.
Many types of evidence can contribute to a FAIs construct validity (e.g., factor or item content analysis, and various methods for examining concurrent and predictive validity). One of the most important types of evidence, however, is the relation between the FAI as a measure of an ability construct and indexes of other psychological constructs that are expected on theoretical grounds to be related to the ability construct.
Let us refer back to Figure 1 to clarify this point. We noted that functional ability concepts for a FAI (C) are selected with close attention to the domain of a legal competence construct (A). In addition, though, the ability concepts should be defined with some conceptualization of their assumed relation to theories, constructs, and empirical findings in psychology and psychiatry (B) that are used to describe and understand human behavior generally. If this process has been carried out conscientiously, then we are prepared for the construct validation process noted earlier. That is, we can examine whether our FAI index of the functional ability (C') relates (dotted line) to indexes of the theoretical constructs in psychology (B') in the hypothesized manner.
To provide an example, imagine a FAI that operationally defines "comprehension of Miranda rights" by obtaining and scoring examinees' explanations of the meaning of the Miranda warnings (Grisso, 1981). We might expect that general intelligence or cognitive developmental maturity contributes to Miranda comprehension, but that the ability to understand these specific message contents may be influenced by other variables as well: for example, amount of past exposure to the warnings. Thus Miranda comprehension as an ability concept is perceived as related to the general cognitive constructs, but not necessarily overlapping them completely. All of these assumptions may be tested by comparing FAI scores to scores on measures of general intelligence, cognitive developmental maturity, some definition of "amount of prior exposure" (e.g., number of prior arrests), and so forth. The results contribute to construct validity—a sense of the meaning of Miranda comprehension as measured by the FAI—if the pattern of results emerges as expected.
In addition to providing general support for the instrument, results of this type become especially useful in relation to the causal characteristic of a legal competence construct (see Chapter 2). Construct validation research provides the logic with which an examiner may support an explanation for an examinee's functional deficits as measured by a FAI, or for considering the plausibility of various possible explanations. Thus when causal information and reasoning are requested by the court, the examiner may be able to use theory in an empirically informed, less speculative manner.
Predictive or Classificatory Utility
The category of predictive or Classificatory utility considers the utility of a FAI for identifying persons who, at a later time, engage in a particular behavior or are classified by other means as having a particular psychiatric or legal status. For example, a FAi may be examined for its ability to predict a later manifestation of the functional ability, behavior, or attitude that it claims to assess, as when an examinee who manifests certain deficiencies in child-rearing abilities on a FAi later manifests deficiencies in actual child-rearing practice. Other examples would include the relation between a FAi for abilities related to competence to stand trial, and later performance of the defendant in the courtroom, or later judicial decisions about legal competence or incompetence of defendants. Special types of analyses (e.g., Receiver Operating Characteristics, or ROC, analysis) are available to demonstrate the degree to which a FAi performs better than chance in classifying examinees according to the criterion behaviors that one intends to predict.
When a FAi can be related empirically to future events, this contributes to its construct validity and to its possible utility for assisting the legal system in anticipating future consequences. Further, from both a scientific and public point of view, there is probably no other empirical evidence of an instrument's integrity that is more impressive than its demonstrated ability to predict the future accurately.
Nevertheless, FAis should not be required to stand or fall on the basis of their predictive utility. in fact, the following discussion will argue that in the case of FAis, predictive utility:
• may not be possible to test
• when it can be tested, is not sufficient by itself to justify predictive uses
• is not essential in relation to legal definitions or scientific standards
• is not a rational expectancy in light of current knowledge concerning the determinants of behavior; and
• is not appropriate, when the objective is the prediction of a legal (judicial)decision.
First, various circumstances pertaining to legal procedures and criteria suggest that we may not be able to test the predictive utility of some FAIs. A few examples will be offered, and others will arise in subsequent chapters.
For example, one would hope that functional deficits on a FAI that assesses abilities for competence to stand trial would be related to defendants' actual performances in subsequent trials. Yet if defendants manifest serious deficits in the course of a forensic examination, they are not likely even to reach trial until the court has evidence that the deficits have been remediated. As a consequence, researchers may never have the opportunity to test the relation between deficits measured by a FAI and behaviors observed in actual trial situations.
For some FAIs, no future criterion can be used in validity studies because the appropriate criterion event is in the past. For example, a FAI that measures capacities to understand Miranda warnings is intended to assist courts in judging whether suspects were able to understand their rights as described to them when they were arrested by police and prepared for interrogation. In order to test the utility of the FAI as a post- dictive indicator of understanding, the researcher would need to compare FAI performance to measures of understanding manifested by suspects earlier, at the time of arrest. This would be very difficult in light of the uncontrollable circumstances and variations in the arrest and interrogation process. In summary, the real world might not allow one to test the predictive or postdictive validity of FAIs.
Second, the predictive power of a FAI is not sufficient by itself to justify or support its use as a predictive tool for legal or scientific purposes. Measurement theorists generally are in agreement that an empirical relationship between a measure and a future event does not justify the instrument's relevance or use if there is no underlying rationale for the relationship (e.g., Messick, 1975). This may be true especially in legal circumstances, where evidence must pass a legal test of probative value with regard to the question at hand. For example, imagine that one found that defendants' competence to stand trial could be predicted with high accuracy by adding their height to the number of state capitals they could name. No matter how powerful the prediction, it would not pass legal scrutiny concerning a rational or reasonable relationship between the scores and the issue at hand. In scientific terms, the measure would lack the support of underlying constructs that provide a cogent rationale for the relation of the index to the criterion.
Third, FAIs do not necessarily need evidence of predictive utility in order to be used to assist legal decision making. Construct validity itself can justify the use of an index for contributing to decisions when it is not possible to do critical predictive validity studies. One must simply avoid using or referring to the instrument as though it is known to account for some substantial proportion of variance in future performance, or as though it can provide actual probability estimates of future behaviors.
Fourth, current theories of the determinants of specific behaviors would not lead one to expect that any measure of personal attributes alone will produce accurate behavioral predictions. Considerable research has shown that a given individual does not behave consistently across all environmental and interpersonal situations. Instead, situations themselves elicit, modify, or inhibit the influences of personal attributes on behavior (Mischel, 1984; Monahan, 1981). Thus we should not require a FAI that measures personal abilities alone to have a high degree of predictive power; at best it might make some contribution to prediction when used with situational variables.
Similarly, a FAI might be unable to predict specific outcomes accurately, yet it might identify people who are at greater risk for certain future outcomes. For example, if a FAI is correct in predicting future child abuse in only 20 out of every 100 cases having a high score on the instrument, this may still be useful information for certain purposes if the base rate of child abuse in the catchment population is much lower than 20%. The FAI would be a very poor predictor of child abuse; yet it could serve an alerting function for certain purposes.
Finally, it is inappropriate to judge the predictive utility of a FAI on the basis of its ability or inability to predict judicial decisions about legal competence or incompetence. FAIs are not (or should not) be constructed for the purpose of making such predictions. Chapter 2 discussed legal competence decisions as depending on a consideration of person-environment interactions and incongruencies (interactive characteristic), and moral senses of justice (judgmental characteristic). In contrast, FAIs seek only to define and measure functional abilities that are relevant for these legal decisions. Judicial decisions to which the FAI is compared may have taken into account far more variables than the FAI was intended to measure. It is even possible that legal decision makers who form the predictive criteria in a validation study might have failed to take into account the very abilities that the legal standard requires and that the FAI claims to assess. Thus FAIs should not necessarily be evaluated negatively when they cannot mimic judicial decisions.
Potential for Expressing Person-Situation Congruency
The person-situation congruency standard for evaluating FAIs is the only one that does not generally appear in standards applied to most psychological instruments. It is offered here not as an essential standard, but as a quality that may enhance the value of a FAI. The standard is related to the Interactive component of legal competence constructs. As described in Chapter 2, legal decisions about competencies depend in part on incongruency between an examinee's functional ability and the degree of demand for that ability in a specific environmental context faced by the examinee. For example, courts may consider not only a defendant's degree of ability to inform a lawyer about matters related to a defense, but also the degree to which the anticipated trial may require this ability. Similarly, a court may weigh the importance of a parent's pattern of child-rearing abilities or deficiencies against the caretaking needs of a specific child.
These comparisons of ability to situational demand suggest the need for assessment methods to describe both sets of information. Further, as noted in Chapter 2, especially desirable would be instruments that assess both an examinee and a specific situation using parallel sets of concepts or dimensions: for example, assessing a parent's ability to provide structure and a child's degree of need for structure.
Subsequent chapters will provide examples of parallel, personenvironment assessment instruments that can address certain legal questions, although very few instruments currently offer this opportunity. When evaluating most FAIs, therefore, we will consider their potential to be translated into parallel dimensions for describing environmental contexts. A FAI that offers this potential has a better chance, given further development, to assist examiners in addressing the interactive questions in legal competence constructs.
Orientation to the instrument reviews
The five categories of standards that we have just discussed will be used in the subsequent reviews of FAIs (Chapters 4 through 9) pertaining to six legal competencies. Before proceeding to those reviews, it may be helpful to describe the manner in which instruments were selected for review, as well as the standardized outline that is used across reviews.
Discovery and Selection of the Instruments
For the first edition, a national mailed survey of forensic mental health professionals had been used to discover instruments for review, because in the 1980s, there were few published sources of information about forensic assessment instruments. in contrast, current books, journal articles, and internet search options provided ample resources for identifying instruments related to the six legal competencies of interest. Thus the national survey was not replicated in preparation for the second edition.
Certain general criteria were employed to select instruments for review:
• The instrument was developed specifically to address a forensic question of legal competence in one of the six areas.
• The instrument offered a method for expressing results in quantitative form.
• A published manual was available for the instrument, and its development was described in at least one journal article, book, or monograph.
In general, the choice of instruments to be included in the second edition was more selective than in the first edition. Far fewer instruments were available in the 1980s, and often instruments were included in the review even when they had not been developed for forensic purposes (and their use in forensic cases was unknown). In the second edition, the greater number of instruments available made it necessary to focus primarily on instruments that were developed with the intention for forensic use.
The selection of instruments also required somewhat different considerations in each of the assessment areas addressed in the book, resulting in the exclusion of certain instruments. These special selection criteria, and some of the instruments excluded from review, are noted in each of the review chapters.
Selection of instruments for this review should not be considered an endorsement of their value in forensic assessments. The purpose was to review how FAIs are developed, and to review current evidence for their utility and directions for further refinement. In most instances, the reviews provide no summary judgment concerning the overall quality of the various instruments. Judgment frequently cannot be passed on the basis of the characteristics of an instrument alone. Its acceptability will depend also on the specific situation and purpose for which its use is being considered. An instrument may be acceptable for some purposes and not for others. Clinicians and legal professionals themselves must make those judgments, weighing the qualities of the instruments as described here against the demands of specific circumstances that arise in their practice.
Purposes and Outline of the Review Chapters
The objective of each of the next six chapters (Chapters 4-9) is to review forensic assessment instruments:
• to test the value of the conceptual model of legal competence constructs (Chapter 2) as a tool for structuring forensic assessments and examining FAIs
• to review existing FAIs as case studies of test development and application, in order to identify issues and potential solutions to problems in the development of specialized FAIs, and
• to evaluate the utility and limitations of FAIs in their current state, for application by mental health professionals in forensic examinations.
Each of the six chapters achieves these purposes according to an identical outline with three major sections: The Competence Question; Evaluation of the Forensic Assessment Instruments; and Current Status of the Field.
The Competence Question
The first section of each chapter provides a description of legal and assessment issues associated with the legal competence with which that chapter is concerned. It has two subsections:
• Law and Current Practice: This subsection identifies the history, intent, and statutory definition of the legal competence ("Legal Standard"); legal and empirical information on the process for arriving at competence decisions ("Legal Process"); and current assessment practices by mental health professionals, as well as commentary and recommendations for assessment practice that exist in
• the literature ("Competence Assessment: Current Practice").
From Legal Standard to Forensic Assessment: This subsection applies the five components of legal competencies (as defined in Chapter 2) in an analysis of the legal competence in question. This analysis is used to describe that which is required of forensic assessments to increase their legal relevance in future practice.
Review of Forensic Assessment Instruments
A major section on forensic assessment instruments reviews each of the selected FAIs separately. The review of each FAI follows the outline for evaluating FAIs described earlier in this chapter:
• Basic Description and Objectives
• Conceptual Basis (conceptual and operational definition of the legally relevant functional abilities)
• Psychometric Development (standardization, reliability, norms)
• Construct Validation
• Predictive or Classificatory Utility, and
• Potential for Expressing Person-Situation Congruency.
Current Status of the Field
A final section in each chapter provides a synthesis and discussion that uses the FAIs in the previous review to identify critical issues for FAI development and use in the legal competence area in question. It is divided into two subsections:
• Research Directions: Focus is on issues in the research and development of FAIs. The subsection is organized according to the five characteristics of a legal competency construct (Chapter 2).
• ClinicalApplication: This subsection summarizes the uses and limitations of the FAIs, in their current state, when employed by forensic examiners. Emphasis is on guidelines for the FAIs collectively within the legal competence area in question, although special suggestions for certain individual instruments are noted as well. The subsection is organized according to four general objectives of assessments: Description of an examinee; Explanation for abilities and deficits; Prediction of an examinee's behavior; and Examiner Conclusions concerning the implications for the assessment results for the questions facing the legal decision maker.
Finally, Chapter 10 represents a synthesis across the Current Status sections of the six review chapters. It uses the material in the Research Directions and Clinical Applications subsections in each of the previous chapters to achieve the broadest level of generalization concerning recommendations for the future development and use of FAIs.
This page intentionally left blank
More on the topic STANDARDS FOR FORENSIC ASSESSMENT INSTRUMENTS:
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- REVIEW OF FORENSIC ASSESSMENT INSTRUMENTS
- USING FORENSIC ASSESSMENT INSTRUMENTS
- FORENSIC ASSESSMENT INSTRUMENTS
- Developing and Using Forensic Assessment Instruments
- DEVELOPING FORENSIC ASSESSMENT INSTRUMENTS
- Grisso T.. Evaluating Competencies: Forensic Assessments and Instruments. 2nd edition. — Springer,2002. — 564 p., 2002
- PM STANDARDS AND INVESTMENT ANALYSIS