Concepts in Psychological Assessment
by Laljit Sidhu
Introduction
Psychological assessment is a broad sub-specialty within clinical psychology. In
order to understand the nature of assessment, a few terms and concepts must
be understood. This page serves to provide a basic review of these concepts.
Types of Evaluators
Psychological assessment has been at the core of applied psychology; and historically, applied psychologists often
did little more than assessment. However, with increased specialization, those who provide assessment services has
become limited. Today, most states and provinces have governing boards which limit the provision of
psychological assessment services to licensed practitioners, and the breadth of services they can
provide
is limited by the level of training. Generally, academic psychologists (i.e., those who are not licensed clinicians) cannot provide assessment
services outside of purely research purposes. Individuals often sub-specialize even within assessment:
Clinical Psychologists are those individuals who provide diagnostic and treatment-oriented evaluations,
often at the request of other therapists or patients. They are often the broadest range of assessment
services and function in various settings, ranging from private practice offices to major medical hospitals.
Forensic Psychologists are more limited in the types of evaluations performed and settings in which
services are rendered. Often called upon by courts and/or attorneys, forensic psychologists evaluate
individuals for very specific purposes, such as competency to stand trial, mental status at time of offense
(for insanity pleas) in criminal cases, custody evaluations for family courts, and disability evaluations for
administrative hearings to name just a few. Due to the complex nature of forensic evaluations, these
psychologists often receive advanced training in forensic evaluation.
School Psychologists vary in the degree of training and the nature of the evaluations performed. Some
have doctorate level degrees (PhD. or EdD) and provide a broad range of services equivalent to
clinical psychologists, whereas others are trained at the master's level and are limited to the provision of
educational and intellectual assessment. Limitations in breadth are accounted for by the depth of
training received in their area of expertise. As the name implies, these psychologists
often function
in public and private school settings.
Industrial/Organizational Psychologists provide services to industry, business, and various public and
private organizations. In regards to assessment, these services often focus on pre-employment
screenings and job-performance evaluations. Generally doctorate level psychologists, the types of
evaluations are limited only by the nature of the setting.
PhD's and PsyD's
Traditionally, doctorate level psychologists have been awarded PhD's (Doctorate in Philosophy), reflecting the
academic nature of the profession. However, with increasing numbers of psychologists pursuing applied degrees,
institutions began awarding the PsyD degree (Doctorate in Psychology). The major difference between the two
degrees relates to the type and level of training. PhD psychologists have a greater emphasis on research, whereas
PsyD psychologists often focus on application; Depending upon the training institution, PsyD's may also complete
programs earlier than PhD psychologists because of less rigorous publication and research demands. As to which
degree is better suited at providing psychological services, there is currently little to support either declaring
superiority and any discussions toward this issue often result in little more than insulting diatribes and self-serving
criticisms of each other. For practical purposes, both degrees are recognized throughout the world and
psychologists with either degree can be licensed and have identical professional privileges.
Types of Evaluations
Psychological evaluations occur in any setting where it is necessary to objectively evaluate human behavior. However, evaluations are often categorized into the following domains:
Diagnostic Evaluations focus on assessing the normal and abnormal nature of a person's overall
functioning. The specific nature of these evaluations vary depending on the issues to be addressed, but they involve at least the following components: a clinical interview, a mental status examination, and one
or more psychological tests. These evaluations can be brief, with the psychological testing limited to one
or two short questionnaires which focus on specific concerns, or they can be long, comprehensive
evaluations consisting of lengthy objective and projective measures of personality. Sometimes
measures found in other evaluations may be included for more breadth, such as intellectual assessment
instruments or brief neuropsychological screening measures.
Forensic Evaluations are often very limited in scope, with the specific nature of the evaluation varying on
the clinical and legal issues at hand. As with diagnostic evaluations, these include a clinical
interview, but may or may not involve a mental status or psychological testing. When psychological testing is
included, psychologists often take measures to evaluate the accuracy of the testing in order to account
for possible deception. Because of their limited scope, these evaluations are often short and quick. However, custody evaluations can often prove to be the exception. Family courts and/or private
attorneys often require detailed information regarding almost everyone involved in a custody dispute to
come to equitable arrangements. These evaluations may involve testing step-parents and distant
third-parties that may only be minimally involved (e.g., a new companion of a parent or grandparents
that have involvement with the children, etc.). Thus, these evaluations can be lengthy and very detailed,
involving interviews, testing, and periods of family observation.
Neuropsychological Evaluations are often the most focused, but lengthiest of evaluations. Designed to
assess the neuropsychological functioning of an individual, these evaluations consist of batteries of tests
that can require up to 8 hours to complete. Comprehensive in depth and breadth, they focus on the
process of neurological functioning and should not be confused with neurological tests, often performed
with MRI's and/or CAT scans, which focus on structure. Neuropsychological evaluations are
performed by trained neuropsychologists and require extensive training for appropriate administration
and interpretation. The two most commonly used batteries are the Halstead-Reitan
Neuropsychological Battery and the Luria-Nebraska Neuropsychological Battery. There are cases
where a complete battery is not necessary and brief screenings suffice; however, in such cases the
screening is part of a more comprehensive evaluation focusing on another issue.
Educational Evaluations are often the most limited in scope, focusing only on the intellectual functioning
of individuals, with limited forays into areas that may impact the educational requirements of the
person. These evaluations are generally performed in schools and are often done to determine the
services a child may need, the type of class in which he or she should be placed, and/or to determine
the type of guidance the child may need in the future.
Vocational Evaluations focus on helping individuals make career choices whether as adolescents and
young adults trying to determine a possible career path or adults considering a change. Such testing
may occur in college counseling centers or in employment settings.
Components of an Evaluation
Clinical Interviews are the primary method of gathering psychological information and no assessment is
complete without one. Interviews vary from completely unstructured approaches in which the clinician follows the story told by the individual to structured interviews in which specific questions are asked in a
precise order. However, most interviews lie somewhere in-between these extremes. The core of a
clinical interview is history gathering, focusing on the development of the person and on the
development of the presenting problems. The depth to which this history is explored depends on the
nature and context of the evaluation.
Mental Status Examination is often conducted as part of the clinical interview or may not even be
directly addressed if the clinician is able to assess the mental status of a patient from observation alone.
A mental status is a means of assessing the person's current thought processes, emotions, and
interpersonal qualities. An individual's mental state can impact the rest of an evaluation and provides a
clinician with a gauge to qualitatively assess and interpret data from other areas of an evaluation. The
mental status can also provide clues to areas that may need to be addressed in follow-up sessions or
outside referrals.
Objective Personality Tests are paper-and-pencil self-report inventories that consist of true-and-false
or multiple choice questions. They come in a variety of forms, from lengthy measures of global
functioning, such as the Minnesota Multiphasic Personality Inventory (MMPI-2) and the Personality
Assessment Inventory (PAI) to short measures focusing on specific concerns, such as the Beck
Depression Inventory (BDI-II). Generally no more than one global measure is included in a clinical
battery and this measure often forms the core of such a battery. Furthermore, global measures are
available in a number of different forms that can be used with different age groups.
These measures often have strong psychometric properties and are interpreted by comparing them to
data gathered from a population sample that is considered the norm. The degree to which this
comparative population is truly the norm has implications for accurate interpretation and this is one of
the reasons that measures such as these are continually revised. For example the original MMPI was
normed on a 1950's population that was no longer reflective of the current United States census,
resulting in a re-norming when the MMPI-2 was developed.
Projective Personality Tests are a diverse set of tools. The commonly used approaches include
inkblots (i.e., the Rorschach) in which the individual must describe what is seen in a given inkblot,
story-telling tests such as the Thematic Apperception Test (TAT) in which stories are told regarding a
series of pictures, word association tests, and drawing tests such as the Draw a Person or the Kinetic
Family Drawing. These measures are often used as supplements to objective tests, with the
Rorschach one of the most frequently administered measures.
These tests are controversial in their psychometric properties, and most clinicians define them as clinical
tools rather than tests. Interpretations are often based on clinical judgment
with only minimal
objectivity. A notable exception is the Rorschach, for which admirable attempts have been made
toward objectifying this test. However, projective measures are commonly employed and, in the
hands of a skilled clinician, found to yield clinically relevant ideograph data that could not be assessed
using other methods.
Aptitude Tests are measures specifically designed to assess an individual's cognitive and intellectual
functioning. These tests can be divided in two sub-categories: intelligence tests and achievement tests. The former measure a person's intellectual functioning in terms of their ability
to learn and provide
information in the form of an IQ, while the latter measure what a person has learned and provide
information in terms of grade equivalents. As with other psychological measures, these tests can be
lengthy and comprehensive or short screening measures. Commonly used intelligence tests include the
Wechsler Adult Intelligence Scale III (WAIS-III) and the Wechsler Intelligence Scale for Children
III (WISC-III), while major aptitude tests include the Wechsler Individual Achievement Test (WIAT)
and the Wide Range Achievement Test (WRAT).
The historical antecedents of modern day testing, these tests are considered the hallmarks of
psychometric strength. Yet, controversies abound regarding their use: How does one define
intelligence? Is IQ an accurate measure? Are intelligence tests fair to minorities?
Specialty Measures include any number of tests designed to address specific questions. These types of
tests may compose a battery unto themselves (e.g. the special neuropsychological tests that make
up the
Halstead-Reitan) or may be specialized instruments that supplement other measures (e.g. measures
designed to assess for deception in forensic evaluations or tests designed to answer a specific legal
question).
Psychometrics
Psychological assessment is a science based on objectively measuring characteristics of human behavior. In order to
do so, psychological measures must meet certain criteria to be considered objective measures. A complete
discussion of psychometric theory is beyond the scope of this brief exploration, but the following characteristics are
important to consider.
Norms are used as a reference against which psychological test data is interpreted. They consist of the
test performance of a standardization sample that is reflective of the general population under
consideration. Without normative data, the information obtained regarding a person is meaningless. Norms provide a means of assessing a person's relative standing in comparison to others. There are
various types of norms that serve specific purposes, with the two most common types presented here:
Development norms are used to assess how far along the normal development path a
person has progressed. They include comparing IQ and grade equivalence.
Within-group norms are used to evaluate a person's performance in terms of a similar
comparison group. For example, comparing a child's performance to others his or her
age, or comparing a schizophrenic patient to other schizophrenics.
Reliability refers to the consistency of scores obtained by a person when re-examined. The degree to
which a test is reliable defines the accuracy with which it assesses the person. A test which is reliable
does not necessarily measure what it is supposed to (this is validity to be addressed next). For example, a
person who consistently throws darts on the border of a dartboard is reliable in her performance,
however she is not valid in the sense that she is not doing what is supposed to be done (hitting in the
scoring region of the board). There are different methods for assessing reliability, each of which is
useful under different circumstances:
Test-retest reliability is determined through the administration of an identical test over
different occasions and shows the extent to which scores on a test can be generalized over
different administrations. Although generally useful, this type of reliability has limitations. Some behaviors fluctuate extensively and it is necessary to take this into consideration
when using test-retest reliability. More seriously, are issues related to the impact practice
may have on a test. This type of reliability is useful for measures of stable personality
traits, but not for measures of aptitude, where practice severely impacts performance on
future administrations.
Alternate form reliability combines test-retest reliability with the administration of two
different versions of a test. In using this method of reliability analysis, it is imperative that
the tests truly be parallel versions.
Split-half reliability uses statistical procedures to determine reliability from the single
administration of one form of a test. This is the most commonly used approach for
determining reliability with aptitude tests and the most effective. It is also appropriate in
cases where the trait assessed is likely to fluctuate extensively.
Validity addresses what a test actually measures and how well it does that, and tells the examiner what
can be interpreted from test scores. Tests are validated in regards to a particular use - one cannot say
that a particular test has "high" or "low validity" in general terms (however, in common parlance it is
common to refer to an established test as having high validity because it is commonly understood what
the test measures). Basically, the validity of any test is determined by comparing it to another test or
some observable fact (i.e., validity is always based on external relationships). The types of validity are
as follows:
Content validity refers to the systematic determination of whether the content of a test
measures the traits that it is designed to measure. This type of validity is built into the test
when it is constructed through the selection of appropriate items.
Related to content validity is face validity, which measures the degree to which a test superficially appears to
measure the trait at hand. Although it is considered desirable for a test to have face
validity, this may not always be the case. For example, on measures geared toward the
assessment of malingering and deception, low face validity may aid in more effective
detection.
Criterion validity refers to the degree to which a test predicts the person's performance on
future, specified activities. In such cases, the performance on the test is compared with
performance on the predicted task.
Construct validity refers to the degree to which the test measures the underlying theoretical
construct. Such validation is often the core of theoretically derived tests such as the MCMI-III and the PAI in personality assessment, or the Woodcock-Johnson Revised
Educational Battery in aptitude assessment.
~~~~~~~~~~
Source - This article appeared on the former web site
Psychological Assessment Online,
and has been posted here with the author's permission.
back