Convergence Between Thematic Apperception Test
Convergence Between Thematic Apperception Test
Convergence Between Thematic Apperception Test
22826
RESEARCH ARTICLE
KEYWORDS
five‐factor model, personality assessment, personality assessment
inventory, projective methods
1 | INTRODUCTION
The longstanding history of lack of convergence between self‐report and performance‐based measures, deemed
by Bornstein (2002) as the heteromethod convergence problem, continues to represent a controversial issue in
personality assessment. While compelling hypotheses for the poor convergence of these methodologies have been
offered (Bornstein, 2002, 2011; Meyer, 1997; Teglasi, 2013), counterevidence for these hypotheses has also been
demonstrated (Archer, 1996; Krishnamurthy, Archer, & House, 1996), and to date there remains no definitive
explanation for the widespread observance of weak heteromethod interrelationships. However, one additional
hypothesis regarding the role of assessment method variance warrants further investigation. Specifically,
differences in response format may be an important factor precluding the convergence of self‐report and
performance‐based instruments, and it is plausible that adapting performance‐based methodologies to more closely
resembling contemporary structured assessment approaches may lead to greater observed convergence between
these distinct methods.
In an exploratory study of this hypothesis, Morey and McCredie (2018) investigated correlations between a
multiple‐choice adaptation of the Rorschach Inkblot Test, the Amplified Multiple Choice Test (Harrower & Steiner,
1951), and two self‐report instruments: the Personality Assessment Inventory (PAI; Morey, 1991), a multiscale
measure of personality and psychopathology, and the International Personality Item Pool (IPIP) Representation of
the Revised NEO Personality Inventory (IPIP‐NEO‐50), a measure of the five broad personality domains. Contrary
to previous findings regarding poor convergence between the Rorschach and self‐report instruments such as the
Minnesota Multiphasic Personality Inventory (e.g., Archer & Krishnamurthy, 1993), hypothesized interrelationships
of considerable magnitude were observed. Indeed, these hypothesized correlations were on average of similar
magnitude to previously demonstrated average relationships between the Rorschach and externally assessed
indicators (e.g., Mihura, Meyer, Dumitrascu, & Bombel, 2013), suggesting that the adapted multiple‐choice version
correlated with self‐report constructs to a similar degree as might be expected between standard Rorschach
scoring and behavioral responses. These preliminary findings suggest that it is feasible for similar constructs on
self‐report and performance‐based instruments to correlate meaningfully when comparable structured response
formats are utilized, thus offering evidence for the role of method variance in the convergence of these
methodologies.
The Thematic Apperception Test (TAT), developed originally by Morgan and Murray (1935), is an additional
performance‐based methodology that like the Rorschach has been a subject of longstanding controversy (e.g.,
Hunsley, Lee, & Wood, 2003; Lilienfeld, Wood, & Garb, 2000, 2005). Despite evidence that the TAT maintains
equivalent psychometric properties to other personality measures (Meyer, 2004), the reliability and validity of the
TAT have proven to be particularly difficult to study systematically given the diverse array of available
administration and scoring systems (Keiser & Prather, 1990). In addition, poor reliability has been noted due to
person‐system interactions, such that individuals’ personal backgrounds interact uniquely with the stimuli, as well
as drive saturation, such that individuals who respond to a particular drive on one card are less likely to respond as
highly to that drive again on the next card, leading to a fluctuating pattern of responses (Gruber & Kreuzpointner,
2013). Importantly, however, low internal consistency stemming from ipsative variability has been shown to be
associated with better criterion validity, thus rendering typical reliability estimations a challenge for implicit motive
measurement. As such, some researchers have suggested that the psychometric criteria used to evaluate
structured, self‐report instruments are not necessarily appropriate for open‐ended performance‐based methods
(Gruber & Kreuzpointner, 2013; Jenkins, 2017).
Questions of validity have also been raised given the uncertainty as to how many mental health practitioners
are using standardized scoring systems in their interpretations of the TAT at all (Lilienfeld et al., 2000, 2005).
Nonetheless, several specific TAT scoring systems have garnered substantial empirical support. One of the most
empirically‐supported uses of the TAT has been in the assessment of achievement motivation, using a scoring
protocol originally developed by McClelland and Koestner (1992) and McClelland, Atkinson, Clark, and Lowell
(1953) to assess Murray’s (1938) need for achievement. A meta‐analysis conducted by Spangler (1992) reported
that achievement motivation as assessed by the TAT and self‐report questionnaires were both predictive of
operant outcomes, with the TAT on average demonstrating superior predictive validity. Similarly, indicators of
achievement motivation as assessed by the TAT and self‐report questionnaires have been shown to demonstrate
comparable validity as predictors of entrepreneurial performance and career choice (Collins, Hanges, & Locke, 2004).
McCREDIE AND MOREY | 3
The Social Cognition and Object Relations Scale‐Global rating method (SCORS‐G; Stein & Slavin‐Mulford, 2018) has
also emerged as a relatively new narrative scoring system that when applied to the TAT has shown promising
sensitivity to a variety of psychological phenomena, including cognitive, psychopathology, and personality functioning
(Stein, Slavin‐Mulford, Sinclair, Siefert, & Blais, 2012). SCORS‐G ratings of TAT responses have also been shown to
correlate with important external outcomes, including the history of childhood trauma, psychiatric hospitalizations,
and suicidality (Stein et al., 2015).
Despite substantial validity evidence for several TAT scoring approaches (e.g., McClelland et al., 1953; Stein
et al., 2012), interrelationships between related constructs assessed using self‐report and TAT scoring protocols
have tended to vary substantially. For instance, self‐report and TAT assessments of achievement motivation have a
lengthy history of poor convergence (Köllner & Schultheiss, 2014; McClelland, Koestner, & Weinberger, 1989).
Across the 36 correlations included in the meta‐analysis, Spangler (1992) reported a modest average correlation of
0.09 between TAT and self‐report measures of achievement motivation. Likewise, Köllner and Schultheiss (2014)
reported a corrected Spearman’s rho value of 0.14 between implicit and explicit measures of achievement
motivation, although other researchers have demonstrated somewhat larger cross‐method relationships (e.g., 0.23,
Emmons & McAdams, 1991; 0.22, Thrash & Elliot, 2002). However, as in the case of the Rorschach (Meyer, 1996,
1997), proponents of the TAT have argued that significant interrcorrelations between the TAT and self‐report
methodologies are not to be expected, given that the implicit motives assessed by the TAT are theoretically distinct
from the explicit motives assessed by self‐report (Cramer, 1999; McClelland et al., 1989). Specifically, some
researchers (Brunstein & Maier, 2005; McClelland et al., 1989; Rawolle, Schultheiss, & Schultheiss, 2013) have
noted that performance‐based and self‐report measurements of the same psychological variable often demonstrate
quite different patterns of correlates, further indicating the distinct nature of the constructs.
Nevertheless, these predictions have been challenged by correlations of substantial magnitude when specific
TAT scoring systems and self‐report questionnaires are used. Stein et al. (2012) reported correlations of medium to
large magnitude between the SCORS‐G component two, capturing the self‐esteem and identity and coherence of
self variables, and several PAI indicators, including the suicidal ideation (SUI; r = −0.50), anxiety (r = −0.42), anxiety‐
related disorders (r = −0.44), and depression (r = −0.44) scales. In addition, several significant relationships emerged
with respect to SCORS‐G ratings of the TAT and five‐factor normative personality traits. Specifically, significant
negative relationships were observed between SCORS‐G component two and Neuroticism (r = −0.40), and
significant positive relationships were observed between SCORS‐G component one, capturing the emotional
investment in values and moral standards and experience and management of aggressive impulses (AGG) variables,
and agreeableness (r = 0.35) and conscientiousness (r = 0.27). Stein et al. (2018) additionally reported significant
relationships between level of personality organization as assessed by SCORS‐G scoring of the TAT and reported
symptomatology on the PAI, as well as significant relationships between level of personality organization and the
study‐derived domains of regulation/control and energy level using the NEO Five‐Factor‐Inventory. Thus, despite
mixed evidence regarding the convergence of TAT and self‐report methodologies, it appears that significant
interrelationships between these two methods are feasible with the use of specific scoring systems and self‐report
criteria.
As explored using a multiple‐choice adaptation of the Rorschach (Morey & McCredie, 2018), another possible
explanation for the inconsistent convergence between the TAT and self‐report is the method variance associated
with the different response format of these measures, with the fixed nature of self‐report response options
contrasting with the open‐ended nature of TAT narrative responses. As in the case of the Amplified Multiple Choice
Test (Harrower & Steiner, 1951), there has also been a history of efforts to develop structured scoring protocols to
accompany the TAT and related measures (Holmstrom, Silber, & Karp, 1990; Hurley, 1955; Johnston, 1957;
Schultheiss, Yankova, Dirlikov, & Schad, 2009), and the use of these adapted approaches may serve to control some
of the method variance precluding heteromethod convergence. For instance, Holmstrom et al. (1990) developed
the Apperceptive Personality Test (APT), in which participants are presented with a series of eight emotionally
ambiguous pictures and a set of structured open‐ended questions pertaining to each picture. The APT was found to
4 | McCREDIE AND MOREY
demonstrate small but significant relationships with the majority of hypothesized constructs on the MMPI (Karp,
Silber, Holmstrom, & Kellert, 1992), suggesting that imposing greater structure on the response format may
increase convergence with highly structured measures such as self‐report. Schultheiss et al. (2009) developed a
cue‐ and response‐matched questionnaire version of the Picture Story Exercise (PSE‐Q) in which participants were
asked to imagine themselves as one of the people in each of eight pictures and respond to a series of 14 items per
picture regarding what they might think, feel, desire, or do in each situation. Although the PSE‐Q correlated to a
greater degree with the self‐report Personality Research Form than the standard performance‐based PSE, the PSE
and the PSE‐Q were not significantly correlated. Notably, however, the PSE‐Q deviated from the traditional PSE
such that participants were explicitly asked to insert themselves into the picture story and respond as they
themselves might in that particular situation, thus seemingly resembling self‐report more closely than implicit
motive measurement.
The Iowa Picture Interpretation Test (IPIT; Hurley, 1955) was developed to offer more time and cost efficient
procedure for assessing McClelland et al. (1953) achievement motivation variable, in light of issues of interrater
reliability and the extensive training necessary to score TAT free responses. The IPIT acts as a forced‐choice
stimulus attribution measure such that participants are presented with a TAT card and instructed to rank a set of
alternatives according to perceived fit with the given card. The original structured protocol utilizes a selection of 10
TAT cards, and each of these cards is presented to participants with four possible response choices representing
one‐sentence stories corresponding to the individual card. Participants are instructed to rank the four response
choices according to their perception of how well each story fits the story picture presented for each of the 10 TAT
cards. Each of the four response choices corresponds to one of four response categories (i.e., achievement imagery
[AI], insecurity [I], blandness [B], and hostility [H]), and scores for each of these categories are calculated by
summing the rankings across the 10 cards.
Early investigation of the IPIT demonstrated some validity evidence for the measure, particularly concerning
achievement imagery (Hurley, 1957; Johnston, 1955; Kight & Sassenrath, 1966). Individuals scoring higher on
achievement imagery per the IPIT were shown to work faster and produce fewer errors when completing
laboratory tasks, such as self‐guided instructional booklets (Kight & Sassenrath, 1966) and an electric maze
(Johnston, 1955), and demonstrated better academic performance (Fakouri, 1972). In addition, IPIT ratings of
hostility were shown to correlate significantly with several indicators on a self‐report measure of hostility and
aggression, including assault (r = 0.40), resentment (r = 0.45), and suspicion (r = 0.49), among male psychiatric clients
(Buss, Fischer, & Simmons, 1962). However, despite utilizing a structured approach to TAT scoring, some reports
continued to demonstrate poor convergence between the IPIT scoring categories and similar constructs assessed
via self‐report (Barnette, 1961; Goodstein, 1954), and the measure demonstrated questionable internal consistency
(Hurley, 1955). In an effort to improve upon the reliability of the IPIT, Johnston (1957) increased the length from
10 to 24 TAT cards and removed response options that were either extremely popular or unpopular among
participants, resulting in a revised version demonstrating modest improvements in split‐half internal consistency
(AI: α = .33, I: α = .48, B: α = .64, and H: α = .55) and 1 month test–retest reliability (AI: r = 0.60, I: r = .58, B: r = 0.69,
and H: r = 0.73).
The interrelationships observed between the Amplified Multiple Choice Test version of the Rorschach and
the PAI and IPIP‐NEO‐50 (Morey & McCredie, 2018) offer preliminary evidence to suggest that method variance
may be an important impeding factor in the convergence of performance‐based and self‐report methodologies.
Thus, the possibility that a structured scoring system of the TAT may produce similarly sized correlations in
relation to contemporary self‐report assessment approaches warrants further exploration. In a continued effort
to examine the role of method variance in heteromethod convergence, the present study sought to conduct an
exploratory analysis of the interrelationships between the revised IPIT scoring system for the TAT (Johnston,
1957) and the PAI and IPIP‐NEO‐50 ratings of the domain traits of the five‐factor model. Although the
exploratory nature of this analysis should be recognized and emphasized, several specific hypotheses are
proposed. Given known relationships between conscientiousness and structured assessments of achievement
McCREDIE AND MOREY | 5
motivation (e.g., Busato, Prins, Elshout, & Hamaker, 2000; Costa & McCrae, 1988), it is expected that the
Conscientiousness domain of the IPIP‐NEO‐50 will correlate with the achievement imagery scoring category of
the IPIT. In addition, given demonstrated relationships between the hostility scoring category of the IPIT and
self‐report indicators of assault, resentment, and suspicion (Buss et al., 1962), the IPIT hostility scoring category
is expected to correlate positively with the paranoia (PAR) and AGG scales of the PAI and negatively with the
agreeableness domain of the IPIP‐NEO‐50.
Finally, in light of the little empirical examination of the blandness and insecurity scoring categories, hypotheses
are tentatively proposed for these categories based on intuitive relationships among constructs. Specifically, it is
anticipated that blandness may correlate with PAI validity indicators, including a positive relationship with the
Positive Impression Management (PIM) scale and a negative relationship with the Negative Impression
Management (NIM) scale, and that insecurity will correlate with the PAI anxiety (ANX) and depression (DEP)
scales. However, in keeping with the exploratory nature of these analyses, a wide range of possible correlations will
be examined in addition to these proposed hypotheses to identify potential avenues for future research.
2 | METHOD
2.1 | Participants
A total of 455 participants (57% male) were recruited from Amazon’s Mechanical Turk (MTurk), with the
requirement that they be at least 18 years old and located in the United States (as verified by their IP address).
Participants ranged from 19 to 70 years of age, with a mean age of 35.5 years (standard deviation [SD] = 10.4). The
majority of participants identified as European American (76.5%), followed by African American (7.9%), Asian
American (7.7%), Hispanic or Latino/a (5.5%), Native American (1.5%), and Other (0.9%). Individuals located outside
the United States were restricted from participation given that all study materials were in English and previous
research has shown that MTurk data samples demonstrate equivalence to more traditional data collection methods
only when IP addresses are restricted to native English speakers (Feitosa, Joseph, & Newman, 2015). No
restrictions based on task approval rate or any other qualification measure were used. Participants were
compensated $7.50 for completion of the study materials, which included additional measures not examined in the
present study (548 total items; mean response time = 67 min). The study protocol was approved as meeting ethical
standards per the university institutional review board, and participants had the option to discontinue the study at
any time.
2.2 | Measures
2.2.1 | Personality assessment inventory
PAI (Morey, 1991) is a 344‐item structured, self‐administered questionnaire that assesses a variety of personality
and psychopathology domains. The PAI consists of 22 nonoverlapping scales, including 4 validity scales, 11 clinical
scales, 5 treatment scales, and 2 interpersonal scales. Response alternatives range from “totally false, not at all
true” (1) to “very true” (4). Means, SDs, and internal consistencies of the PAI scales and subscales in the present
sample are published in a previous study (McCredie & Morey, 2019). Internal consistency alpha coefficients ranged
from .81 (Stress) to .95 (ANX) for the full scales and .63 (obsessive–compulsive; ARD‐O) to .91 (affective
depression; DEP‐A) for the subscales.
to one of five broad personality dimensions: neuroticism (α = .92), extraversion (α = .90), agreeableness (α = .82),
conscientiousness (α = .89), and openness to experience (α = .80). The provided response alternatives ranged from
“very little or not at all descriptive of me” (0) to “extremely descriptive of me” (3).
Achievement imagery
Individuals high in achievement imagery are characterized by a strong desire to be successful and demonstrate
excellence. These individuals are motivated by competition with others and strive to achieve goals that are
considered high accomplishments per social standards.
Insecurity
Individuals high in insecurity are characterized by feelings of failure or anticipated failure in achieving sought after
goals. These individuals demonstrate a fear of deprivation or risk of deprivation from these desirable goals,
particularly those of an interpersonal nature (e.g., affection).
Blandness
Individuals high in blandness are characterized by depersonalization of situations or events. These individuals are
reluctant to express feelings or motives associated with the self or others.
Hostility
Individuals high in hostility often express feelings of irritability, anger, or resentment. These individuals
demonstrate or verbalize a desire or intention to act threateningly or adversely towards others.
The IPIT utilizes the following 24 (Murray, 1943) TAT cards: 1, 2, 3GF, 4, 5, 6BM, 6GF, 7BM, 7GF, 8BM, 9BM,
9GF, 10, 12F, 12M, 13B, 13G, 13MF, 14, 17BM, 17GF, 18BM, 18GF, and 20.
3 | RES U LTS
Table 1 presents the means, SDs, and coefficient alphas for each of the revised IPIT scoring categories. There was
appreciable internal consistency observed for the four scoring categories (mean α = .74), ranging from an alpha of
.70 for achievement imagery to .77 for hostility. All four scoring categories were utilized at all four ranking
positions, and no one scoring category appeared to be over‐ or underutilized by participants. Of the four scoring
McCREDIE AND MOREY | 7
T A B L E 2 Pearson correlations between IPIT categories and PAI scales and IPIP‐NEO‐50 domains
Achievement imagery Insecurity Blandness Hostility
PAI
Inconsistency (ICN) −0.30** .16** −0.16** .27**
Infrequency (INF) −0.18** .04 −0.21** .32**
†,
Negative impression management (NIM) −0.36** .20** −0.31 ** .44**
Positive impression management (PIM) .35** −0.28** .19†,** −0.24**
Somatic complaints (SOM) −0.33** .21** −0.27** .37**
Anxiety (ANX) −0.44** .34† ** −0.32** .40**
Anxiety‐related disorders (ARD) −0.43** .32** −0.30** .39**
Depression (DEP) −0.49** .39†,** −.33** .40**
Mania (MAN) −0.17** .05 −.24** .33**
Paranoia (PAR) −0.48** .30** −.36** .51† **
Schizophrenia (SCZ) −.45** .32** −.36** .46**
Borderline features (BOR) −0.46** .32** −.36** .47**
Antisocial features (ANT) −0.27** .13** −.29** .40**
Alcohol problems (ALC) −0.22** .16** −.23** .27**
Drug problems (DRG) −0.21** .12* −.26** .33**
Aggression (AGG) −0.27** .13** −.26** .37**
Suicidal ideation (SUI) −0.42** .32** −.27** .35**
Stress (STR) −0.34** .25** −.26** .32**
Nonsupport (NON) −0.51** .38** −.32** .43**
Treatment rejection (RXR) .38** −0.31** .23** −.29**
Dominance (DOM) .25** −0.25** .08 −.08
Warmth (WRM) .37** −0.32** .18** −.22**
IPIP‐NEO‐50
Neuroticism −0.42** .34** −.26** .33**
Extraversion .27** −0.28** .13** −.12*
Openness to experience .21** −0.15** .14** −.19**
Agreeableness .40** −0.26** .27** −.38† **
Conscientiousness .33† ** −.31** .24** −.25**
Abbreviations: IPIP, international personality item pool; IPIT, Iowa picture interpretation test; PAI, personality assessment
inventory.
†
Hypothesized correlations.
*p < .05.
**p < .01.
8 | McCREDIE AND MOREY
categories, blandness tended to be the most preferred response category (mean ranking = 2.22) across TAT
pictures and Hostility was the least preferred (mean ranking = 2.92).
Table 2 presents the observed Pearson product‐moment correlations (r) between the IPIT scoring variables and
the PAI scales and IPIP‐NEO‐50 domains. Hypothesized correlations between the IPIT scoring categories and the
corresponding PAI and IPIP‐NEO‐50 indicators were all statistically significant, with magnitudes primarily falling in
the medium range. Of the hypothesized correlations, the strongest association was between the hostility scoring
category and the PAI PAR scale (r = 0.51). As expected, a positive relationship with AGG and a negative relationship
with agreeableness was also observed for hostility. As anticipated, achievement imagery demonstrated a significant
positive relationship with conscientiousness, and insecurity was positively associated with the PAI indicators of
DEP and ANX. Blandness was also correlated significantly with the validity scales of the PAI as hypothesized,
demonstrating a negative relationship with NIM and to a slightly lesser magnitude a positive relationship with PIM.
In addition to the hypothesized correlations, subsequent exploratory analyses revealed significant correlations
between the IPIT scoring categories and the majority of PAI scales and IPIP‐NEO‐50 domains, with several
approaching a large effect size. In general, achievement imagery and blandness tended to be negatively associated
with indicators of pathology and positively associated with indicators of psychological health. For instance, strong
negative relationships with effect sizes approaching the large range were observed between achievement imagery
and PAI indicators of DEP, PAR, and poor social support (nonsupport; NON), whereas positive relationships were
observed with agreeableness. To a slightly lesser magnitude, blandness also demonstrated strong negative
associations with indicators of pathology, including PAR, schizophrenia (SCZ), and borderline features (BOR) scales
of the PAI, as well as positive relationships with personality traits such as extraversion and agreeableness. The
opposite pattern was observed for the IPIT insecurity and hostility categories, which tended to be positively
associated with indicators of psychopathology and negatively associated with indicators of adaptive psychological
functioning. For instance, in addition to the hypothesized relationships with the PAI ANX and DEP scales, insecurity
also demonstrated medium effect size relationships with neuroticism, as well as the PAI indicators of ARD, poor
social support (NON), SCZ, BOR, and SUI. Likewise, individuals scoring higher on insecurity tended to be less
conscientious and interpersonally warm. Relationships between hostility and the PAI and IPIP‐NEO‐50 indicators
demonstrated a similar pattern, with additional positive associations observed with PAI indicators of BOR, SCZ, and
NON, as well as a generally negative self‐presentation (NIM).
4 | D IS C U S S IO N
This study extends previous research (Morey & McCredie, 2018) utilizing structured adaptations of performance‐
based measures to further explore the role of method variance in heteromethod convergence. Although existing
research has produced mixed findings regarding interrelationships between comparable indicators on self‐report
and TAT scoring protocols (e.g., McClelland et al., 1989; Spangler, 1992; Stein et al., 2012), the results of the
present study suggest that TAT performance can correlate meaningfully with similar constructs assessed using self‐
report methodology when comparable structured response formats are used. As anticipated, all four scoring
categories demonstrated significant relationships in the expected direction with hypothesized self‐report
constructs. A general pattern emerged such that achievement imagery and blandness tended to be associated
with indicators of psychological health and adaptive functioning, whereas insecurity and hostility were primarily
associated with psychopathology and maladaptive functioning. Notably, approximately half of the observed
correlations between the IPIT scoring categories and PAI scales and IPIP‐NEO‐50 domains exceeded the
“personality ceiling” (e.g., Mischel, 1968) of approximately 0.30, with correlation magnitudes comparable to those
observed between well‐validated TAT scoring protocols and behavioral outcomes (e.g., SCORS‐G; Stein et al.,
2015). Thus, these findings offer further empirical support for the critical role of response format method variance
in cross‐method convergence (Morey & McCredie, 2018).
McCREDIE AND MOREY | 9
The four scoring categories additionally demonstrated better internal consistency than might be expected.
Although some researchers have argued that coefficient alpha indicators of internal consistency are inappropriate
given the heterogeneous nature of TAT cards (e.g., Cramer, 1999; Gruber & Kreuzpointner, 2013; Lundy, 1985), the
present findings suggests that there is actually a substantial degree of consistency in motivational themes as
measured across the different cards. While it is possible that the limited number of motivational themes evaluated
(i.e., four) may have contributed to the internal consistency observed presently, it is nevertheless notable that these
observed internal consistency coefficients were noticeably higher than those reported in the initial validation study
of the revised IPIT (Johnston, 1957).
As in the case of the Amplified Multiple Choice Rorschach (Harrower & Steiner, 1951), it should be recognized
that these relationships were observed with a structured version of the TAT that is dramatically different from the
variety of free response TAT protocols currently used. This is particularly evident given the observation that TAT
administration and scoring appear to vary substantially from clinician to clinician, with some ambiguity regarding
how many TAT administrators are employing any standardized system at all (Keiser & Prather, 1990; Lilienfeld
et al., 2000, 2005). Nevertheless, despite clear differences between the typical open‐ended TAT response format
and the structured IPIT format used in the present study, the rank ordering response format utilized by the IPIT is
still arguably performance‐based in nature. It should also be noted that this adaptation of the TAT is not proposed
to function as a comprehensive replacement for other TAT scoring systems, as there are several well‐validated
standardized scoring protocols with substantial empirical support (e.g., SCORS‐G) that assess a number of
additional domains beyond the four scoring categories of the IPIT. However, this structured version of the TAT may
show particular utility in contexts in which a performance‐based measure may be advantageous over self‐report,
such as instances in which participants lack insight about their psychological difficulties or are motivated to engage
in impression management strategies (Meyer, 1997), but setting or time constraints preclude a traditional TAT
scoring and administration protocol.
As also observed with the Amplified Multiple Choice Test (McCredie & Morey, 2019), the vast number of
significant interrelationships raises potential issues of discriminant validity that warrant further investigation.
Although hypothesized correlations were observed in the expected directions, the additional number of medium to
large effect size correlations across nearly all of the PAI and IPIP‐NEO‐50 indicators suggests that these scoring
categories may be limited in their ability to detect discrete psychological or personality features. Instead, it appears
that these response categories may be pulling for a general pattern of psychological adjustment and functioning,
such that individuals tending to rank stories from the achievement imagery and blandness categories higher than
the hostility and insecurity categories appear to demonstrate overall more adaptive functioning across personality
and psychopathology domains, while the opposite is also true. Thus, the IPIT may be better suited as a screening
tool for potential psychopathology or adjustment issues as opposed to a psychodiagnostic tool, although future
research will be necessary to examine this measure from a clinical utility perspective.
Two important limitations of the present study also shared by the investigation of the Amplified Multiple
Choice Test (Morey & McCredie, 2018) offer potential avenues for future research in this area. First, the present
study utilized a nonclinical sample, and thus it remains uncertain as to whether this pattern of relationships will be
similarly observed in a psychiatric sample, particularly given that the PAI is often used to assess constructs related
to psychopathology. Future research of both the IPIT and the Amplified Multiple Choice Test in clinical samples will
be necessary to determine the extent to which these patterns of findings generalize among psychiatric clients, as
well as to explore the potential clinical utility of these structured performance‐based protocols. Second, it should
be noted that the relationships observed using a multiple‐choice ranking variation of the TAT will not necessarily
generalize to standard TAT administration and scoring protocols, and it remains unclear the extent to which
constructs assessed by a multiple‐choice variation of the TAT more closely align with related constructs as assessed
by the standard TAT scoring approach or self‐report. As in the case of the Rorschach, there are a variety of process‐
related aspects of TAT administration that may be lost using a multiple‐choice approach, including the
interpersonal dynamic between the respondent and examiner, the demand upon the respondent to provide a
10 | McCREDIE AND MOREY
coherent narrative, and the opportunity for further inquiry and clarification. It may be challenging to recapture
these process‐related aspects while maintaining the efficiency of the multiple‐choice structure; future research will
be necessary to explore this possibility for both instruments. Furthermore, while structuring these performance‐
based methodologies in a multiple‐choice format may have served to eliminate some of the method variance
thought to impede convergence between performance‐based and self‐report instruments, future research using
alternative types of outcome criteria is necessary to empirically evaluate the influence of this methodological
adaptation, as well as to explore the extent to which the different methods may provide valid, but unshared,
variance in predicting these outcomes.
The results of this study contribute to a growing body of evidence (Morey & McCredie, 2018) suggesting that
method variance may offer a compelling explanation for the poor convergence historically observed between
performance‐based and self‐report methodologies. Although distinct from contemporary performance‐based
approaches, multiple‐choice adaptations of traditional TAT and Rorschach procedures appear to assess a number of
similar constructs, while also converging meaningfully and to a more substantial degree with related constructs on
self‐report measures. Furthermore, although further investigation and empirical validation of these instruments will
undoubtedly be necessary to determine the extent to which these adapted instruments capture similar constructs
as assessed by traditional performance‐based methodology, it is possible that these structured instruments could
offer viable time‐ and cost‐effective alternatives to standard performance‐based administration and scoring
protocols in contexts where resources are limited. Further exploration of these instruments utilizing novel samples
and outcome criteria will continue to illuminate the nature of these observed relationships, with the ultimate
possibility of eventual revival of these time‐ and cost‐effective instruments into mainstream clinical practice.
OR CID
REFERENC ES
Archer, R. P. (1996). MMPI‐Rorschach interrelationships: Proposed criteria for evaluating explanatory models. Journal of
Personality Assessment, 67, 504–515. https://doi.org/10.1207/s15327752jpa6703_7
Archer, R. P., & Krishnamurthy, R. (1993). A review of MMPI and Rorschach interrelationships in adult samples. Journal of
Personality Assessment, 61, 277–293. https://doi.org/10.1207/s15327752jpa6102_9
Barnette, W. L., Jr (1961). A structured and a semi‐structured achievement measure applied to a college sample. Educational
and Psychological Measurement, 21, 647–656.
Bornstein, R. F. (2002). A process dissociation approach to objective‐projective test score interrelationships. Journal of
Personality Assessment, 78, 47–68.
Bornstein, R. F. (2011). Toward a process‐focused model of test score validity: Improving psychological assessment in
science and practice. Psychological Assessment, 23, 532–544. https://doi.org/10.1207/S15327752JPA7801_04
Brunstein, J. C., & Maier, G. W. (2005). Implicit and self‐attributed motives to achieve: Two separate but interacting needs.
Journal of Personality and Social Psychology, 89, 205–222.
Busato, V. V., Prins, F. J., Elshout, J. J., & Hamaker, C. (2000). Intellectual ability, learning style, personality, achievement
motivation and academic success of psychology students in higher education. Personality and Individual Differences, 29,
1057–1068. https://doi.org/10.1016/S0191‐8869(99)00253‐6
Buss, A. H., Fischer, H., & Simmons, A. J. (1962). Aggression and hostility in psychiatric patients. Journal of Consulting
Psychology, 26, 84–89. https://psycnet.apa.org/10.1037/h0048268
Collins, C. J., Hanges, P. J., & Locke, E. A. (2004). The relationship of achievement motivation to entrepreneurial behavior: A
meta‐analysis. Human Performance, 17(1), 95–117. https://doi.org/10.1207/S15327043HUP1701_5
Costa, P. T., Jr, & McCrae, R. R. (1988). From catalog to classification: Murray's needs and the five‐factor model. Journal of
Personality and Social Psychology, 55, 258–265. https://doi.org/10.1037/0022‐3514.55.2.258
Costa, P. T., & McCrae, R. R. (1992). Normal personality assessment in clinical practice: The NEO Personality Inventory.
Psychological Assessment, 4, 5–13. https://doi.org/10.1037/1040‐3590.4.1.5
McCREDIE AND MOREY | 11
Cramer, P. (1999). Future directions for the thematic apperception test. Journal of Personality Assessment, 72, 74–92.
https://doi.org/10.1207/s15327752jpa7201_5
Emmons, R. A., & McAdams, D. P. (1991). Personal strivings and motive dispositions: Exploring the links. Personality and
Social Psychology Bulletin, 17, 648–654. https://doi.org/10.1177%2F0146167291176007
Fakouri, M. E. (1972). Achievement motivation and cheating. Psychological Reports, 31, 629–630. https://doi.org/10.2466%
2Fpr0.1972.31.2.629
Feitosa, J., Joseph, D. L., & Newman, D. A. (2015). Crowdsourcing and personality measurement equivalence: A warning
about countries whose primary language is not English. Personality and Individual Differences, 75, 47–52. https://doi.org/
10.1016/j.paid.2014.11.017
Goldberg, L. R. (1999). A broad‐bandwidth, public domain, personality inventory measuring the lower‐level facets of several
five‐factor models. In I. Mervielde, I. Deary, F. De Fruyt, & F. Ostendorf (Eds.), Personality psychology in Europe (7, pp.
7–28). Tilburg, Netherlands: Tilburg University Press.
Goodstein, L. (1954). Interrelationships among several measures of anxiety and hostility. Journal of Consulting Psychology,
18, 35–39. https://psycnet.apa.org/10.1037/h0063512
Gruber, N., & Kreuzpointner, L. (2013). Measuring the reliability of picture story exercises like the TAT. PLoS One, 8(11),
e79450. https://doi.org/10.1371/journal.pone.0079450
Harrower, M. R., & Steiner, M. E. (1951). Large scale Rorschach techniques (2nd ed.). Springfield, IL: Charles C. Thomas.
Holmstrom, R. W., Silber, D. E., & Karp, S. A. (1990). Development of the apperceptive personality test. Journal of Personality
Assessment, 54, 252–264.
Hunsley, J., Lee, C. M., & Wood, J. M. (2003). Controversial and questionable assessment techniques. In S. O. Lilienfeld, S. J.
Lynn, & J. M. Lohr (Eds.), Science and pseudoscience in clinical psychology (pp. 39–76). New York, NY: Guilford Press.
Hurley, J. R. (1955). The Iowa picture interpretation test: A multiple‐choice variation of the TAT. Journal of Consulting
Psychology, 19, 372–376. https://psycnet.apa.org/10.1037/h0040550
Hurley, J. R. (1957). Achievement imagery and motivational instructions as determinants of verbal learning. Journal of
Personality, 25, 274–282. https://psycnet.apa.org/10.1111/j.1467‐6494.1957.tb01526.x
Jenkins, S. R. (2017). The narrative arc of TATs: Introduction to the JPA special section on thematic apperceptive
techniques. Journal of Personality Assessment, 99, 225–237. https://doi.org/10.1080/00223891.2016.1244066
Johnston, R. A. (1955). The effects of achievement imagery on maze‐learning performance. Journal of Personality, 24,
145–152. https://psycnet.apa.org/10.1111/j.1467‐6494.1955.tb01180.x
Johnston, R. A. (1957). A methodological analysis of several revised forms of the Iowa picture interpretation test. Journal of
Personality, 25, 283–293. https://psycnet.apa.org/10.1111/j.1467‐6494.1957.tb01527.x
Karp, S. A., Silber, D. E., Holmstrom, R. W., & Kellert, H. (1992). Prediction of MMPI clinical scores from the apperceptive
personality test. Perceptual and Motor Skills, 74, 779–786. https://doi.org/10.2466%2Fpms.1992.74.3.779
Keiser, R. E., & Prather, E. N. (1990). What is the TAT? A review of ten years of research. Journal of Personality Assessment,
55, 800–803. https://doi.org/10.1080/00223891.1990.9674114
Kight, H. R., & Sassenrath, J. M. (1966). Relation of achievement motivation and test anxiety to performance in programed
instruction. Journal of Educational Psychology, 57, 14–17. https://psycnet.apa.org/10.1037/h0022926
Köllner, M. G., & Schultheiss, O. C. (2014). Meta‐analytic evidence of low convergence between implicit and explicit
measures of the needs for achievement, affiliation, and power. Frontiers in Psychology, 5, 1–20. https://doi.org/10.3389/
fpsyg.2014.00826
Krishnamurthy, R., Archer, R. P., & House, J. J. (1996). The MMPI‐A and Rorschach: A failure to establish convergent
validity. Assessment, 3, 179–191. https://doi.org/10.1177%2F107319119600300210
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the
Public Interest, 1, 27–66. https://doi.org/10.1111%2F1529‐1006.002
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2005). What's wrong with this picture? Scientific American Mind, 16, 50–57.
Lundy, A. (1985). The reliability of the thematic apperception test. Journal of Personality Assessment, 49, 141–145. https://
doi.org/10.1207/s15327752jpa4902_6
McClelland, D. C., Atkinson, J. W., Clark, R. A., & Lowell, E. L. (1953). The achievement motive. Century psychology series.
East Norwalk, CT: Appleton‐Century‐Crofts.
McClelland, D. C., & Koestner, R. (1992). The achievement motive. In C. P. Smith (Ed.), Motivation and personality: Handbook
of thematic content analysis (pp. 143–152). New York, NY: Cambridge University Press.
McClelland, D. C., Koestner, R., & Weinberger, J. (1989). How do self‐attributed and implicit motives differ? Psychological
Review, 96, 690–702. https://psycnet.apa.org/10.1037/0033‐295X.96.4.690
McCredie, M. N., & Morey, L. C. (2019). Who are the Turkers? A characterization of MTurk workers using the Personality
Assessment Inventory. Assessment, 26(5), 759–766. https://doi.org/10.1177/1073191118760709.
Meyer, G. J. (1996). The Rorschach and MMPI: Toward a more scientifically differentiated understanding of cross‐method
assessment. Journal of Personality Assessment, 67, 558–558‐578.
12 | McCREDIE AND MOREY
Meyer, G. J. (1997). On the integration of personality assessment methods: The Rorschach and MMPI. Journal of Personality
Assessment, 68, 297–330. https://doi.org/10.1207/s15327752jpa6703_11
Meyer, G. J. (2004). The reliability and validity of the Rorschach and Thematic Apperception Test (TAT) compared to other
psychological and medical procedures: An analysis of systematically gathered evidence. In M. J. Hilsenroth, D. L. Segal,
& M. Hersen (Eds.), Comprehensive handbook of psychological assessment (2, pp. 315–342). Hoboken, NJ: John Wiley &
Sons, Inc.
Mihura, J. L., Meyer, G. J., Dumitrascu, N., & Bombel, G. (2013). The validity of individual Rorschach variables: Systematic
reviews and meta‐analyses of the comprehensive system. Psychological Bulletin, 139, 548–605. https://psycnet.apa.org/
10.1037/a0029406
Mischel, W. (1968). Personality and assessment. New York, NY: John Wiley and Sons, Inc.
Morey, L. C. (1991). Personality assessment Inventory professional manual. Odessa, FL: Psychological Assessment Resources.
Morey, L. C., & McCredie, M. N. (2018). Convergence between Rorschach and self‐report: A new look at some old
questions. Journal of Clinical Psychology, 75, 202–220. https://doi.org/10.1002/jclp.22701
Morgan, C. D., & Murray, H. A. (1935). A method for investigating fantasies: The thematic apperception test. Archives of
Neurology and Psychiatry, 34, 289–306.
Murray, H. A. (1938). Explorations in personality. New York, NY: Oxford University Press.
Murray, H. A. (1943). Thematic apperception test. Cambridge, MA: Harvard University.
Rawolle, M., Schultheiss, M., & Schultheiss, O. C. (2013). Relationships between implicit motives, self‐attributed motives,
and personal goal commitments. Frontiers in Psychology, 4, 1–7. https://doi.org/10.3389/fpsyg.2013.00923.
Schultheiss, O. C., Yankova, D., Dirlikov, B., & Schad, D. J. (2009). Are implicit and explicit motive measures statistically
independent? A fair and balanced test using the Picture Story Exercise and a cue‐and response‐matched questionnaire
measure. Journal of Personality Assessment, 91, 72–81. https://doi.org/10.1080/00223890802484456
Spangler, W. D. (1992). Validity of questionnaire and TAT measures of need for achievement: Two meta‐analyses.
Psychological Bulletin, 112, 140–154.
Stein, M. B., & Slavin‐Mulford, J. (2018). The Social Cognition and Object Relations Scale‐Global Rating Method (SCORS‐G): A
comprehensive guide for clinicians and researchers. New York, NY: Routledge.
Stein, M. B., Slavin‐Mulford, J., Siefert, C. J., Sinclair, S. J., Smith, M., Chung, W. J., & Blais, M. A. (2015). External validity of
SCORS‐G ratings of Thematic Apperception Test narratives in a sample of outpatients and inpatients. Rorschachiana,
36, 58–81.
Stein, M. B., Slavin‐Mulford, J., Sinclair, S. J., Chung, W. J., Roche, M., Denckla, C., & Blais, M. A. (2018). Extending the use of
the SCORS–G composite ratings in assessing level of personality organization. Journal of Personality Assessment, 100,
166–175. https://doi.org/10.1080/00223891.2016.1195394
Stein, M. B., Slavin‐Mulford, J., Sinclair, S. J., Siefert, C. J., & Blais, M. A. (2012). Exploring the construct validity of the Social
Cognition and Object Relations Scale in a clinical sample. Journal of Personality Assessment, 94, 533–540. https://doi.org/
10.1080/00223891.2012.668594
Teglasi, H. (2013). The scientific status of projective techniques as performance measures of personality. In D. H. Saklofske,
C. R. Reynolds, & V. L. Schwean (Eds.), The Oxford handbook of child psychological assessment (pp. 113–128). New York,
NY: Oxford University Press.
Thrash, T. M., & Elliot, A. J. (2002). Implicit and self‐attributed achievement motives: Concordance and predictive validity.
Journal of Personality, 70, 729–756. https://doi.org/10.1111/1467‐6494.05022
How to cite this article: McCredie MN, Morey LC. Convergence between thematic apperception test and
self‐report: Another look at some old questions. J. Clin. Psychol. 2019;1–12.
https://doi.org/10.1002/jclp.22826