Handbook of Bilingualism Psycholinguistic Approaches (Judith F. Kroll, Annette M. B. de Groot)
Handbook of Bilingualism Psycholinguistic Approaches (Judith F. Kroll, Annette M. B. de Groot)
Handbook of Bilingualism Psycholinguistic Approaches (Judith F. Kroll, Annette M. B. de Groot)
Psycholinguistic Approaches
EDITED BY
JUDITH F. KROLL
ANNETTE M. B. DE GROOT
1
2005
1
Oxford University Press, Inc., publishes works that further
Oxford Universitys objective of excellence
in research, scholarship, and education.
9 8 7 6 5 4 3 2 1
Printed in the United States of America
on acid-free paper
Preface and Acknowledgments
As recently as 10 years ago, the topic of bilingual- more attention than others and some questions that
ism was somewhat outside the mainstream of ex- historically have been underrepresented. It is our
perimental cognitive psychology. There were many hope that the chapters in this volume will satisfy the
studies on disparate topics, but no systematic body interest of students who wish to learn about psy-
of research that could be identied as constituting cholinguistic approaches to bilingualism and at the
a clear focus within the eld. In the time since, ac- same time encourage researchers across a range of
tivity in this eld has accelerated at a dizzying pace. elds to see that there are still many important
There are now journals, a variety of books, inter- questions yet to be answered.
national meetings, and cross-disciplinary graduate We have had the good fortune of being collea-
programs in psychology, linguistics, applied linguis- gues and collaborators for 15 years. During this
tics, and education, all dedicated to second lan- time, we have exchanged ideas and students, we
guage acquisition and bilingualism. In 1997, we have co-taught a course, visited each others labs,
edited a book, Tutorials in Bilingualism (Erlbaum), and shared a special friendship. This book, like our
to provide students and researchers with overviews previous edited volume, is a full and equal colla-
of the topics that we considered central to the boration between us.
emerging psycholinguistics of bilingualism. At the There are many people we wish to thank for
time, we could not possibly anticipate the rapid their support in the process of compiling this vol-
developments in this eld that have occurred. ume. At the top of the list are the contributors; they
As we try to understand why interest in cognitive were generous with their time, patient with us in the
approaches to bilingualism has grown, we can point process of assembling a handbook of this length,
to the global economy, to the increasing multilingual and wrote outstanding reviews of the research in
presence in the United States and elsewhere where their respective areas. We thank Catharine Carlin,
monolingualism was once the accepted norm, to our editor at Oxford, who was extremely encour-
debates regarding bilingual education, and to the aging, incredibly patient, responsive to all of our
introduction of exciting new methods for revealing questions; she made us feel throughout that the
brain activity during language processing. But, what project was as exciting in the thick of revisions as on
we really believe is the main reason for this increased the rst day it was proposed.
interest is that cognitive scientists have come to ap- We have been fortunate to work with a won-
preciate that learning and using more than one lan- derful group of students, visitors, and colleagues
guage is a natural circumstance of cognition. Not who spent time in our labs during this period and
only does research on second language learning and enriched our lives both professionally and person-
bilingualism provide crucial evidence regarding the ally. They include Teresa Bajo, Susan Bobb, Su-
universality of cognitive principles, but it also pro- sanne Borgwaldt, Kate Cheng, Ingrid Christoffels,
vides an important tool for revealing constraints Philip Delmaar, Sara Hasson, Noriko Hoshino,
within the cognitive architecture. Cristina Izura, April Jacobs, Nan Jiang, Rineke
The chapters in this book represent what we take Keijzer, Martin van Leerdam, Jared Linck, Lorella
to be the essence of the new psycholinguistics of Lotto, Pedro Macizo, Erica Michael, Natasha
bilingualism, one that is informed by developments Miller, Maya Misra, Pilar Pinar, Petra Poelmans,
in linguistics and neuroscience and that builds on Rik Poot, Carmen Ruiz, Mikel Santesteban, Beryl
the rigor of experimental cognitive science. As in Schulpen, Ana Schwartz, Bianca Sumutka, Gret-
any young eld, there are some topics that garner chen Sunderman, Natasha Tokowicz, Rosanne
vi Preface and Acknowledgments
van den Brink, Ellen van den Eijnden, and Zoa one else in the eld; her sister Elise Kroll, who is the
Wodniecka. only real bilingual in the immediate family; and
The quality of our intellectual lives has also especially David Rosenbaum, her partner of 28
been supported by a fantastic group of colleagues years, who understands that for a couple to have
on both sides of the Atlantic; they made discussions two careers is a bit like having two languages
about bilingualism a vibrant source of stimulation they are always active to a high level, they compete,
that has led to enduring collaborations. We espe- and somehow they manage to speak in a single
cially thank Dorothee Chwilla, Albert Costa, Ton voice that sustains them both. It is to them that she
Dijkstra, Giuli Dussias, Chip Gerfen, David Green, dedicates this effort.
Jan Hulstijn, Wido La Heij, Jaap Murre, Scott Annette would like to thank her father, Johan de
Payne, Nuria Sagarra, Janet van Hell, Vincent van Groot, who at a very respectable age is still closely
Heuven, and Dan Weiss. monitoring the well-being of each member of his
Finally, each us of would like to thank some large family; her son Jan, just for being the nice
special people in our lives. Judy would like to young man he is; her sisters Francis de Groot,
thank her parents, Ruth Kroll and Sol Kroll, who Monique de Groot, Birgitte van den Elzen, and
have always been a source of support; her twin especially Marion de Groot, who over the years
daughters, Nora Kroll-Rosenbaum and Sarah Kroll- gradually lled the void that was left following the
Rosenbaum, who know what it means to be on death of Annettes twin sister, Jeannette de Groot.
the team and how to make jokes about psycho- It is to the memory of Jeannette and of her mother,
linguistic models that might never occur to any- Cher de Groot, that she dedicates this effort.
Contents
Contributors xi
I. Acquisition
Introduction to Part I: Acquisition 3
Nick C. Ellis
1. The Learning of Foreign Language Vocabulary 9
Annette M. B. de Groot and Janet G. van Hell
SYNTAX
BIOLOGICAL BASES
5. What Does the Critical Period Really Mean? 88
Robert DeKeyser and Jenifer Larson-Hall
6. Interpreting Age Effects in Second Language Acquisition 109
David Birdsong
7. Processing Constraints on L1 Transfer 128
Manfred Pienemann, Bruno Di Biase, Satomi Kawaguchi, and Gisela Hakansson
8. Models of Monolingual and Bilingual Language Acquisition 154
Jaap M. J. Murre
II. Comprehension
Introduction to Part II: Comprehension 173
Natasha Tokowicz and Charles A. Perfetti
9. Bilingual Visual Word Recognition and Lexical Access 179
Ton Dijkstra
10. Computational Models of Bilingual Comprehension 202
Michael S. C. Thomas and Walter J. B. van Heuven
viii Contents
COGNITIVE CONSEQUENCES
20. Consequences of Bilingualism for Cognitive Development 417
Ellen Bialystok
21. Bilingualism and Thought 433
Aneta Pavlenko
22. Simultaneous Interpreting: A Cognitive Perspective 454
Ingrid K. Christoffels and Annette M. B. de Groot
23. Clearing the Cobwebs From the Study of the Bilingual Brain: Converging Evidence
From Laterality and Electrophysiological Research 480
Rachel Hull and Jyotsna Vaid
24. What Can Functional Neuroimaging Tell Us About the Bilingual Brain? 497
Jubin Abutalebi, Stefano F. Cappa, and Daniela Perani
Contents ix
ACQUISITION
This page intentionally left blank
Nick C. Ellis
Introduction to Part I
Acquisition
3
4 Acquisition
picture association methods for learning foreign De Houwer corrects this misapprehension. She
language vocabulary, and they evaluate their begins with a clear methodological analysis of
effects on receptive and productive learning, speed the types of evidence required to test the SDH,
of access, and resistance to forgetting. Words vary particularly that separate development must be ev-
on various dimensions, such as their concreteness; ident for most of the comparable morphosyntactic
their morphological, phonological, and ortho- structures in the childs speech that reect differ-
graphic complexity; their frequency; and their cog- ences in the input languages. She then reviews the
nate status. De Groot and Van Hell show that all of majority of the longitudinal studies published in the
these factors affect the ease of learning a word and last 15 years that have looked at morphosyntactic
its eventual representation. Concrete words are development in BFLA children. Her analysis of the
easier to learn than abstract words, a result of their speech productions of these 29 children between the
greater information content, richer representation, ages of 1 and nearly 6 years, who together acquired
and greater opportunity for anchoring and retrieval. 12 languages in 13 different combinations, showed
Word forms that are phonologically familiar to the that no child produced the sort of language reper-
learner are easier than those that sound more for- toire that would be predicted to develop in bilingual
eign. These two factors compound in making cog- children in line with a transfer theory. Young bi-
nate words particularly easy to learn. De Groot and lingual children reect the structural possibilities of
Van Hell analyze these effects in terms of their im- both languages of exposure and are able to produce
plications for the structure of the bilingual lexicon, utterances that are clearly relatable to each of their
that is, whether it is compound, coordinate, or sub- different languages; from very early on, the mor-
ordinate, an issue considered in parts II and III of this phosyntactic development of the one language does
volume as it applies to procient bilingual repre- not have any fundamental effect on the morpho-
sentation. syntactic development of the other.
Whatever the structure of the bilingual lexicon In general, BFLA childrens language-specic
at uency, at which point thousands of hours of development within one language differs little from
contextualized L2 vocabulary use have ground di- that of monolingual acquisition, except of course
rect connections between the L2 forms and their that bilingual children do it for two languages at
meanings, the evidence here suggests a word asso- a time. Equally, like adult bilinguals, young BFLA
ciation organization of the low-prociency learner children are able to switch between languages very
by which the processing of L2 is mediated via the L1. easily, either at utterance boundaries or within ut-
Early L2 vocabulary acquisition is parasitic on L1 terances. De Houwer also claims that there is no
phonological representations, L1 conceptual repre- evidence that hearing two languages from birth
sentations, and L1 word-concept mappings; L2L1 leads to language delay. Empirical conrmation of
independence only comes as a result of considerable the SDH entails that young bilingual children are
L2 experience. keenly attuned to the specic linguistic environ-
ments in which they nd themselves.
In chapter 3, MacWhinney considers SLA. In
Syntax contrast to infant (B)FLA, L2 learners already know
a great deal about the world, their brains are com-
In chapter 2, De Houwer focuses on early bilingual mitted and entrenched in their L1, and they cannot
acquisition of morphosyntax. In acquiring two rely on an intense system of social support from
languages from birth with parents who accord to the their caregivers. These differences have led some
one person, one language principle, a situation researchers to propose that SLA requires a totally
referred to as BFLA, do children undergo a double separate understanding from L1A. Yet the many
acquisition process in which the two morpho- similarities of microprocess in rst and second lan-
syntactic systems are acquired in parallel as funda- guage acquisition and the fact that L2 learning is
mentally independent closed systems (the Separate inuenced by transfer from L1 mean that a model of
Development Hypothesis, SDH)? Alternatively, SLA must take into account the acquisition and
does BFLA produce a single hybrid, a Mish-Mash structure of L1.
that results from systematic morphosyntactic inu- For these reasons MacWhinney sketches the plan
ence of each language on the other? Research in the of a new unied model in which the mechanisms of
1970s suggested the single-system hypothesis held, L1 learning are seen as a subset of the mechanisms of
with children systematically applying the same L2 learning. This unied account builds on his ear-
syntactic rules to both languages. lier Competition Model, which maintains that the
Introduction to Part I 5
learners task is to learn the forms of language that is betrayed by their nonnative accent; in listening,
serve as the most reliable cues to interpretationin they often fail to perceive foreign sounds correctly.
essence, trying to learn the probability distribution Sebastian-Galles and Bosch consider L1A, BFLA,
P(interpretation|cue, context), a mapping from form to and adult SLA of the range of systems of phonolog-
meaning conditioned by context, with the different ical representation. At 4.5 months of age, bilingual
interpretations competing for realization in any infants can separate their languages, recognizing
particular context according to their cue strength. when there is a switch from one to the other, even
All language processing can be viewed thus as a set if they are rhythmically very similar. By 6 months
of competitive interactions driven by either auditory old, monolingual infants show maternal language-
and formal cues in comprehension or functional specic phoneme perception behavior, and their
cues in production. sensitivity to nonnative phonetic contrasts declines
The unied model supports this theory of cue during the rst year of life.
validity by extending it here with additional theo- Thus, acquisition reects processes of perceptual
retical constructs for dealing with cue cost and cue reorganization that result from linguistic experi-
support. Cue cost relates to the salience of formal ence, with monolingual childrens phonological
cues, particularly the forms that are not salient to system becoming perceptually tuned to categorize
the learner because of their expectations that have their native language optimally. For bilingual chil-
developed from their rst language experience: dren, the outcome of these perceptual reorganiza-
These are aspects of learned selective attention re- tion processes should result in two sound systems
sulting from transfer. To acquire these low-salience that correspond to the two languages of their ex-
cues properly, L2 learners can support their im- perience. It does, but their perceptual learning takes
plicit learning with additional cognitive mecha- longer: It is only by 1421 months of age that bi-
nisms, such as combinatorial learning, chunking, linguals show evidence of categorizing stimuli in
and use of analogy in the acquisition of new lin- each of their two languages as do monolinguals.
guistic constructions, mnemonics, and other meta- If these discrimination capacities of BFLA chil-
linguistic knowledge and the use of social support dren are temporarily delayed in comparison to
strategies. monolinguals, this is nothing compared to the dif-
The unied model incorporates a grounded cog- culties of second language learners when proces-
nition, functional explanation of grammar as a set of sing nonnative phonemes. Sebastian-Galles and
devices that marks the ow of perspective across ve Bosch review theories of why is it so difcult to
cognitive domains: direct perception, spacetime perceive some foreign contrasts and why these dif-
deixis, causal action, social roles, and belief systems. culties are not universal, but depend on the rst
In these ways, MacWhinney links research in bilin- language of the listener: The ease or difculty with
gualism to mainstream cognitive psychology and to which two phonemes will be discriminated depends
cognitive and functional linguistics. All of these on the similarities and differences between L1 and
areas predict that there will be considerable transfer L2 phoneme systems. They then ask these same
in SLA: Connectionism predicts it, spreading acti- questions for bilingual acquisition of the perception
vation predicts it, the notion of thinking for of stress, phonotactics, and receptive and productive
speaking predicts it, and perceptual learning and lexicons. The lexical activation studies addressed
interference theory predict it. MacWhinney reviews in this volume in chapters concerning adult bilin-
the factors that promote, and those that protect gual comprehension, production, and control per-
against, language transfer in phonology, lexicon, suasively demonstrate that, even when placed in a
syntax, morphology, and pragmatics. totally monolingual mode, phonological input ac-
tivates both of the bilinguals auditory lexicons. The
acquisition and processing of phonology is riddled
Phonology and Bilingualism with transfer effects.
younger learners. Both chapters agree on this. which schools typically cannot provide. The im-
There is a large body of empirical evidence showing plication of this research for education is that in-
that age of acquisition (AoA) is strongly negatively struction should be adapted to the age of the
correlated with ultimate second language pro- learner, not that learners should necessarily be
ciency, for grammar as well as for pronunciation. taught at a young age. If early language teaching is
But, close scrutiny of this effect reveals a range of needed, it should be based on communicative input
different interpretations, the implications of which and interaction, whereas adolescents and adults
are currently under debate in the literature. need additional focus on form to aid explicit
In chapter 5, DeKeyser and Larson-Hall present learning mechanisms, which at least some of them
a detailed review of the published results relating can substitute for implicit learning with a satisfac-
AoA and prociency. These studies have tested tory degree of success.
speakers of a wide variety of languages and used In chapter 6, Birdsong subjects many of these
a wide variety of testing formats and dependent same studies relating age and SLA to an equally
variables, albeit with grammaticality judgments as admirable methodological scrutiny. He cautions
the most common measure of morphosyntax and that there is a constant need to assess independently
global pronunciation ratings the most common the effects of length of residence (and consequent
index of phonological prociency. The large ma- amount of L2 exposure) and AoA. But, his major
jority of these studies demonstrated substantial critique concerns not the effect of age per se, but
childadult differences or strong correlations be- rather whether the reported relationships between
tween AoA and L2 prociency. L2 learners per- AoA and attainment conform to a strict notion of a
formance in morphosyntax varied as a function critical period. The orthodox conception of a crit-
of age more when grammaticality items were pre- ical period hypothesis is that there is a circum-
sented in oral rather than written form, and not all scribed developmental period before adulthood
areas of the target language grammar were equally during which SLA is essentially guaranteed and
susceptible to age effects. after which mastery of an L2 is not attainable.
Despite these variations, DeKeyser and Larson- Accordingly, there should be discontinuities in the
Hall argue that the evidence is doubtful that any function relating age and ultimate attainment. In
person has learned a second language perfectly in particular, there should be an offset that coincides
adulthood, claiming that four studies showing with the point at which full neurocognitive matu-
overlap between adult and native acquirers for ration is reached and after which no further age
morphosyntax can be explained to result from effects are predicted.
methodological factors, and that the rare observa- Birdsongs analysis of end-state SLA research
tions that some learners can achieve very high reveals little congruence with these geometric
levels of nativelike pronunciation are limited to and temporal features of critical periods. The ge-
performance on constrained rather than spontane- ometry of the age function (its slope and any dis-
ous production tasks. Their subsequent analysis continuities), and temporal features of the age
considers whether the age effect may be caused by function (the points at which AoA begins to, and
confounded variables such as quantity and quality ceases to, correlate signicantly with outcomes)
of input, amount of practice, level of motivation, vary from study to study, depending on such fac-
and other social variables differentiating child tors as the linguistic feature tested, amount of L2
and adult learners, but they discount the role of use, and L1-L2 pairing. The general conclusion is
these confounds because these variables play a that there is no apparent period within which age
limited role when the effect of AoA is removed effects are observed, but rather that they persist
statistically; AoA maintains a large and signicant indenitely.
role when the social and environmental variables Birdsong also reviews these studies to determine
are removed. whether there are any cases of nativelike attain-
Despite their clear conclusion that there is a ment in late bilinguals. He concludes that this is
maturational decline in second language learning possible in rare but nonnegligible frequencies, and
capacity during childhood, DeKeyser and Larson- that the 5% or greater incidence of nativelikeness
Hall caution that it is important not to overinterpret in late bilinguals, which is roughly as predicted
the implications of this nding for educational from the slope of the age function, is a substantial
practice. The observation that earlier is better enough incidence to warrant rejection of a strong
only applies to certain kinds of naturalistic learning, critical period hypothesis for SLA.
Introduction to Part I 7
The Human Language Processor, abundantly clear from the chapters preceding it, the
Grammar, Transfer, and ways by which exposure to tens of hundreds of hours
Acquisition of language input results in the mental representation
of language are hugely complex. There are too many
In chapter 7, Pienemann, Di Biase, Hakansson, and variables to hold in mind for a properly considered
Kawaguchi describe processability theory (PT), a complete theory. Therefore, language researchers
psycholinguistic analysis of the human language take recourse to computer modeling by which the test
processor and its operation according to linguistic of the simulation is whether competences emerge
analyses using lexical-functional grammar, a uni- that parallel those of human language learners
cation grammar attractive in its typological and psy- exposed to similar input. In this way, the debate
chological plausibility. The basic logic underlying PT between deductive and inductive approaches to
is that structural options are produced in the learners language acquisition is being rephrased in terms of
interlanguage only if the necessary processing proce- well-articulated models and real-world data.
dures are available. Language acquisition routes are Murre reviews computational simulation re-
thus constrained by the architecture of the human search into language acquisition using subsymbolic
language processor because, for linguistic hypotheses inductive connectionist approaches. Such research
to transform into executable processing skills, the demonstrates that, despite being very noisy and in-
processor needs to have the capacity for processing consistent, the nature of language input is never-
the structures relating to those hypotheses. theless sufcient to support inductive mechanisms
PT can be applied cross-linguistically to investigate by which seemingly rulelike behavior emerges from
the nature of the computational mechanisms involved a data-driven learning process. Examples are given
in the processing and acquisition of different L1s. PT from a variety of language domains (including stress
can also be used to analyze the interplay between L1 assignment, phonology, past tense formation, lo-
transfer and psycholinguistic constraints on L2 pro- calization, and certain aspects of semantics) using a
cessability: It assumes that the initial state of the L2 variety of exemplar-based and connectionist archi-
does not necessarily equal the nal state of the L1 be- tectures (including feedforward networks, simple
cause there is no guarantee that a given L1 structure is recurrent networks, Hopeld nets, and Kohonen
processable by the underdeveloped L2 parser. In other self-organizing maps for monolingual perceptual
words, L1 transfer is constrained by the capacity of and semantic representation and a Self-Organizing
the language processor of the L2 learner irrespective of Connectionist Model of Bilingual Processing) and a
the typological distance between the two languages. variety of theoretical frameworks (including latent
Pienemann et al. present a cross-linguistic survey semantic analysis, the Competition Model, the In-
of L1 transfer effects in SLA and demonstrate (a) that teractive Activation Model and its bilingual exten-
learners of closely related languages do not neces- sions Bilingual Interactive Activation and Bilingual
sarily transfer grammatical features at the initial Model of Lexical Access, and the Bilingual Speech
state even if these features are contained in both L1 Learning Model).
and L2, providing the features are located higher up Different aspects of language are best modeled
the processability hierarchy; (b) that such features using different architectures, a nding that accords
are transferred when the interlanguage has devel- well with the individualities outlined at the begin-
oped the necessary processing prerequisites; and (c) ning of this introduction. Murre concludes that,
that typological distance and differences in gram- compared to the thriving eld of computational
matical marking need not constitute a barrier to psycholinguistics and the developing subelds of
learning if the feature to be learned is processable at models of language acquisition or models of bilin-
the given point in time. These ndings strongly gual processing, there are still very few models of
qualify theories that emphasize extensive L1 transfer bilingual language acquisition. Murre suggests a
at the initial state, and they demonstrate the ways number of areas of bilingual acquisition ripe for
that processability moderates L1 transfer. simulation research.
In chapter 8, Murre reviews computational models As each of these chapters shows, we have come a
of monolingual and bilingual acquisition. As is long way in our understanding of these complex
8 Acquisition
issues. The most telling insight, which only be- BFLA seems to promote rapid language-specic
comes apparent from the compendium of a hand- morphosyntax acquisition to the standards ex-
book like this, is what is seen from the alignment pected of monolinguals, not some messy Mish-
and comparison of what is currently known about Mash. To what extent is it the FLA aspect of this
these issues when taken together. It is clear that a equation that allows this success or the clear envi-
true understanding can only come from the synthe- ronmental cuing that comes from one person,
sis of these different questions and approaches. one language? Is the acquisition of two separate
Three themes stand out in my mind in illustration. syntactic systems really as rapid as the acquisition of
The rst is the age factor and how it engages just one? If so, why is this true for syntax (De
aspects of interaction and contexts of acquisition, Houwer), whereas the BFLA of phonology is some-
education, transfer, and brain. Although DeKeyser what delayed in comparison to monolingual acqui-
and Larson-Hall and Birdsong might disagree over sition (Sebastian-Galles and Bosch)?
continuity/discontinuity in the AoA/SLA end-state Pienemann et al. similarly provide evidence of
function and about the possibility of nativelike at- lack of transfer in L2 sentence production. Is it the
tainment in late bilinguals, they are in clear accord case that transfer has its effects via selective atten-
that SLA is less successful in older learners. There tion, the way learners perceive the L2 input, and the
follows the question of why this should be, a single hypotheses they generate about the second lan-
question that begs considerable further research. guage, whereas the processing of the L2 rapidly
What are the brain mechanisms that underpin such becomes L2-content driven, with the modularity of
loss of plasticity? Are they a function of age or the eventual L2 grammar driven by the combinato-
increasing L1 entrenchment? What is the role of rial possibilities of L2 lexical forms and construc-
linguistic variables in determining the timing and tions and unsullied by cross-linguistic inuence?
shape features of the age function? What are the Modular systems are implicit, the sorts of system
cognitive developmental factors relating to these that are well simulated by connectionist models
differences, particularly those relating to implicit (Murre). They are automatic in their inhibition of
and explicit learning potential in adults and chil- cross-linguistic competitors. How is this selective
dren? What are the implications for the promotion interference of a multilinguals other languages
of multilingualism? controlled? Consciousness unites, with the potential
The second is second language processing (Pie- to pull together everything we know. To what extent
nemann et al.). We require a psycholinguistically is transfer an implicit learning phenomenon, and to
plausible account of grammar, one with processing what extent is it determined by explicit learning
stages that are clearly specied, and one that can be under attentional control? What are the cues that
applied to different languages in a principled way. multilingual individuals use to determine which
We need a well-specied theory of the architecture language is spoken, how are these mentally re-
of the human language processor. We need to un- presented, and how do they function in language
derstand how this processor develops and how new processing?
routines are acquired as a result of exposure to the We have to know all of these things. The be-
linguistic evidence available from the input. We ginnings of answers to some of these questions are
need to understand language typology and distance. to be found elsewhere in this handbook, but in
We need to understand the interplay between lan- sum, only some. The further concerted efforts of
guage transfer and language-specic growth. individuals in cognitive neuroscience, linguistics,
The third is transfer itself. These chapters psychology, and education are required to fully
clearly demonstrate linguistic transfer, most reli- appreciate the complex nature of bilingualism. It
ably regarding the acquisition of L2 phonology has been claimed that binary variables have prop-
(Sebastian-Galles and Bosch; De Groot and Van erties of all other scales: In a paradoxical way, the
Hell), but with examples spanning lexicon, or- two values meet requirements of nominal, ordinal,
thography, syntax, and pragmatics. Hence, Mac- interval, and ratio scales. The evidence of this part
Whinneys general Competition Model dictum that shows that it is less of a stretch to claim that bi-
everything that can transfer will. But, there are lingual language acquisition has properties of rst
situations that also seem to protect against transfer. language acquisition and much more besides.
Annette M. B. de Groot
Janet G. van Hell
1
The Learning of Foreign
Language Vocabulary
ABSTRACT This chapter reviews experimental research into learning foreign language
(FL) vocabulary, focusing on direct methods of teaching, such as keyword mnemonics,
paired association learning (including rote rehearsal), and picture association learning.
We discuss the relative effectiveness of these methods, the constraints in using them,
and the way they interact with other factors, most notably the amount of experience a
learner has had with learning foreign languages. We review research that shows that
some types of words are easier to learn than others and discuss the reasons why this
might be so. We also discuss the important role that good phonological skills play in
successful FL vocabulary learning and review preliminary research that suggests that
background music may be benecial for some FL learners but detrimental for others.
Finally, acknowledging the fact that FL learning via one of the direct methods discussed
only provides the starting point for FL word learning, we discuss more advanced stages
of the full-edged learning process.
9
10 Acquisition
access to the words stored in the readers mental cabulary instruction becomes a feasible means of
lexicon is a prerequisite of uent reading. If word instruction. The remaining vocabulary can subse-
recognition fails (because the word encountered quently be learned implicitly, similar to the way
is unknown to the reader or because it is known native speakers and early bilinguals acquire vocab-
but cannot be accessed rapidly or automatically), ulary from an early age (e.g., Ellis, 1995) and
reading comprehension breaks down. The reason through extensive reading in the FL.
is that, in the case of laborious, nonautomatic word This chapter focuses on research that has em-
recognition, precious attentional capacity (precious ployed direct methods of FL vocabulary teaching
because only a limited amount of attentional ca- (or, from the learners viewpoint, on direct methods
pacity is available at any moment in time) has to be of FL vocabulary learning) in (primarily) exper-
allocated to guring out the word and its meaning, imental settings. The rst section discusses the
leaving too little of the remaining attentional ca- various methods used and their effectiveness and
pacity to be allocated to higher level processes, such constraints. The next two sections focus on the
as nding the antecedent for a pronoun. differential learning effects that have been obtained
On acknowledging the importance of vocabu- with different types of words. A description of these
lary knowledge and fast access to and retrieval of word-type effects precedes a discussion of plausible
this knowledge for uent FL use, teachers and FL theoretical explanations of their occurrence.
learners appear to face an immense and daunt- A considerable amount of recent research points
ing task. A language contains many tens of thou- at the importance of good phonological skills in
sands of words, far too many to teach and learn via vocabulary learning. This work constitutes the to-
a method of direct teaching. Moreover, for each pic discussed in the next part of this chapter. It is
word, ultimately seven types of information have to followed by a section that shows that much more is
be learned: phonological and orthographic, syntac- involved in FL vocabulary learning than just storing
tic, morphological, pragmatic, articulatory, idio- the FL words name in memory. The nal two sec-
matic, and semantic information (Schreuder, 1987). tions discuss, rst, a topic of obvious pedagogical
The majority of these words have multiple importance, namely, the benecial or detrimental
meanings. It has been suggested that the number of effects that background music may have on FL vo-
meanings per word amounts to 15 to 20, none of cabulary learning and, second, a number of the cau-
whichcontrary to what is often thoughtcan be ses of the large differences in FL vocabulary learning
singled out as the words basic or real meaning outcomes and learning ability that exist across stud-
(Fries, 1945, in Boyd Zimmerman, 1997). Add to ies and between groups of FL learners and individual
this the fact that word meanings are not stable but FL learners.
instead, just as a languages phonology, develop
gradually over time (see Pavlenko, chapter 21, this
volume), and it can easily be imagined that the
teaching and learning of a full-edged FL vocabu- Direct Methods of Learning
lary is an impossible task that may discourage both Foreign Language Vocabulary
teachers and learners of FL and direct their efforts to
more manageable components of FL knowledge Keyword Mnemonics
instead.
However, several studies indicated that famil- A well-known, imagery-based instruction method
iarity with a relatively small, carefully selected, for the learning of novel vocabulary, including FL
number of words sufces for adult language com- vocabulary, is the keyword method. The keyword
prehension (Laufer, 1992; Nation, 1993; see method is a mnemonic technique in which learning
Hazenberg & Hulstijn, 1996, for a review). Nation is divided into two steps. In the rst step, one learns
argued that a vocabulary of the 3,000 most frequent to associate the novel word (e.g., mariposa) to a
word families (about 5,000 lexical items; but see keyword (e.g., marinade). A keyword is a word in
Bogaards, 2001) provides around 95% coverage of the native language that looks or sounds like the
written texts in English, which should enable an novel word that must be learned. In the second
adequate level of comprehension of these texts (but step, the learner creates a mental image in which
see Hazenberg & Hulstijn, 1996). This point of view both the keyword and the rst language (L1)
has clear implications for FL learning: If the FL translation (here buttery) of the novel word
learner needs to attain an initial vocabulary of interact (e.g., a buttery swimming in the mari-
only a few thousand words, direct (explicit) vo- nade). The keyword mnemonic thus establishes
Learning Foreign Language Vocabulary 11
both a form and a semantic connection (by means cabulary learning is integrated with basic grammar.
of the interactive image) between the novel word After 10 years, without any use of Italian, this
and its L1 translation. After learning, presentation person remembered 35% of the previously learned
of the novel FL word will elicit the keyword, which FL vocabulary, and after 10 minutes of relearning,
in turn will evoke the interactive image between the added an additional 93 words to the list of recalled
keyword and the novel word, after which the words. Although the learners performance in ac-
learner can produce the L1 translation. quiring Italian could have been facilitated by his
The keyword method may seem a rather labo- knowledge of other languages, including French,
rious procedure for learning FL vocabulary. Many Spanish, German, and Greek, and long-term re-
studies have found, however, that the keyword tention with other instruction methods has not
method facilitates foreign vocabulary learning and been evaluated, the amount of vocabulary retained
enhances recall in comparison to rote rehearsal (in after so long is still remarkable.
which the novel word and its L1 translation are Theoretical explanations of the benets of the
subvocally repeated) and unstructured learning (in keyword method point toward an important role of
which learners may choose their own strategy; for imagery. According to the dual-coding theory of
reviews, see Cohen, 1987; Hulstijn, 1997; Pressley, Paivio and colleagues (e.g., Paivio, 1986; Paivio &
Levin, & Delaney, 1982). Benecial effects of the Desrochers, 1981), the keyword method enhances
keyword method on learning and immediate recall learning and recall because the method uses both
of FL vocabulary have been obtained in a wide the verbal system and the image system in human
variety of languages, including Chinese (Wang & memory. During learning, both a verbal and an
Thomas, 1992), English (Elhelou, 1994; Rodrguez image code are encoded in memory. Assuming that
& Sadoski, 2000), German (e.g., Desrochers, these codes have additive effects, retrieval of the FL
Wieland, & Cote, 1991), Russian (Atkinson & word is facilitated because there are two memory
Raugh, 1975), and Tagalog (e.g., Wang, Thomas, & codes for the learning event, either of which can
Ouellette, 1992). support recall. An alternative explanation was pro-
The keyword method has been successful in a posed by Marschark and his colleagues, who sug-
wide variety of settings, including laboratory ex- gested that imaginal processing facilitates recall by
periments (as in Atkinson & Raugh, 1975) and increasing the relative relational value and distinc-
studies in more natural settings, often a classroom tiveness of the information generated during learn-
(Levin, Pressley, McCormick, Miller, & Shriberg, ing (Marschark, Richman, Yuille, & Hunt, 1987;
1979; Rodrguez & Sadoski, 2000). The method Marschark & Surian, 1989).
beneted FL vocabulary learning and recall of Although many studies reported positive effects
learners of various ages, ranging from children (e.g., of the use of keyword mnemonics in FL vocabulary
Elhelou, 1994; Pressley, Levin, & Miller, 1981) to learning, the ndings of other studies suggested
elderly learners (Gruneberg & Pascoe, 1996). that the method may not be effective under all
The keyword methods success can be illustrated conditions. Questions that have been raised pertain
by the classical study of Atkinson and Raugh to the long-term benets of the keyword method
(1975), which instigated a wealth of studies on and intentional versus incidental learning condi-
keyword mnemonics. These authors had university tions, its usefulness for certain word types, the ef-
students learn 120 Russian words on three con- fects on retrieval speed, the benets for experienced
secutive days (40 words a day). The learners, all learners, and its usefulness for receptive and pro-
native speakers of English with no prior knowledge ductive learning and recall. These ndings poten-
of Russian, received instructions to follow the tially constrain and qualify the general applicability
keyword method or were instructed to use any of this method. We discuss each of these topics
learning method they wished. Atkinson and Raugh next.
found that keyword learners outperformed the
own-strategy learners on all recall tests. Durability of Memory Traces In the majority of
A second striking example concerns a study by studies reporting long-term benets of the keyword
Beaton, Gruneberg, and Ellis (1995), who studied method, the delay interval between learning and
the 10-year retention of a FL vocabulary of 350 testing is typically manipulated within subjects:
words learned by a 47-year-old university lecturer Each subject is tested both on the immediate test
via the Linkword Italian course. In this course, and on subsequent delayed tests. In a series of
subsequently published by Gruneberg (1987, in studies, Wang and Thomas questioned the viability
Beaton et al., 1995), the keyword method of vo- of this approach for measuring long-term effects of
12 Acquisition
the keyword method because the immediate test concrete words over abstract words was not notably
potentially provides an additional learning trial larger under keyword instructions.
or allows testing the adequacy of retrieval paths Another type of FL words that may be less suit-
(Wang & Thomas, 1992, 1995; Wang, Thomas, & able for learning via the keyword method is cog-
Ouellette, 1992). They examined the long-term nates. Remember that the keyword is an L1 word
effectiveness of the keyword method by treating the that looks or sounds like the to-be-learned FL word.
delay interval as a between-subjects variable, test- In learning cognates, for instance, for the Spanish
ing some learners immediately after study and word rosa, the most obvious keyword would be its
others only after a delay of several days. Their translation, here rose. The keyword method thus
manipulation also changed the learning set from seems an unnecessarily laborious and ineffective
intentional learning instructions (in which the method for learning cognates, particularly consid-
learners know in advance that their newly acquired ering the large advantage that cognates have over
knowledge will be tested after learning) to inci- noncognates in the more straightforward learning
dental learning instructions. Wang and Thomas methods of word association and picture associa-
convincingly showed that, under these conditions, tion learning (see the detailed discussion of the role
long-term forgetting is greater for keyword learners of word type in FL vocabulary learning).
than for rote learners (Wang et al., 1992; Wang &
Thomas, 1992, 1995; but see Gruneberg, 1998). Retrieval Speed In the keyword literature, the ben-
The poorer retention for keyword learners ob- ets of learning are typically expressed in terms of
served by Wang and Thomas may have surfaced the percentage or proportion of correctly recalled
because of the between-subjects manipulation, words, often measured in a cued recall task. In the
which prevented additional learning or retrieval cued recall task, one of the elements in a pair (the
rehearsal on the immediate test. cue) is presented during testing, and the participant
is asked to come up with the other element of the
The Role of Word Type A second potential con- pair. In the cross-language variant of the cued recall
straint on the applicability of the keyword method task, as frequently applied in FL vocabulary learn-
concerns the diversity of the words presented in ing studies, the cue is a word in one language, and
these studies. In most keyword studies, the FL vo- the element to come up with is its translation in the
cabulary items are concrete words, referring to other language; the cross-language version of the
easily imaginable concepts. This sample of words cued recall task is thus essentially a word translation
does not represent adult vocabulary knowledge and task. The cued recall retrieval measure expressed as
language usage faithfully. Moreover, the exclusive percentage of correctly recalled words is assumed to
use of concrete words may have overestimated the reect the items successfully encoded in long-term
merits of the keyword method: Creating an inter- memory during learning. However, as discussed in
active image between the keyword and the L1 this chapter, uent language use is determined not
equivalent of the novel FL word, a crucial step only by retrieval accuracy, but also by the speed
in the keyword method, is likely to be easier for with which a word can be retrieved from memory.
concrete words (e.g., buttery) than for abstract Nearly three decades ago, Atkinson (1975) raised a
words (e.g., duty). Ellis (1995) even conjectured similar point. He assumed that FL learning via the
that the keyword method would be of little use in keyword method would not slow subsequent re-
learning abstract vocabulary. trieval of the learned FL words as compared to
However, the few studies that explicitly tested methods in which word retrieval is less complex,
the applicability of the keyword method to words like rote rehearsal.
that varied in imageability or concreteness did not Remarkably few studies, however, have exam-
seem to substantiate this idea (Delaney, 1978; ined the effect of keyword instruction on FL word
Pressley et al., 1981; Van Hell & Candia Mahn, retrieval speed (see Van Hell & Candia Mahn,
1997; cf. Ellis & Beaton, 1993a). For example, Van 1997, and Wang & Thomas, 1999, for exceptions).
Hell and Candia Mahn presented abstract and In two experiments, Van Hell and Candia Mahn
concrete FL words to keyword learners and rote examined retrieval speed by comparing retrieval
learners. They found that concrete words were times of keyword and rote learners for newly
learned and remembered better than abstract words learned FL words in a timed cued recall task. Per-
under rote rehearsal instructions (as is commonly formance was assessed in three tests: immediately
found; see the word-type effects discussed in the after the learning phase, after a 1-week delay, and
next part of this section). However, the advantage of after a 2-week delay. In all tests, they observed
Learning Foreign Language Vocabulary 13
considerably shorter retrieval times for rote learn- particular type of learner benets most from a par-
ers than for keyword learners (with the differences ticular learning method. (Another experimental re-
ranging between 452 and 966 ms). The faster sult that substantiates this claim is presented in the
retrieval times for rote learners were not compro- section The Effect of Background Music on Learn-
mised by poor recall performance. Rather, the pro- ing Foreign Language Vocabulary.)
portion of correctly recalled words of rote learners
was higher than (Experiment 1) or equal to (Exper- Direction of Testing Another factor that may
iment 2) that of the keyword learners. Wang and qualify the benets of the keyword method con-
Thomas (1999) corroborated these results by mea- cerns the direction of recall. Most keyword studies
suring response times via a timed recognition task have used a receptive cued recall task in which
(treating the delay interval as a between-subjects the newly learned FL word is presented and the
factor). L1 translation must be produced; this task corre-
Together, these ndings showed that keyword sponds to backward word translation (see, e.g.,
learners need more time to retrieve the newly De Groot, Dannenburg, & Van Hell, 1994). The
learned words from memory than rote learners do, reverse task, productive cued recall (or for-
suggesting that the retrieval of newly learned words ward translation), is used less frequently. Ellis and
may be slowed by the use of keyword mnemonics. Beaton (1993a) found that keyword mnemonics
Moreover, it appears that the keyword does not are effective for receptive recall, but less so than
become superuous, but is still used as a retrieval rote rehearsal instructions for productive recall.
cue well after learning (cf. Atkinson, 1975). This
may impede an important goal of FL learning, In conclusion, numerous studies reported the bene-
namely, the attainment of verbal uency. cial effect of using keyword mnemonics in FL vo-
cabulary learning. Yet, a drawback of the method is
The Role of Experience in Foreign Language Learn- that it seems to impede word retrieval after learning,
ing A fourth factor that may constrain the appli- and that its success is constrained by a number of
cability and suitability of the keyword method factors, including the learners experience with FL
concerns the learners amount of FL learning ex- learning and the type of words to be learned. One
perience. In the majority of keyword studies, the of the learning methods discussed in the next sec-
participants were inexperienced FL learners. tion, the word association method, does not suffer
Studies using more advanced learners suggested from these constraints.
that these learners may benet less from keyword
mnemonics than inexperienced learners do. Levin
et al. (1979), Moore and Surber (1992), and Hogben Paired Associate Learning
and Lawson (1994) used learners who had followed
FL classes for at least a year and observed that Two other common methods used in FL
the typical benecial effects of keyword mnemonics vocabulary learning studies are versions of a gen-
were less robust with more advanced learners of eral learning method that has been used in verbal
the target language. These ndings were extended learning and memory research for decades, namely,
by Van Hell and Candia Mahn (1997) to another the so-called paired associate paradigm. In studies
group of experienced learners, namely, multilingual employing this method, pairs of stimuli are pre-
language users with a considerable amount of ex- sented during learning. At testing, the cued recall
perience in learning FL vocabulary (i.e., in English, task is often employed; one of the elements in a pair
French, and German), but who had no prior (the cue) is presented, and the participant is asked
knowledge of the target language, Spanish. In these to come up with the second element of the pair.
learners, keyword instructions were less effective Alternatively, whole pairs are presented at testing
than rote rehearsal instructions in both immediate that were or were not presented as such during
and delayed recall. learning, and the participants are asked to indicate
These studies suggested that keyword mnemon- whether the presented stimulus pair is old (pre-
ics are relatively ineffective in experienced FL sented during learning) or new (not presented
learners, both advanced learners of the target lan- during learning; recognition). The stimuli as
guage and inexperienced learners of the target complete pairs, and the separate elements within a
language who had experience with learning a pair, may vary on many dimensions, such as the
number of other FLs. Apparently, there is no single modality of presentation (e.g., auditory or visual)
most effective way of FL vocabulary learning, but a and the nature of the stimuli. Line drawings of
14 Acquisition
common objects or the objects themselves, nonsense instructions are somewhat more specic. For in-
shapes, words of various grammatical categories, stance, in studies employing the rote learning tech-
nonsense combinations of letters, single letters, nu- nique, the participants are instructed to rehearse
merals, and, indeed, foreign words have been used as and memorize the presented materials silently (this
stimulus materials in paired associate studies (see is how the term was employed above).2
Runquist, 1966, for an early description of the es- Of the two paired associate learning methods,
sentials of the method). the word association technique can be applied
The two versions of this general paradigm that more widely than the pictureword association
have often been used in FL vocabulary learning method. As pointed out, the success and applica-
research are the word association and picture as- bility of the keyword method, although effective in
sociation methods. In the word association meth- many circumstances, is constrained by a number
od, the paired associates presented during learning of factors. One of these is the fact that the method
are two words, one a native language word and is not optimally suited for the learning of abstract
the second its translation in the target FL. The FL words and is unsuitable for learning cognates. The
words to be learned may be actual words in a picture association technique suffers from one of
natural language or invented, articial words that these constraints as well and to an even larger
do not occur as such in any natural language. In the extent than the keyword method: Whereas with
latter case, the FL word to be learned may be a some effort it is possible to employ the keyword
letter sequence that is formed according to the or- method in learning abstract words (Van Hell &
thographic and phonological systems of the learn- Candia Mahn, 1997), it is virtually impossible to
ers native language but that carries no meaning (a depict abstract words, which by denition cannot be
pseudoword) or an orthographically or phono- experienced by the senses, including the eye. (Unlike
logically illegal letter string that does not follow the keyword method, there is no restriction to limit
the orthographic or phonological rule systems of the picture association method to noncognates.)
the learners native language (a nonword). In the The word association method does not suffer
picture association method, one of the elements from any of these constraints; it can be used, and
in the study pairs is the targeted FL word and indeed has been used, to study the learning of con-
the second is a picture (or a line drawing) depicting crete and abstract words and cognates and non-
the referent of this word. Typically, in both these cognates (and frequent and infrequent words, but
methods the words are presented visually, but in this variable also does not constitute a constraint for
word association (and for the FL words in the the picture association and keyword methods). The
picture association condition), auditory presenta- pertinent studies and the effects found are discussed
tion is a feasible alternative as well and may indeed in the section on word type effects.
sometimes be the only option (when the learners Why then, if its applicability is restricted to the
are illiterate). study of only a subset of words in a language, is the
The term word association method is used picture association method used at all? An impor-
here to stress the fact that, in this method, two tant reason presumably is that it lends itself rather
words are paired in each learning trial. The term is naturally to study vocabulary learning in young
neutral with respect to the exact learning strategy children because the method closely resembles a
the participants actually use. Often, no specic common form of L1 vocabulary acquisition in
instructions regarding which strategy to adopt are these children, namely, the association of a word
given to the participants, a learning setting that is with the corresponding object in the childs envi-
also referred to as unstructured learning. Under ronment. Experimental data collected by Wimer
these circumstances, learners report the use of and Lambert (1959) suggested that this association
various learning strategies (e.g., associating the two of the to-be-learned FL word with environmental
words in the pair; rehearsing them silently; detecting objects and events is a relatively effective FL vo-
similarities between the words in a pair; forming cabulary learning method for adult learners as well,
mental images of the words; constructing sentences but a more recent study (Lotto & De Groot, 1998)
containing the words in the pair; inventing memory refuted this claim (see the section Individual Dif-
aids; De Groot & Van den Brink, 2004); different ferences in Learning Foreign Language Vocabulary
participants in the same experiment may use dif- for details).
ferent strategies, but individual participants may When the pictureword association method is
also replace a strategy employed early in the learn- used with very young children, it can only be
ing episode with a new strategy. In other studies, the exploited in an auditory form (presenting a picture
Learning Foreign Language Vocabulary 15
with the spoken form of its FL name) because these learners may differ in the extent to which they can
children will typically still be illiterate. Whereas successfully transfer new vocabulary learned via
visual presentation of the FL word is an option for contextually restricted methods (here via word as-
young children who have just passed the very initial sociation) to more meaningful and contextually
stages of learning to read, it is not a recommended richer FL situations.
mode of presentation for this learner group either.
The reason is that, for these children, word reading
has not been automatized yet and therefore coming
up with the correct sound structure of the visually Word-Type Effects
presented words (via the written forms) often con-
stitutes a real challenge to them. This cognitive Word-Type Effects on Learning
limitation cannot be ignored in studies of vocabu-
lary acquisition because it is a well-established fact Words vary on a number of dimensions. For in-
that generating the phonological forms of visually stance, words may refer to concrete objects or to
presented words by means of overt or subvocal abstract entities (the variable concreteness); they
speech is an essential component of successful vo- may share (a large part of their) visual or auditory
cabulary acquisition (see The Role of Phonology in form with their translation in another language
Foreign Language Vocabulary Learning section). (cognate status); they may be used often or rather
sparsely in speech and writing (frequency); they
may be morphologically simple or complex (mor-
Learning Words in Context phological complexity) or may differ in structural
complexity for other reasons (e.g., they may con-
In the FL vocabulary learning methods discussed tain more or less-complex consonant clusters).
above (i.e., keyword learning, rote rehearsal, word The effect of some of these variables, most no-
association learning, and picture association learn- tably concreteness, cognate status, and word fre-
ing), the newly learned words are presented in quency, has been studied frequently in bilingual
highly impoverished contexts. Language users, in- representation studies, which focus on the way
cluding FL learners, typically perform in contex- translation pairs are represented in bilingual mem-
tually richer situations. This evokes the idea that an ory (e.g., as compound, coordinate, or sub-
FL word may be better learned in a larger, more ordinate structures in the words of Weinreich,
meaningful linguistic context like a sentence. In the 1953/1974, or as word-association or concept-
eld of FL vocabulary learning studies using direct mediation structures in the terminology of Potter,
instruction methods, the question whether such So, Von Eckardt, and Feldman, 1984; see De Groot,
learning is more effective using restrictive contexts, 1993; Kroll, 1993; and Kroll & Tokowicz, chapter
as in the studies discussed above, or using a larger 26, this volume, for reviews). The tasks most com-
linguistic context has received relatively little em- monly employed in these studies are word transla-
pirical attention (but see, e.g., Moore & Surber, tion (e.g., De Groot et al., 1994), word association
1992; Prince, 1996). One prerequisite of learning (e.g., Kolers, 1963; Van Hell & De Groot, 1998a),
FL vocabulary in an FL sentence context is that the and semantic priming across languages (e.g., De
FL learners have a basic level knowledge of the FL Groot & Nas, 1991; Keatley, Spinks, & De Gelder,
language that should be at least sufcient to un- 1994).
derstand the sentence context. In contrast to the bilingual representation stud-
Prince (1996) examined more advanced FL ies, relatively few FL vocabulary learning studies
learners who had studied the FL (English) for 5 to 8 have manipulated word-type variables, even though
years and instructed them to learn new FL words in doing so is likely to provide relevant information on
either a sentence context condition or a word asso- the learning process and the ensuing memory rep-
ciation condition. He found that more words were resentations. Furthermore, results of such studies
recalled with word association than with sentence may inform FL curricula, especially the sequencing
context instructions. It should be noted, however, of the vocabulary to be learned by the students (e.g.,
that recall of the relatively weak learners (but not Meara, 1993).
of the more advanced learners) in the word asso- A plausible reason why only a few of these
ciation condition was notably poorer when mea- learning studies varied word type is that typically
sured via a sentence completion task than via a the word set presented for learning in these studies
cued recall task. This nding suggests that FL consisted of rather few words, too few to contain a
16 Acquisition
sufciently large number of each type (e.g., con- Compared to the effects of word concreteness and
crete noncognates) to obtain reliable effects of cognate status, the effect of this variable is not
the variables concerned. For instance, studies by robust. If it occurs at all in a particular study, it is
Cheung (1996), Papagno, Valentine, and Baddeley rather small (effects of 3% to 7% in De Groot &
(1991), and Wimer and Lambert (1959) presented Keijzer, 2000; De Groot & Van den Brink, 2004;
only three, eight, and nine words, respectively, for and Lotto & De Groot, 1998), and in two of these
which an FL word was to be learned. studies (De Groot & Keijzer, 2000; De Groot &
As the representation studies, the few FL vocab- Van den Brink, 2004), this small effect (with better
ulary learning studies that manipulated word type performance for high-frequency words than for low-
showed reliable effects of two of the above variables: frequency words) was attributable to a subset of the
word concreteness (De Groot & Keijzer, 2000; De items only.
Groot & Van den Brink, 2004; Ellis & Beaton, The FL vocabulary learning studies discussed
1993b; Service & Craik, 1993; Van Hell & Candia in this section employed different methods of FL
Mahn, 1997) and cognate status (De Groot & learning. As mentioned, Van Hell and Candia Mahn
Keijzer, 2000; Ellis & Beaton, 1993b; Kroll, Mi- (1997) contrasted the keyword method and rote
chael, & Sankaranarayanan, 1998; Lotto & De rehearsal; De Groot and Keijzer (2000) and De
Groot, 1998). For some of these studies, namely, Groot and Van den Brink (2004) used the word
those that have employed an orthogonal (not a association technique; and Kroll et al. (1998) and
correlational) design, it is possible to determine the Lotto and De Groot (1998) contrasted the word as-
actual size of the effects. These analyses show that sociation and picture association methods. Maybe
the effects are substantial: Across the relevant stud- the most noteworthy word-type effect reported in
ies, the magnitude of the concreteness effects varies these studies combined is the nding by Kroll et al.
between 11% and 27%, meaning that the recall and Lotto and De Groot that an effect of cognate
scores are from 11% to 27% higher for concrete status not only materialized in the word association
words than for abstract words (De Groot & Keijzer, condition, but also in the picture association con-
2000; De Groot & Van den Brink, 2004; Van Hell & dition. What is more, the cognate effect was equally
Candia Mahn, 1997). Similarly, the magnitude of large in these two conditions. The reason to qualify
the effect of cognate status varies between 15% and this nding as noteworthy is that it is generally as-
19% when highly experienced FL learners were the sumed that the form relation between translation
participants in the vocabulary learning studies (De equivalent terms underlies the effects of cognate
Groot & Keijzer, 2000; Lotto & De Groot, 1998). status in both representation and learning studies.
When less-experienced FL learners served as par- But of course, a word and a picture representing this
ticipants, the cognate effect even appears to be word do not share any form similarity.
substantially larger (about 25% in a receptive test- The effect of cognate status in the picture-
ing condition and about 50% in a productive testing learning condition thus suggested that the presen-
condition; Kroll et al., 1998, p. 383). tation of a picture activates the corresponding L1
Acknowledging the fact that uent use of a FL word form (Lotto & De Groot, 1998, pp. 5859),
not only requires that FL knowledge (here, the and that the learner then recognizes the similarity
knowledge of FL vocabulary) is stored in memory, between the generated L1 word form and the to-be-
but also that this knowledge is accessed and re- learned FL word form accompanying the picture.
trieved rapidly (see also the section on keyword This awareness then somehow (see the section
mnemonics), the ve studies that employed an or- Cognate Status for more detail) facilitates the
thogonal design measured retrieval times as well. learning of the new form. In theory, the form con-
The results of these analyses generally converged cerned could be phonological, orthographic, or both
with the analyses on the recall scores, although because the two elements within the cognate pairs
fewer of the effects were statistically signicant. used in these studies are typically similar both in
But, whenever a signicant effect occurred, its di- spelling and in phonology, and the learners recog-
rection strengthened the conclusions drawn from nition of either type of relationship might facilitate
the analyses of the recall scores. That is, responses learning. Lotto and De Groot, however, argued that
to concrete words and cognates were generally the forms involved presumably are the phonological
faster than those to abstract words and non- forms (see the original reference for details). Fur-
cognates, respectively. thermore, they noted that such a conclusion ts in
A third variable that has been manipulated in nicely with the results of a number of related studies
some of the above studies is word frequency. that all suggested an important role for phonology
Learning Foreign Language Vocabulary 17
The second explanation is in terms of the dif- volume). In fact, a cognate relation between two
ferential informational density of memory repre- words is considered a special case of a morpho-
sentations for concrete and abstract words within logical relation that may exist between words
an amodal, monolithic memory system (De Groot, within the same language and that is reected in the
1989; Kieras, 1978; Van Hell & De Groot, joint storage of morphologically related words in
1998b; Van Hell & Sjarbaini, 2004). Within this memory. According to this view, bilingual memory,
framework, the memory representations of con- just as monolingual memory, is organized by mor-
crete and abstract words are only assumed to differ phology, not by language. For instance, a French-
quantitatively, not qualitatively: Those of concrete English bilingual has one memory representation
words are assumed to contain more information containing both the English words marry, marriage,
elements than those of abstract words (see De and married and the French words marier and
Groot, 1989, for experimental support). Again, this mariage (Kirsner et al., 1993). If true, the learning
allows more anchoring opportunities in the case of of a FL word that shares a noncognate relation with
learning a FL word form for concrete L1 words. the corresponding L1 word involves creating a new
Lotto and De Groot (1998) proposed this same entry in memory, whereas learning a cognate word
explanation for the (relatively small) frequency ef- may only involve adding new information to, or
fect in FL vocabulary learning that has sometimes adapting, a representation already stored there
(but not reliably) been obtained. prior to the learning episode. The latter process may
This explanation of the concreteness effects in be less demanding than the former, causing the
FL vocabulary learning cannot account for the learning advantage of cognates over noncognates.
analogous effects in L1 vocabulary acquisition by A second possible cause for the cognate advan-
toddlers. The reason is that the former effects result tage is that in the case of learning a FL cognate,
from differences in memory structures for concrete which shares form with its translation, less has to
and abstract words that presumably reect the be learned than when a noncognate FL word has
outcome, not the beginning, of the L1 acquisition to be learned. Finally, because of the form overlap
process. At the onset of L1 vocabulary acquisition, between cognate translations and the absence of
representations are not likely to exist in memory for such overlap in the case of noncognates, when a
either concrete words or abstract words; in other cognate is presented as the testing stimulus, it will
words, at that stage concrete and abstract words do constitute a strong cue for the retrieval of its
not differ with respect to their memory representa- translation equivalent in the target language. These
tions; the buildup of memory information for both three suggested causes of the effects of cognate
types of words presumably starts from scratch. A status do not have to be mutually exclusive, but
plausible explanation for the concreteness effect may all contribute to the effect.
in L1 vocabulary acquisition was already provided
above: Only the acquisition of concrete words, not
that of abstract words, is supported by the percep- Word-Type Dependent Effects
tual presence of these words referents in the childs on Forgetting
environment.
The differential forgetting of concrete words and
cognates on the one hand and abstract words and
Cognate Status noncognates on the other suggests that, in terms
of Atkinson (1972), immediately after training
Lotto and De Groot (1998) and De Groot abstract words and noncognates are in a T (for
and Keijzer (2000) suggested three possible sources temporal) state relatively often. This means that the
for the superior FL vocabulary learning perfor- newly learned word is only known temporarily,
mance for cognates, considering both the learning and that subsequent learning of other words will
stage (storage) and the testing stage (retrieval) as cause interference, causing forgetting of the previ-
possible loci of the effect. The rst explanation ously known word. The second state Atkinson
extends a view of bilingual memory representation distinguishes is a P (permanent) state for newly
that assumes shared representations for cognates, learned words that have gained a permanent status
but language-specic representations for noncog- in memory immediately after training. The data
nates (Kirsner, Lalor, & Hird, 1993; Sanchez- suggest that concrete words and cognates have
Casas, Davis, & Garca-Albea, 1992; see also reached a P state relatively often at the conclusion
Sanchez-Casas & Garca-Albea, chapter 11, this of the training phase. A third possible state that
Learning Foreign Language Vocabulary 19
words presented for learning can be in, and that ory, and the level of accuracy at which the task is per-
abstract words and noncognates are in relatively formed is thought to reect phonological-memory
often immediately after training, is the U (un- skills and capacity. Therefore, these data also sug-
known) state. Of course, distinguishing between gest a relation between phonological memory and
these three retention states only concerns a re- FL vocabulary learning. This conclusion is strength-
phrasal of the effects obtained, not an explanation. ened further by neuropsychological evidence: Bad-
A true explanation may ultimately be provided in deley et al. (1988) showed that their patient P. V.,
terms, again, of differential memory representa- who had a reduced phonological store capacity, was
tions for different types of words (e.g., being em- unable to repeat back pseudowords longer than
bedded in a denser representation and, as such, three syllables and to learn auditorily presented
being linked to a relatively large number of infor- pseudowords paired with real words.
mation elements in memory might render a newly The important role of phonology in FL vocab-
learned FL word relatively immune to forgetting). ulary learning is further supported by studies using
experienced FL learners. Papagno and Vallar (1995)
observed that polyglots performed better than
nonpolyglots in phonological memory tasks and in
The Role of Phonology in Foreign FL paired associate learning, suggesting a rela-
Language Vocabulary Learning tion between phonological-memory capacity and
FL vocabulary learning.
The cognate effect observed in the picture associ- Van Hell and Candia Mahn (1997) observed that
ation learning condition in the work of Lotto and experienced FL language learners beneted more
De Groot (1998) and Kroll et al. (1998) suggested from rote rehearsal learning than from keyword
that participants generated the names of the pre- learning. They proposed that subvocal rehearsal of
sented pictures during learning (see Word-Type the FL word and its translation activates phono-
Effects on Learning). This was regarded as support logical codes, and that experienced learners in par-
for the view that phonology plays an important role ticular benet from using phonological information
in FL vocabulary learning. Gathercole and Thorn in learning novel FL words. Specically, experi-
(1998) reviewed the relevant literature and provided enced FL learners not only may have better phono-
overwhelming support from various sources for logical memory skills (as suggested by Papagno and
this view. Vallars 1995 study), but also may possess more
For instance, Papagno et al. (1991) showed that rened long-term knowledge of phonological struc-
an experimental technique called articulatory sup- tures. For example, the experienced FL learners in
pression disrupts the learning of FL vocabulary Van Hell and Candia Mahns study had all learned
(although suppression had little effect on meaningful the subtle, yet important, differences in the pro-
paired-associate learning in L1). The articulatory nunciation of the cognate hotel across the Dutch,
suppression technique involves the repeated uttering English, French, and German languages. This ne-
of a sound (e.g., bla) while learning the paired as- grained and broad repertoire of phonological
sociates consisting of, say, an L1 word and its FL knowledge, along with better phonological memory
translation. Suppression interferes with the phono- skills, may make experienced FL learners more re-
logical recoding of visually presented items, thus ceptive to the phonological information novel FL
preventing their short-term phonological storage. vocabulary contains and may thus guide and facili-
Furthermore, suppression interferes with subvo- tate the learning of novel FL words.
cal rehearsal, a process that is deemed necessary for Finally, the typicality of the FL words to be
transfer from short-term memory into long-term learned affects their learning; that is, if the sound
memory. structure of the to-be-learned words conforms to
Service (1992), in a 3-year longitudinal study of the phonotactic rules of the learners native lan-
Finnish children learning English as a FL, showed a guage, learning is more successful than when pho-
close relationship between the childrens ability at notactically alien FL words are presented for
the start of the program to repeat presented pseu- learning. Gathercole, Martin, and Hitch (in Gath-
dowords and their grades in English at the end of the ercole & Thorn, 1998) varied the nonwords in
program. Subsequent work (Service & Kohonen, wordnonword pairs on wordlikeness (in terms
1995) suggested that this relationship was medi- of sound structure) and demonstrated that more
ated by English vocabulary knowledge. Pseudoword wordlike nonwords than non-wordlike nonwords
repetition is assumed to involve phonological mem- were learned. Similarly, immediately after learning,
20 Acquisition
word forms or, later, directly would imply the use stimulus itself, e.g., the picture of a mug, or by the
of a strong semantic accent; the reason is that learner), leaving all remaining meanings of the FL
translation equivalents seldom share all aspects word yet to be learned through other means.
of their meaning: The meaning aspects specic to Insight into learning the meaning of words in
the word in L1 would be implied when using its more advanced FL vocabulary learning was pro-
L2 (second language) equivalent (see MacWhinney, vided by Bogaards (2001). He studied the learning
chapter 3, this volume, for other types of L1 transfer of new meanings for known words and for com-
in FL learning). Highly technical words possibly binations of known words in learners of French, all
constitute the only exception to the apparent rule native Dutch speakers, who were in their fourth
that the meanings of a word and its closest trans- year of learning this FL in high school. The results
lation do not overlap perfectly (Fries, 1945, in Boyd of this study (see the original reference for details)
Zimmerman, 1997, p. 11), although for particular suggest that both previously learned word forms
classes of words (concrete words) the overlap in and word meanings may promote the learning of
meaning between the two languages is larger than new meanings for familiar forms and expressions
for other classes (abstract words; emotion words). comprised of familiar forms.
For this reason, De Groot (1992; see also Van In sum, for ultimate use of a FL word in a nativelike
Hell & De Groot, 1998a) proposed the distributed way, the FL word form must provide access to
feature model of bilingual lexical representation meaning and be retrieved from conceptual represen-
as an alternative to the more common localist tations directly, bypassing the form representation of
models. In this model, word meaning is represented its L1 translation. The meaning that is initially asso-
in memory as a set of semantic features, some of ciated with the FL word (the meaning of its L1
which are shared between a pair of translations, translation) must gradually be narrowed (to get rid of
whereas others are unique to either the L1 word or the unique L1 meaning parts), extended (to also cover
the FL word. Translations of concrete words share the unique L2 meaning parts or be used in multiword
more of these semantic features than translations of expressions) and rened such that it covers all of its FL
abstract words (see Kroll & Tokowicz, chapter 26, meanings and captures the specic connotations of
this volume, for further details). each.
Furthermore, assigning a FL word the meaning Needless to say, gaining such a detailed level of
of its translation equivalent entails the awed FL vocabulary knowledge requires extensive prac-
assumption that a word has only one meaning, tice of the FL words in contexts varied enough to
whereas the truth is that words typically have many acquaint the learner with the nesse of all their
different meanings (some claim from 15 to 20 in meanings. Apart from extended immersion in an
English; Fries, 1945, in Boyd Zimmerman, 1997, p. environment in which the FL is the dominant
11), some of which are related, but others appar- language, only extensive reading in that language is
ently are unique. Which of a words many mean- likely to provide that outcome. The initial, imsy
ings should be assigned to it when it is encountered representations set up via the direct instruction
in speech or reading depends on the context of use. methods discussed here provide no more than the
This plethora of meanings and shades of meaning means to bootstrap into this time-consuming learn-
words may have and the context dependence of ing process, but as such are extremely valuable.
word meaning have frustrated the attempts by many
to obtain exact denitions of words and have led
others to accept the view that word meanings
cannot be pinned down, as if they were dead insects. The Effect of Background Music
Instead, they utter around elusively like live but- on Learning Foreign Language
teries. Or perhaps they should be likened to sh Vocabulary
which slither out of ones grasp (Aitchison, 1987,
p. 40). Or, in the words of Labov (1973, in Aitch- When performing cognitively demanding tasks,
ison, 1987): Words have often been called slippery some people prefer a quiet environment, claiming
customers, and many scholars have been distressed to be hindered by noise, including music, whereas
by their tendency to shift their meanings and slide others seem not to be bothered by a certain noise
out from under any single denition (p. 40). In level or even prefer (a particular type of) back-
keyword mnemonics, word association learning, ground music while performing the task, claim-
and picture association learning, only one of this ing to perform better under those circumstances.
plethora of meanings is singled out (either by the This observation, if conrmed and understood in
22 Acquisition
rigorous research, has obvious pedagogical impli- nonpronounceable nonwords) for a set of Dutch
cations as it might, for instance, inform teachers words. The participants were all drawn from the
about how to create the optimal learning environ- same population of relatively experienced FL
ment in the classroom and advise students with learners. Half of them learned the FL words in si-
respect to the most effective circumstances to do lence; the other half learned them while part of the
their homework. Of course, the potential impact of Brandenburg Concerto by J. S. Bach was playing
well-controlled studies into this topic reaches far in the background. During testing, no music was
beyond the classroom because cognition is involved played to either group of participants. The results
in the majority (if not all) tasks to be performed by were promising, but not in all respects conclusive:
humans, even tasks performed automatically most The recall scores were higher (by 8.7%) in the music
of the time. condition than in the silent condition, but this effect
Acknowledging its potential importance, the only generalized over items, not over participants.
effect of background music (and other types of This nding suggests that only a subset of the par-
noise ignored in the present discussion) on task ticipants in the music condition beneted from the
performance has been a topic of study by several presence of background music. It also suggests that
groups of researchers, most notably applied psy- the remaining participants in this condition also
chologists, cognitive psychologists, and personality were not hindered by it because otherwise an overall
psychologists. The applied psychologists among null effect of the music manipulation might have
these researchers primarily tried to nd out whe- been expected.
ther music affects workers satisfaction and morale Studies by Furnham and Bradley (1997) and
or their productivity at work. The cognitive psy- Furnham and Allass (1999) hinted at an exciting
chologists goal was to look at ways in which music explanation of why the effect of the music manip-
affects attention and processing in various tasks. ulation did not generalize over participants.
The personality psychologists focus was on the way Inspired by Eysencks (1967) theory that introverts
music and different musical styles interact with in- and extraverts differ in their levels of cortical
dividual differences in personality. See Furnham arousal, they predicted that background music
and Allass (1999) and Furnham and Bradley (1997) might have a detrimental effect on cognitive task
for a historical overview of this work. performance in introverts, but a benecial effect on
The role of background music in learning has such performance in extraverts. Manipulating this
also received the attention of teachers and educa- personality trait, Furnham and Allass observed that
tors with an interest in a eld of study carrying the introverts performed substantially better in the si-
esoteric name of Suggestopedia, a name based on a lent condition than in the (pop) music condition in
teaching method thus dubbed and introduced in a reading comprehension task and a recall task,
Bulgaria by Lozanov (1978, in Felix, 1993). The whereas for extraverts exactly the opposite pattern
innovative element this learning method introduced of results was obtained. The detrimental effect of
in the classroom was the systematic use of music in music for the introverts was larger in a condition
the instruction process. Especially, classical ba- in which the music played was complex than in
roque music was thought to support the learning a condition in which it was simpler. Again, this
process. Felix (1993) reviewed the pertinent studies pattern reversed for the extraverts.
and concluded that positive effects of music played Furnham and Bradley (1997) also demonstrated
during learning have been reported for vocabulary an interaction between the introvert/extravert var-
learning and reading performance; that effects of iable and the music variable on two cognitive tests,
music played during testing do not consistently one a reading comprehension test and the second
occur; and that playing the same music during both a memory test, and Daoussis and McKelvie (1986)
learning and testing leads to the best achievement. showed a similar interaction in a study looking at
The latter nding exemplies the well-known phe- reading comprehension. The results of the last two
nomenon of context-dependent memory, that is, studies differed from those of Furnham and Allass
that test performance is better the more similar the (1999) in that music had a detrimental effect on
circumstances under which testing occurs are to the the cognitive performance of introverts, whereas
circumstances present while learning (e.g., Godden extraverts appeared immune to the effects of the
& Baddeley, 1975). music manipulation. But, all three studies converge
De Groot and Van den Brink (2004) looked at on the same conclusion: The introvert/extravert per-
the effect of background music on learning FL sonality trait plays an important role in the effects of
words (which in fact were pronounceable and background music on cognitive performance.
Learning Foreign Language Vocabulary 23
The authors of the three studies just discussed the learning of FL vocabulary, both differences
all turned to Eysenck (1967) to account for this between learner groups and differences within
intriguing interaction between the introvert/extra- groups of learners. For instance, it was pointed out
vert personality trait and the presentation of music that advanced (experienced) learners of a particular
during learning. Eysenck posited that introverts target language benet less from keyword mne-
have a lower neurological threshold of arousal and monics than less-advanced (inexperienced) learners
therefore experience greater arousal in response to of that language do (e.g., Moore & Surber, 1992),
lower-intensity stimulation than extraverts; this and that for multilingual language users, who have
results in introverts satisfaction at relatively low considerable experience with learning FLs, rote
levels of stimulation. It was posited that in intro- rehearsal is a more effective learning method than
verts optimum performance is reached at moderate keyword mnemonics is (Van Hell & Candia Mahn,
levels of arousal. In contrast, extraverts require 1997). Lotto and De Groot (1998) obtained a
relatively high levels of arousal for optimal similar result: They showed that multilingual lan-
performance (Furnham & Allass, 1999, pp. 2829). guage users, sampled from the same population as
Presumably without awareness of this alleged the participants in Van Hell and Candia Mahns
underlying physiological cause, introverts and ex- study, learned more FL vocabulary when a word
traverts are apparently aware of the effect of back- association method was used than when the picture
ground music on their study success because association method was employed.
extraverts claim to play background music more In contrast, Wimer and Lambert (1959), com-
often while studying than introverts (Daoussis & paring word association learning with object as-
McKelvie, 1986; Furnham & Bradley, 1997). sociation learning (in which the word to be learned
This account of music effects on learning pro- is paired with an object rather than a picture of that
vides a possible explanation for the above nding by object), obtained better recall performance with
De Groot and Van den Brink (2004) that the effect of object association than with word association.
the music manipulation did not generalize over all They concluded that environmental events are
participants. In that study, the introvert/extravert more effective stimuli for the acquisition of foreign-
personality trait was not taken into account, and language responses than are native-language equiv-
the participant sample most likely included both alents for the new words, at least for the learning of
introverts and extraverts. The extraverts may have a simple, basic vocabulary (p. 35). The results of
beneted from background music, causing the Lotto and De Groot (1997) and (if imaging objects
overall higher recall scores in this condition. The fact plays the same role in learning as actual objects or
that a net positive effect of background music was pictures of actual objects do) those of Moore and
obtained suggests that the introverts were neither Surber (1992) and Van Hell and Candia Mahn
helped nor hindered by background music. (1997) suggest that this conclusion does not hold
The role of a number of other factors that may for all groups of learners. Possibly, the participants
affect musics effect on learning success, such as in Wimer and Lamberts study were relatively in-
music preference (see Etaugh & Michals, 1975, who experienced FL learners. If so, this combined set of
studied the effect of this variable on reading com- studies would suggest that learner group and
prehension), vocal versus nonvocal music (Belsham learning method interact such that, for experienced
& Harman, 1977), and musical styles (e.g., classi- FL learners, the word association technique (or rote
cal, jazz, and popular; Sogin, 1988), is still largely rehearsal, as one particular implementation of this
unknown. The evident pedagogical implications of technique) is more effective than learning techniques
lling this knowledge gap on creating optimal that employ the visual (imagined or actual) ana-
learning environments warrant increased research logues of the FL words to be learned, and that for
efforts devoted to unraveling the relevant variables less-experienced learners the opposite holds.
and their interactions. The results of Kroll et al. (1998; Experiment 1)
that, just as Lotto and De Groot (1998) contrasted
word association and picture association learning,
Individual Differences in provide some direct support for this suggestion:
Learning Foreign Language Whereas Lotto and De Groot, testing experienced
Vocabulary FL learners, obtained better results overall with
word association learning than with picture asso-
At various points in the preceding sections, we ciation learning (82% correct for word association
alluded to the existence of individual differences in learning vs. 77% correct for picture association
24 Acquisition
learning; only productive testing was employed), FL vocabulary learning (Baddeley et al., 1998; Pa-
Kroll et al., who tested less-experienced language pagno & Vallar, 1995; see also Michael & Gollan,
learners, obtained the opposite pattern of results chapter 19, this volume). As we have seen, pho-
(78.5% and 39.5% correct for word association nological coding appears to play an important role
learning in receptive and productive testing con- in transferring newly learned words from transient
ditions, respectively, vs. 82% and 42% for these memory stores into permanent memory, and the
testing conditions, respectively, following picture presence of ne-grained phonological knowledge in
association learning; all data collapsed across a test long-term memory may increase the learners re-
condition that tested with picture stimuli and one ceptiveness to subtle phonological differences in
that tested with word stimuli). That the partici- the learning material.
pants in Kroll et al.s study were less-experienced Baddeley et al. (1998) suggested that the pho-
learners than those of Lotto and De Groot is nological loop function differs between individuals,
strongly suggested by the far lower learning scores and that gifted language learners are characterized
in the productive testing condition in the work of by an excellent such function. The amount and
Kroll et al. than in that of Lotto and De Groot. subtlety of phonological information in memory
Furthermore, to achieve an overall recognition is obviously a function of the amount of language
accuracy of 70% in the (relatively easy) receptive experience, native and foreign, a learner has, so
testing condition, the data of only half of the par- that ultimately language learning experience may
ticipants (45 of 99) could be included in the ana- underlie (a substantial part of) the effects of pho-
lyses (see Kroll et al., 1998, pp. 379 and 381). In nological skills on FL language learning. It remains
Lotto and De Groot (1998), to achieve at least to be seen whether, if all other things (such as lan-
60% accuracy in the (relatively hard) productive guage learning experience) are equal, a thing such
testing condition (the only condition that they as talent for learning FLs can still be identied.
tested), only 8 of the 64 participants tested had to
be removed from the analyses (p. 43).
The amount of FL learning experience is un-
likely to be the only variable that interacts with the Conclusion
specics of the learning environment. That other
factors may be relevant as well was implicit in our This review of studies on FL vocabulary learning
discussion of the effect of background music on has highlighted some of the factors that need to
learning FL words. As shown, the relevant litera- be taken into account to gain a complete under-
ture suggests that the personality trait introversion/ standing of successful learning performance; it has
extraversion interacts with a role of background only briey touched on, or even completely ig-
music. We hypothesized that the pattern of results nored, other factors. For instance, much attention
obtained by De Groot and Van den Brink (2004), was devoted to contrasting the various direct FL
who tested experienced FL learners exclusively, vocabulary learning methods and pointing out their
emerged from an interaction between this person- limitations and the ways they interact with learner
ality trait and the music manipulation. If that characteristics such as FL learning experience
analysis is correct, the results of that study indicate and phonological skills. Similarly, the fact that
that FL learning experience is only one of the fac- various word characteristics determine the success
tors that determine what the optimal learning cir- of learning FL equivalents for L1 words and the way
cumstances are. In other words, the effects of FL these effects can be explained were discussed at
learning experience and background music both length.
suggest that there is no single optimal procedure of We also reviewed at some level of detail the re-
learning FL vocabulary, but that instead the opti- search that tries to resolve the dispute regarding the
mal procedure depends on learner characteristics. role that background music may play in FL vocab-
Different learners may benet most from different ulary learning. Finally, some discussion was devoted
circumstances, and the same learner may benet to the later stages of FL vocabulary acquisition, in
most from different circumstances at different which the newly learned FL words are functionally
stages of learning. detached from their L1 counterparts, and their
Differences in phonological knowledge and meaning representations gradually develop toward
processes and other aspects of working memory, those of L1 users of the FL concerned.
such as working memory capacity, were mentioned Other aspects of FL vocabulary learning re-
as yet another source of individual differences in ceived little or no attention, for instance, the role of
Learning Foreign Language Vocabulary 25
proximity of the to-be-learned FL to the learners (e.g., in education and government) and is used
L1. This issue was only briey touched on in the alongside another language or languages (e.g., En-
discussion of the effect of word typicality on learn- glish in Nigeria). In both Britain and North America,
ing performance. The larger the distance between the term second language describes the native lan-
L1 and the FL to be learned, the more FL word guage in a country as learned by immigrants who
have another rst language (Longman Dictionary of
forms to be learned will be atypical for the learner,
Language Teaching and Applied Linguistics). In this
the more alien the meanings of the FL words will be chapter, we consistently use the term foreign lan-
to the learner, and the more mapping problems guage (FL) to cover all these usages, although most of
between elements in the L1 and the FL the FL lear- the studies described concern the learning of a FL in
ner will encounter. FL vocabulary learning studies experimental settings by learners whose native lan-
that test a FL similar to the learners L1 (or that test guage is the dominant (and only ofcial) language in
the learning of pseudowords, which by denition the country where they live.
have phonological forms akin to the learners L1) 2. Note that the term word association learning
may overestimate learning performance as com- should not be confused with the word association
pared to testing more distant FLs. Such effects of technique often employed in semantic memory re-
search, in which the structure of semantic memory
language proximity/distance warrant a more thor-
is revealed by presenting participants with words
ough discussion than received here. they know, and they are asked to provide the rst
A further neglected topic concerns the large dif- word they think of after they are given a stimu-
ference in performance that is typically obtained lus word.
between productive and receptive testing condi-
tions, with receptive testing producing better re-
References
sults. Mention was made of these two ways of
testing newly learned FL vocabulary, but without Aitchison, J. (1987). Words in the mind: An in-
providing theoretical accounts of this effect (see De troduction to the mental lexicon. Oxford, UK:
Groot & Keijzer, 2000, pp. 4345, for a discussion). Basil Blackwell.
Finally, hardly anything has been said on the cru- Altarriba, J., & Mathis, K. M. (1997). Conceptual
cial differences between late FL vocabulary learn- and lexical development in second language
acquisition. Journal of Memory and
ing, which, albeit implicitly, was the topic of the
Language, 36, 550568.
present discussion, and early bilingual vocabulary Atkinson, R. C. (1972). Optimizing the learn-
acquisition (see De Houwer, chapter 2, this vol- ing of a second-language vocabulary. Jour-
ume). These learning processes differ crucially be- nal of Experimental Psychology, 96,
cause, in early bilingual vocabulary acquisition, as 124129.
in L1 vocabulary acquisition, the acquisition of Atkinson, R. C. (1975). Mnemotechnics in
word form and word meaning proceed in parallel, second-language learning. American
whereas in late FL vocabulary learning, a meaning Psychologist, 30, 821828.
for the new word to be learned is already in place Atkinson, R. C., & Raugh, M. R. (1975). An ap-
(although it requires adjustment; see the section plication of the mnemonic keyword method
to the acquisition of a Russian vocabulary.
Freeing and Fine-Tuning the Newly Learned For-
Journal of Experimental Psychology: Human
eign Language Words). Future reviews of studies Learning and Memory, 104, 126133.
on FL vocabulary learning might shift the focus to Baddeley, A. D., Gathercole, S., & Papagno, C.
these and other issues neglected here. (1998). The phonological loop as a language
learning device. Psychological Review, 105,
Notes 158173.
Baddeley, A. D., Papagno, C., & Vallar, G. (1988).
1. A foreign language is a language that is not When long-term learning depends on short-
a native language in a country. In North America, term storage. Journal of Memory and
foreign language and second language are often used Language, 27, 586595.
interchangeably in this sense. In British usage, a dis- Bahrick, H. P., & Phelps, E. (1987). Retention of
tinction between the two is often made, such that a Spanish vocabulary over 8 years. Journal of
foreign language is a language taught in school but Experimental Psychology: Learning, Memory,
not used as a medium of instruction in school, nor is it and Cognition, 13, 344349.
a language of communication within a country (e.g., Beaton, A., Gruneberg, M., & Ellis, N. (1995).
English in France). In contrast, a second language is a Retention of foreign vocabulary learned using
language that is not a native language in the country, the keyword method: A 10-year follow-up.
but is widely used as a medium of communication Second Language Research, 11, 112120.
26 Acquisition
Belsham, R. L., & Harman, D. W. (1977). Effect of De Groot, A. M. B., & Nas, G. L. J. (1991). Lexical
vocal versus non-vocal music on visual recall. representation of cognates and noncognates
Perceptual and Motor Skills, 44, 857858. in compound bilinguals. Journal of Memory
Bogaards, P. (2001). Lexical units and the learning and Language, 30, 90123.
of foreign language vocabulary. Studies in De Groot, A. M. B., & Poot, R. (1997). Word
Second Language Acquisition, 23, 321343. translation at three levels of prociency in a
Boyd Zimmerman, C. (1997). Historical trends in second language: The ubiquitous involvement
second language vocabulary instruction. In J. of conceptual memory. Language Learning,
Coady & T. Huckin (Eds.), Second language 47, 215264.
vocabulary acquisition (pp. 519). Cambridge, De Groot, A. M. B., & Van den Brink, R. (2004).
England: Cambridge University Press. Effects of background music, word concrete-
Brown, R. W. (1957). Linguistic determinism and ness, word frequency, and word typicality on
the part of speech. Journal of Abnormal and learning foreign language vocabulary. Manu-
Social Psychology, 55, 15. script in preparation.
Chen, H.-C., & Leung, Y.-S. (1989). Patterns of Delaney, H. D. (1978). Interaction of individual
lexical processing in a nonnative language. differences with visual and verbal elaboration
Journal of Experimental Psychology: instructions. Journal of Educational Psychol-
Learning, Memory, and Cognition, 15, ogy, 70, 306318.
316325. Desrochers, A., Wieland, L. D., & Cote, M.
Cheung, H. (1996). Nonword span as a unique (1991). Instructional effects in the use of the
predictor of second-language vocabulary mnemonic keyword method for learning
learning. Developmental Psychology, 32, German nouns and their grammatical gender.
867873. Applied Cognitive Psychology, 5, 1936.
Cohen, A. D. (1987). The use of verbal and Elhelou, M. W. A. (1994). Arab childrens use
imagery mnemonics in second-language of the keyword method to learn English
vocabulary learning. Studies in Second vocabulary words. Educational Research, 36,
Language Acquisition, 9, 4362. 295302.
Daoussis, L., & McKelvie, S. J. (1986). Musical Ellis, N. C. (1995). The psychology of foreign
preferences and effects of music on a reading language vocabulary acquisition: Implications
comprehension test for extraverts and for CALL. Computer Assisted Language
introverts. Perceptual and Motor Skills, 62, Learning, 8, 103128.
283289. Ellis, N., & Beaton, A. (1993a). Factors affecting
De Groot, A. M. B. (1989). Representational the learning of foreign language vocabulary:
aspects of word imageability and word Imagery keyword mediators and phonological
frequency as assessed through word short-term memory. Quarterly Journal of
association. Journal of Experimental Experimental Psychology, 46A, 533558.
Psychology: Learning, Memory, and Ellis, N. C., & Beaton, A. (1993b). Psycholinguistic
Cognition, 15, 824845. determinants of foreign language
De Groot, A. M. B. (1992). Bilingual lexical vocabulary learning. Language Learning, 43,
representation: A closer look at conceptual 559617.
representations. In R. Frost & L. Katz (Eds.), Ellis, N. C., & Sinclair, S. G. (1996). Working
Orthography, phonology, morphology, and memory in the acquisition of vocabulary and
meaning (pp. 389412). Amsterdam: Elsevier syntax: Putting language in good order.
Science. Quarterly Journal of Experimental
De Groot, A. M. B. (1993). Word-type effects in Psychology, 49A, 234250.
bilingual processing tasks: Support for a Etaugh, C., & Michals, D. (1975). Effects on
mixed-representational system. In R. Schreuder reading comprehension of preferred music and
& B. Weltens (Eds.), The bilingual lexicon frequency of studying to music. Perceptual and
(pp. 2751). Amsterdam: Benjamins. Motor Skills, 41, 553554.
De Groot, A. M. B., Dannenburg, L., & Van Hell, Eysenck, H. (1967). The biological basis of
J. G. (1994). Forward and backward word personality. Springeld, IL: Thomas.
translation by bilinguals. Journal of Memory Felix, U. (1993). The contribution of background
and Language, 33, 600629. music to the enhancement of learning in
De Groot, A. M. B., & Keijzer, R. (2000). What is suggestopedia: A critical review of the
hard to learn is easy to forget: The roles of literature. Journal of the Society for Accelera-
word concreteness, cognate status, and word tive Learning and Teaching, 18, 277303.
frequency in foreign language vocabulary Furnham, A., & Allass, K. (1999). The inuence of
learning and forgetting. Language Learning, musical distraction of varying complexity on
50, 156. the cognitive performance of extraverts and
Learning Foreign Language Vocabulary 27
Wang, A. Y., & Thomas, M. H. (1992). The effect Wang, A. Y., Thomas, M. H., & Ouellette, J. A.
of imagery-based mnemonics on the long-term (1992). Keyword mnemonic and retention of
retention of Chinese characters. Language second-language vocabulary words. Journal
Learning, 42, 359376. of Educational Psychology, 84, 520528.
Wang, A. Y., & Thomas, M. H. (1995). Effect Weinreich, U. (1974). Languages in contact:
of keywords on long-term retention: Help Findings and problems. The Hague, The
or hindrance? Journal of Educational Netherlands: Mouton. (Original work
Psychology, 87, 468475. published 1953)
Wang, A. Y., & Thomas, M. H. (1999). In defence Wimer, C. C., & Lambert, W. E. (1959). The
of keyword experiments: A reply to differential effects of word and object stimuli
Grunebergs commentary. Applied Cognitive on the learning of paired associates. Journal
Psychology, 13, 283287. of Experimental Psychology, 57, 3136.
Annick De Houwer
2
Early Bilingual Acquisition
Focus on Morphosyntax and the
Separate Development Hypothesis
30
Early Bilingual Acquisition 31
contrast, were concerned with the acquisition of languages from birth, but rather to one. After all,
two rst languages (L1s), as it were, that is, with all the people interacting with the child would be
cases for which the children in the study heard two using the same types of utterances, regardless of
languages from birth (and continued to do so at whether linguists could describe these as consisting
least until the time of study). of elements from two languages (cf. mixed lan-
The present overview chapter also focuses on guages in the sense of, e.g., Bakker, 1992). Bilin-
the acquisition of two languages from birth. Chil- gual input as understood here involves variation
dren who hear two languages from birth are un- between strictly unilingual utterances in at least
dergoing a process of what Meisel (1989) called two languages, but will in most cases include mixed
bilingual rst language acquisition (BFLA; see also utterances as well.
De Houwer, 1990). In BFLA, there is no second I only refer to children acquiring varieties of
language in the chronological sense. It thus makes what are commonly seen as distinct languages ra-
no sense to speak of an L1 or an L2. To refer to the ther than a standard language and a regional variety
two languages that play a role in BFLA, I use the of that same language, although the actual formal
terms language A and language Alpha (terminology differences between them may in fact be similar to
borrowed from Wolck, 1984). This does not nec- those between two different languages. The over-
essarily imply that both these languages need be on view focuses on aspects of language production as
the same footing; that is, they need not be used in it can be observed and recorded in naturalistic in-
equal proportion or with equal frequency or regu- teractional settings. All studies mentioned here
larity. Rather, the terms here refer to the input concern children growing up without any known
languages and specify that both input languages handicaps or language learning problems.
start to be used in regular communication with the
child at the same time in development (viz., from
birth or very soon afterward).
This chapter, then, reviews recent studies of A Frame of Reference for Studying
children under the age of 6 years exposed to two Morphosyntactic Development
spoken languages from birth who continued to in Young Bilinguals
hear these languages fairly regularly and frequently
until the time of data collection (for a rare study In modern studies of monolingual acquisition,
that focused on the bilingual development of a morphosyntactic development continues to be the
signed and a spoken language in young children, most frequently investigated area of research. The
see van den Bogaerde & Baker, 2002). Studies of same is true for recent work on bilingual acquisi-
children who have been regularly addressed in tion. For instance, in an article giving an overview
three or even more languages from birth do not of many different aspects of BFLA published since
feature in this review: So far, none appear to have 1985 (De Houwer, 1999b), 35 of the 64 original
been published (see Quay, 2001). There are, how- research articles or book chapters cited concerned
ever, studies of children acquiring two languages morphosyntax; the 29 remaining texts were spread
from birth who start hearing a third language reg- out over six other major research topics (i.e., the
ularly once they are just a little older (see, e.g., Quay, role of the input, the lexicon, phonological devel-
2001; Widdicombe, 1997). opment, the use of mixed utterances, and language
I dene bilingual input as dual-language input choice; I discuss all these aspects, in addition to
consisting mainly of substantial numbers of utter- morphosyntactic development, in De Houwer,
ances that both lexically and structurally belong to 1995b, as well).
one language only. Mixed utterances (i.e., utter- As the eld of bilingual acquisition research
ances containing morphemes and/or lexemes from grows and ourishes (see, e.g., the volume edited
two languages) may account for some of the input by Cenoz & Genesee, 2001), more and more dif-
as well. Even if the people in the childs environ- ferent topics are under investigation. Nevertheless,
ment address the child mainly in either of two more is currently known about morphosyntactic
languages and thus follow the one person, one development in bilingual children than about any
language strategy (Ronjat, 1913), they will occa- other area of language functioning. This justies
sionally use mixed utterances. However, if a child the primary focus here on morphosyntactic issues
hears nothing but mixed utterances, as might be in early bilingual development.
the case in some so-called bilingual communities, I In the eld of language acquisition research,
would argue that the child is not exposed to two there have for a long time been divergent views
32 Acquisition
regarding the role and status of morphosyntactic Deuchar and Quays (2000, pp. 8283) view that
categories in early language development and when later two-word utterances that show morphologi-
it makes sense to use morphosyntactic categories cal markings are in principle analyzable in mor-
for describing childrens early language produc- phosyntactic terms. Once children produce a large
tions. The controversies focus mainly on what is proportion of multiword utterances, child language
commonly termed the two-word stage (compare, researchers seem to agree that it is fully appropri-
e.g., Lieven, Pine, & Baldwin, 1997, and Vihman, ate to describe their language use in morpho-
1999). New lines of research in developmental syntactic terms.
psycholinguistics that focus on transitions and
connections between different kinds of knowledge
(phonological, lexical, morphological, and syntac-
tic) hold great promise for greater insight into the
roots of morphosyntactic development (see the The Relationship Between a
contributions in Weissenborn & Hohle, 2001). Childs Two Developing Languages:
For the purposes of the discussion in this chapter, The Status of the Separate
I consider morphosyntactic development in pro- Development Hypothesis
duction to be evident once a child growing up bi-
lingually has begun to produce utterances containing Ronjat (1913) was not only the rst to publish an
at least three clause constituents or two-word ut- empirical study on a bilingual individuals language
terances containing at least one bound morpheme, use, but was also the rst to formulate generaliza-
whichever comes rst. This is not to imply that from tions regarding the relationship between a young
this point on children have an awareness or abstract bilingual childs two languages. In addition, Ronjat
knowledge of the morphosyntactic categories they was the rst to address, based on empirical data,
are using, and I do not mean to imply that no such the issue of the relationship between a bilingual
knowledge is available prior to this (as, e.g., Go- speakers two languages.
linkoff, Hirsh-Pasek, & Schweisguth, 2001, have It is this relationship between bilingual chil-
suggested, it is quite possible that children as young drens two languages that continues to be in the
as 18 months have a representation of some mor- limelight in bilingual acquisition studies today.
phological categories well before they use these Basically, the question is to what extent and at
categories in production). what point in overall development a bilingual
Space does not permit an extensive explanation, childs two separate input languages are processed
but I believe that the fairly conservative position as two independent systems. As researchers de-
taken here strikes a reasonable balance between velop more sophisticated tools to investigate bi-
overestimating and underestimating a childs gram- lingual infants perceptual capabilities and earliest
matical skills. At the same time, it takes into con- vocalizations (see, e.g., Bosch & Sebastian-Galles,
sideration the huge typological differences between 2001; Poulin-Dubois & Goodz, 2001), this ques-
different languages as far as their reliance on con- tion may nally have a chance of being answered.
stituent order versus bound morphology is concerned. However, both the methodological and the ana-
Clearly, my position here excludes the one-word stage lytical problems are quite formidable and have led
as a relevant focus of interest for a discussion of some researchers to question whether in fact it
morphosyntactic development. This corresponds to will be possible to address fully the issue for
what appears to be a consensus in the eld of language childrens very earliest stages of linguistic devel-
acquisition in general: Morphosyntactic analyses of opment. In particular, determining whether bilin-
single-word utterances when children are still in the gual childrens early phonologies develop as
one-word stage are conspicuous by their absence (it is separate systems or not is quite a daunting task
acknowledged, however, that at the one-word stage, (Johnson & Lancaster, 1998; cf. also Deuchar &
precursors of bound morphology may be already Quay, 2000, p. 111).
present; see, e.g., Peters, 1983). Earlier publications strongly defended either the
For the so-called two-word stage (which may be Independent Development Hypothesis (e.g., Berg-
very drawn out or so brief it is hardly noticeable), man, 1976), which claims that from the very be-
there is less consensus (see above), but my proposal ginning of language development infants who were
for bilingual data here is in line both with Meisels hearing two languages from birth develop two in-
(1994) reluctance to see childrens early two-word dependent systems, or, alternatively, they strongly
utterances as exhibiting syntactic properties and supported the one hybrid system interpretation
Early Bilingual Acquisition 33
(e.g., Leopold, 19391949/1970, Vol. 2, p. 206; that supports the SDH even though the subject of
Volterra & Taeschner, 1978, p. 312), which posits this study did not quite hear her languages ac-
an initial processing of two input languages as cording to the one person, one language principle
one hybrid system. Both these opposing points of suggests that the input condition that is part of my
view made their claims regarding all basic levels original formulation of the SDH may in fact not be
of language functioning (i.e., phonology, lexicon, necessary. However, more studies are needed to
morphosyntax). Within the hybrid system view, it investigate this issue. Also, I know of no studies
then became crucial to try to explain just how that have explored young childrens language de-
children did in fact eventually manage to differ- velopment under mainly mixed conditions (i.e.,
entiate between their languages (see, e.g., Arnberg when children heard most of the people in their
& Arnberg, 1992). Today, researchers are fortu- environment speak two languages to them).
nately much more aware of the methodological and Because of the potentially very large role of in-
theoretical complexities involved in explaining put conditions, it is too early, then, to generalize
the very earliest stages of bilingual development the SDH to all children growing up with two lan-
and understandably reluctant to make denitive guages from birth (see also De Houwer, 1990). At
claims. the same time, I know of no study that clearly
For the development of morphosyntax in pro- shows evidence against the SDH. There appears to
duction, however, the issue of the extent to which be a broad consensus among researchers today that
bilingual children speak like the people acting as the Separate Development Hypothesis accurately
models for their two input languages is in principle characterizes the basic process of morphosyntactic
much more amenable to investigation. Once chil- development in young bilingual children (see also
dren start showing clear signs of morphosyntactic Meisel, 2001, p. 16).
development in production, which typically occurs In the conclusion to my 1990 monograph, I
around their second birthday (cf. the previous speculated on the reasons that make separate de-
section outlining a frame of reference for studying velopment possible. One basic reason must be that
morphosyntactic development), their phonologies young children pay very close attention to the
tend to be more stable, and the huge problems of variable nature of the input. Without at least this, it
identifying language sources for childrens vocali- would appear impossible for young bilingual chil-
zations start to decrease steadily. It comes as no dren to produce utterances that are clearly relatable
surprise, then, that many studies of language de- to each of their input languages.
velopment in toddlers who grow up with two lan- Given the existence of widely available earlier
guages from birth have given a lot of attention to and in-depth reviews (Meisel, 1989; De Houwer,
the relationship between childrens developing mor- 1990, pp. 3647, and 1995b; Lanza, 1997b; Deu-
phosyntactic systems. char & Quay, 2000), I only briey mention a few
On the basis of an in-depth case study of a of the many criticisms that have over the years been
Dutch-English bilingual child, Kate, I proposed the leveled at earlier claims concerning the initial stages
Separate Development Hypothesis (SDH), which of morphosyntactic development in bilingual chil-
states that children regularly exposed to two lan- dren. These claims were part of the general single-
guages from birth according to the one person, one system hypothesis (cf. the discussion in the third
language principle develop two distinct morpho- paragraph of this section). For early morpho-
syntactic systems in that the morphosyntactic syntactic development, they posited that children
development of the one language does not have any systematically apply the same syntactic rules
fundamental effect on the morphosyntactic devel- to both languages (Volterra & Taeschner, 1978,
opment of the other (De Houwer, 1990, p. 66). p. 312), thus implying that very young bilingual
At the time, there were only a few published studies children do not follow the ways of speaking of the
that provided empirical support for the SDH people around them. In this view, bilingual chil-
(or the Differentiation Hypothesis, as Meisel, 2001, dren are seen as unable to keep two grammatical
termed it), and the Kate study was the rst to systems separate (Meisel, 1989), a process that has
address the issue based on a very wide variety of been called fusion in the bilingualism literature
morphosyntactic phenomena as present in the (Wolck, 1984). The authors making these claims
speech of one and the same child. do not refer to input conditions, but all the data
The 1990s saw an explosion of other studies supposedly supporting the claims come from chil-
providing additional support for the SDH. The fact dren growing up according to the one person, one
that there is also a study (Deuchar & Quay, 2000) language principle.
34 Acquisition
A rst basic problem is that the nature of the language Alpha than of utterances with lexical items
empirical support offered by Volterra and Taesch- and structural features from the same language
ner (1978) and later by Taeschner (1983) is very (De Houwer, 1987, p. 138, and 1990, p. 66).
unclear, and that the few analyses given showed Following Slobin (1973), a weaker version of
internal inconsistencies and were often inaccurate the transfer theory that is based on a kind of con-
(see, e.g., Mills, 1986; Meisel & Mahlau, 1988). tinuous comparison procedure between structures
A more analytical problem is that Volterra and in both input languages, predicts transfer only if a
Taeschner, like Leopold (19391949/1970, Vol. 1, particular morphosyntactic feature of input system
p. 179, and Vol. 3, p. 186) interpreted the use of A is less complex than a functionally equivalent
lexically mixed utterances as evidence for a fused feature of input system Alpha. Other proponents of
system. As I have argued (De Houwer, 1990, p. 39), this weaker version, such as Arnberg (1987, p. 68),
the use of utterances that contain lexical items however, tend to refer only to differences in formal
from two languages is not necessarily a reection of complexity and ignore a crucial aspect in Slobins
one underlying language system. If so, all bilingual original proposal: functional equivalence. The
speakers would necessarily be operating with one weaker version of the transfer theory is less easily
fused system since all bilingual speakers at least testable since it is very difcult, if not impossible, to
occasionally use lexically mixed utterances. Rather, compare levels of formal complexity across lan-
young bilingual childrens lexically mixed utter- guages (cf. De Houwer, 1987, pp. 138139, and
ances rst and foremost need a sociolinguistic ex- 1990, pp. 5658).
planation: It needs to be investigated under which Neither version of the transfer theory explains
sociolinguistic conditions they do and do not ap- how children eventually do become able to com-
pear and whether children are socialized in an en- bine lexical items from language Alpha with mor-
vironment that encourages their use or not (see also phosyntactic features of the same language. Note
Lanza, 1997b). Once this is clear, psycholinguistic also that the transfer theory presupposes a very
models can be constructed to explain the occur- great deal of creative tenacity in the young bilin-
rence of mixed utterances and their form. gual child that manifests itself even in the face of
Volterra and Taeschner (1978) also discussed continuous contradictory and nonsupporting evi-
instances of lexically unilingual utterances that dence as provided in the dual-language input.
they claimed showed interference between their So far, no studies have empirically shown
subjects two languages. They considered such ut- the actual existence of the kinds of language
terances to be evidence for their single-system hy- repertoires predicted by the transfer theory in
pothesis. As Meisel (1989) pointed out, the notion children with bilingual input from birth. For chil-
of interference requires the existence of two dren undergoing a process of early L2 acquisition,
systems that can exert inuence on each other. This though, clear and frequent signs of transfer may
is very different from positing, as Volterra and appear in one of their languages once children
Taeschner (1978) and Taeschner (1983) did, one are beyond the silent stage (Ervin-Tripp, 1974;
single rule system that gives rise to an undiffer- Tabors, 1987) and the formulaic stage (Wong-
entiated language that by implication has as its Fillmore, 1979). Preschool-aged children who start
output a type of language production that differs out hearing only one language from birth and who
substantially from each input system. start regularly hearing an L2 on top of that at,
The single-system or Mish-Mash hypothesis say, age 3, may produce quite a few utterances
(a term used by Bergman, 1976) is not incom- with lexical items from L2 but structural features
patible with the strong version of what I have mainly from L1 (Fantini, 1985; Ekmekci, 1994;
termed a transfer theory of bilingual development Pfaff, 1994). The proportion of these kinds of ut-
(De Houwer, 1987, pp. 138140, and 1990, p. 66). terances in relation to the childs overall produc-
In its stronger version, such a theory assumes that tion in L2 is not known, but they appear to be
any morphosyntactic device belonging to input sys- quite common.
tem A will be used in the childs speech production in The characteristics of childrens L2 speech pro-
utterances containing only lexical items from lan- duction are quite different from what is generally
guage B and vice versa (De Houwer, 1990, p. 66). reported for young children with bilingual expo-
Stated in these empirically testable terms, support sure from birth. These childrens language pro-
for the theory would consist of a quantitatively duction shows on the whole very little evidence of
much higher proportion of utterances with lexical morphosyntactic transfer from one input system to
items from language A, but structural features of the other (see also the section on studies of BFLA
Early Bilingual Acquisition 35
that offer support for the SDH). Rather, most of childrens lexically unilingual utterances resemble
young bilingual learners utterances with words those used by the people around them and thus to
from language A have morphosyntactic features what extent the SDH is an accurate descriptive
that are relatable to the same input language. The generalization of early bilingual development. The
same goes for language Alpha as well. This is pre- SDH thus depends on analyses of lexically unilin-
cisely what the SDH predicts. gual utterances only (De Houwer, 1990, p. 69, and
In the next section, I discuss in more detail the 1994, 1998, p. 256; De Houwer & Meisel, 1996).
basis for concluding whether there is separate de- Of course, should a childs repertoire consist
velopment. First, though, it needs to be emphasized mainly of mixed utterances, it becomes impossible
that, to be able to interpret the morphosyntactic to investigate the SDH (or its counterpart, the
features of their two input languages, bilingual transfer theory). As it turns out, though, most of
children must have processing mechanisms that are young bilingual childrens utterances are lexically
able to approach each input language as a mor- unilingual and thus offer ample opportunity for
phosyntactically closed set. So far, there have been investigating the extent to which these unilingual
no reports that bilingual toddlers or preschoolers utterances resemble target structures present in
are somehow slow or have difculty in real-time each of the input languages.
comprehension of their input languages or switches In principle, the SDH should be addressed on
between them, whether utterance-internal or not. the basis of childrens acquisition of aspects of
However, this issue has to my knowledge not been morphosyntax that clearly differ between the
explicitly addressed as yet. Since in young children childs two input languages but that are compara-
language comprehension generally precedes and ble in that they fulll more or less the same func-
paves the way for language production (see, e.g., tion (cf. De Houwer, 1990; Meisel, 1989; for a
Bates, Dale, & Thal, 1995), it is not unlikely that particularly penetrating argumentation explaining
separate development in comprehension is partly this need, see Serratrice, 2002). After all, when
what makes separate development in production both input systems use different morphosyntactic
possible. means for expressing a particular function, there
are different expectations for their use in the childs
language A than in language Alpha. When both
input systems closely resemble each other for a
Methodological Requirements particular feature, the child could not be expected
for Addressing the Separate to use different features.
Development Hypothesis An example will help clarify this point: In English
yes-no questions involving lexical verbs, there is use
Once children with bilingual input from birth start of do-support as in Do you want some tea? In
to use morphosyntactic elements in their utter- contrast, Dutch yes-no questions involving lexical
ances, they use three types of utterances: (a) lexi- verbs do not use do-support (Wil je thee? literally,
cally unilingual utterances in language A, (b) Want you tea?). The SDH would predict do-
lexically unilingual utterances in language Alpha, support only in the childs lexically English ques-
and (c) mixed utterances, which contain lexical tions and would not expect any do-support in the
items or bound morphemes from languages A and childs lexically Dutch questions. The transfer the-
Alpha. These are also the types of utterances that ory would expect either no do-support in English or
older bilingual speakers produce and that will be do-support in Dutch. On the other hand, English
present in the childs bilingual input. and Dutch yes-no questions involving the copula
The basic question to be answered is whether have exactly the same structure: Is that tea? or Is
a child with bilingual input from birth follows dat thee? (literally, Is that tea?). Application of
a target-language-like developmental path in two the Dutch rule to English or the English rule to Dutch
languages (cf. the third section). Thus, it needs to gives the same result. Hence, English and Dutch yes-
be investigated to what extent the childs lexically no questions with a copula are not constructions
unilingual utterances in language A use morpho- that can provide insight into whether children
syntactic features from language A and to what transfer rules from one language to the other.
extent the childs lexically unilingual utterances Children may not always agree with linguists
in language Alpha use morphosyntactic features regarding what should count as a particular struc-
from language Alpha. The answer to this question ture. Getting back to the example of the yes-no
will show the extent to which young bilingual questions, it is quite possible that, for English,
36 Acquisition
a bilingual child has not yet learned that questions development holds in some areas of morpho-
with a copula are structured differently from syntactic functioning, but not in others.
questions with a lexical verb. The child might use For the SDH to be conrmed, then, separate
do-support for all English questions, including development must be evident for most of the
those with a copula. As long as the child does not morphosyntactic structures in the childs speech
use do-support in Dutch questions, though, there is that reect differences in the input languages.
no evidence of transfer. Rather, the childs overuse Occasional instances of apparent transfer in lexi-
of do-support in English even lends stronger evi- cally unilingual utterances of features that differ
dence for the SDH than if the child was producing across the input languages do not detract from the
do-support only when required. validity of the SDH, but should of course only be
But, what if the child does not use any do- very occasional (see further discussion in this sec-
support? The fact that she or he fails to use it when tion). They should occur in no more than a few
English requires it is not necessarily a result of percent of the relevant cases within a brief time
transfer since English questions with a copula pro- frame (say, in all the recordings made in a months
vide evidence that in English do-support is not time). Structures that appear to push the two sys-
necessary in all questions. The child may be over- tems apart even more than necessary are obviously
generalizing in English on the basis of English input additional evidence for the SDH (cf. the theoreti-
evidence, and the childs lack of do-support may cally possible example above for which do-support
have nothing to do with inuence from Dutch. The is used in English yes-no questions with a copular
childs lack of do-support in English, then, cannot verb). Morphosyntactic features that appear in
be interpreted. It is not support or lack of support both input languages as well as in the childs uni-
for either the transfer theory or the SDH. lingual utterances are neutral to the SDH.
As I suggested elsewhere (De Houwer, 1994, Analyzing all or most of the morphosyntactic
p. 45), one way of getting around this interpreta- features used by a bilingual child is highly time
tive problem might be to look at data from consuming. Most child language researchers there-
monolingual acquisition: If the bilingual child uses fore prefer to limit their analyses to specic subparts
forms similar to those used in the same language by of childrens language production. When these dif-
a monolingual peer, there is a possibility that the ferent analyses are combined, though, we actually
forms are intralinguistically determined. However, get a random sample of a variety of structures used
such a comparative approach can never entirely by different children living in different parts of the
settle the issue since a similarity of form does not world and acquiring different language pairs. If the
necessarily indicate a similarity in processing. Hence, SDH is not valid, such a database should reveal this
intrinsically ambiguous forms in the bilingual data fairly easily. However, as I show in the next section,
will often have to remain just that. quite the contrary is the case.
Clear evidence for the SDH, then, consists of the
child using comparable structures that differ across
both input languages in utterances with lexical items
from the appropriate language. Although evidence
Studies of Bilingual First
for the SDH is not expected to be noticeable when Language Acquisition That Offer
both input systems closely resemble each other for a Support for the Separate
particular feature, such evidence might in fact occur Development Hypothesis
if, at the same age, the child does use this particular
feature, but only in one language. An example In this section, I give an overview of a large portion
of this can be found in a study by Almgren and of the empirical studies published in the last 15
Idiazabal (2001). Their Basque-Spanish bilingual years that have looked at morphosyntactic devel-
subject, Mikel, started using imperfective pasts to opment in children growing up with two languages
refer to imaginary events in Spanish 9 months before from birth. All these studies show evidence of
he did this in Basque. Yet, imperfective pasts can be the separate development of morphosyntax, whe-
used to refer to imaginary events in both Spanish ther this was made explicit by the authors or not
and Basque. (Table 2.1). The analyses of the data in the studies
To be able to conclude that one particular child is listed in Table 2.1 that provide support for the
developing two separate morphosyntactic systems, SDH all refer to childrens unilingual utterances
a wide spectrum of morphosyntactic features must and to aspects of morphosyntax that clearly differ
be studied. After all, it is possible that separate across the two languages investigated.
Early Bilingual Acquisition 37
Table 2.1 Empirical Studies on Bilingual Acquisition that Conrm the Separate Development Hypothesis
In the eld of child language research, several language combinations listed in Table 2.1, 9 in-
quite different methods are used to collect data. It clude English. Four of the combinations include
is still the case, however, that data based on natu- French. The more language combinations show
ral, spontaneous interaction are the most desirable support for the SDH, the less likely the chance that
when little is known about the developmental evidence for the SDH is somehow a result of the
course of a particular language or pair of lan- specic languages investigated and the more likely
guages. Given the very scant knowledge about the chance that the SDH indeed captures an im-
early bilingual acquisition up until about 15 years portant aspect of the bilingual acquisition process
ago, it will come as no surprise that most of the in general (see also De Houwer, 1994, p. 45).
studies reviewed here are longitudinal case studies It is often claimed that bilingual children re-
that used spontaneous speech as their main data- ported on in the literature are primarily children
base. For this reason, Table 2.1 is organized as a of (psycho-)linguists (see, e.g., Romaine, 1999).
function of the children studied. This has the Whereas this might have been the case in the past, it
advantage of giving a clear picture of the current certainly no longer is today: Only 6 of the 29 chil-
database on which present-day knowledge of mor- dren in Table 2.1 (viz., Andreu, Manuela, Natalie,
phosyntactic development in young bilinguals is Odessa, Sonja, and Zevio) are children of linguists
based. Aside from listing the language combination or psychologists (viz., correspondingly, Perez-Vidal,
acquired, Table 2.1 also shows the age ranges from Deuchar, Stefanik, Jisa, Schelleter, and Krasinski).
which data were drawn in the studies reporting on a As in most studies of child language in general, the
particular childs speech. children studied primarily live in a middle class
As Table 2.1 shows, the current database for environment that, on the whole, is fairly common in
studies that support the SDH consists of the speech the Western world (most of the children studied live
productions of 29 children (17 boys, 12 girls) be- in Western Europe and North America).
tween the ages of 1 and nearly 6 years, who to- Most of the children listed in Table 2.1 have been
gether are acquiring 12 languages in 13 different exposed to their two languages according to the one
combinations. All but 2 of those 12 languages be- person, one language principle. As discussed here,
long to the group of Indo-European languages the SDH was originally formulated to apply only
(Catalan, Dutch, English, French, German, Italian, to children growing up in these circumstances.
Latvian, Slovak, Spanish, and Swedish). The 2 However, at least one child in Table 2.1 (Deuchars
nonIndo-European languages that have been daughter Manuela) quite clearly was not raised ac-
studied in publications addressing the SDH are cording to the one person, one language principle.
Basque and Japanese. As in child language acqui- Instead, Manuelas bilingual parents spoke English
sition research involving monolingual children, to her when there were other English speakers
English is much more heavily represented than present and Spanish in all other circumstances. She
any other language: Of the total of 13 different heard English from monolingual English speakers.
Early Bilingual Acquisition 39
Yet, she developed her two languages along two studies that have a particular topic as their main
separate morphosyntactic paths as well. focus. When studies concern more than one sub-
Many studies listed in Table 2.1 analyzed data topic, they appear more than once in the table.
from the same children, but investigated different As discussed in the previous section, it is impor-
subtopics in morphosyntactic development (see be- tant to investigate all or most of the morphosyntactic
low). Also, they do not always use data from elements used by a particular bilingual child in
the same age period, even though they may concern order to have rm evidence for the SDH. If the in-
the same child. Most notable here are the many stud- formation from Tables 2.1 and 2.2 is combined, it
ies published by Meisel and his collaborators in is clear that for quite a few children in Table 2.1
the framework of the Hamburg DUFDE project many different morphosyntactic aspects have been
(Deutsch und FranzosischDoppelter Erstsprach- investigated. This is particularly the case for the
erwerb [German and FrenchDouble First Lan- children Andreu, Caroline, Christoph, Kate, Ivar,
guage Acquisition]; for overviews, see Koppe, 1994a, Maija, Manuela, Mikel, Pascal, Pierre, Sonja, and
and Schlyter, 1990b). The children Annika, Caroline, Carlo (see also Serratrice, 1999, besides the publi-
Christoph, Francois, Ivar, Pascal, and Pierre were all cations listed in Tables 2.1 and 2.2). These chil-
studied in the framework of this inuential project. dren, then, were denitely approaching their two
The studies in Table 2.1 investigated a wide languages as fundamentally closed morphosyntac-
variety of morphosyntactic subtopics. These sub- tic sets. Whether the same can be said for the other
topics are listed in Table 2.2 together with the children is yet to be determined; in any case, the
Table 2.2 Morphosyntactic Topics Investigated in Empirical Studies of Bilingual First Language Acqui-
sition Conrming the Separate Development Hypothesis
Topic Study/Studies
Morphology of the Almgren and Barrena, 2000; Barrena, 1997; De Houwer, 1990; Ezeizabarrena and
nominal constituent Larranaga, 1996; Idiazabal, 1988, 1991; Koehn, 1994; Meisel, 1986; Muller, 1995;
Parodi, 1990; Sinka and Schelleter, 1998; Stefanik, 1995, 1997; Stenzel, 1994, 1996
Syntactic gender De Houwer, 1990; Muller, 1990a, 1994, 1995; Sinka and Schelleter, 1998; Stefanik,
1995, 1997
Pronouns/clitics Almgren and Barrena, 2000; De Houwer, 1990; Kaiser, 1994; Muller et al., 1996;
Serratrice, 2002
Determiners Barrena, 1997; De Houwer, 1990; Muller, 1994; Paradis and Genesee, 1997
Pluralization Barrena, 1997; De Houwer, 1990; Deuchar and Quay, 1998; Muller, 1994; Sinka and
Schelleter, 1998
Verb morphology Almgren and Barrena, 2000; Almgren and Idiazabal, 2001; De Houwer, 1990; Deuchar,
1992; Ezeizabarrena and Larranaga, 1996; Jisa, 1995; Meisel, 1996; Meisel and
Muller, 1992; Muller, 1990b; Paradis and Genesee, 1997; Serratrice, 2001; Sinka and
Schelleter, 1998
Aspect and/or time Almgren and Barrena, 2000; Almgren and Idiazabal, 2001; De Houwer, 1990, 1997;
markings Jisa, 1995; Krasinski, 1995; Meisel, 1985, 1994; Mishina-Mori, 2002; Serratrice,
2001; Schlyter, 1990a, 1995
Congruence/agreement Almgren and Barrena, 2000; De Houwer, 1990; Deuchar, 1992; Meisel, 1989, 1990,
1994; Meisel and Muller, 1992; Muller, 1990b; Paradis and Genesee, 1996;
Serratrice, 2002; Sinka and Schelleter, 1998
Negation Mishina-Mori, 2002; Paradis and Genesee, 1996, 1997
Syntactic word order Almgren and Barrena, 2000; De Houwer, 1990; Hulk and Van der Linden, 1996;
Koppe, 1994b; Meisel, 1986, 1989; Meisel and Muller, 1992; Muller, 1990b, 1993;
Parodi, 1990; Sinka and Schelleter, 1998
Complex sentences Barrena, 2001; De Houwer, 1990; Muller, 1993, 1994b
Subject realization Juan-Garau and Perez-Vidal, 2000; Serratrice, 2002
General development De Houwer, 1990; Juan-Garau and Perez-Vidal, 2000
40 Acquisition
remaining children show no signs of interlinguis- input to which the child is exposed (in which case
tically determined development in any of the areas the use by the child of structurally similar utter-
that happen to have been investigated. ances is to be expected and as such not surprising
The fact that young, actively bilingual chil- or in need of special analytic treatment).
dren essentially develop their two morphosyntactic An example of such an utterance that might
systems separately from each other implies that be a result of interlinguistic inuence is I want
one language may be further developed than the another, produced by my Dutch-English bilingual
other. The children studied by Jisa (1995), Juan- subject, Kate, at age 3 years (cf. also De Houwer,
Garau and Perez-Vidal (2000), Schlyter (1995), 1995b, p. 236). The pronominalizer one would
and Stefanik (1995, 1997) (viz. Odessa, Andreu, have been expected here from an adults perspec-
Jean, Mimi, Anne, and Natalie), for instance, tive. Its nonrealization could be a result of simply
showed quite different language abilities for at least insufcient, immature knowledge of the English
some time during the period they were studied. system (i.e., a developmental explanation); after
Most of the other children were at roughly the all, children often sound unlike adults because they
same level of development in each of their lan- omit a particular word or phrase. As it happens,
guages at the time of data collection (that is, if it is Kate often used the pronominalizer one at around
accepted that levels of development can in fact be the same age in other (and similar) sentences, so
meaningfully compared across languages, a point this explanation might be less likely.
I am not so sure of unless the differences are It should be noted, though, that it is typical for
blatantly obvious; see De Houwer, 1998). young children to show variability in their lan-
Given the general lack of relevant data that could guage use: For instance, at the same point of de-
speak to this issue of uneven (but still separate) velopment, Dutch-speaking monolingual 3-year-
development, however, it is not clear what the range olds may correctly say ik heb (I have) and
of possibilities here is: For instance, it is theoretically incorrectly *ik heeft (I has) (De Houwer &
possible that a bilingual child produces complex Gillis, 1998). Alternatively, the utterance I want
sentences in one language while in the other lan- another might be considered a speech error. Or,
guage only two-word utterances appear. But are the utterance might be explained by reference to
there any children growing up bilingual from birth Dutch (i.e., by inuence from one language on
who exhibit these sorts of patterns? So far, reports another one), in which saying I want another but
showing these kinds of divergent paths in skilled with Dutch words as in Ik (I) wil (want) een ander
child speakers are lacking. The few studies that do (another) is perfectly ne. Often, it will be im-
show very differing levels of language ability across possible to choose between these three explanatory
bilingual childrens two languages (cf. above) hap- possibilities.
pen to concern very young children who are just If potentially interlinguistically generated utter-
entering the multiword stage in one of their lan- ances of a similar nature are very rare on the whole,
guages. Also, it remains to be investigated which they are of little theoretical consequence: It will
factors determine gross differences across bilingual usually be impossible to verify their exact status
childrens abilities in either language. In any case, with any degree of certainty, and because they are
it is a common observation that young bilingual so rare, they will hardly be able to exert any lasting
children who have been regularly addressed in two effects on the rest of the childs developing systems.
languages from birth do not necessarily speak their So far, there has to my knowledge only been one
two languages equally well. study that has expressly looked at possibly inter-
linguistically generated utterances within a corpus
of bilingual child speech, and that has published
Interlinguistic Inuence precise quantitative data regarding the frequency of
in Unilingual Utterances occurrence of such utterances: Sinka (2000, p. 171)
reported that in the corpus for her Latvian-English
As suggested, even if a child is found to develop subject Mara, spanning nearly 1 year of data col-
two morphosyntactic systems as fundamentally lection, only 2 (which is less than a tenth of a
closed sets, occasionally the child may use unilin- percent) of a total of 5,275 unilingual utterances
gual utterances in language A that could well be were possibly cases showing interlinguistic inu-
explained as drawing on structural features of lan- ence on the syntactic level; for her second Latvian-
guage Alpha. Such utterances will be nonadultlike, English subject Maija, there were 13 (or less than a
except when they are in fact modeled in the actual quarter of a percent) such utterances of a total of
Early Bilingual Acquisition 41
5,537 utterances recorded in a year (Sinka noted ronment speak that language. Both bilingual and
that these utterances might in fact be performance monolingual preschool children make morphologi-
errors). Clearly, with such small numbers, further cal and syntactic errors, and they both produce only
analyses are quite pointless. a fraction of the range of morphosyntactic devices
available to mature speakers.
Further global similarities between bilingual
Bilingual and Monolingual and monolingual children concern the timing of a
Acquisition Compared number of important milestones in language de-
velopment. Except for the huge range of normal
Already in the very rst study of bilingual acqui- individual variation that exists between monolin-
sition by Ronjat (1913), the question was raised gual children (and which also exists among bilin-
how bilingual development compares to mono- gual children), there are no systematic differences
lingual development (see also my summary of between normally developing bilingual and mono-
Ronjats views in De Houwer, 1990, pp. 5152). lingual children in the ages at which basic language
Ronjat complained that in effect he could not really skills are acquired. Just like his or her monolingual
address this issue in any detail since there simply friend, a bilingual 2-year-old can be expected to be
were no sources for monolingual comparisons able to carry on a brief, but largely comprehensible,
available. Since 1913, the situation has improved, conversation with a familiar adult using an occa-
although some of Ronjats problems are still with sional two-word utterance. A great deal more can
us today (see below). The interest in comparing be expected from a bilingual 3-year-old (just as
bilingual and monolingual development, however, can be expected of a 3-year-old monolingual): The
has not changed. child should be able to produce utterances con-
Children who have been regularly and fre- taining three or four words and should be quite
quently exposed to two languages from birth and comprehensible to strangers.
who actually speak those languages (not all bilin- There is as yet no empirical basis for the
gual exposure results in active bilingualism; see, claim that, as a group, bilingual children develop
e.g., De Houwer, 1999a) are no different from their languages more slowly than monolingual
children growing up with just one language as far children.
as the general course of morphosyntactic develop- Finally, there are quite detailed similarities
ment is concerned. The main distinction between to be noted for bilingual and monolingual children
actively bilingual children on the one hand and concerning the developmental course of one spe-
monolingual children on the other is that the for- cic language. In other words, if comparisons
mer are able to make themselves understood in two are made, for instance, of the English language
languages whereas the latter are not. Apart from use of a bilingual child and that of a monolingual
this, there are no major differences. Both bilingual child of approximately the same age, the similari-
and monolingual children start off their conven- ties are quite striking. It is impossible to say on the
tionally meaningful language production using basis of a corpus of English utterances by a 3-
single-word sentences or holophrases. They then year-old whether they were produced by a bilingual
go on to produce two-word sentences, and after or a monolingual child. Monolingual and bilingual
producing multiword sentences for a while, they children acquiring the same language from birth
start to use complex sentences as well. use that language in very similar ways: They
On the morphological level, depending on the produce the same sorts of utterances (some stud-
language that is acquired, both bilingual and ies even reported identical utterances; see, e.g.,
monolingual children may use a number of bound De Houwer, 1990) with similar types of errors and
morphemes at a very early stage in development. characteristics.
From the two-word stage onward, both monolin- Detailed comparisons between bilingual and
gual and bilingual children speak a clearly identi- monolingual children so far have been undertaken
able language (for a critique of earlier theories for Basque, Dutch, English, French, German, and
that implicitly denied this as far as bilingual chil- Spanish. Obviously, it must not be forgotten that,
dren are concerned, see De Houwer, 1995b). in comparisons between bilingual and monolingual
Although young bilingual and monolingual children acquiring a common language, there may
children clearly speak a particular language from be a great deal of variation between individual
a very early age onward, they still differ quite children. That individual variation makes it quite
dramatically from how the adults in their envi- difcult in some cases to determine whether a small
42 Acquisition
point of difference is relatable to the fact that the is rather infrequent in comparison with their use
bilingual child is simultaneously acquiring another of lexically unilingual utterances (that is, for those
language or not. children for whom we have data).
Future studies will have to show to what extent Switching between different types of utterances
the minimal differences that do crop up here and is apparently not a problem: To my knowledge,
there in very detailed comparisons are to be ex- there have been no reports of bilingual children
plained in terms of individual variation or other who had trouble switching between unilingual ut-
factors. One problem here is that often there is little terances from languages A and Alpha or vice versa
material available for monolingual acquisition or between unilingual utterances and mixed utter-
that could be used as a dependable basis for com- ances. The use of far more hesitations in one
parison (this problem sometimes occurs even for language than another, though, might give rise to
English, the most frequently researched language in less-uent transitions from one type of utterance
acquisition studies). Another problem is that studies to another. In my study, the only one I am aware
of spontaneous child speech often have few quan- of that counted a bilingual childs (Kates) hesita-
titative data, so that it is impossible to decide the tions and analyzed their use in both languages
extent of quantitative differences between monolin- (De Houwer, 1990, pp. 96, 331), I found no dif-
gual and bilingual children in the frequency of oc- ferences between the languages. In the absence of
currence of particular types of linguistic structures evidence to the contrary, uent switching between
(for a more in-depth comparison of monolingual different types of utterances seems to be part and
and bilingual acquisition, see De Houwer, 2002). parcel of early bilingual production in children
So far, I have mainly emphasized the similari- raised with two languages from birth.
ties between the morphosyntactic development of Parents in bilingual families are sometimes sur-
bilingual and monolingual children. Those simi- prised to hear their young children use mixed ut-
larities highlight the robust nature of the primary terances, especially when they see themselves as not
language development process, which seems im- using mixed utterances (however, as Goodz (1989)
mune to whether a child is growing up learning two has shown, for instance, there may be quite a dif-
languages or just one. Note, however, that I have so ference between self-reported and actual language
far discussed bilingual childrens morphosyntactic use in bilingual situations). Often, parents (and in
development only on the basis of a portion of their the past, researchers as well) see the use of mixed
speech production (viz. on the basis of lexically and utterances by their young bilingual children as
morphologically unilingual utterances). All young evidence of language confusion. As Lanza (1997b)
bilingual children, however, also produce lexically admirably demonstrated, young bilingual childrens
and morphologically mixed utterances (which, by early use of mixed utterances cannot be seen as a
denition, monolingual children cannot). It is these result of language confusion, but can be explained
to which I turn next. by the language socialization practices in the family
and childrens sensitivity to them. Young bilingual
children are in general very responsive vis-a`-vis the
Structural Aspects of Bilingual sociolinguistic norms that exist in their environment
Childrens Mixed Utterances regarding language choice (see, e.g., De Houwer,
1990; Deuchar & Quay, 2000). Also, the use of
Mixed utterances in bilingual speech are here de- mixed utterances can be explained in terms of this
ned as utterances with surface realization that sensitivity: Children will use more mixed utterances,
clearly includes lexical items or bound morphemes and will continue to use them, the more tolerance
(or both) from two languages (I leave aside the there is for them in their environment. The use of
theoretical issue of the extent to which mixed ut- mixed utterances, then, is in most cases not reducible
terances can be seen as instances of code switching to a lack of language skill.
or code mixing; for an in-depth discussion of this There have been only a few studies of bilingual
regarding young bilingual children, see, e.g., Lanza, childrens morphosyntactic development that have
1997a). The very youngest of bilingual speakers use both looked in detail at lexically unilingual and
mixed utterances from the rst stages of morpho- mixed utterances produced by the same child
syntactic development. The use of mixed utter- and have tried to draw comparisons between them
ances, then, is an integral part of early bilingual (De Houwer, 1990; Sinka, 2000). The general pic-
development, although on the whole, childrens use ture gained from these studies is that the structure
of lexically or morphologically mixed utterances of mixed utterances tends to reect the overall
Early Bilingual Acquisition 43
structure of lexically unilingual utterances pro- in which two morphosyntactic systems are ac-
duced by the bilingual child at the same age: Both quired as fundamentally separate and closed sys-
the global length and linguistic complexity of tems. This does not imply, of course, that structural
mixed utterances resemble those of the unilingual inuence from one language on the other is not
utterances the child is producing at the time. possible, but until now no evidence has been found
There have been rather more studies of bilingual of systematic morphosyntactic inuence from one
childrens mixed utterances per se, although again language on the other in children who have been
the number of studies focusing on their morpho- regularly and frequently exposed to two languages
syntactic characteristics is limited. Because young from birth. Young bilingual children reect the
bilingual children are still quite immature speakers, structural possibilities of both languages to which
they will often produce very short utterances con- they have been exposed and are able to produce
sisting of just two words. For these, it will be im- utterances that are clearly relatable to each of their
possible to investigate which elements are mixed different languages from very early. This would not
into what (cf. De Houwer, 1990, 1995a; but see be possible without very close attention to the
Lanza, 1997b, for an alternative view). For longer variable nature of the input (De Houwer, 1990).
mixed utterances, it may be possible to identify the In general, bilingual childrens language-specic
consistency of the mixed elements. development within one language differs little
The empirical data available so far, regardless from that of monolingual acquisition, except of
of the particular language combination studied course that bilingual children do it for two lan-
(see, e.g., De Houwer, 1990, 1995a; Saunders, guages at a time. There is no evidence that hearing
1988; Sinka, 2000; Wanner, 1996) show that, in two languages from birth leads to language delay.
utterances that clearly are utterances in language A In being able to produce unilingual utterances in
with one or more elements inserted from language two languages, bilingual children closely resemble
Alpha (or vice versa), the insertions from lan- bilingual adults. In addition, just like adult bilin-
guage Alpha mainly consist of single nouns when guals, young bilingual children are able to switch
children are under age 4 years (at a somewhat later between languages very easily, either at utterance
age, insertions mainly consist of noun phrases in boundaries or within utterances. Utterances in
addition to single nouns; cf. Bentahila & Davies, which lexical or morphological switching occurs
1994). Also, in bilingual adults, noun insertions are are an integral mark of bilingual functioning both in
the most commonly inserted category in mixed young child bilinguals and in more mature bilingual
utterances (see, e.g., Romaine, 1995). speakers. In mixed utterances produced by either
In De Houwer (1995a), I applied an analytical child or adult bilinguals, noun insertions are a
method based on utterance length and guest and common feature. Naturally, though, in both mixed
host language status to mixed utterances pro- and unilingual utterances, child bilinguals do not
duced by 11 preschool children acquiring ve lan- yet exhibit the full wealth and breadth of the sorts
guage combinations. The data for this study were of structures of which adult bilinguals are capable.
drawn from the spoken language corpora archive As they acquire two separate linguistic systems,
CHILDES (MacWhinney, 1991) as well as from young bilingual children learn from a very tender
several published sources. The main nding was age which norms for language choice exist in their
that, regardless of the actual language pair in- environment, and in general they are able to apply
volved, childrens mixed multiword utterances those norms in their own language production. The
consisted mainly of free morpheme insertions of the use of mixed utterances is to be seen as one of the
guest languages into the host language. These free language choice possibilities within the socializa-
morpheme insertions were most often nouns. tion patterns present in bilingual childrens lin-
More analyses that apply one specic method for guistic environments rather than as a sign of
cross comparisons are needed, however, to obtain a insufcient linguistic skill.
clearer picture of the main characteristics of young It is clear, then, that young bilingual children
bilingual childrens mixed utterances. are very much attuned to the specic linguistic
environment in which they nd themselves, and
that they are very much inuenced by this envi-
Conclusion ronment. The real challenge for explaining bilin-
gual development is to discover the precise links
In acquiring two languages from birth, children are between that environment and bilingual child
undergoing a sort of double acquisition process language use.
44 Acquisition
Leopold, W. (1970). Speech development of a Mills, A. (1986). Review of T. Taeschners The Sun
bilingual child. A linguists record. New York: Is Feminine. Linguistics, 24, 825833.
AMS Press. (Original work published Mishina-Mori, S. (2002). Language differentiation
19391949) of the two languages in early bilingual
Lieven, E., Pine, J., & Baldwin, G. (1997). development: A case study of Japanese/
Lexically-based learning and early English bilingual children. International
grammatical development. Journal of Child Review of Applied Linguistics, 40, 211233.
Language, 24, 187219. Muller, N. (1990a). Developing two gender
MacWhinney, B. (1991). The CHILDES project: assignment systems simultaneously. In J.
Tools for analyzing talk. Hillsdale, NJ: Meisel (Ed.), Two rst languages. Early
Erlbaum. grammatical development in bilingual children
Meisel, J. (1985). Les phases initiales du (pp. 193236), Dordrecht, The Netherlands:
developpement de notions temporelles, Foris.
aspectuelles et de modes daction. Etude Muller, N. (1990b). Erwerb der Wortstellung
basee sur le langage denfants bilingues im Franzosischen und Deutschen. Zur
francais-allemand. Lingua, 66, 321374. Distribution von Finitheitsmerkmalen in der
Meisel, J. (1986). Word order and case marking Grammatik bilingualer Kinder. In M.
in early child language. Evidence from Rothweiler (Ed.), Spracherwerb und
simultaneous acquisition of two first Grammatik. Linguistische Untersuchungen
languages: French and German. Linguistics, zum Erwerb von Syntax und Morphologie
24, 123183. (pp. 127151). Opladen, Germany: West-
Meisel, J. (1989). Early differentiation of languages deutscher Verlag.
in bilingual children. In K. Hyltenstam & Muller, N. (1993). Komplexe Satze. Der Erwerb
L. Obler (Eds.), Bilingualism across the von COMP und von Wortstellungsmustern bei
lifespan. Aspects of acquisition, maturity bilingualen Kindern (Franzosisch/Deutsch).
and loss (pp. 1340). Cambridge, U.K.: Tubingen, Germany: Gunter Narr Verlag.
Cambridge University Press. Muller, N. (1994a). Gender and number agreement
Meisel, J. (1990). INFL-ection: Subjects and within DP. In J. Meisel (Ed.), Bilingual rst
subject-verb agreement. In J. Meisel (Ed.), language acquisition. French and German
Two rst languages. Early grammatical grammatical development (pp. 5388).
development in bilingual children (pp. Amsterdam: Benjamins.
237298). Dordrecht, The Netherlands: Foris. Muller, N. (1994b). Parameters cannot be reset:
Meisel, J. (1994). Getting FAT: Finiteness, evidence from the development of COMP.
agreement and tense in early grammars. In In J. Meisel (Ed.), Bilingual rst language
J. Meisel (Ed.), Bilingual rst language acquisition. French and German grammatical
acquisition. French and German grammati- development (pp. 235270). Amsterdam:
cal development (pp. 89130). Amsterdam: Benjamins.
Benjamins. Muller, N. (1995). Lacquisition du genre et du
Meisel, J. (2001). The simultaneous acquisition of nombre chez des enfants bilingues
two first languages. Early differentiation (Francais-Allemand). AILE (Acquisition et
and subsequent development of grammars. Interaction en Langue Etrange`re), 6, 6599.
In J. Cenoz & F. Genesee (Eds.), Trends in Muller, N., Crysmann, B., & Kaiser, G. (1996).
bilingual acquisition (pp. 1141). Amsterdam: Interactions between the acquisition of
Benjamins. French Object drop and the development
Meisel, J., & Mahlau, A. (1988). La adquisicion of the C-system. Language Acquisition,
simultanea de dos primeras lenguas. Discusion 5(1), 3563.
general e implicaciones para el estudio del Paradis, J., & Genesee, F. (1996). Syntactic
bilinguismo en euzkadi. In Actas del II acquisition in bilingual children: Autonomous
Congreso Mundial Vasco: Congreso sobre la or interdependent? Studies in Second
Lengua Vasca (Vol. 3). Vitoria, Spain: Servicio Language Acquisition, 18, 125.
de publicaciones del Gobierno Vasco. Paradis, J., & Genesee, F. (1997). On continuity
Meisel, J., & Muller, N. (1992). Finiteness and and the emergence of functional categories in
verb placement in early child grammars. bilingual rst-language acquisition. Language
Evidence from simultaneous acquisition of Acquisition, 6(2), 91124.
two rst languages: French and German. In Parodi, T. (1990). The acquisition of word order
J. Meisel (Ed.), The acquisition of verb place- regularities and case morphology. In J. Meisel
ment. Functional categories and V2 phenom- (Ed.), Two rst languages. Early grammatical
ena in language acquisition (pp. 109138). development in bilingual children (pp.
Dordrecht, The Netherlands: Kluwer. 157192). Dordrecht, The Netherlands: Foris.
Early Bilingual Acquisition 47
Pavlovitch, M. (1920). Le langage enfantin. Sinka, I. (2000). The search for cross-linguistic
Acquisition du serbe et du francais. Paris: inuences in the language of young
Champion. Latvian-English bilinguals. In S. Dopke
Peters, A. (1983). The units of language acquisi- (Ed.), Cross-linguistic structures in
tion. Cambridge, U.K.: Cambridge University simultaneous bilingualism (pp. 149174).
Press. Amsterdam: Benjamins.
Pfaff, C. (1994). Early bilingual development of Sinka, I., & Schelleter, C. (1998).
Turkish children in Berlin. In G. Extra & Morphosyntactic development in bilingual
L. Verhoeven (Eds.), The cross-linguistic study children. International Journal of
of bilingualism (pp. 7597). Amsterdam: Bilingualism, 2, 301326.
North-Holland. Slobin, D. (1973). Cognitive prerequisites for the
Poulin-Dubois, D., & Goodz, N. (2001). Language development of grammar. In C. Ferguson &
differentiation in bilingual infants: Evidence D. Slobin (Eds.), Studies of child language
from babbling. In J. Cenoz & F. Genesee development (pp. 175208). New York: Holt,
(Eds.), Trends in bilingual acquisition Rinehart, and Winston.
(pp. 95106). Amsterdam: Benjamins. Stefanik, J. (1995). Grammatical category of
Quay, S. (2001). Managing linguistic boundaries gender in Slovak-English and English-Slovak
in early trilingual development. In J. Cenoz & bilinguals. Journal of East European Studies,
F. Genesee (Eds.), Trends in bilingual 4, 155164.
acquisition (pp. 149200). Amsterdam: Stefanik, J. (1997). A study of English-Slovak
Benjamins. bilingualism in a child. Journal, 30, 721734.
Romaine, S. (1995). Bilingualism (2nd ed.). Stenzel, A. (1994). Case assignment and functional
Oxford, U.K.: Blackwell. categories in bilingual children. In J. Meisel
Romaine, S. (1999). Early bilingual development: (Ed.), Bilingual rst language acquisition.
from elite to folk. In G. Extra & L. Verhoeven French and German grammatical development
(Eds.), Bilingualism and migration (pp. 161208). Amsterdam: Benjamins.
(pp. 6173). Berlin: Mouton de Gruyter. Stenzel, A. (1996). Development of prepositional
Ronjat, J. (1913). Le developpement du langage case in a bilingual child. Linguistics, 34,
observe chez un enfant bilingue. Paris: 10291058.
Champion. Tabors, P. (1987). The development of
Saunders, G. (1988). Bilingual children: From birth communicative competence by second
to teens. Clevedon, U.K.: Multilingual Matters. language learners in a nursery school
Schlyter, S. (1990a). The acquisition of tense and classroom: An ethnolinguistic study.
aspect. In J. Meisel (Ed.), Two rst languages. Unpublished doctoral dissertation, Harvard
Early grammatical development in bilingual University, Boston, MA.
children (pp. 87122). Dordrecht, The Taeschner, T. (1983). The sun is feminine: A study
Netherlands: Foris. on language acquisition in bilingual children.
Schlyter, S. (1990b). Introducing the DUFDE Berlin: Springer Verlag.
project. In J. Meisel (Ed.), Two rst languages. Van den Bogaerde, B., & Baker, A. (2002). Are
Early grammatical development in bilingual young deaf children bilingual? In G. Morgan
children (pp. 7386). Dordrecht, The & B. Woll (Eds.), Directions in sign language
Netherlands: Foris. acquisition (pp. 183206). Amsterdam:
Schlyter, S. (1995). Formes verbales du passe dans Benjamins.
des interactions en langue forte et en langue Vihman, M. (1999). The transition to grammar in
faible. AILE (Acquisition et Interaction en a bilingual child: Positional patterns, model
Langue Etrange`re), 6, 129152. learning, and relational words. International
Serratrice, L. (1999). The emergence of functional Journal of Bilingualism, 3, 267301.
categories in bilingual rst language Volterra, V., & Taeschner, T. (1978). The
acquisition. Unpublished doctoral dissertation, acquisition and development of language by
University of Edinburgh, U.K. bilingual children. Journal of Child Language,
Serratrice, L. (2001). The emergence of verbal 5, 311326.
morphology and the lead-lag pattern issue in Wanner, P. (1996). A study of the initial
bilingual acquisition. In J. Cenoz & F. Genesee codeswitching stage in the linguistic
(Eds.), Trends in bilingual acquisition development of an English-Japanese bilingual
(pp. 4370). Amsterdam: Benjamins. child. Japan Journal of Multilingualism and
Serratrice, L. (2002). Overt subjects in English: Multiculturalism, 2, 2040.
evidence for the marking of person in an Weissenborn, J., & Hohle, B. (Eds.). (2001).
English-Italian bilingual child. Journal of Approaches to bootstrapping (Vols. 1 and 2).
Child Language, 29(2), 129. Amsterdam: Benjamins.
48 Acquisition
3
A Unied Model of Language
Acquisition
49
50 Acquisition
For example, in English, the positioning of the predictions for both positive and negative transfer in
subject before the verb is a form that expresses the the various linguistic arenas. The second component
function of marking the perspective or agent. Or, is the theory of code interaction, which determines
to give another example, the pronoun him is a code selection, switching, and mixing. The Com-
form that expresses the functions of masculine petition Model relies on the notion of resonance,
gender and the role of the object of the verb. The discussed next, to account for coactivation pro-
Competition Model focuses primarily on the use of cesses in both L2 learners and bilinguals. The choice
forms as cues to role assignment, coreference, and of a particular code at a particular moment during
argument attachment as outlined in MacWhinney lexicalization depends on factors such as activation
(1987a). Mappings are social conventions that from previous lexical items, the inuence of lexical
must be learned for each of the eight linguistic gaps, expression of sociolinguistic options (Ervin-
arenas, including lexicon, phonology, morpho- Tripp, 1968), and conversational cues produced by
syntax, and mental models. the listener.
4. Storage. The learning of new mappings relies 7. Resonance. Perhaps the most important area
on storage in both short-term and long-term mem- of new theoretical development in the Unied
ory. Gupta and MacWhinney (1997) developed Competition Model is the theory of resonance. This
an account of the role of short-term memory in theory seeks to relate the Competition Model to
the construction of memories for the phonological research in the area of embodied or embed-
forms of words and the mapping of these forms into ded cognition, as well as newer models of proces-
meaningful lexical items. Short-term memory is also sing in neural networks.
crucially involved in the online processing of specic
syntactic structures (Gibson, Pearlmutter, Canseco- The seven-component model sketched out here in-
Gonzalez, & Hickok, 1996; MacWhinney & Pleh, cludes no separate component for learning. This is
1988). MacWhinney (1999) examined how the because learning is seen as an interaction between
processes of perspective switching and referent each of the various subcomponents during the pro-
identication can place demands on verbal memory cesses of competition and resonance. Each of the
processes during mental model construction. The seven components of the model is now explored in
operation of these memory systems constrains the more detail.
role of cue validity during both processing and ac-
quisition. For example, the processing of subject-
verb agreement for inverted word orders in Italian is Competition
not fully learned until about age 8 years (Devescovi,
DAmico, Smith, Mimica, & Bates, 1998) despite its The basic notion of competition is fundamental to
high cue validity and high cue strength in adult most information-processing models in cognitive
speakers. psychology. In the unied model, competition takes
5. Chunking. The size of particular mappings on slightly different forms in each of the eight
depends on the operation of processes of chunking. competitive arenas. These arenas are not thought
Work in L1 acquisition has shown that children rely of as encapsulated modules, but as playing elds
on both combinatorial processing and chunking to that can readily accept input from other arenas
build up syllables, words, and sentences. For exam- when that input is made available. In the course
ple, a child may treat whats this as a single unit of work on the core model and related mecha-
or chunk, but will compose phrases such as more nisms, my colleagues and I have formulated work-
cookie and more milk by combination of more ing computational models for most of these
with a following argument. MacWhinney (1978, competitive arenas.
1982) and Stemberger and MacWhinney (1986)
showed how large rote chunks compete with smaller 1. In the auditory arena, competition involves
analytic chunks in both children and adult learners. the processing of cues to lexical forms based
6. Codes. When modeling bilingualism and L2 on both bottom-up features and activation
acquisition, it is important to have a clear theory from lexical forms. Models of this process
of code activation. The Competition Model distin- include those that emphasize top-down acti-
guishes two components of the theory of code vation (Elman & McClelland, 1988) and
competition. The rst component is the theory of those that exclude it (Norris, 1994). In the
transfer. This theory has been articulated in some Competition Model, bottom-up activation is
detail in Competition Model work in terms of primary, but top-down activation will occur
52 Acquisition
in natural conditions and in those experi- studies included word order, subject-verb agreement,
mental tasks that promote resonance. object-verb agreement, case marking, prepositional
2. In the lexical arena, competition occurs case marking, stress, topicalization, animacy, omission,
within topological maps (Li, Farkas, & and pronominalization. These cues were varied in a
MacWhinney, 2004) in which words are standard orthogonalized analysis of variance design
organized by semantic and lexical type. with three or four sentences per cell to increase statis-
3. In the morphosyntactic arena, there is an item- tical reliability. The basic questions were always
based competition between word orders and the same: What is the relative order of cue strength in
grammatical markings centered on valence the given language, and how do these cue strengths
relations (MacDonald, Pearlmutter, & Sei- interact?
denberg, 1994; MacWhinney, 1987b). In English, the dominant cue for subject identi-
4. In the interpretive arena, there is a competi- cation is preverbal positioning. For example, in
tion between fragments of mental models as the English sentence The eraser hits the cat, it is
the listener seeks to construct a unied mental assumed that the eraser is the agent. However, a
model (MacWhinney, 1989) that can be en- parallel sentence in Italian or Spanish would have
coded in long-term memory (Hausser, 1999). the cat as the agent. This is because the word order
5. In the arena of message formulation, there is cue is not as strong in Italian or Spanish as it is in
a competition between communicative goals. English. In Spanish, the prepositional object mar-
Winning goals are typically initialized and ker a is a clear cue to the object, and the subject is
topicalized. the noun that is not the object. An example of this
6. In the arena of expressive lexicalization, is the sentence El toro mato al torero (The bull
there is a competition between words for the killed to the bullghter). No such prepositional
packaging and conation of chunks of mes- cue exists in English.
sages (Langacker, 1989). In German, case marking on the denite article
7. In the arena of sentence planning, there is a is a powerful cue to the subject. In a sentence such
competition of phrases for initial position as Der Lehrer liebt die Witwe (The teacher
and a competition between arguments for loves the widow), the presence of the nominative
attachment to slots generated by predicates masculine article der is a sure cue to identication
(Dell, Juliano, & Govindjee, 1993). of the subject. In Russian, the subject often has a
8. In the arena of articulatory planning, there is case sufx. In Arabic, the subject is the noun that
a competition between syllables for insertion agrees with the verb in number and gender, and
into a rhythmic phrasal output pattern (Dell this cue is stronger than the case marking cue. In
et al., 1993). French, Spanish, and Italian, when an object pro-
noun is present, it can help identify the noun that is
not the subject.
Cues Thus, we see that Indo-European languages can
vary markedly in their use of cues to mark case roles.
Experimental work in the Competition Model tra- When we go outside Indo-European languages to
dition has focused on measurement, using a simple languages like Navajo, Hungarian, or Japanese, the
sentence interpretation procedure, of the relative variation becomes even more extreme.
strength of various cues to the selection of the To measure cue strength, Competition Model
agent. Subjects listen to a sentence with two nouns experiments rely on sentences with conicting cues.
and a verb and are asked to say who was the ac- For example, in The eraser push the dogs, the
tor. In a few studies, the task involved direct ob- cues of animacy and subject-verb agreement favor
ject identication (Sokolov, 1988, 1989), relative the dogs as agent. However, the stronger cue
clause processing (MacWhinney & Pleh, 1988), or of preverbal positioning favors the eraser as
pronominal assignment (MacDonald & MacWhinney, agent. As a result, English-speaking adult subjects
1990; McDonald & MacWhinney, 1995); usually, strongly favor the eraser even in a competition
the task was agent identication. Sometimes, the sen- sentence of this type. However, about 20% of the
tences were well-formed grammatical sentences, such participants will choose the dogs in this case. To
as The cat is chasing the duck. Sometimes, they measure the validity of cues in the various lan-
involved competitions between cues, as in the un- guages studied, we rely on text counts in which we
grammatical sentence *The duck the cat is chasing. list the cues in favor of each noun and track the
Depending on the language involved, the cues in these relative availability and reliability of each cue. Cue
Unied Competition Model 53
processes is the learning of the skill of simultaneous conventional estimate of the number of items that
translation (Christoffels & De Groot, chapter 22, can be stored in short-term memory is about four.
this volume). Practitioners of this art are able to The interpreters task is made even more difcult
listen in one language and speak in the other in by the fact that they must continue to build mental
parallel while performing a complex mapping of models of incoming material (MacWhinney, 1999)
the message of the input language to the very dif- while using previously constructed mental models
ferent syntax of the output language. The very as the basis for ongoing articulation. To do this
existence of simultaneous translation underscores successfully, the interpreter must be able to delin-
the extent to which two languages can be coacti- eate chunks of comprehended material that are
vated (Spivey & Marian, 1999) for long periods of sufcient to motivate full independent output pro-
time (Meuter, chapter 17, this volume). ductions. In effect, the interpreter must maintain
The problems involved in simultaneous trans- two separate conceptual foci centered about two
lation nicely illustrate how language can place separate points in conceptual space. The rst at-
a heavy load on functional neural circuits. Let tentional focus continues to take in new material
us take a simple case to illustrate the problem. from the speaker in terms of new valence and
Consider a German sentence with a verb in nal conceptual relations. The second attentional focus
position. If the German sentence is short, the in- works on the comprehended structure to convert it
terpreter will have little problem converting the to a production structure. The location of the
German SOV (subject-object-verb) order to English production focus is always lagged after that of the
SVO (subject-verb-object) order. For example, a comprehended structure, so the interpreter always
sentence like Johannes hat den Mann mit dem has a split in conceptual attention. As a result of the
dunkelen Mantel noch nicht kennengelernt (John load imposed by this attentional split and ongoing
has not yet met the man with the dark coat) will activity in two channels, interpreters often nd that
cause few problems because the interpreter can lag they cannot continue this line of work past the age
behind the speaker enough to take in the whole of 45 years or so.
utterance along with the verb before starting to Interpreters are not the only speakers who are
speak. The interpreter prepares an utterance with a subject to load on their use of functional neural
subject and object already in nal form. When the circuits. It is easy to interfere with normal language
verb comes along, it is simply a matter of trans- processing by imposing additional loads on the
lating it to the English equivalent, dropping it into listener or speaker. Working within a standard
the prepared slot, and starting articulatory output. Competition Model experimental framework, Kil-
However, if there is additional material piled up born (1989) has shown that even fully competent
before the verb, the problem can get worse. Typi- bilinguals tend to process sentences more slowly
cally, simultaneous interpreters try not to lag more than monolinguals. However, when monolinguals
than a few words behind the input. To avoid this, are asked to listen to sentences under conditions of
one solution would be to store away the short white noise, their reaction times are identical to
subject and dump out the large object as the head those of the bilinguals. Similarly, Blackwell and
of a passive as in, The man with the dark coat has Bates (1995) and Miyake, Carpenter, and Just
not yet been met by John. Another, rather un- (1994) have shown that, when subjected to condi-
happy, solution is topicalization, as in John, in tions of noise, normal individuals process sentences
regard to the man with the dark coat, he hasnt much like aphasics. Gerver (1974) and Seleskovitch
seen him yet. Similar problems can arise when (1976) reported parallel results for the effects of
translating from relative clauses in languages with noise on simultaneous interpretation.
VSO (verb-subject-object) order, such as Tagalog
or Arabic. Studies of Hungarian (MacWhinney &
Pleh, 1988) and Japanese (Hakuta, 1981) show
that the stacking up of unlinked noun phrases can Chunking
be even worse in SOV languages.
If interpreters had access to an unlimited verbal The component of chunking is a recent addition
memory capacity, there would be little worry about to the Competition Model. However, this idea is
storing long chunks of verbal material. However, certainly not a new one for models of language
we know that our raw memory for strings of learning. Chunking operates to take two or
words is not nearly large enough to accommodate more items that frequently occur together and com-
the simultaneous interpretation task. In fact, the bine them into a single automatic chunk. Chunking
Unied Competition Model 55
is the basic learning mechanism in Newells general offer an account of age-related learning effects that
cognitive model (Newell, 1990), as well as in many have been discussed in terms of critical periods and
neural network models. MacWhinney and Ander- fossilization. Because of space limitations, I will not
son (1986) showed how the child can use chunking include a discussion of code-switching theory here
processes to build up larger grammatical structures and focus instead on the theory of transfer and its
and complex lexical forms. Ellis (1994) has shown impact on age-related effects.
how chunking can help us understand the growth The basic claim is that whatever can transfer
of uency in L2 learning. Gupta and MacWhinney will. This claim is theoretically important for at
(1997) showed how chunking can also apply to the least two reasons. First, because the Competition
learning of the phonological shape of individual Model emphasizes the interactive nature of cogni-
words for both L1 and L2. tive processing, it must assume that, unless the
Chunking plays a particularly interesting role in interactions between languages are controlled and
the acquisition of grammar. For L2 learners, mas- coordinated, there would be a large amount of
tering a complex set of inectional patterns is a par- transfer. Second, the model needs to rely on transfer
ticularly daunting challenge. These problems are a to account for age-related declines in L2 learning
result of the tendency of L2 learners to fail to pick up ability without invoking the expiration of a geneti-
large enough phrasal chunks. For example, if learners cally programmed critical period (Birdsong, chap-
of German not only would pick up that Mann means ter 6, this volume; DeKeyser and Larson-Hall,
man, but also would learn phrases such as der alte chapter 5, this volume).
Mann, meines Mannnes, den jungen Mannern, and For simultaneous bilingual acquisition (De
ein guter Mann, then they not only would know the Houwer, chapter 2, this volume), the model predicts
gender of the noun, but also would have a good basis code blending in young children only when parents
for acquiring the declensional paradigm for both the encourage this or when there are gaps in one lan-
noun and its modiers. However, if they analyze guage that can be lled by borrowing from the
a phrase like der alte Mann into the literal string other. This prediction follows from the role of
the old man and throw away all of the details resonance in blocking transfer. When the childs
of the inections on der and alte, then they will lose two languages are roughly similar in dominance
an opportunity to induce the grammar from implicit or strength, each system generates enough system-
generalization across stored chunks. If, on the other internal resonance to block excessive transfer.
hand, the learner stores larger chunks of this type, However, if one of the languages is markedly
then the rules of grammar can emerge from analogic weaker (Dopke, 2000), then it will not have enough
processing of the stored chunks. internal resonance to block occasional transfer. The
Chunking also leads to improvements in uency situation is very different for L2 learners because the
(Segalowitz & Hulstijn, chapter 18, this volume). balance between the languages is then tipped so
For example, in Spanish, L2 learners can chunk extremely in favor of L1. To permit the growth of
together the plan for buenos with the plan for resonance in L2, learners must apply additional
das to produce buenos das. They can then com- learning strategies that would not have been needed
bine this chunk with muy to produce muy buenos for children. These strategies focus primarily on
das (very good morning). Chunking (Ellis, 1994) optimization of input, promotion of L2 resonance,
allows the learner to get around problems with and avoidance of processes that destroy input
Spanish noun pluralization, gender marking, and chunks.
agreement that would otherwise have to be rea- In the next sections, the evidence for transfer
soned out in detail for each combination. Although from L1 to L2 is briey reviewed. There is clear
the learner understands the meanings of the three evidence for massive transfer in audition, articula-
words in this phrase, the unit can function as a tion, lexicon, sentence interpretation, and prag-
chunk, thereby speeding production. matics. In the area of morphosyntax and sentence
production, transfer is not as massive, largely be-
cause it is more difcult to construct the relations
between L1 and L2 forms in these areas. Pienemann,
Codes and Transfer Di Biase, Kawaguchi, and Hakansson (chapter 7,
this volume) have argued that transfer in these areas
Any general model of L2 learning must be able is less general than postulated by the Competition
to account for interlanguage phenomena such as Model. However, their analysis underestimates
transfer and code switching. In addition, it must transfer effects in their own data.
56 Acquisition
the L1 conceptual world en masse to L2. Young this persistent transfer effect is probably less marked
bilinguals can also benet from this conceptual in nonlaboratory contexts.
transfer. When learners rst acquire a new L2 Third, error is minimized when two words in L1
form, such as silla in Spanish, they treat this form map onto a single word in L2. For example, it is
as simply another way of saying chair. This easy for an L1 Spanish speaker to map the mean-
means that initially the L2 system has no separate ings underlying saber and conocer (Stockwell,
conceptual structure, and that its formal structure Bowen, & Martin, 1965) onto the L2 English form
relies on the structure of L1. Kroll and Tokowicz know. Dropping the distinction between these
(chapter 26, this volume) review models of the forms requires little in the way of cognitive reor-
lexicon that emphasize the extent to which L2 re- ganization. It is difcult for the L1 English speaker
lies on L1 forms to access meaning rather than to acquire this new distinction when learning Span-
accessing meaning directly. In this sense, we can ish. To control this distinction correctly, the learner
say that L2 is parasitic on L1 because of the exten- must restructure the concept underlying know into
sive amount of transfer from L1 to L2. The learners two new related structures. In the area of lexical
goal is to reduce this parasitism by building up L2 learning, these cases should cause the greatest
representations as a separate system. They do this by transfer-produced errors.
strengthening the direct linkage between new L2
forms and conceptual representations.
Given the fact that connectionism predicts such Transfer in Sentence
massive transfer for L1 knowledge to L2, it might Comprehension
be asked why more transfer error in L2 lexical
forms is not seen. There are three reasons for this: Transfer is also pervasive in the arena of sentence
First, a great deal of transfer occurs smoothly interpretation. There are now over a dozen Com-
and directly without producing error. Consider a petition Model studies that have demonstrated the
word like chair in English. When the native transfer of a syntactic accent in sentence inter-
English speaker begins to learn Spanish, it is easy pretation (Bates & MacWhinney, 1981; De Bot &
to use the concept underlying chair to serve as the Van Montfort, 1988; Gass, 1987; Harrington,
meaning for the new word silla in Spanish. The 1987; Kilborn, 1989; Kilborn & Cooreman, 1987;
closer the conceptual, material, and linguistic Kilborn & Ito, 1989; Liu, Bates, & Li, 1992;
worlds of the two languages are, the more successful McDonald, 1987a, 1987b; McDonald & Heilen-
this sort of positive transfer will be. Transfer only man, 1991; McDonald & MacWhinney, 1989).
works smoothly when there is close conceptual Frenck-Mestre (chapter 13, this volume) presents a
match. For example, Ijaz (1986) showed how dif- particularly elegant design demonstrating this type
cult transfer can be for Korean learners of English in of effect during online processing. These studies
semantic domains involving transfer verbs, such as have shown that the learning of sentence processing
take or put. Similarly, if the source language cues in an L2 is a gradual process. The process
has a two-color system (Berlin & Kay, 1969), as begins with L2 cue weight settings that are close to
in Dani, acquisition of an eight-color system, as in those for L1. Over time, these settings change in the
Hungarian, will be difcult. These effects under- direction of the native speakers settings for L2.
score the extent to which L2 lexical items are par- This pattern of results is perhaps most clearly
asitic on L1 forms. documented in McDonalds studies of English-
Second, learners are able to suppress some types Dutch and Dutch-English L2 learning (McDonald,
of incorrect transfer. For example, when a learner 1987b). This pattern shows the decline in the
tries to translate the English noun soap into strength of the use of word order by English
Spanish by using a cognate, the result is sopa (soup). learners of Dutch over increased levels of compe-
Misunderstandings created by false friend trans- tence. In Fig. 3.2, the monolingual cue usage pat-
fers such as this will be quickly detected and tern for English is given on the left and the
corrected. Similarly, an attempt to translate the monolingual Dutch pattern is given on the right.
English form competence into Spanish as com- Between these two patterns, we see a declining use
petencia will run into problems because compe- of word order and an increasing use of case in-
tencia means competition. Dijkstra (chapter 9, this ection across three increasing levels of learning of
volume) notes that, in laboratory settings, the sup- Dutch. In Fig. 3.3, we see exactly the opposite
pression of these incorrect form relatives is incom- pattern for Dutch learners of English. These results
plete, even in highly procient bilinguals. However, and others like them constitute strong support for
58 Acquisition
100
Percentage variance accounted for
80
60
Noun animacy
Case inflection
40
Word order
20
the Competition Model view of L2 learning as the of specic phrases such as Guten Morgen or
gradual growth of cue strength. Bye-bye. Learning about how and when to use
The Competition Model view of the two lan- specic speech acts is linked to learning about
guages as interacting in a variety of ways is further forms such as Could you? Listen, and Why
supported by evidence of effects from L2 back to not? Learning these forms in a concrete context is
L1. Sentence processing studies by Liu et al. (1992) important for both L1 and L2 learners.
and Dussias (2001) have demonstrated the pres- However, pragmatics involves much more
ence of just such effects. Although the Competition than simple speech act units or pairs. We also need
Model requires that the strongest transfer effects to learn larger frames for narratives, argumenta-
should be from L1 to L2, the view of competition tion, and polite chatting. By following the ow
as interactive leads us to expect some weaker of perspectives and topics in conversations (Mac-
amount of transfer from L2 back to L1. Whinney, 1999), models of how discourse repre-
sents reality in both L1 and L2 can eventually be
internalized.
Transfer in Pragmatics
The acquisition of pragmatic patterns is also heavily Transfer in Morphology
inuenced by L1 transfer. When we rst begin to use
an L2, we may extend our L1 ideas about the proper Learning of the morphological marking or inec-
form of greetings, questions, offers, promises, ex- tions of an L2 is very different from learning of the
pectations, turn taking, topic expansion, face-saving, other areas we have discussed. This is because, in
honorics, presuppositions, and implications. If the morphosyntax, it is typically impossible to transfer
two cultures are relatively similar, much of this from L1 to L2. For example, an English learner of
transfer will be successful. However, there will in- German cannot use the English noun gender system
evitably be some gaps. In many cases, the L2 learner as a basis for learning the German noun gender
eventually will need to reconstruct the entire system system. This is because English does not have a
of pragmatic patterns in the way they were learned fully elaborated noun gender system. Of course,
by the child acquiring L1. English does distinguish between genders in the
Much of this learning is based on specic pronouns (he vs. she), and this distinction is of
phrases and forms. For example, the L1 learners some help in learning to mark German nouns that
understanding of greetings is tightly linked to use have natural gender such as der Vater (the-masc
Unied Competition Model 59
100
Percentage variance accounted for
80
60
Noun animacy
Case inflection
40
Word order
20
Group
Figure 3.3 Changes in cue strength as Dutch speakers learn English (McDonald, 1987b). D/E 1, D/E 2,
D/E 3 indicate Dutch-English learning levels.
father) and die Mutter (the-fem mother). However, transfer because there is no basis for transfer. The
one really does not need to rely on cues from English exception here is between structurally mapable
he and she to realize that fathers are masculine features, as in the example of gender transfer from
and mothers are feminine. On the other hand, there Spanish to German.
can be some real transfer effects to German from Although there is no transfer of the exact forms
other languages that have full nominal gender sys- of morphosyntax and little transfer of secondary
tems. For example, a Spanish speaker might well mappings such as thinking that the moon is femi-
want to refer to the moon as feminine on the basis of nine, there is important positive and negative
la luna in Spanish and produce the erroneous form transfer of the underlying functions expressed by
die Mond in German rather than the correct mas- morphological devices. Concepts such as the in-
culine form der Mond. strumental, locatives, or benefactives often have
Similarly, a Spanish learner of Chinese cannot positive transfer between languages. For example,
use L1 knowledge to acquire the system of noun many languages merge the instrumental with
classiers because Spanish has no noun classiers. and the comitative with. If L1 has this merger, it
Chinese learners of English cannot use their L1 is easy to transfer the merged concept to L2. Sim-
forms to learn the English contrast among denite, ilarly, semantically grounded grammatical distinc-
indenite, and zero articles. This is because Chinese tions such as movement toward and movement
makes no overt distinctions in this area, leaving the from can easily be transferred across languages.
issue of deniteness to be marked in other ways, However, in other areas, transfer is less positive.
if at all. One remarkable area of difculty is in the learning
The fact that morphosyntax is not subject to of article marking in English by speakers of Chi-
transfer is a reection of the general Competition nese, Japanese, or Korean. These languages have no
Model dictum that everything that can transfer separate category of deniteness, instead using
will. In the areas of phonology, lexicon, orthog- classiers and plurals to express some of the func-
raphy, syntax, and pragmatics, there are attempts tions marked by the English denite. Moreover, the
to transfer. However, in morphology there is no complexity of the subcomponents of deniteness in
60 Acquisition
English stands as a major barrier for speakers of Pienemann et al. (chapter 7, this volume) present
these languages. the example of learning of Japanese SOV order by
speakers of L1 English. These learners almost never
generalize English SVO to Japanese. Of course,
Transfer in Sentence Production the input to L2 learners consistently emphasizes
SOV order and seldom presents any VO sequences,
Pienemann et al. (chapter 7, this volume) present although these do occur in colloquial Japanese.
evidence that the Competition Model claim that This learning is best understood in terms of the
everything that can transfer will does not hold in account of MacWhinney (1982, 1987a). Learners
the area of L2 sentence production. Instead, they acquire a few initial Japanese verbs as item-based
suggest that only those linguistic forms that the constructions with slots for objects in preverbal
learner can process can be transferred to the L2. position marked by the postposition o and topics
Their analysis of this issue is exceptionally detailed, in initial position marked by the postpositions
and the additional evidence they bring to bear is wa or ga. After learning a few such items, they
bound to lead to very helpful sharpening of the generalize to the feature-based construction of
issues at stake. SOV. This is positive learning based on consistent
Pienemann et al. (chapter 7, this volume) pres- input in L2. If L1 were to have a transfer effect at
ent the case of the learning of the German V2 rule this point, it would be extremely brief because L2 is
by speakers of L1 Swedish. The V2 rules in Swedish so consistent, and these item-based constructions
and German allow speakers to front adverbs like are in the focus of the learners attention.
today or now as long as the verb immediately What these two examples illustrate is that L1
follows in V2 position. This rule produces sen- transfer in the areas of sentence production and
tences such as Today likes Peter milk. The sur- morphosyntax is limited by the fact that morpho-
prising nding is that Swedes do not produce this syntax is the most language-specic part of a target
order from the beginning, starting instead with language. Because the mappings are hard to make,
Today Peter likes milk. This nding is only sur- transfer in this area is minimized. Once relations
prising if one believes that what learners transfer between the two languages can be constructed, as
are whole syntactic frames for whole sentences. in the case of the transfer of the English cleft to
However, the Competition Model holds that the Spanish, some positive transfer can be expected.
basic unit of both L1 and L2 acquisition is the item- However, we should not expect to see consistent
based pattern. In this case, learners rst learn to early transfer in this particular area. Thus, the an-
place the subject before the verb, as in Peter likes alyses of Pienemann et al. (chapter 7, this volume)
milk. Later, they add the adverb to produce Pe- are remarkably close to those found in the Com-
ter likes milk today. Only in the nal stages of petition Model once the importance of item-based
learning do they then pick up the item-based patterns is recognized.
frame that allows adverbs to take the initial slot.
The important point here is that, in this part of
sentence production, much as in morphology, the Resonance
mapping from L1 to L2 is low level and conserva-
tive. Thus, the failure to see a transfer of the V2 rule As mentioned, the Unied Competition Model in-
from Swedish to German is based on the fact that cludes three new components that were not found
Swedes are learning German from item-based pat- in the classic model. These are chunking, codes,
terns, not by picking up whole sentence frames at a and resonance. The theory of chunking is certainly
time. The emphasis on learning from item-based not new and could well have been included in the
patterns should hold for all beginning L2 learners. model many years ago. The theory of code rela-
For example, early transfer to Italian of the English tions is also not entirely new because it incorpo-
cleft structure would not be expected, although the rates and extends ideas about transfer that have
structure is present in both languages, and learners been in development within the Competition Model
will eventually make the mapping. The problem is for nearly 15 years. The component of resonance,
that during the rst stages of learning, learners are on the other hand, is new to the theory. Despite this
just not working on the sentence level. newness to the model, it plays an important central
The opposite side of this coin is that, when L2 role in understanding code separation, age-related
structures can be learned early on as item-based effects, and the microprocesses of learning and
patterns, this learning can block transfer from L1. processing.
Unied Competition Model 61
It is fairly easy to get an intuitive grasp of what attentional shifts in motivating the recruitment of
resonance means in L1 and L2 learning. Resonance additional computational elements.
occurs most clearly during covert inner speech. Perhaps the model that comes closest to ex-
Vygotsky (1962) observed that young children pressing the core notion of resonance is the inter-
would often give themselves instructions overtly. active activation model of the early 1980s.
For example, a 2-year-old might say pick it up Interactive activation models such as the bilingual
while picking up a block. At this age, the verbali- interactive activation model and the bilingual in-
zation tends to guide and control the action. By teractive model of lexical access (Thomas & Van
producing a verbalization that describes an action, Heuven, chapter 10, this volume) have succeeded
the child sets up a resonant connection between in accounting for important aspects of bilingual
vocalization and action. Later, Vygotsky argues, lexical processing. Although these models have not
these overt instructions become inner speech and explicitly examined the role of resonance, they are
continue to guide our cognition. The L2 learners go at least compatible with the concept.
through a process much like that of the child. At We can also use resonance as a way of under-
rst, they use the language only with others. Then, standing certain dynamic multilingual processes.
they begin to talk to themselves in the new lan- For example, variations in the delays involved in
guage and start to think in the second language. code switching in both natural and laboratory tasks
At this point, the L2 begins to assume the same can be interpreted in terms of the processes that
resonant status that the child attains for the L1. maintain language-internal resonant activations. If
Once a process of inner speech is set into mo- a particular language is repeatedly accessed, it will
tion, it can also be used to process new input and be in a highly resonant state. Although another
relate new forms to other forms paradigmatically. language will be passively accessible, it may take
For example, if I hear the phrase ins Mittelalter a second or two before the resonant activation of
in German, I can think to myself that this means that language can be triggered by a task. Thus, a
that the stem Alter must be das Alter. This means speaker may not immediately recognize a sentence
that the dative must take the form in welchem Alter in a language that has not been spoken in the recent
or in meinem Alter. These resonant form-related context. On the other hand, a simultaneous inter-
exercises can be conducted in parallel with more preter will maintain both languages in continual
expressive resonant exercises in which I simply try receptive activation while trying to minimize reso-
to talk to myself about things around me in Ger- nant activations in the output system of the source
man or whatever language I happen to be learning. language.
On a mechanistic level, resonance is based on Like La Heij (chapter 14, this volume), I would
the repeated coactivation of reciprocal connections. argue that multilingual processing relies more on
As the set of resonant connections grows, the activation and resonance than on inhibition (Green,
possibilities for cross associations and mutual ac- 1998). Of course, it is known that the brain makes
tivations grow, and the language starts to form a massive use of inhibitory connections. However,
coherent coactivating neural circuit. Although this these are typically local connections that sharpen
idea of resonance seems basic and perhaps obvious, local competitions. Inhibition is also important in
it is important to note that modern connectionist providing overt inhibitory control of motor output,
models (Murre, chapter 8, this volume; Thomas & as in speech monitoring. However, inhibition by
Van Heuven, chapter 10, this volume) have pro- itself cannot produce new learning, coactivation,
vided virtually no place for learning in resonant and inner speech. For these types of processing,
models. This is because current popular neural resonant activation is more effective.
network models, such as backpropagation, work The cognitive psychology of the 1970s (Atkin-
in only a feedforward fashion, so resonant links son, 1975) placed much emphasis on the role of
cannot be established or utilized. Self-organizing strategic resonance during learning. More recently,
maps such as the DisLex model of Li et al. (in press) the emphasis has been more on automatic processes
can provide local resonance between sound and of resonance, often within the context of theories
meaning, but have not yet been able to model res- of verbal memory. The role of resonance in L1
onance on the syntactic level. Grossbergs (1987) learning is an area of particular current impor-
adaptive resonance theory would seem to be tance. It is known that children can learn new
one account that should capture at least some ideas words with only one or two exposures to the new
about resonance. However, the resonant connec- sounds. For this to work successfully, children must
tions in that model only capture the role of resonantly activate the phonological store for that
62 Acquisition
word. In the model of Gupta and MacWhinney sound symbolism, postural associations (Paget,
(1997), this resonance will involve keeping the 1930), lexical analysis, or a host of other provi-
phonological form active in short-term memory sional relations. It is not necessary that this sym-
long enough for it to be reliably encoded into the bolism be in accord with any established linguistic
central lexical network (Li et al., in press). This pattern. Instead, each learner is free to discover a
preservation of the auditory form in the phono- different pattern of associations. This nonconven-
logical buffer is one form of resonant processing. tional nature of resonant connections means that it
Resonance can facilitate the sharpening of con- will be difcult to demonstrate the use of specic
trasts between forms. Both L1 and L2 learners may resonant connections in group studies of lexical
have trouble encoding new phonological forms that learning.
are close to words they already know. Children can However, it is known that constructive mne-
have trouble learning the two new forms pif and bif monics provided by the experimenter (Atkinson,
because of their confusability, although they can 1975) greatly facilitate learning. For example,
learn pif when it occurs along with wug (Stager & when learning the German word Wasser, the sound
Werker, 1997). of water running out of a faucet can be imagined
This same phonological confusability effect can and associated with the /s/ of Wasser. For this
have an impact on L2 learners. For example, when word, the sound of the German word can also
I came to learn Cantonese, I needed to learn to pay be associated to the sound of the English word
careful attention to marking with tones so I would water. At the same time, we can associate Wasser
not confuse mother, measles, linen, horse, and with collocations such as Wasser trinken, which
scold as various forms of /ma/. Once a learner has themselves resonate with Bier trinken and others.
the tonal features right, it is still important to pay Together, these resonant associations among collo-
attention to each part of a word. For example, cations, sounds, and other words help to link the
when I was learning the Cantonese phrase for German word Wasser into the developing German
pivoting your foot inward, I initially encoded it lexicon.
as kau geu instead of the correct form kau geuk. It is likely that children also use these mecha-
This is because there is a tendency in Cantonese to nisms to encode the relations between sounds and
reduce nal /k/. However, the reduced nal /k/ is meanings. Children are less inhibited than adults
not totally absent and has an effect on the quality in their ability to create ad hoc symbolic links be-
of the preceding vowel. At rst, I did not attend to tween sounds and meanings. The child learning
this additional component or cue. However, after German as an L1 might associate the shimmering
my encoding for kau geu became automated, qualities of Wasser with a shimmering aspect of the
my attentional focusing was then freed enough so sibilant; or the child might imagine the sound as
that I could notice the presence of the nal /k/. This plunging downward in tone in the way that water
expansion of selective attention during learning is comes over a waterfall. The child may link the
a very general process. concept of Wasser tightly to a scene in which
Once the auditory form is captured, the learner someone pours ein Glas Wasser, and then the as-
needs to establish some pathway between the sociation between the sound of Wasser and the
sound and its meaning. Because few words encode image of the glass and the pouring are primary. For
any stable conventional phonological symbolism, the L1 learner, these resonant links are woven to-
pathways of this type must be constructed anew gether with the entire nature of experience and the
by each language learner. It has been proposed growing concept of the world.
that activation of the hippocampus (McClelland, A major dimension of resonant connections is
McNaughton, & OReilly, 1995) is sufcient to between words and our internal image of the hu-
encode arbitrary relations of this sort. If this were man body. For example, Bailey, Chang, Feldman,
true, L2 learners would have virtually no problem and Narayanan (1998) characterize the meaning of
picking up long lists of new vocabulary items. Al- the verb stumble in terms of the physical motion of
though the hippocampus certainly plays a role in the limbs during walking, the encountering of a
maintaining a temporary resonance between sound physical object, and the breaking of gait and pos-
and meaning, it is up to the learner to extract ad- ture. As Tomasello (1992) noted, each new verb
ditional cues that can facilitate the formation of the learned by the child can be mapped onto a physical
sound-meaning linkage. or cognitive frame of this type. In this way, verbs
Resonant mappings can rely on synesthesia (Ra- and other predicates can support the emergence of
machandran & Hubbard, 2001), onomatopoeia, a grounded mental model for sentences. Workers
Unied Competition Model 63
in L2 (Asher, 1977) have often emphasized the Chinese writing system is based largely on radical
importance of action for the grounding of new elements that have multiple resonant associations
meanings, and this new literature in cognitive with the sounds and meanings of words.
grammar provides good theoretical support for Resonance can also play an important role in
that approach. Item-based patterns are theoreti- the resolution of errors. For example, I recently
cally central in this discussion because they provide noted that I had wrongly coded the stress on the
a powerful link between the earlier Competition Spanish word abanico (fan) on the second syllable,
Models emphasis on processing and cue validity as in abanico. To correct this error, I spent time
and the newer theories of grounded cognition both rehearsing the correct stress pattern a few
(MacWhinney, 1999). times and then visualizing the word as spelled
Resonance can make use of analogies between without the stress mark or with the stress on the
stored chunks, as described in the theories for second syllable, which is normally not written in
storage and chunking. Gentner and Markman Spanish spelling. I also tried to associate this pat-
(1997), Hofstadter (1997), and others have for- tern in my mind with the verb abanicar (fan) and
mulated models of analogical reasoning that have even the rst person singular of this verb that has
interesting implications for language acquisition the form abanico. Having rehearsed this form in
models. Analogies can be helpful in working out these various ways and having established these
the rst examples of a pattern. For example, a child resonant connections, the tendency to produce the
learning German may compare steh auf! (stand only incorrect form was somewhat reduced, al-
up!) with er mub aufstehen (He must get up). The though it will take time to banish fully the traces of
child can see that the two sentences express the the incorrect pattern.
same activity, but that the verbal prex is moved in
one. Using this pattern as the basis for further
resonant connections, the child can then begin to Age-Related Effects
acquire a general understanding of verbal prex
placement in German. It may be helpful to review here how the Unied
The adult L2 learner tends to rely on rather less- Competition Model accounts for age-related
imaginative and more structured resonant linkages. changes in language learning ability. As Dekeyser
One important set of links available to the adult is and Larson-Hall (chapter 5, this volume) note, the
orthography. When an L2 learner of German learns default account in this area has been the critical
the word Wasser, it is easy to map the sounds of the period hypothesis, which holds that, after some
word directly to the image of the letters. Because time in late childhood or puberty, L2s can no
German has highly regular mappings from or- longer be acquired by the innate language acquisi-
thography to pronunciation, calling up the image tion device, but must be learned painfully and
of the spelling of Wasser is an extremely good way incompletely through explicit instruction.
of activating its sound. When the L2 learner is il- Following the work of Birdsong (chapter 6, this
literate or when the L2 orthography is unlike the volume), the Unied Competition Model attributes
L1 orthography, this backup system for resonance the observed facts about age-related changes to
will not be available. The L2 learning of Chinese by very different sources. The model emphasizes the
speakers of languages with Roman scripts illus- extent to which repeated use of L1 leads to its on-
trates this problem. In some signs and books in going entrenchment. This entrenchment operates
mainland China, Chinese characters are accompa- differentially across linguistic areas, with the stron-
nied by romanized pinyin spellings. This allows gest entrenchment occurring in output phonology
the L2 learner a method for establishing resonant and the least entrenchment in the area of lexicon,
connections among new words, their pronuncia- for which new learning continues to occur in L1
tion, and their representations in Chinese orthog- in any case. To overcome entrenchment, learners
raphy. However, in Taiwan and Hong Kong, must rely on resonant processes that allow the
characters are seldom written out in pinyin in ei- edgling L2 to resist the intrusions of L1, particu-
ther books or public notices. As a result, learners larly in phonology (Colome, 2001; Dijkstra,
cannot learn from these materials. To make use of Grainger, & Van Heuven, 1999). For languages
resonant connections from orthography, learners with familiar orthographies, resonance connections
must then focus on the learning of the complex can be formed among writing, sound, meaning, and
Chinese script. This learning itself requires a large phrasal units. For languages with unfamiliar or-
investment in resonant associations because the thographies, the domain of resonant connections
64 Acquisition
will be more constrained. This problem has a se- perceptual strategies. In H. Winitz (Ed.),
vere impact on older learners because they have Annals of the New York Academy of Sciences
become increasingly reliant on resonant connec- Conference on Native and Foreign Language
tions between sound and orthography. Acquisition (pp. 190214). New York: New
Because learning through resonant connections York Academy of Sciences.
Bates, E., & MacWhinney, B. (1982). Functionalist
is highly strategic, L2 learners will vary markedly
approaches to grammar. In E. Wanner & L.
in the constructions they can control or that are Gleitman (Eds.), Language acquisition: The
missing or incorrectly transferred (Birdsong, chap- state of the art (pp. 173218). New York:
ter 6, this volume). In addition to the basic forces Cambridge University Press.
of entrenchment, transfer, and strategic resonant Berlin, B., & Kay, P. (1969). Basic color terms:
learning, older learners will be affected by prob- Their universality and evolution. Berkeley:
lems with restricted social contacts, commitments University of California Press.
to ongoing L1 interactions, and declining cognitive Blackwell, A., & Bates, E. (1995). Inducing
abilities. None of these changes predict a sharp agrammatic proles in normals: Evidence for
drop at a certain age in L2 learning abilities. the selective vulnerability of morphology
under cognitive resource limitation. Journal of
Instead, they predict a gradual decline across the
Cognitive Neuroscience, 7, 228257.
life span. Bley-Vroman, R., Felix, S., & Ioup, G. (1988). The
accessibility of universal grammar in adult
language learning. Second Language Research,
Conclusion 4, 132.
Booth, J. R., MacWhinney, B., Thulborn, K. R.,
This concludes the examination of the Unied Sacco, K., Voyvodic, J. T., & Feldman, H. M.
Competition Model. Many of the pieces of this (2001). Developmental and lesion effects
model have already been worked out in some detail. during brain activation for sentence compre-
For example, there is a good model of cue compe- hension and mental rotation. Developmental
Neuropsychology, 18, 139169.
tition in syntax for both L1 and L2. There are good
Clahsen, H., & Muysken, P. (1986). The
models of L1 lexical acquisition. There are good availability of UG to adult and child
data on phonological and lexical transfer in L2. learners: A study of the acquisition of
There are clear data on the ways in which proces- German word order. Second Language
sing load has an impact on sentence processing in Research, 2, 93119.
working memory. We are even learning about the Colome, A`. (2001). Lexical activation in bilinguals
neuronal bases of this load (Booth et al., 2001). speech production: Language specic or
Other areas provide targets for future work. But, language independent. Journal of Memory
the central contribution of the unied model is not and Language, 45, 721736.
in terms of accounting for specic empirical nd- De Bot, K., & Van Montfort, R. (1988).
Cue-validity in het Nederlands als eerste en
ings. Rather, the unied model provides a high-level
tweede taal. Interdisciplinair Tijdschrift voor
road map of a very large territory that can now be Taal en Tekstwetenschap, 8, 111120.
lled out in greater detail. Dell, G., Juliano, C., & Govindjee, A. (1993).
Structure and content in language production:
References A theory of frame constraints in
phonological speech errors. Cognitive
Asher, J. (1977). Children learning another Science, 17, 149195.
language: A developmental hypothesis. Child Devescovi, A., DAmico, S., Smith, S., Mimica,
Development, 48, 10401048. I., & Bates, E. (1998). The development of
Atkinson, R. (1975). Mnemotechnics in second- sentence comprehension in Italian and
language learning. American Psychologist, Serbo-Croatian: Local versus distributed cues.
30, 821828. In D. Hillert (Ed.), Syntax and semantics:
Bailey, D., Chang, N., Feldman, J., & Vol. 31. Sentence processing: A cross-linguistic
Narayanan, S. (1998). Extending embodied perspective (pp. 345377). San Diego, CA:
lexical development. Proceedings of the 20th Academic Press.
Annual Meeting of the Cognitive Science Dijkstra, A., Grainger, J., & Van Heuven, W. J. B.
Society, 6469. (1999). Recognizing cognates and
Bates, E., & MacWhinney, B. (1981). Second interlingual homographs: The neglected
language acquisition from a functionalist role of phonology. Journal of Memory
perspective: Pragmatic, semantic and and Language, 41, 496518.
Unied Competition Model 65
Dopke, S. (2000). Generation of and retraction Grossberg, S. (1987). Competitive learning: From
from cross-linguistically motivated structure interactive activation to adaptive resonance.
in bilingual rst language acquisition. Cognitive Science, 11, 2363.
Bilingualism: Language, and Cognition, 3, Gupta, P., & MacWhinney, B. (1997). Vocabulary
209226. acquisition and verbal short-term memory:
Dussias, P. E. (2001). Bilingual sentence parsing. Computational and neural bases. Brain and
In J. L. Nicol (Ed.), One mind, two languages: Language, 59, 267333.
Bilingual sentence processing (pp. 159176). Hakuta, K. (1981). Grammatical description versus
Cambridge, MA: Blackwell. configurational arrangement in language
Ellis, R. (1994). A theory of instructed second acquisition: The case of relative clauses in
language acquisition. In N. C. Ellis (Ed.), Japanese. Cognition, 9, 197236.
Implicit and explicit learning of language Hancin-Bhatt, B. (1994). Segment transfer: A
(pp. 79114). San Diego, CA: Academic Press. consequence of a dynamic system. Second
Elman, J. L., & McClelland, J. L. (1988). Cognitive Language Research, 10, 241269.
penetration of the mechanisms of perception: Harrington, M. (1987). Processing transfer:
Compensation for coarticulation of lexically language-specic strategies as a source of
restored phonemes. Journal of Memory and interlanguage variation. Applied
Language, 27, 143165. Psycholinguistics, 8, 351378.
Ervin-Tripp, S. M. (1968). Sociolinguistics. In Hauser, M., Newport, E., & Aslin, R. (2001).
L. Berkowitz (Ed.), Advances in experimental Segmentation of the speech stream in a
social psychology (Vol. 4, pp. 91165). New non-human primate: statistical learning in
York: Academic Press. cotton-top tamarins. Cognition, 78,
Felix, S., & Wode, H. (Eds.). (1983). Language B53B64.
development at the crossroads. Tubingen, Hausser, R. (1999). Foundations of computational
Germany: Gunter Narr. linguistics: Man-machine communication in
Flege, J., & Davidian, R. (1984). Transfer and natural language. Berlin: Springer.
developmental processes in adult foreign Hofstadter, D. (1997). Fluid concepts and
language speech production. Applied creative analogies: Computer models of the
Psycholinguistics, 5, 323347. fundamental mechanisms of thought. London:
Flege, J., Takagi, J., & Mann, V. (1995). Japanese Allen Lane.
adults can learn to produce English r and l Ijaz, H. (1986). Linguistic and cognitive determi-
accurately. Language Learning, 39, 2332. nants of lexical acquisition in a second
Flynn, S. (1996). A parameter-setting approach to language. Language Learning, 36, 401451.
second language acquisition. In W. C. Ritchie Kilborn, K. (1989). Sentence processing in a second
& T. K. Bhatia (Eds.), Handbook of second language: The timing of transfer. Language
language acquisition (pp. 121158). San and Speech, 32, 123.
Diego, CA: Academic Press. Kilborn, K., & Cooreman, A. (1987). Sentence
Gass, S. (1987). The resolution of conflicts among interpretation strategies in adult Dutch-
competing systems: A bidirectional perspec- English bilinguals. Applied Psycholinguistics,
tive. Applied Psycholinguistics, 8, 329350. 8, 415431.
Gentner, D., & Markman, A. (1997). Structure Kilborn, K., & Ito, T. (1989). Sentence processing
mapping in analogy and similarity. American in Japanese-English and Dutch-English
Psychologist, 52, 4556. bilinguals. In B. MacWhinney & E. Bates
Gerver, D. (1974). The effects of noise on the (Eds.), The crosslinguistic study of sentence
performance of simultaneous interpreters: processing (pp. 257291). New York:
Accuracy of performance. Acta Psychologica, Cambridge University Press.
38, 159167. Krashen, S. (1994). The input hypothesis and its
Gibson, E., Pearlmutter, N., Canseco-Gonzalez, E., rivals. In N. C. Ellis (Ed.), Implicit and explicit
& Hickok, G. (1996). Recency preference in learning of languages (pp. 4578). San Diego,
the human sentence processing mechanism. CA: Academic.
Cognition, 59, 2359. Langacker, R. (1989). Foundations of cognitive
Goldberg, A. E. (1999). The emergence of the grammar. Vol. 2: Applications. Stanford, CA:
semantics of argument structure constructions. Stanford University Press.
In B. MacWhinney (Ed.), The emergence of Li, P., Farkas, I., & MacWhinney, B. (2004).
language (pp. 197213). Mahwah, NJ: Early lexical development in a self-organizing
Erlbaum. neural network. Neural Networks, 17,
Green, D. M. (1998). Mental control of the 13451362.
bilingual lexico-semantic system. Bilingualism: Liu, H., Bates, E., & Li, P. (1992). Sentence
Language and Cognition, 1, 6781. interpretation in bilingual speakers of English
66 Acquisition
and Chinese. Applied Psycholinguistics, 13, McClelland, J. L., McNaughton, B. L., &
451484. OReilly, R. C. (1995). Why there are
Lively, S., Pisoni, D., & Logan, J. (1990). Some complementary learning systems in the
effects of training Japanese listeners to identify hippocampus and neocortex: Insights from
English /r/ and /l/. In Y. Tohkura (Ed.), Speech the successes and failures of connectionist
perception, production and linguistic structure models of learning and memory. Psychologi-
(pp. 4655). Tokyo: OHM. cal Review, 102, 419457.
MacDonald, M. C., & MacWhinney, B. (1990). McDonald, J. L. (1987a). Assigning linguistic roles:
Measuring inhibition and facilitation from The inuence of conicting cues. Journal of
pronouns. Journal of Memory and Language, Memory and Language, 26, 100117.
29, 469492. McDonald, J. L. (1987b). Sentence interpreta-
MacDonald, M. C., Pearlmutter, N. J., & tion in bilingual speakers of English
Seidenberg, M. S. (1994). Lexical nature of and Dutch. Applied Psycholinguistics, 8,
syntactic ambiguity resolution. Psychological 379414.
Review, 101, 676703. McDonald, J. L., & Heilenman, K. (1991).
MacWhinney, B. (1975). Pragmatic patterns in Determinants of cue strength in adult rst
child syntax. Stanford Papers and Reports on and second language speakers of French.
Child Language Development, 10, 153165. Applied Psycholinguistics, 12, 313348.
MacWhinney, B. (1978). The acquisition of McDonald, J. L., & MacWhinney, B. (1989).
morphophonology. Monographs of the Society Maximum likelihood models for sentence
for Research in Child Development, 43(1). processing research. In B. MacWhinney & E.
MacWhinney, B. (1982). Basic syntactic processes. Bates (Eds.), The crosslinguistic study of
In S. Kuczaj (Ed.), Language acquisition: sentence processing (pp. 397421). New York:
Vol. 1. Syntax and semantics (pp. 73136). Cambridge University Press.
Hillsdale, NJ: Erlbaum. McDonald, J. L., & MacWhinney, B. J. (1995).
MacWhinney, B. (1987a). The Competition The time course of anaphor resolution: Effects
Model. In B. MacWhinney (Ed.), Mechanisms of implicit verb causality and gender.
of language acquisition (pp. 249308). Journal of Memory and Language, 34,
Hillsdale, NJ: Erlbaum. 543566.
MacWhinney, B. (1987b). Toward a psycho- Menn, L., & Stoel-Gammon, C. (1995).
linguistically plausible parser. In S. Thomason Phonological development. In P. Fletcher & B.
(Ed.), Proceedings of the Eastern States MacWhinney (Eds.), The handbook of child
Conference on Linguistics. Columbus: Ohio language (pp. 335360). Oxford, U.K.:
State University. Blackwell.
MacWhinney, B. (1989). Competition and lexical Miyake, A., Carpenter, P., & Just, M. (1994). A
categorization. In R. Corrigan, F. Eckman, & capacity approach to syntactic comprehension
M. Noonan (Eds.), Linguistic categorization disorders: Making normal adults perform like
(pp. 195242). Philadelphia: Benjamins. aphasic patients. Cognitive Neuropsychology,
MacWhinney, B. (1999). The emergence of lan- 11, 671717.
guage from embodiment. In B. MacWhinney Moon, C., Cooper, R. P., & Fifer, W. P. (1993).
(Ed.), The emergence of language (pp. Two-day infants prefer their native language.
213256). Mahwah, NJ: Erlbaum. Infant Behavior and Development, 16,
MacWhinney, B. (2002). The gradual emergence 495500.
of language. In T. Givon & B. F. Malle Newell, A. (1990). A unied theory of cognition.
(Eds.), The evolution of language out of Cambridge, MA: Harvard University Press.
pre-language (pp. 231263). Philadelphia: Norris, D. (1994). Shortlist: A connectionist model
Benjamins. of continuous speech recognition. Cognition,
MacWhinney, B., & Anderson, J. (1986). The 52, 189234.
acquisition of grammar. In I. Gopnik & Paget, R. (1930). Human speech. New York:
M. Gopnik (Eds.), From models to modules Harcourt Brace.
(pp. 325). Norwood, NJ: Ablex. Ramachandran, V. S., & Hubbard, E. M. (2001).
MacWhinney, B., Feldman, H. M., Sacco, K., & Synaesthesia: A window into perception,
Valdes-Perez, R. (2000). Online measures of thought and language. Journal of
basic language skills in children with early Consciousness Studies, 8, 334.
focal brain lesions. Brain and Language, 71, Seleskovitch, D. (1976). Interpretation: A
400431. psychological approach to translating.
MacWhinney, B., & Pleh, C. (1988). The In R. W. Brislin (Ed.), Translation:
processing of restrictive relative clauses in Application and research (pp. 113134).
Hungarian. Cognition, 29, 95141. New York: Gardner.
Unied Competition Model 67
Snow, C. E. (1999). Social perspectives on the Stemberger, J., & MacWhinney, B. (1986).
emergence of language. In B. MacWhinney Frequency and the lexical storage of regularly
(Ed.), The emergence of language inected forms. Memory and Cognition, 14,
(pp. 257276). Mahwah, NJ: Erlbaum. 1726.
Sokolov, J. L. (1988). Cue validity in Hebrew Stockwell, R., Bowen, J., & Martin, J. (1965). The
sentence comprehension. Journal of Child grammatical structures of English and Span-
Language, 15, 129156. ish. Chicago: University of Chicago Press.
Sokolov, J. L. (1989). The development of role Tomasello, M. (1992). First verbs: A case study of
assignment in Hebrew. In B. MacWhinney & early grammatical development. Cambridge,
E. Bates (Eds.), The crosslinguistic study of U.K.: Cambridge University Press.
sentence processing (pp. 158184). New York: Tomasello, M. (2000). The item-based nature of
Cambridge University Press. childrens early syntactic development. Trends
Spivey, M., & Marian, V. (1999). Cross talk be- in Cognitive Sciences, 4, 156163.
tween native and second language: Partial ac- Vygotsky, L. (1962). Thought and language.
tivation of an irrelevant lexicon. Psychological Cambridge, MA: MIT Press.
Science, 10, 281284. Werker, J. F. (1995). Exploring developmental
Stager, C. L., & Werker, J. F. (1997). Infants listen changes in cross-language speech perception. In
for more phonetic detail in speech perception L. Gleitman & M. Liberman (Eds.), An invi-
than in word learning tasks. Nature, 388, tation to cognitive science: Language, Volume
381382. 1 (pp. 87106). Cambridge, MA: MIT Press.
Nuria Sebastian-Galles
Laura Bosch
4
Phonology and Bilingualism
68
Phonology and Bilingualism 69
different brain areas, some of them already func- and English and between English and Japanese, but
tional at birth; others may take several months, not between Dutch and English or between Spanish
perhaps years, to be fully developed (Posner, Roth- and Italian. That is, newborns can distinguish be-
bart, Farah, & Bruer, 2001). tween languages that differ fundamentally with
In this scenario, the meaning of early or late respect to their rhythmic or prosodic structure
exposure to a language may differ according to (Abercrombie, 1967), but not between languages
which brain areas are recruited in a particular that belong to the same rhythmic category (Mehler
process. We are just beginning to understand the et al., 1988; Nazzi, Bertoncini, & Mehler, 1998).
neural substrate of most of our mind, and we are This ability is not specic to human beings because
even more at the beginning of understanding how it can be traced back in other mammals, like tam-
the brain develops.1 Our knowledge of how speech arin monkeys (Ramus, Hauser, Miller, Morris, &
is processed in the brain is also not very profound. Mehler, 2000) and rats (Toro, Trobalon, & Se-
Nevertheless, at least for some particular domains, bastian-Galles, 2003). However, independent of
our knowledge about the functioning of the speech the phylogenetic origins of this capacity, the early
processing system is quite robust. Behavioral re- acquaintance with the prosodic structure of the
search done in the past years to understand the basic language of exposure has been considered the core
mechanisms of acquisition and processing of non- of the prosodic bootstrapping hypothesis for the
native languages has provided insightful results, lexical and syntactic acquisition by different re-
solid bases, and hypotheses to be tested. searchers (Guasti, Nespor, Christophe, & Van
In this chapter, we review different aspects of Ooyen, 2001).
the processing of the sound systems in bilinguals. It has been suggested that the prosody of speech
Although we cover a wide range of phonological may provide infants with crucial information to
domains, we do not address in an exhaustive way start acquiring the language. For instance, it has
relatively well-known areas (like, for instance, the been hypothesized that some major syntactic pa-
processing of nonnative phonemes) for which other rameters, such as the direction of recursivity, might
reviews already exist. We also restrict our review be set at a prelexical stage based on the prosodic
to oral languages, and bilingualism in written and prominence of whole utterances to which the infant
sign languages is not addressed (basically because is exposed early in life (Christophe, Guasti, Nespor,
almost nothing exists for the domain of phono- Dupoux, & Van Ooyen, 1997). For infants in bi-
logical processing). lingual environments, prosody could also be ini-
The organization of this chapter follows a tially helpful in setting the languages apart after
developmental sequence in describing the different a short period of exposure, possibly just a few
phonological subsystems. We start by what is months. Prosodic information could facilitate the
known about newborns speech processing capa- discovery of two different language systems and
cities, and we subsequently review data of older in- perhaps help infants start the building of this in-
fants, from their second semester of life up to the age formation in two separate systems before they reach
when they have already begun to build a receptive the lexical stage in their language development.
vocabulary and can easily associate word labels to Although no data are available on language dis-
novel objects, as in the classical word-learning task crimination capacities in newborns or very young
experiments. This approach helps stress the impor- infants (up to 4 months of age) growing up in bi-
tance we give to the developmental pattern.2 lingual environments, it is reasonable to say that,
at least theoretically, newborns simultaneously ex-
posed to languages belonging to different rhythmic
Becoming a Bilingual: The categories should be able to tell apart both sound
Discovery of the Existence systems at an early age; newborns exposed to lan-
of Different Sound Systems guages with more similar prosodic structures would
face a rather different starting point, with perhaps a
One of the rst prerequisites to become a bilingual later differentiation.3
is to be able to distinguish the existence of two How long does it take these bilingual infants to
different sound systems spoken in the environment. separate both systems? Research in our laboratory
Data of monolingual newborn infants show that with infants exposed from birth to Catalan and
they can differentiate between some pairs of lan- Spanish (two Romance languages belonging to the
guages, but not between any language pair. For in- same rhythmic group; Ramus, Nespor, & Mehler,
stance, newborns can distinguish between Spanish 1999) has shown that as early as 4.5 months of age,
70 Acquisition
bilingual infants can separate both languages. In to the specic characteristics of the native language
that research, infants were familiarized to six dif- representation, which gradually includes ne-
ferent sentences in their maternal language (that is, grained details on the specic rhythmic and prosodic
the language predominantly spoken by their mo- information of the language.
ther, Catalan or Spanish; thus, there were two Parallel experiments have been done in our
different groups of bilingual infants in the study). laboratory with syllable-based languages to assess
After 2 minutes of familiarization, they were tested further this perceptual renement that seems to take
on their attention time to novel utterances for eight place approximately from 2 to 5 months of age.
test trials, half in the same language of the famil- Infants from monolingual Spanish and monolin-
iarization phase and half with a switch in the lan- gual Catalan families have been analyzed in their
guage (always the same female speaker with a capacities to discriminate between Catalan, Span-
motherese style). Both groups of infants showed ish, and Italian at 4.5 months of age (Bosch &
signicant mean attention time differences to the Sebastian-Galles, 2000b). These monolingual in-
switch and same trials, with longer listening time fants can distinguish not only Spanish from Cata-
to trials with a switch in the language (Bosch & lan (see previous discussion in this section), but also
Sebastian-Galles, 2001b). Similar results were ob- Italian from Catalan; however, they cannot distin-
tained with two parallel groups of infants from guish between Spanish and Italian. These particular
monolingual environments, thus indicating that results are interpreted not in terms of prosodic
these two Romance languages can be differentiated differences (which seem to be minimal, as data
rather early in life independent of the level of ex- from low-pass-ltered materials with adult partic-
posure. The crucial point to be stressed here is the ipants have suggested; Bosch & Sebastian-Galles,
fact that simultaneous bilingual exposure was not 2002), but as a function of the specic frequency
creating any specic trouble in the process of lan- and distribution of vowels in the uent speech of these
guage differentiation for this pair of languages, a three languages: Italian and Spanish would show a
case for which a later differentiation had been more similar distribution of vowel sounds than Cat-
predicted (Mehler, Dupoux, Nazzi, & Dehaene- alan, a language in which central vowels /a/ and
Lambertz, 1996). schwa count for more than half of the total number
Thus, the possibility of separating both lan- of vowels present in uent speech.
guages, even if they are rhythmically very similar, These results emphasize the importance of dis-
would already be present in the rst half of the rst tributional cues, segmental rather than prosodic, to
year of life, before any language-specic behavior help reach ner discriminations between the native
has been observed (the rst one is vowel perception; language and a nonfamiliar one. This hypothesis is
see the next section). Interestingly, other studies currently under exploration with the contrast be-
with monolingual infants have also shown that by 5 tween Spanish and Basque, a non-Indo-European
months of age they are already able to make some language with a subject-object-verb (SOV) basic
discriminations within rhythmic group for stress- order. This pair of languages shows similar
based languages, provided that the languages in the vowel repertoires and distributional properties of
test belong to the native rhythmic class. the segments; rhythmic and syntactic differences
For instance, Jusczyk and collaborators found are notorious.4
that 5-month-old American monolingual infants Even though all these results have been obtained
can discriminate between British English and with monolingual infants, they are relevant in
Dutch (both stress based) and between two English identifying the specic acoustic cues in the speech
dialects (British English and American English); signal that are useful to discriminate languages and
however, they cannot distinguish between Dutch to start creating a native language representation
and German, again two stress-based languages that, through extended exposure, will become more
(Nazzi, Jusczyk, & Johnson, 2000). This result is and more rened. As for the bilinguals, the only
interesting because it indicates, as suggested by the studies in which a native language (one of them, the
authors, that the ability to make within-category one spoken by the mother) has been contrasted
discriminations emerges as a consequence of in- with a nonfamiliar language have been developed
fants gradual learning of the specic rhythmic in our laboratory.
features of the native language, and this ability is Some evidence exists showing that 4.5- as well
related to how similar the languages are when as 6-month-old bilingual infants do discriminate
compared to the native language or native dialect. between their maternal language and English (syl-
This discrimination capacity thus seems to be tied lable-based vs. stress-based languages), although
Phonology and Bilingualism 71
their pattern of results is in clear contrast performing this task, both in English and in French
to the one obtained with monolingual infants (Cutler, Mehler, Norris, & Segu, 1989, 1992). The
(Bosch & Sebastian-Galles, 1997, 2001a). More- results showed a complex conguration of results;
over, a language differentiation has also been ob- basically, two different patterns were observed:
served between the maternal language and Italian, Either participants behaved like English monolin-
a within-category distinction not yet available for guals (no syllabic effect in French and in English) or
monolinguals at 4.5 months of age when Spanish they showed a syllabic effect for French, but not for
and Italian are contrasted (Bosch & Sebastian- English materials. The authors concluded that the
Galles, 1997). results were more consistent with the notion that,
Altogether, the results seem to suggest that, in even for simultaneous bilinguals, there is a domi-
bilingual exposure, attention to specic prosodic nance of one language over the other (in their
and distributional cues of syllabic or segmental sample, the dominant language was most often the
units in the speech signal may help the infant reach language of the mother).5
an early differentiation between the languages, al- The consequences of these different parsing
though the specic mechanisms involved may vary strategies of the speech signal caused by rhythmic
for different pairs of different languages, and pos- differences across languages have also been ob-
sible delays in differentiation might be observed served in other experimental situations. It seems
even for languages quite different in prosodic terms. that the ability to adapt to time-compressed speech
Renement in the specication of the types of cues is also mediated by the rhythmic properties of
available must wait until new data are gathered the maternal language. In a series of experiments
from bilingual language acquisition studies. (Altmann & Young, 1993; Dupoux & Green,
It has been proposed (Cutler & Mehler, 1993; 1997; Mehler et al., 1993; Pallier, Sebastian-Galles,
Mehler et al., 1996; Mehler & Christophe, 2000) Dupoux, Christophe, & Mehler, 1998; Sebastian-
that learning a particular language belonging to Galles, Dupoux, Costa, & Mehler, 2000), it was
a specic rhythmic group in infancy has lasting shown that listeners can adapt to highly time-
consequences in the way adults perceive speech. compressed sentences rather rapidly. In these
According to these authors, individuals exposed to experiments, participants are presented highly
stress languages (like English or Dutch) would compressed utterances in a particular language (or
perceive the speech signal in a different way than condition), and they are asked either to listen pas-
speakers of syllabic languages (like French or sively to them or to try to transcribe as much as they
Spanish). Different studies have made use of this can. After the habituation phase, participants are
cross-linguistic difference to analyze how bilinguals presented with a set of target sentences (in their
perceive their languages. But, let us see rst what it maternal language), and they are asked to write
means to perceive a language in a syllable way. them down (there can also be a control situation in
It has been observed that, when performing a which participants are not presented with any ha-
syllable detection task, native French listeners are bituation sentences, and they are just asked to try
faster when the target to detect coincides with to write down the test sentences).
the initial syllable of the carrier word. For example, Sebastian-Galles et al. (2000) observed that
when asked to detect the sequence /ba/ at the transfer of adaptation occurred within rhythmic
beginning of a French word like balance or balcon, group languages, but not across rhythmic group
French speakers are faster in the former case. languages; that is, listening to time-compressed
Reaction times are reversed when the syllable to be Catalan, Italian, or Greek (three syllabic languages)
detected is bal (Mehler, Dommergues, Frauen- helped Spanish natives better perceive the target
felder, & Segui, 1981). In contrast, when English Spanish time-compressed sentences than listening to
native listeners are asked to perform an equivalent time-compressed English or Japanese, languages not
task with stimuli like the English words balance belonging to the syllabic group.6 Pallier et al. (1998)
and balcony, they do not show a syllable advantage reported no transfer for monolingual English
effect; in fact, in several studies using this tech- speakers from adapting to time-compressed French,
nique, English natives have failed to show any and they showed intermediate transfer (compared
syllabic trace (Bradley, Sanchez-Casas, & Garca with habituation to English) with Dutch sentences.
Albea, 1993; Cutler, Mehler, Norris, & Segui, In this last study, Pallier et al. (1998) also studied
1983, 1986). different groups of bilinguals. They studied two
Cutler and colleagues also studied highly groups of Catalan-Spanish and English-French
balanced French-English simultaneous bilinguals bilinguals (highly procient in both languages,
72 Acquisition
although they had learned the L2 in early child- the discrimination of consonant sounds. The spe-
hood, not from birth). The results showed that, cic properties of the sounds to be contrasted with
even in the case of highly procient bilinguals, respect to the frequently experienced sounds in the
transfer across rhythmic groups did not occur. That language of exposure make discrimination easily
is, Spanish-Catalan bilinguals showed benets from available because they cannot be assimilated to the
exposure to time-compressed Catalan when tested native sounds, and they are possibly treated as
with time-compressed Spanish (and vice-versa). noises.
But, French-English bilinguals did not benet from Similar perceptual reorganization processes have
habituation to French and then tested in English been identied for vowel sounds and at an earlier
and vice-versa. time in development. Work by Kuhl and collabo-
rators showed language-specic effects on vowel
perception by 6 months of age (Kuhl, Williams,
Lacerda, Stevens, & Lindblom, 1992). American
The Discovery of the Building and Swedish infants performed differently in a dis-
Blocks (Phonemes) crimination task involving either the American /i/
or the Swedish /y/ as the background stimulus con-
Monolingual infant research has shown that, as trasted with different variants of these two proto-
early as 6 months of age, infants show maternal typical exemplars. Discrimination was reduced for
language-specic phoneme perception behavior. exemplars around the prototypical vowel of the na-
Several studies have addressed the issue of the de- tive language (/i/ for American infants and /y/ for
velopmental changes in the perception of vowel Swedish infants), thus indicating that linguistic ex-
and consonant contrasts. The pioneering work by perience was already playing a role in the building of
Werker and Tees (1984) showed a decline in in- the rst native language vowel categories. The na-
fants sensitivity to nonnative phonetic contrasts tive language magnet model fully accounts for these
during the rst year of life. Canadian infants from language-specic perceptual biases that reect the
English-speaking families were tested with an op- role of language exposure (Kuhl, 1998, 2000). A
erant head-turn procedure on three different con- different study has also addressed the issue of in-
trasts, one native (ba-da), and two nonnative (the fants vowel perception changes; in general terms,
Hindi retroex/dental stop contrast and the Salish the results are congruent with the notion of an earlier
velar/uvular contrast). Although younger infants decline for vowels (by 68 months of age) than for
were all able to discriminate these three contrasts, most of the consonants (Polka & Werker, 1994;
by 8 to 10 months only some of the infants showed but see Polka & Bohn, 1996, for a controversial
a sensitivity for the nonnative contrasts, and by result).
1012 months of age infants were no longer able to Support for these language-specic changes in
discriminate any of the two nonnative contrasts; vowel and consonantal perception has been ob-
sensitivity remained only for the native contrast. tained by electrophysiological studies. The study of
These results were also subsequently replicated electric brain responses to selected vowel stimuli
in a longitudinal study with fewer subjects, but offers evidence of age-related changes in mismatch
the tendency was conrmed. Moreover, infants negativity (MMN)7 amplitude measures that are
exposed to Hindi and Salish showed sensitivity to interpreted as a consequence of the phonemic sta-
these contrasts by 12 months of age. tus certain sounds gradually acquire during the
Other studies have also shown the same decline second semester of life (Cheour et al., 1998). Other
in sensitivity for consonants by the end of the rst research has addressed the analysis of the proces-
year of life in what has been called perceptual re- sing steps involved in the discrimination of pho-
organization processes that result from linguistic nemes (mainly consonants by means of syllable
experience (Best, 1994; Werker & Lalonde, 1988). presentation), showing at least two processing
In a different study, Best and collaborators tested stages corresponding to an increasingly rened
different groups of infants with the apical/lateral analysis of the auditory input in the temporal lobes
Zulu click contrast; in this case, discrimination where phonetic changes are detected (Dehaene-
could be observed in infants of all age groups (and Lambertz & Dehaene, 1994).
in adults) for these sounds that do not occur in The study of perceptual reorganization pro-
English (Best, McRoberts, & Sithole, 1988). Thus, cesses and the development of language-specic
linguistic experience was not the only factor to take sound categories in bilinguals have been addressed
into account to explain these perceptual changes in in our laboratory. For bilingual children, the
Phonology and Bilingualism 73
outcome of these perceptual reorganization pro- might suggest to the infant that the stimuli be-
cesses should be compatible with the existence of longed to one specic language, so an interpreta-
two sound systems corresponding to the two lan- tion of the results based on infants having adopted
guages in their environment. If the attunement to a Spanish perceptual way of listening to the
the phonetic contrasts in the native language takes material cannot be a priori supported.)
place during the second semester of life, earlier for A simpler interpretation can derive from a dis-
vowel sounds than for consonants, and if we agree tributional account. Infants exposed to Catalan and
that in many bilingual situations a general language Spanish hear an increased number of Spanish
differentiation can be reached before 6 months of /e/ vowel tokens compared to the two Catalan mid-
age, then no great differences would be expected front vowels (Bosch & Sebastian-Galles, 2003b),
when exploring the sound discrimination capacities altogether forming a unimodal distribution that re-
in monolingual and bilingual infants. duces discrimination, as suggested by the work of
However, rst results in our laboratory revealed Maye, Werker, and Gerken (2002). Consequently,
a specic developmental pattern in the bilingual discrimination cannot be initially reached until some
group not found in the monolingual population mechanism triggers differentiation; one possible
(Bosch & Sebastian-Galles, 2003b). candidate is the gains in lexical knowledge usually
Infants at 4 and 8 months old from monolingual observed by the end of the rst year of life. If this is
(Spanish or Catalan) and bilingual (Spanish- correct, then similar processes should be observed
Catalan) environments were tested with a familiar- for other vowel contrasts, especially those that are
ization-preference procedure on a vowel contrast infrequent or that do not perfectly overlap when
present only in Catalan: /e//e/ (/dedi/ vs. /dedi/). both systems are compared. More recent data in our
Not surprisingly, younger infants were all able to laboratory exploring the discrimination capacities
perceive this contrast, independent of the language for other vowel contrasts suggest that this explana-
of exposure, because at that young age they tion may be right, with infants showing the same
respond in an acoustic rather than a phonemic reduced discrimination pattern by 8 months and
way. However, by 8 months, only infants from reaching the contrastive categories by 12 months of
Catalan monolingual environments succeeded, and age (Bosch & Sebastian-Galles, 2003a).
although a decline in sensitivity was expected for A similar trend has been observed in work
the monolingual Spanish group, the bilingual re- by Werker and collaborators (Burns, Werker, &
sults were clearly unexpected because Catalan was McVie, 2002). In a series of experiments, they
one of the ambient languages. An additional group investigated the developmental time course and
of bilingual infants, by now 12 months of age, was nature of infant phonetic representations relative to
also tested to analyze the stability and time course the phonetic category boundaries for [b], [p], and
of this particular pattern of response. Results from [ph]. When comparing monolingual and bilingual
this additional experiment at 12 months of age infants, differences arose by 1012 months of age,
indicated that bilinguals nally achieved discrimi- when monolinguals had already placed the cate-
nation, and their behavior was similar to that gory boundary in the appropriate location for their
found in monolinguals 4 months younger (Bosch & native language, and bilinguals still did not show a
Sebastian-Galles, 2003b). categorization compatible with either one or both
These results challenge the view that mere of their native languages. However, later in time,
exposure is enough to maintain the capacity to that is, by 1421 months of age, bilinguals showed
perceive a contrast and suggest specic perceptual evidence of either categorizing the stimuli as
reorganization processes and a different time monolinguals in one of their two languages or -
course in infants who have to cope with the nally having the boundaries that correspond to
building of two separate systems. It is possible to both of the languages of exposure (French and
argue that the specic Catalan contrast tested was English). The parallelism between this study and
particularly problematic because it partially over- our work with vowel discrimination in bilinguals
laps with the single midfront vowel /e/ of the lies in the fact that, at a certain age (8 months in the
Spanish system. Thus, it might be argued that dis- vowel study and 1012 months in the consonant
crimination cannot be easily reached in this case, as boundary study), infants from bilingual environ-
if the infants were judging the stimuli as tokens of ments seemed to be still in the process of organizing
the same word, in the same way as Spanish in- (or reorganizing) their phonetic representations,
fants were responding to this experiment. (How- and their discrimination capacities thus were,
ever, there was nothing in the test situation that temporarily, delayed compared with monolinguals.
74 Acquisition
If tuning the phoneme system to the maternal amount of use of the L1. One important charac-
language occurs very rapidly and with a reduced teristic of this model is that, for Flege, the capacity
amount of exposure (it takes place within a few to acquire new phonemes remains intact all life
months), this does not seem to be the case for adults long; therefore, at least theoretically, it should be
when learning new sound categories. Probably the possible, at any moment in life, to learn any new L2
most explored domain within phonological proces- sound. It would justify a whole chapter (and a
sing in bilinguals and L2 learners is that of the pro- book) to cover in a minimally comprehensive way
cessing of nonnative phonemes. There are two the literature existing in this domain. Instead, we
different groups of questions that need to be ad- concentrate on an apparent paradox that exists
dressed in this eld. First, it must be explained why it when some behavioral and neurophysiological
is so difcult to perceive some foreign contrasts and measures are compared.
why difculties are not universal, but depend on Recent research, mostly using electrophysiolog-
the rst language (L1) of the listener. These ques- ical measures, has shown that the human auditory
tions relate to the static aspects of the relationship system keeps a high degree of plasticity to learn
between an already-existing phonetic system and new speech sounds. One rst group of studies refers
the one to be acquired. Second, what is the impact to the comparison of event-related potential (ERP)
of the age of exposure (and the amount of exposure) signatures (in particular, MMN) to maternal and
on the adult level of competence in the L2, and foreign languages. Different studies (Naatanen
what are the consequences of the acquisition of a et al., 1997; Phillips et al., 1995; Rivera-Gaxiola,
second sound system on the already-existing one? Csibra, Johnson, & Karmiloff-Smith, 2000;
These last questions refer to the dynamics of the Sharma & Dorman, 2000; Winkler et al., 1999)
learning process. have shown that the MMN for a deviant that
Two different models have been developed to crosses a phonemic boundary (which is categorized
address these questions, the perceptual assimilation as a distinct linguistic unit) is larger than the MMN
model of Best and Strange (Best, 1994, 1995; Best for a deviant that falls within a phonemic boundary
& Strange, 1992) and the speech learning model (which is categorized as a different token of the
(SLM) of Flege (1992, 1995, 2003). Although the same phoneme as the standard). That is, MMNs
rst model exclusively deals with the rst set of are larger for between-category contrasts and
problems, the second model stresses the importance smaller for within-category ones. Unfortunately,
of the dynamic factors. Basically, both models make these studies have never been done, to our knowl-
similar predictions when referring to the particular edge, with either early (and highly procient
relationships between L1 and L2 (static aspects), al- bilinguals) or simultaneous bilinguals, so it is un-
though from quite different theoretical perspectives. known if, with extended exposure (and learning),
Summarizing their proposals, both models assume nonnative contrasts can elicit MMN of the same
that the ease or difculty with which two phonemes amplitude as native ones.
will be discriminated will depend on the similarities Although the dynamics of extended exposure
and differences between L1 and L2 phonemes. have never been addressed, several studies have an-
In particular, Bests model proposes three types alyzed the impact of short-term training. In these
of perceptual assimilations: (a) the new L2 sound studies, monolingual individuals were trained with
will be assimilated to an already existing (L1) cat- difcult L2 contrasts (nonexisting in their own lan-
egory as a more or less good realization; (b) it will guage). ERP signatures were measured before and
be perceived as a new sound; or (c) it can be per- after training and after each training phase. Trem-
ceived as a nonlinguistic sound (like when listen- blay and coworkers (Tremblay, Kraus, Carrell, &
ers of English or Spanish hear the Zulu clicks). McGee, 1997; Tremblay, Kraus, & McGee, 1998)
According to this model, when two contrastive observed that training resulted in a signicant in-
L2 sounds are assimilated to a single L1 category, crease in the amplitude of the MMN (a character-
but are equally deviant (or equally acceptable) istic that the studies reviewed in the preceding
from the L1 sound, they will be very hard to dis- paragraph have shown to distinguish between native
tinguish (as happens with the popular example and foreign contrasts). Importantly, these changes
of the difculties of Japanese listeners in perceiv- were lateralized to the left, and they generalized to
ing the English /rl/ contrast). new, nontrained contrasts, indicating that they were
The SLM adds to this level of explanation the not restricted just to the trained sounds.
dynamic aspects that address the issues of the im- Taken together, these data indicate that, as
pact of the age of acquisition of the L2 and the the SLM model postulates, the auditory system of
Phonology and Bilingualism 75
adults keeps a high degree of plasticity. Then, the materials, similar to the one used in the Tremblay
question that remains is to understand why with et al. (1997, 1998) studies, and the high variability
natural stimuli and with very early and extensive procedure developed by Lively and coworkers
exposure many bilinguals fail to acquire new pho- (Lively, Logan, & Pisoni, 1993; Lively, Pisoni,
netic categories. The feeling that most bilinguals Yamada, Tohkura, & Yamada, 1994; Logan,
and L2 learners have is that acquiring new phonetic Lively, & Pisoni, 1991) with natural tokens.
categories is not an easy task; on the contrary, both In both techniques, participants are asked to
everyday experience and behavioral research seem identify stimuli as belonging to one category or the
to point just in the opposite direction. other. One interesting characteristic of Echeverras
Several studies performed in our laboratory study is that, contrary to previous studies, the
with highly skilled Spanish-Catalan bilinguals evaluation of the perceptual capacities of the bi-
raised as monolinguals for the rst 36 years linguals was not restricted to the stimuli or tasks
of their lives, but who from this age on had used in the training phases. Indeed, bilingual per-
been exposed to both languages intensively and ceptual competence was also assessed using several
who received all their education in a bilingual tasks: gating, identication, discrimination, and
system, have shown that a large percentage of this lexical decision. Tasks were classied either as
population totally fails in learning a nonnative directly related with the training (i.e., the task to be
contrast. In particular, we have extensively tested performed was an identication task, the same as
the perception of the Catalan-specic contrast /ee/ the one used in the training) or as indirectly related
by Spanish-dominant bilinguals (Spanish only has with the training (i.e., discrimination, gating, and
one /e/ vowel falling roughly between Catalan lexical decision).
/ee/). Using a wide variety of tasks (categorical The results showed that Spanish-dominant bi-
perception, gating, odd-ball discrimination, among linguals increased their perceptual capacities in the
others), both with synthesized and natural tokens, posttest in a percentage similar to those previously
presented in isolation or within word contexts, observed with the same techniques and materials.
Spanish-dominant bilinguals systematically failed to However, the improvement was restricted to the
perceive the contrast (Bosch, Costa, & Sebastian- tests directly related to the training. No trace of
Galles, 2000; Pallier, Bosch, & Sebastian, 1997; improvement was observed in the posttest for the
Pallier, Colome, & Sebastian-Galles, 2001; Sebastian- tasks indirectly related with the training. Further-
Galles & Soto-Faraco, 1999).8 Results pointing in the more, the improvement faded away for almost all
same direction have been obtained in other studies of the tasks when participants were retested 6
(Mack, 1989) with French-English bilinguals and a months later. This fact is particularly relevant be-
consonantal contrast. That is, these results show that, cause, contrary to most training studies (in partic-
for those difcult contrasts, as predicted by both ular, Lively et al., 1993, 1994; Logan et al., 1991;
Bests perceptual assimilation model and Fleges SLM Tremblay et al., 1997), the individuals were trained
model, the inuence of the rst exposure is not easily in a contrast that was produced in their environ-
overcome. ment, and thus they had all the opportunities after
One possible explanation could be that, once training was completed to use the acquired contrast
the window of opportunity is gone, only labora- in their ordinary life.
tory training in highly controlled situations may
help in these circumstances (but see Takagi, 2002;
Takagi trained monolingual Japanese listeners to
identify English /r/ and /l/ in a quite intensive way Getting Stress: Some
and concluded that truly native like identica- Suprasegmentals
tion of /r/ and /l/ may never be achieved by adult
Japanese learners of English, p. 2887). It has been suggested that, in the same way that
In research in our laboratory, Echeverra (2002) nonnatives may show great difculties in perceiv-
indicated that this is not the case. Echeverra trained ing contrasting phonemes not existing in the ma-
Spanish-dominant Spanish-Catalan bilinguals (who ternal language, this may apply to other speech
had acquired Catalan before the age of 4 years and dimensions. One aspect that has been particularly
who were highly procient in both languages) both analyzed and for which data on bilinguals have
with synthesized and natural stimuli, using two been gathered is that of the perception of stress.
different training techniques: the fading technique Languages differ not only in their phonemic
(Jamieson & Morosan, 1986) with synthesized inventory, but also in the presence or absence of
76 Acquisition
lexical stress. English and Spanish are languages may result from experience with the language be-
that have contrastive stress; that is, pairs of cause in English the disyllabic trochee is a frequent
stimuli just differing in the position of their stress word type. Preference studies with infants learning
(forbearforebear in English and sabanasabana in languages with words that do not predominantly
Spanish) can be found. On the contrary, French is follow the trochee would shed light on this issue.
a xed-stress language in that all content words The question still remains whether stress dis-
bear stress on the last syllable. Although the ability crimination follows a developmental pattern simi-
to perceive differences between pairs of syllables lar to the one found in the contrastive sound
just differing in one phoneme seems to be present at category formation, that is, with an initial period
birth (at least for a vast majority of the phonemes), of a general sensitivity to perceive stress changes
the empirical evidence that newborns can discrim- in any direction followed by a language-specic
inate stimuli on the basis of stress is not conclusive. attunement that should determine perceptual
Newborns (French) have been shown to be sen- differences between native speakers of languages
sitive to changes in stress when tested with the with a xed stress and speakers of languages with
high-amplitude sucking paradigm on disyllabic contrastive stress. Less is even known at present
phonetically unvaried words ([mama] vs. [mama]), about the specic time course and nature of the
on trisyllabic consonant-varied words ([takala] vs. stress discrimination abilities in the case of bilin-
[takala]), and on two sets of disyllabic words varied gual infants.
in consonants but keeping the same vowel [a] in all In contrast to this lack of concluding evidence
syllables (Sansavini, Bertoncini, & Giovanelli, from the developmental studies, adult research has
1997). So, newborns processing of stress does not shown parallel results in the perception of stress
seem to be affected by consonant variations, but no with those observed in the domain of segment
published work has reported the same ability for perception. Dupoux, Christophe, Sebastian-Galles,
stimuli in which the vowel sounds are changed. and Mehler (1997) observed that Spanish and
On the contrary, it has been suggested that vowel French natives showed contrasting results in per-
variations might affect the processing of stress pat- forming an ABX task with nonwords that con-
terns, because of the perceptual saliency of vowels trasted in the position of the stressed syllables. In
(Sansavini, 1994). this experiment, participants were presented with
Although no concluding data exist on newborns pairs of CVCVCV strings (uttered by Dutch na-
and very young infants, several studies have shown tives, so they sounded foreign for all subjects),
the preference for certain stress patterns and the and then they were asked to decide if a third
usefulness of stress cues in word segmentation tasks stimulus sounded like the rst or the second one.
in older infants. Monolingual data, mainly from Stimuli pairs differed either in the position of
English-learning infants, just indicated that by 9 stress (/tamido/ vs. /tamido/) or in one phoneme
months of age, but not earlier, they showed a (/tamidopamido/). In the latter case, the contrast
trochaic bias in their listening preferences; that existed in both languages, and thus both popula-
is, they preferred to listen to lists of words with a tions were expected to perform equally well. How-
strongweak pattern over lists consisting of words ever, if French participants had difculties in
with a iambic or weakstrong pattern (Jusczyk, perceiving the position of stress, they should per-
Friederici, Wessels, Svenkerud, & Jusczyk, 1993). form worse than the Spanish ones. This is indeed
Echols, Crowhurst, and Childers (1997) also rep- what was found. French participants were signi-
licated the trochaic preferences in 9-month-olds but cantly worse than Spanish ones, but just in the stress
not in younger infants, and they showed the easi- condition and not in the phoneme condition.
ness of segmenting trochaic sequences over iambic These data have been conrmed by Dupoux,
ones at this age. Peperkamp, and Sebastian-Galles (2001), who used
Research by Morgan and Saffran (Morgan, a short-term memory task. In this task, participants
1994; Morgan & Saffran, 1995) also examined were asked to memorize and repeat sequences of
the role of rhythm in grouping syllabic units in the different length (from 2 to 6) composed of random
input and showed how a trochaic pattern seems to alternations of members of minimal pairs. Pairs
produce greater cohesiveness in the perception were dissyllabic nonwords differing in either one
of disyllables at 8 and 9 months of age. In general, phoneme or in the position of the stress (/tukutupu/
the failure to observe a preference for a specic and /mipamipa/). This task totally separated
rhythmic pattern in younger infants of 67 months French and Spanish participants in that there was
of age has been interpreted as an indication that it no overlap between the scores obtained by native
Phonology and Bilingualism 77
the aim of this study was to analyze the building of Spanish early in life). Using a subset of the same
phonotactic knowledge in bilingual infants exposed materials as those in the infant studies, two dif-
to two languages with similar prosodic structure, ferent tasks were employed: a phoneme-monitoring
but with a few differing phonotactic patterns. task and a modied gating task (in the version
It is reasonable to think that infants exposed to employed in the experiment, participants were
two languages during the rst year of their lives given two alternatives, and they had to chose one
have less exposure to each language than mono- of them). The rationale behind these experiments
lingual infants. Thus, it could be the case that was that, if Spanish-dominant bilinguals had not
bilingual infants are less sensitive to certain phono- properly acquired Catalan phonotactics, they
tactic patterns of their environmental languages would perform equally with both types of materials
because they have fewer chances to observe them. It (both in monitoring the consonants and in doing
is also plausible to hypothesize that sensitivity is the gating task). In contrast, Catalan-dominant
developed toward the common sound patterns in bilinguals should take advantage of the restrictions
both languages, even if the languages can be distin- imposed by the phonology of their maternal lan-
guished. On the other hand, the opposite tendency guage in what can constitute a nal word complex
could be observed: Attending to differential sound coda and perform better with legal than with illegal
patterns would become a strategy to exploit because stimuli.
this could help infants increase the perceptual dis- The results partially supported this hypothe-
tance between the languages of exposure. Conse- sis. Catalan-dominant bilinguals performed better:
quently, two different outcomes were possible, one They were more accurate in rejecting consonants in
suggesting a delay in bilinguals and the other sug- illegal than in legal nonwords in the phoneme-
gesting a similar developmental trend for bilinguals monitoring task, and they identied earlier the
and monolinguals. Furthermore, as there were two legal stimuli in the gating task. But, Spanish-
groups of bilinguals in this study, classied accord- dominant bilinguals also showed this pattern of
ing to the language more frequently spoken by their results. However, in both tasks the advantage of
mother, a third possibility also existed, with bilin- legal over illegal stimuli was larger for the Catalan-
guals showing a language dominance already at this dominant than for the Spanish-dominant bilin-
early age. In this case, only the Catalan-dominant guals, indicating that although both groups of
group would behave similarly to the Catalan participants had acquired the Catalan phonotactic
monolingual group of infants. constraints, those individuals who had been ex-
The results obtained t better with this last posed to this language in the rst years of their lives
possibility: Spanish monolingual infants showed still showed an advantage over those who had been
no preference for either type of stimuli. Catalan exposed to it in their early childhood.
monolingual and Catalan-dominant bilingual in- Weber (2002) also addressed the question of the
fants showed a similar pattern of preference for processing of phonotactic information, although in
legal over illegal sequences, although Spanish- the context of lexical access. Weber (2000, 2002)
dominant bilingual infants showed an ambiguous studied the recognition of English words by native
pattern halfway between the Catalan and the listeners and skilled nonnative listeners (German
Spanish monolingual groups. Thus, the results are L1). It has been shown that the fact that some
more consistent with the notion of an early lan- phoneme sequences never occur at the beginning of
guage dominance in infants raised in a bilingual a word in a particular language is used as a reliable
environment, although complete support for this cue to signal a potential word onset; in particular,
hypothesis would require further research in which the fact that some phonemes never co-occur within
Spanish-dominant bilinguals would also show a a syllable would signal a potential word boundary
pattern of preference for some language-specic between those phonemes.
structure present only in Spanish. For instance, McQueen (1998) observed that
In this same study, adult Spanish-Catalan bi- Dutch natives found it more difcult to detect
linguals were also examined. The participants, al- the word rok (skirt) when it appeared in the
though highly procient bilinguals (of the same sequence /.drok/ than in the sequence /m.rok/.
type as those described above) were not exposed Weber used analogous materials and tasks to
from birth to both languages (Catalan-dominant compare English natives and German-English late
bilinguals had only been exposed to Catalan during bilinguals. She observed that English natives found
early childhood, in the same way as Spanish- it easier to spot lunch embedded in moyshlunch
dominant bilinguals had only been exposed to than in moycelunch because no English word starts
Phonology and Bilingualism 79
or ends in shl: Moyshlunch will activate no on particular types of segmentation cues that could
competitors overlapping with lunch, and it will be useful in both languages, thus relying on strate-
be relatively easily detected. In contrast, many gies rather different from the ones found in mono-
English words begin with sl; thus, moycelunch linguals.
will activate many potential competitors, making What kind of information do infants use to rec-
recognition slower (as compared with moyslunch). ognize a word? Most models of adult word recog-
However, in German the sequence probabilities are nition (see discussion further in this section)
reversed. No German words begin or end with consider that adults are highly sensitive to ne-
sl, but many begin with shl-. German natives grained phonetic detail (potential word candidates
showed the opposite pattern from the English na- are penalizedin ways that differ depending
tives. One explanation would be that the native on the modelwhen subphonemic mismatches
vocabulary is activated by the nonnative input. between the speech input and the lexical repre-
This leads us to the last group of studies addressing sentation occur). How ne is the phonetic infor-
the question of the representation and processing of mation that infants use to recognize words?
phonology in bilinguals. Considering the reduced size of infant vocabularies,
it could be the case that no such ne-grained detail
is needed (at least not until a certain vocabulary
size is reached).
Building-up One or Two There is now an extended body of work, mainly
Lexicons? The Representation developed by Werker and colleagues, that shows the
of Words continuity between the prelexical phonetic cate-
gories developed during the second semester of life
It is surprising that, although there is extensive lit- and the use of these categories in the functional
erature on morphosyntax (see De Houwer, chapter representations of words once the child has started
2, this volume, for a review) and on the develop- to develop a receptive lexicon. Also, spoken word
ment of the productive lexicon in bilingual infants, recognition studies with 18- to 23-month-olds sug-
data on the development of the receptive lexicon are gested that childrens representations of familiar
very sparse. As we show next, the same situation words are phonetically well specied (Swingley &
appears to hold for studies of auditory lexical ac- Aslin, 2002). However, some controversial results
cess: There is a large number of studies on visual were obtained (Stager & Werker, 1997) that showed
lexical access in adult bilinguals, but the data on an inefcient use of language-specic sensitivities
the auditory modality are scarce. when mapping sound onto meaning. That is, infants
Jusczyk and Aslin (1995) showed that the ability of 14 months of age failed to map two novel words
to segment words from uent speech was present forming a minimal pair (/bihdih/) to two different
by 7.5 months of age, and that this ability involved objects, although able to perceive the contrast when
rather detailed phonetic knowledge because, when the words were not linked to objects. Then, in a
words were modied by just one feature in their series of follow-up experiments, it was shown that
initial consonant (cup ? tup) they were no longer this inability was no longer present at 17 and 20
recognized. This ability was explored in bilingual months of age. A possible explanation for this
infants from French/English environments (Polka, temporary decit was that it was the consequence of
Sundara, & Blue, 2002). Initial results indicated the computational demands of the word-learning
that there is no delay in this ability when bilinguals situation. It was further suggested that more expert
are compared to monolinguals (infants in both word learners (either older children or children with
groups were 7.5 months old), and that the bilin- a bigger productive vocabulary size) should be bet-
guals are able to segment disyllabic words in both ter able to pick up ne phonetic detail, and this
of the languages of exposure, in this case, rhyth- hypothesis was conrmed (Werker & Fennell,
mically different languages with different stress 2004).
patterns in words (trochaic in English and iambic in Werker and colleagues also explored the behav-
French). If the tendency observed in a small sample ior of bilingual children in this word-learning task.
is eventually conrmed, then it will be necessary to According to their hypothesis of a resource limita-
analyze further whether bilinguals are developing tion that interferes with the ability to learn minimal
separate strategies for segmenting words in each of pairs, the bilingual population seems suitable to
the languages of exposure (behaving as monolin- explore this hypothesis further, especially if con-
guals in each of the languages) or whether they rely sidering the increasing demands of having to acquire
80 Acquisition
two lexical systems simultaneously. Results indi- difculties in perceiving some Catalan-specic
cated that they not only showed a similar difculty contrasts. The question Pallier et al. addressed was
by 14 months of age, but also were still experiencing whether Catalan minimal pairs like /per[ ]per[ ]/
e e
the same difculty by 17 months of age, when most (meaning pear and Peter) could be represented
monolinguals were already able to learn the mini- as homophones.
mally different words correctly (Fennell & Werker, Alternatively, it could be the case that lexical
2000). So, despite having to deal with two different entries would also represent the acoustic informa-
language systems, the bilingual children did not use tion present in the signal, as has been postulated by
more phonetic detail in learning phonetically similar episodic models (Goldinger, 1996) and by the work
object labels. Access to phonological detail seems to of Tremblay et al. (1998) mentioned above, which
be reached later in bilinguals than in monolinguals, suggests that no conscious awareness would be
at least in this word-learning task. Further research needed to perceive nonnative contrasts. In this way,
with a different methodological approach is re- /per[ ]per[ ]/ could constitute separate phonolog-
e e
quired before rmly establishing this position. ical lexical entries even for those Spanish-dominant
As stated, the data on auditory lexical access Spanish-Catalan bilinguals. If Spanish-dominant
in adults is not very extensive. Although different bilinguals had shared lexical phonological entries
models for (monolingual) auditory word recogni- for the words of Catalan minimal pairs like /per[ ]
e
tion have been postulated (Gaskell & Marslen- per[ ]/, in a repetition priming paradigm they
e
Wilson, 1997; Luce & Pisoni, 1998; McClelland should show equivalent repetition savings when
& Elman, 1986; Norris, 1994), to our knowledge, one member of the pair was preceded by itself or by
there is no operating model for bilingual auditory the other member of the pair; that is, the amount
word recognition. Until very recently, the only of facilitation of /per[ ]/ preceded by /per[ ]/ or by
e e
existing works were the pioneering studies of /per[ ]/ should be the same. But, if they had sepa-
e
Francois Grosjean that resulted in the proposal of rate entries, there should be no repetition savings
the bilingual interactive model of lexical access when the other member of the pair was presented
(Grosjean, 1988; Lewy & Grosjean, 1996, 1999). before; this pattern of results should not differ from
This model is based on the architecture proposed that of Catalan-dominant bilinguals, who do per-
by McClelland and Elman in the TRACE model. ceive the contrast. This is indeed what was found.
It assumes the existence of two separate language Spanish-dominant bilinguals showed equivalent
networks, independent but interconnected at the repetition effects with exact repetitions and with
same time. Although developed to account for bi- minimal pairs, but Catalan-dominant bilinguals
lingual lexical access in general, the arguments for only showed repetition effects when the same word
both assumptions rely on arguments about lan- was presented twice.
guage selection and code-switching situations. For Another relevant line of research is that initi-
Grosjean and Lewy, both networks needed to be ated by Spivey and Marian (1999), who analyzed
independent because bilinguals are able to speak auditory lexical access in Russian-English (late)
just one language without any effort. Also, net- bilinguals. In this study, participants were simulta-
works need to be interconnected because bilinguals neously presented with four objects; following
can easily switch from one language to the other. auditory instructions, they were asked to pick
Although the model makes interesting assumptions up one of them (the target) and place it at a specic
about the way phonology is represented (it as- location while eye movements were recorded. Crit-
sumes a single feature level, but separate phoneme ically, in one condition, one of the distractor ob-
levels, duplicating common phonemes), we do jects had a translation that shared initial phonetic
not review it here because its central goals fall features with the target word (the interlingual dis-
out of the scope of this chapter (see Thomas & tractor); for instance, if participants were in-
Van Heuven, chapter 10, this volume, for further structed in Russian to pick up the stamp (marku in
details). Russian), one of the distractors was a marker. The
Different studies have addressed questions that results showed that bilinguals looked briey to
tackle in a more specic way the relationship be- the interlingual distractor object more often than to
tween phonology and lexical representation. Pallier a control object bearing no phonetic relationship
et al. (2003) used a repetition priming technique to either the target word or its translation. These
to study the representation of minimal pairs in L2 results indicated that, in spite of a totally mono-
in Spanish-Catalan bilinguals. As said here, most lingual mode, bilinguals activate both auditory
Spanish-dominant Spanish-Catalan bilinguals have lexicons.
Phonology and Bilingualism 81
processes. That is, as in any other cognitive do- linguistic domains were explored: phonology and
main, less-skilled bilinguals have to devote more morphosyntax. Savings of early exposure were
effort to process their L2 than more skilled bilin- observed, but only for phonological aspects of
guals. Research with brain imaging techniques in language processing and not for morphosyntax.
other cognitive domains (for instance, the work of The particular interests of this study are that it
Raichle et al., 1994, with a verbal task) has shown indicated that the language system cannot be con-
that when a new task becomes automatic, it is not sidered as a whole, and that the impact of early
just performed more rapidly in the same brain (and continued) exposure on its different subsys-
structures, but that a real transfer to other brain tems, because of the specic brain areas involved,
areas occurs; thus, important reorganization takes may vary.
place. To summarize, in this chapter we reviewed
A last possibility is that prociency and age of various pieces of work that give evidence of the
acquisition may be affecting different processes adaptation of the human brain to handle two
involved in language processing. It may be that sound systems. The impact of this bilingual expo-
more core linguistic knowledge, like different sure has been shown at different levels of phono-
aspects of phonology and grammar, are more sen- logical processing. However, the precise nature of
sitive to age of acquisition differences. For instance, the way the bilingual brain deals with this exposure
Wartenburger and colleagues (2003), studying needs further research to be understood fully.
grammatical processing of Italian-German bilin-
guals, found that it was age of acquisition and not
prociency that determined how L1 and L2 were Acknowledgments
represented in the brain.
Brain imaging techniques undoubtedly enable Preparation of this chapter was facilitated by a
new and interesting ways to investigate language grant from the James S. McDonnell Foundation
acquisition and processing, but careful attention (Bridging Brain, Mind, and Behavior Program), by
needs to be paid to the kinds of linguistic knowl- BSO2001-3492-C04-01 grant from the Spanish
edge that are in every moment under study. Ministerio de Ciencia y Tecnologa and by a grant
Finally, the importance of the early exposure from the Catalan Government (2001SGR00034).
followed by an interruption of this exposure has
been addressed in two studies. In the rst one,
Pallier et al. (2003) studied forgetting the maternal Notes
language in a group of adult Koreans who were 1. There is an ongoing debate about the exis-
adopted by French families in childhood and who tence of critical periods in L2 learning (see, for
never again were exposed to Korean. The data of instance, Birdsong, chapter 6, this volume, and
this study showed that there were no remaining DeKeyser & Larson-Hall, chapter 5, this volume).
signs of Korean in either behavioral tests (language The interesting research from the domain of biol-
identication and word recognition) and imaging ogy under development in this area (see Knudsen,
data (listening to sentences in four different lan- 2003) should not be neglected.
2. Before starting, let us say a few words
guages while performing a fragment detection
about what will be understood as a bilingual and
task). The only observed difference between French the different subtypes. We do not focus our pre-
natives and the adopted Koreans was obtained in sentation on L2 learners with a relatively poor
the imaging task, when the activation patterns to competence in their L2. Ideally, we would have
French sentences in both populations were com- liked just to consider individuals extremely pro-
pared: The activation covered a larger brain area in cient in both languages, that is, individuals who use
the native French subjects than in the adopted both languages in an equivalent way in all aspects
ones. The authors interpreted this difference as of their lives. Then, within this group, we would
indicating that, although an L1 can be forgotten distinguish between simultaneous and successive
and a L2 thus replace the rst, this replacement is bilinguals. However, the data in some aspects of
phonology are quite scarce, and less-procient bi-
not complete.
linguals will be considered when necessary.
The other study was reported by Au, Knightly, 3. To our knowledge, the precise consequences
Jun, and Oh (2002), who analyzed the effects of of an early versus late differentiation have not been
exposure to an L2 during early childhood (and thoroughly analyzed, and the implications in terms
subsequently not again for a long period of time) of the unfolding of the rst steps in the language
on (re)learning this L2. In this study, two different acquisition processes remain still unclear.
Phonology and Bilingualism 83
4. Preliminary data indicated that discrimina- discrimination among Zulu clicks by English-
tion cannot be reached, neither at 4 months nor at 6 speaking adults and infants. Journal of Ex-
months of age, using different experimental proce- perimental Psychology: Human Perception
dures. If these data are conrmed, the hypothetical and Performance, 14, 345360.
use of the language discrimination capacities in Best, C. T., & Strange, W. (1992). Effects of
newborns and infants in helping to establish some language-specic phonological and phonetic
syntactic knowledge should be revised. factors on cross-language perception of
5. The authors proposed an explanation in approximants. Journal of Phonetics, 20,
terms of markedness. They suggested that syllable 305330.
segmentation is a marked speech-processing rou- Bosch, L., Costa, A., & Sebastian-Galles, N.
tine, and that it would develop only when the (2000). First and second language vowel per-
language which encourages use of the marked ception in early bilinguals. European Journal
routine dominates the language which encourages of Cognitive Psychology, 12, 189222.
use of the unmarked routine (Cutler et al., 1989, Bosch, L., & Sebastian-Galles, N. (1997). Native-
p. 160). language recognition abilities in four-month-
6. French, a syllabic language, gave an inter- old infants from monolingual and bilingual
mediate pattern of results, falling between English environments. Cognition, 65, 3369.
and the other syllabic languages. Bosch, L., & Sebastian-Galles, N. (2000a, July).
7. The MMN is an event-related potential Coping with two languages: Early differenti-
(ERP) component that is elicited when a differ- ation and beyond. Paper presented at the In-
ent stimulus (the deviant) is presented in the ternational Workshop on Speech Perception
context of a repeated series of particular stimuli Development in Early Infancy: Behavioural,
(the standard). One particularly interesting prop- Neural-Modelling and Brain Imaging Data,
erty of this measure for the research in L2 phono- Barcelona, Spain.
logical processes is that it does not require Bosch, L., & Sebastian-Galles, N. (2000b, July 19
consciousness. 20). Exploring 4-month-old infants abilities
8. This is not to say that no Spanish-dominant to discriminate languages from the same
bilingual perceived the contrast, but that as a group rhythmic class. Paper presented at the Inter-
they did not. Depending on the task, between 10% national Conference on Infant Studies,
and 25% of Spanish-dominant bilinguals fell within Brighton, UK.
the range of the Catalan-dominant group. Bosch, L., & Sebastian-Galles, N. (2001a). Early
language differentiation in bilingual infants. In
References J. Cenoz & F. Genesee (Eds.), Trends in bi-
lingual acquisition (pp. 7193). Amsterdam:
Abercrombie, D. (1967). Elements of general Benjamins.
phonetics. Edinburgh, U.K.: Edinburgh Bosch, L., & Sebastian-Galles, N. (2001b). Evi-
University Press. dence of early language discrimination abili-
Altmann, G. T. M., & Young, D. (1993, Septem- ties in infants from bilingual environments.
ber). Factors affecting adaptation to time- Infancy, 2, 2949.
compressed speech. Paper presented at the Bosch, L., & Sebastian-Galles, N. (2003a, April
Eurospeech 93, Berlin, Germany. 30May 3). Developmental changes in the
Au, T. K., Knightly, L. M., Jun, S., & Oh, J. S. discrimination of vowel contrasts in bilingual
(2002). Over hearing a language during infants. Paper presented at the Fourth Inter-
childhood. Psychological Science, 13, 238 national Symposium on Bilingualism, Arizona
243. State University, Tempe.
Best, C. T. (1994). The emergence of native-lan- Bosch, L., & Sebastian-Galles, N. (2003b). Simul-
guage phonological inuence in infants: A taneous bilingualism and the perception of a
perceptual assimilation model. In J. C. Good- language specic vowel contrast in the rst year
man & H. C. Nusbaum (Eds.), The develop- of life. Language and Speech, 46, 217243.
ment of speech perception: The transition Bradley, D. C., Sanchez-Casas, R., & Garca-Albea,
from speech sounds to spoken words (pp. J. E. (1993). The status of the syllable in the
167224). Cambridge, MA: MIT Press. perception of Spanish and English. Language
Best, C. T. (1995). A direct realist view of cross- and Cognitive Processes, 8, 197233.
language speech perception. In W. Strange Burns, T. C., Werker, J. F., & McVie, K. (2002,
(Ed.), Speech perception and linguistic expe- November). Development of phonetic
rience (pp. 171206). Baltimore: York Press. categories in infants raised in bilingual and
Best, C. T., McRoberts, G. W., & Sithole, N. N. monolingual environments. Paper presented
(1988). The phonological basis of perceptual at the Boston University Conference on
loss for non-native contrasts: maintenance of Language Development, Boston, MA.
84 Acquisition
Cheour, M., Ceponiene, R., Lehtokoski, A., development: Models, research and implica-
Luuk, A., Allik, J., Alho, K., et al. (1998). tions (pp. 565604). Timonium, MD: York.
Development of language-specic phoneme Flege, J. E. (1995). Second language speech
representations in the infant brain. Nature learning: Theory, ndings and problems. In
Neuroscience, 1, 351353. W. Strange (Ed.), Speech perception and
Christophe, A., Guasti, M. T., Nespor, M., linguistic experience (pp. 233272).
Dupoux, E., & Van Ooyen, B. (1997). Baltimore: York.
Reections on phonological bootstrapping: Flege, J. E. (2003). Assessing constraints on
its role for lexical and syntactic acquisition. second-language segmental production and
Language and Cognitive Processes, 12, perception. In N. Schiller & A. Meyer (Eds.),
585612. Phonetics and phonology in language
Cutler, A., & Mehler, J. (1993). The periodicity comprehension and production (pp. 319355).
bias. Journal of Phonetics, 21, 103108. Berlin, Germany: Mouton de Gruyter.
Cutler, A., Mehler, J., Norris, D., & Segu, Friederici, A. D., & Wessels, J. M. I. (1993).
J. (1983). A language-specic comprehension Phonotactic knowledge and its use in
strategy. Nature, 304, 159160. infant speech perception. Perception &
Cutler, A., Mehler, J., Norris, D., & Segui, Psychophysics, 54, 287295.
J. (1986). The syllables differing role in the Gaskell, M. G., & Marslen-Wilson, W. (1997).
segmentation of French and English. Journal Integrating form and meaning: A distributed
of Memory and Language, 25, 385400. model of speech perception. Language and
Cutler, A., Mehler, J., Norris, D., & Segu, J. Cognitive Processes, 12, 613656.
(1989). Limits on bilingualism. Nature, 320, Goldinger, D. S. (1996). Words and voices:
229230. Episodic traces in spoken word identication
Cutler, A., Mehler, J., Norris, D. G., & Segui, and recognition memory. Journal of
J. (1992). The monolingual nature of speech Experimental Psychology: Learning, Memory,
segmentation by bilinguals. Cognitive and Cognition, 22, 11661183.
Psychology, 24, 381410. Golestani, N., Paus, T., & Zatorre, R. J. (2002).
Dehaene-Lambertz, G., & Dehaene, S. (1994). Anatomical correlates of learning novel speech
Speed and cerebral correlates of syllable dis- sounds. Neuron, 35, 9971010.
crimination in infants. Nature, 370, 292295. Grosjean, F. (1988). Exploring the recognition of
Dupoux, E., Christophe, A., Sebastian-Galles, N., guest words in bilingual speech. Language and
& Mehler, J. (1997). A distressing deafness in Cognitive Processes, 3, 233274.
French. Journal of Memory and Language, 36, Guasti, M. T., Nespor, M., Christophe, A., &
406421. Van Ooyen, B. (2001). Pre-lexical setting of
Dupoux, E., & Green, K. (1997). Perceptual ad- the head-complement parameter through
justment to highly compressed speech: Effects prosody. In J. Weissenborn & B. Hoehle
of talker and rate changes. Journal of Experi- (Eds.), How to get into language: Approaches
mental Psychology: Human Perception and to bootstrapping early language development
Performance, 23, 914927. (pp. 231248). New York: Benjamin.
Dupoux, E., Peperkamp, S., & Sebastian-Galles, N. Jamieson, D. G., & Morosan, D. E. (1986).
(2001). A robust method to study stress Training non-native speech contrasts in adults:
deafness. Journal of the Acoustical Society Acquisition of English /q//d/ contrast by
of America, 110, 16061618. francophones. Perception and Psychophysics,
Echeverra, S. (2002). El aprendizaje de contrastes 40, 205215.
foneticos no nativos: Lmites y reversibilidad. Jusczyk, P. W., & Aslin, R. N. (1995). Infants
Unpublished doctoral thesis, Universitat de detection of sound patterns of words in
Barcelona, Spain. uent speech. Cognitive Psychology,
Echols, C. H., Crowhurst, M. J., & Childers, J. B. 29, 123.
(1997). The perception of rhythmic units in Jusczyk, P. W., Friederici, A. D., Wessels, J.,
speech by infants and adults. Journal of Svenkerud, V. Y., & Jusczyk, A. M. (1993).
Memory and Language, 36, 202225. Infants sensitivity to the sound patterns of
Fennell, C. T., & Werker, J. F. (2000, July). native language words. Journal of Memory
Does bilingual exposure affect infants use and Language, 32, 402420.
of phonetic detail in a word learning task? Jusczyk, P. W., Hohne, E. A., & Bauman, A.
Poster presented at the ICIS 2000 meeting, (1999). Infants sensitivity to allophonic cues
Brighton, U.K. for word segmentation. Perception and
Flege, J. E. (1992). Speech learning in a second Psychophysics, 61, 14651476.
language. In C. A. Ferguson, L. Menn, & Jusczyk, P. W., Luce, P. A., & Charles-Luce,
C. Stoel-Gammon (Eds.), Phonological J. (1994). Infants sensitivity to phonotactic
Phonology and Bilingualism 85
rhythm. Journal of Experimental Psychology: Polka, L., Sundara, M., & Blue, S. (2002, June).
Human Perception and Performance, 24, The impact of language experience on word
756766. recognition. Paper presented at the 143rd
Nazzi, T., Jusczyk, P. W., & Johnson, E. K. (2000). meeting of the Acoustic Society of America,
Language discrimination by English-learning Pittsburgh, PA.
5-month-olds: Effects of rhythm and Polka, L., & Werker, J. F. (1994). Developmental
familiarity. Journal of Memory and Language, changes in perception of non-native vowel
43, 119. contrasts. Journal of Experimental
Norris, D. (1994). Shortlist: A connectionist model Psychology: Human Perception and
of continuous speech recognition. Cognition, Performance, 20, 421435.
52, 189234. Posner, M., Rothbart, M., Farah, M., & Bruer,
Pallier, C., Bosch, L., & Sebastian, N. (1997). A J. (Eds.). (2001). The developing human brain
limit on behavioral plasticity in vowel [Special issue]. Developmental Science, 4(3).
acquisition. Cognition, 64, B9B17. Raichle, M. E., Fiez, J. A., Videen, T. O., Ma-
Pallier, C., Colome, A` ., & Sebastian-Galles, N. cLeod, A. M. K., Pardo, J. V., Fox, P. T., et al.
(2001). The inuence of native-language (1994). Practice-related changes in human
phonology on lexical access: Exemplar-based brain functional anatomy during nonmotor
vs. abstract lexical entries. Psychological learning. Cerebral Cortex, 4, 826.
Science, 12, 445449. Ramus, F., Hauser, M. D., Miller, C., Morris, D.,
Pallier, C., Dehaene, S., Poline, J.-B., & Mehler, J. (2000). Language discrimination
LeBihan, D., Argenti, A.-M., Dupoux, E., by human newborns and by cotton-top tama-
et al. (2003). Brain imaging of language rin monkeys. Science, 288, 349351.
plasticity in adopted adults: Can a second Ramus, F., Nespor, M., & Mehler, J. (1999).
language replace the rst? Cerebral Cortex, Correlates of linguistic rhythm in the speech
13, 155161. signal. Cognition, 73, 265292.
Pallier, C., Sebastian-Galles, N., Dupoux, E., Rivera-Gaxiola, M., Csibra, G., Johnson, M., &
Christophe, A., & Mehler, J. (1998). Karmiloff-Smith, A. (2000). Electrophysio-
Perceptual adjustment to time-compressed logical correlates of cross-linguistic speech
speech: A cross-linguistic study. Memory and perception in native English speakers. Beha-
Cognition, 26, 844851. vioural Brain Research, 111, 1323.
Peperkamp, S., & Dupoux, E. (2002). A Sansavini, A. (1994). Percezione della prosodia
typological study of stress deafness. In del linguagio nei primi giorni di vita
C. Gussenhoven & N. Warner (Eds.), Papers [Perception of the prosody in the rst days of
in laboratory phonology 7 (pp. 203240). life]. Unpublished doctoral thesis, Universita`
Berlin, Germany: Mouton de Gruyter. di Bologna, Italy.
Peperkamp, S., Dupoux, E., & Sebastian-Galles, Sansavini, A., Bertoncini, J., & Giovanelli, G.
N. (2002, October). Stress deafness in early (1997). Newborns discriminate the rhythm of
and late French-Spanish bilinguals. Paper multisyllabic stressed words. Developmental
presented at the Structure of Learner Psychology, 33, 311.
Language, Kolymbari, Greece. Sebastian-Galles, N., & Bosch, L. (2002).
Perani, D., Dehaene, S., Grassi, F., Cohen, L., The building of phonotactic knowledge in
Cappa, S., Dupoux, E., et al. (1996). Brain bilinguals: The role of early exposure. Journal
processing of native and foreign languages. of Experimental Psychology: Human
NeuroReport, 7, 24392444. Perception and Performance, 28, 974989.
Perani, D., Paulesu, E., Sebastian-Galles, N., Sebastian-Galles, N., Dupoux, E., Costa, A., &
Dupoux, E., Dehaene, S., Bettinardi, V., et al. Mehler, J. (2000). Adaptation to
(1998). The bilingual brain: Prociency and time-compressed speech: Phonological
age of acquisition of the second language. determinants. Perception and Psychophysics,
Brain, 121, 18411852. 62, 834842.
Phillips, C., Marantz, A., McGinnis, M., Sebastian-Galles, N., & Soto-Faraco, S. (1999).
Pesetsky, D., Wexler, K., Yellin, A., et al. On-line processing of native and non-native
(1995). Brain mechanisms of speech phonemic contrasts in early bilinguals.
perception: A preliminary report. MIT Cognition, 72, 112123.
Working Papers in Linguistics, 26, 125163. Sharma, A., & Dorman, M. F. (2000).
Polka, L., & Bohn, O. S. (1996). A cross-language Neurophysiologic correlates of cross-language
comparison of vowel perception in English- phonetic perception. Journal of the Acoustical
learning and German-learning infants. Journal Society of America, 107, 26972703.
of the Acoustic Society of America, 100, Shehan, P. (1989). Individual differences in second-
577592. language learning. London: Arnold.
Phonology and Bilingualism 87
Spivey, M. J., & Marian, V. (1999). Cross talk Weber, A. (2000, May). The role of phonotactics
between native and second languages: Partial in the segmentation of native and non-native
activation of an irrelevant lexicon. Psycho- continuous speech. Paper presented at the
logical Science, 10, 281284. Workshop on Spoken Access Processes,
Stager, C. L., & Werker, J. F. (1997). Infants listen Max Planck Institute for Psycholinguistics,
for more phonetic detail in speech perception Nijmegen, The Netherlands.
than in word-learning tasks. Nature, 388, Weber, A. (2002). Language-specic listening:
381382. The case of phonetic sequences. Unpublished
Swingley, D., & Aslin, R. N. (2002). Lexical doctoral thesis, Katholieke Universiteit
neighborhoods and the word-form Nijmegen, The Netherlands.
representations of 14-month-olds. Weber, A., & Cutler, A. (2004). Lexical
Psychological Science, 13, 480484. competition in non-native spoken-word
Takagi, N. (2002). The limits of training Japanese recognition. Journal of Memory and
listeners to identify English /r/ and /l/: Eight Language, 50, 125.
case studies. Journal of the Acoustic Society Werker, J. F., & Fennell, C. T. (2004).
of America, 111, 28872896. Listening to sounds versus listening to
Toro, J. M., Trobalon, J. B., & Sebastian-Galles, words: Early steps in word learning. In D. G.
N. (2003). The use of prosodic cues in Hall & S. Waxman (Eds.), Weaving a
language discrimination tasks by rats. Animal lexicon (pp. 79111). Cambridge, MA:
Cognition, 6, 131136. MIT Press.
Tremblay, K., Kraus, N., Carrell, T., & McGee, Werker, J. F., & Lalonde, C. E. (1988). Cross-
T. (1997). Central auditory system plasticity: language speech perception: Initial capabilities
Generalization to novel stimuli following and developmental change. Developmental
listening training. Journal of the Acoustical Psychology, 24, 672683.
Society of America, 6, 37623773. Werker, J. F., & Tees, R. C. (1984). Cross-language
Tremblay, K., Kraus, N., & McGee, T. (1998). The speech perception: Evidence for perceptual re-
time course of auditory perceptual learning: organization during the rst year of life. Infant
neurophysiological changes during speech- Behavior and Development, 7, 4963.
sound training. NeuroReport, 9, 35573560. Winkler, I., Lehtoksoki, A., Alku, P., Vainio, M.,
Wartenburger, I., Heekeren, H. R., Abutalebi, Czugler, I., Csepe, V., et al. (1999).
J., Cappa, S. F., Villringer, A., & Perani, Pre-attentive detection of vowel contrasts
D. (2003), Early setting of grammatical pro- utilizes both phonetic and auditory memory
cessing in the bilingual brain, Neuron, 37, representations. Cognitive Brain Research, 7,
159170. 357369.
Robert DeKeyser
Jenifer Larson-Hall
5
What Does the Critical Period
Really Mean?
ABSTRACT A large amount of empirical evidence shows that age of acquisition is str-
ongly negatively correlated with ultimate second language prociency for grammar as
well as for pronunciation. It is even doubtful that any evidence exists at this point of any
person having learned a second language perfectly in adulthood. Some researchers have
rightly pointed out that correlation is not causation, and that the age effect may be
caused by confounded variables such as quantity and quality of input, amount of
practice, level of motivation, and other social variables. Many studies, however, have
shown that these variables play a very limited role when the effect of age of acquisition
is removed statistically, but age of acquisition keeps playing a large role when the social
and environmental variables are removed. Other researchers have objected to a critical
period interpretation of such age effects because these do not show the discontinuities
that would be expected under the critical period hypothesis. We argue here, however,
that quite a few studies have documented discontinuities, and that their absence in some
studies may be because of a variety of confounding variables and other methodological
problems. Assuming there is indeed a maturational decline in second language learning
capacity during childhood, then there is a need to investigate whether this decline
affects competence, performance, or both and what the ultimate cause of this decline is.
Increasingly, evidence points toward fundamental maturational changes in certain as-
pects of memory. The challenge for critical period researchers is to tie such changes to
both specic neurological antecedents and specic psycholinguistic corollaries. Re-
gardless of ones view of the critical period, it is important not to overinterpret its
implications for educational practice. The observation that earlier is better only
applies to certain kinds of learning, which schools typically cannot provide. Therefore,
the implication of critical period research seems to be that instruction should be adapted
to the age of the learner, not that learners should necessarily be taught at a young age.
88
Critical Period 89
cation to this day (Marinova-Todd et al., 2000; direct way, but rather insofar as it reects an
Patkowski, 1994; Peneld & Roberts, 1959; see intricate sequence of interactions between the
especially Scovel, 1988). The younger is better developing phenotype and the environment,
phenomenon has also been given a number of which is sufciently typical of the species that
far-reaching theoretical interpretations, stretching it appears despite individual differences and
from cognitive psychology (see, e.g., Newport 1990), widely varying experiences. (p. 10)
to neurology (e.g., Long, 1990; Pinker, 1995;
Pulvermuller & Schumann, 1994; Scovel, 1988; More specically, we use the term critical period
Ullman, 2001; Walsh & Diller, 1981), to evolution- hypothesis (CPH) in this chapter to designate the
ary theory (e.g., Hurford, 1991; Hurford & Kirby, idea that language acquisition from mere exposure
1999). (i.e., implicit learning), the only mechanism avail-
The origins of the modern debate1 about AoA able to the young child, is severely limited in older
in L2 learning are usually traced back to Penelds adolescents and adults.
epilogue on the learning of languages in Peneld The hypothesis applies to both rst language
and Robertss (1959) book Speech and Brain- (L1) acquisition and SLA. Evidence from L1 ac-
Mechanisms. In the 1950s and 1960s, the neurol- quisition not only is the most dramatic, but also
ogist Peneld was a staunch advocate of early fortunatelyis limited in quantity. Until fairly re-
immersion education (see Scovel, 1988), but the cently, it consisted of hard-to-interpret ndings
researcher whose name became most strongly as- about feral children, who had largely failed to
sociated with the issue of age in language learning acquire language at an older age having been de-
is Lenneberg, whose 1967 book Biological Foun- prived of normal input during the CP. The best-
dations of Language contains the rst use of the documented cases are Victor, the wild boy of
term critical period (CP) in the context of language Aveyron (Lane, 1976), Genie (Curtiss, 1977), and
acquisition (p. 158, pp. 175 ff.). A number of au- Chelsea (Curtiss, 1988), but even in these cases, it
thors have suggested that this term has too absolute was impossible to determine to what extent extreme
a ring to it, and that optimal or sensitive would be a social deprivation, maybe even food deprivation or
better term than critical, but the term critical has sensory deprivation, may have been confounded
stuck and is the only one we use in what follows, with language input deprivation. In the last 15 years
without necessarily implying an absolute end point or so, more systematic research with deaf children
for all language acquisition. See Oyama (1978) and born to hearing parents, and therefore deprived of
Schachter (1996) for more discussion of the terms good signed input until grade school or later, has
critical and sensitive and Scovel (1988) for further demonstrated the strength of the AoA effect when
discussion of the history of the CP debate in the lack of input was not confounded with extreme
20th century. forms of social deprivation: The older the age of
Terminological squabbles aside, it is important to rst exposure to American Sign Language is, the
be clear about what we mean by CP. Far too often, in worse the ultimate attainment is (Emmorey, Bellugi,
our opinion, the idea of a CP is rejected because Friederici, & Horn, 1995; Grimshaw, Adelstein,
of specic interpretations of it, rather than because of Bryden, & MacKinnon, 1998; Mayberry, 1993;
the core idea. In particular, a number of authors seem Mayberry & Eichen, 1991; Newport, 1990). Evi-
to think that the term should be rejected if no clear dence about how the hypothesis applies to ultimate
cause for the younger is better phenomenon can attainment in SLA is discussed in detail in the re-
be found in what is known about neurological de- mainder of this chapter.
velopment (e.g., Flege, 1987; Snow, 1987) or if pro- The hypothesis does not apply, however, to rate
cessing rather than representation is at issue (e.g., of acquisition. Since Krashen, Long, and Scarcella
McDonald, 2000). We adhere to a broader inter- (1979) stressed the difference between rate of
pretation of the term, which does not prejudge its learning and ultimate attainment, this distinction
causes and is more in line with Lennebergs (1967) has been generally accepted. The fact that adults or
denition: It is automatic acquisition from mere adolescents seem faster or at least no slower in the
exposure that seems to disappear after this age (p. initial stages of acquisition compared to children
176), regardless of the exact nature of the underlying (for morphosyntax, Slavoff & Johnson, 1995, and
maturational causes. As Oyama (1978) put it: Snow & Hoefnagel-Hohle, 1978; for phonology,
Ekstrand, 1976; Fathman, 1975; Harley, 1986;
It is a developmental phenomenon not in that it Loewenthal & Bull, 1984; Morris & Gertsman,
is determined by the genes in some rigid or 1986; Olson & Samuels, 1973; and Thogmartin,
90 Acquisition
1982) has no direct bearing on the CPH, even prociency after LoR was partialed out. They
though ultimately the two phenomena (i.e., faster typically found that AoA is a much better predictor
initial acquisition but more limited ultimate at- than LoRDeKeyser (2000) even found a zero
tainment by older learners) may be due to the same correlation for LoR and prociencywith the one
underlying cause, as is argued below. exception of prociency in terms of error rate in the
In what follows, then, we largely restrict the work of Shim (1993), for which LoR was a better
discussion to ultimate attainment in SLA. We rst predictor than AoA.
present a fairly detailed and systematic summary In conclusion, all the studies that used gram-
of empirical ndings and then discuss possible in- maticality judgments of auditory stimuli found a
terpretations. moderate to very strong negative correlation be-
tween AoA and L2 prociency. In most cases, there
was a strong decline up to a certain age and a
leveling off for adults or a strong decline around
Empirical Findings on Age
puberty with little age differentiation within the
of Acquisition and Second child and adult groups. The studies of Bialystok
Language Acquisition and Miller (1999) and Birdsong and Molis (2001)
were the only ones that found a strong AoA effect
The Basic Argument: Age within the adult group. In the former study, this
of AcquisitionProciency may have been because of the very low LoR of
Correlations some members of the adult group, which means
their test results were not representative of ultimate
Evidence From Oral Grammaticality Judgment attainment. In the latter study, the correlation ap-
Tests Since the late 1970s, studies have been ac- pears to be largely because of two outliers (two of
cumulating that document a strong negative corre- the ve oldest members of the group had extremely
lation between AoA and L2 prociency. A variety of low scores).2 The majority of studies, then, clearly
techniques have been used to assess L2 prociency documented a decline (of considerably varying de-
in this kind of research, but the most commonly gree) during childhood, followed by a low to zero
used instrument has been grammaticality judgments correlation through adulthood.
of auditorily presented stimulus sentences. In part,
this is because of the visibility of Johnson and Evidence From Other Tests of Morphosyntax Writ-
Newports (1989) study, which led to a number ten GJTs have tended to yield lower, but still
of replications or semireplications. Johnson and signicant, AoAprociency correlations (see
Newport (1989) had a sample of Chinese or Korean Table 5.2). Especially interesting are the correla-
speakers listen to 276 short English sentences and tions obtained with the oral and written presenta-
judge them as correct or incorrect. The sentences tion of the same stimuli. Johnson (1992), with the
were designed to represent 12 basic grammar same stimuli as in Johnson and Newport (1989)
structures of English. Johnson and Newport found a and the same subjects tested a year later, found an
correlation of .77 between AoA and L2 pro- AoA-GJT correlation of .54 for the group as a
ciency (.87 for AoA below 16 and .16 for AoA whole and .73 for AoA less than 15 (as opposed
above 16). to .77 and .87, respectively, for oral presenta-
Other studies have found correlations that were tion). Jia (1998) found an AoA-GJT correlation of
not as high, but similar (see Table 5.1), not only .35 for written presentation (as opposed to .68
with native speakers of Korean and Chinese, but for oral presentation). Bialystok and Miller (1999),
also with speakers of Vietnamese, Spanish, Hun- however, found no signicant correlation with
garian, and Russian. Although most studies used AoA for written presentation (the actual r was not
the number of items correct (or the error rate) on reported).
the grammaticality judgment test (GJT) as the only A couple of studies used written grammaticality
outcome variable, three others also documented judgments only and did not calculate correlations
the reaction time (RT; R. Kim, 1993; McDonald, between AoA and GJT scores, but simply compared
2000; Shim, 1993). All found negative correlations the mean scores for different AoA groups: Cop-
between AoA and test score and positive correla- pieters (1987) and Sorace (1993) found signicant
tions with error rate or RT. A number of studies differences between natives and adult acquirers (in
also presented correlations of length of residence Coppieterss case, there was even no overlap be-
(LoR) with prociency or correlations of AoA with tween natives and nonnatives), in spite of the fact
Critical Period 91
Table 5.1 Correlations Between Age of Acquisition (AoA) and Second Language (L2) Prociency as
Measured by Oral Grammaticality Judgment Tests
Johnson and Chinese English 326 46 539 .77 For all learners
Newport, 1989 Korean .87 For AoA < 16
.16 For AoA > 16
Johnson and Chinese English 515 21 416 .63 All AoA < 16
Newport, 1991
R. Kim, 1993 Korean English Minimum 3 30 335 .66 For error rate
.55 For reaction time
Shim, 1993 Korean English 521 60 029 .45 For error rate
.71 For reaction time
Jia, 1998 Various English 532 105 334 .68
Flege, Yeni- Korean English Minimum 8 240 123 .71 For AoA < 15
Komshian, .23 For AoA > 15
and Liu, 1999
Bialystok and Chinese English 118 33 132 .82 For AoA < 15
Miller, 1999 .57 For AoA > 15
Spanish English 223 28 341 .68 For AoA < 15
.51 For AoA > 15
DeKeyser, 2000 Hungarian English Minimum 10 57 140 .63 For all learners
Mean 34 .26 For AoA < 16
.04 For AoA > 16
McDonald, 2000 Spanish English 324 28 020 .61 For test score;
for RT, late acquirers
differ from all others
Vietnamese English 923 24 010 .59 For test score; for
RT, all L2
acquirers differ
from L1 acquirers
Birdsong and Spanish English Minimum 10 61 344 .24 For AoA < 16
Molis, 2001 .69 For AoA > 16
that participants in both studies were considered The pattern for these various types of mor-
nativelike. phosyntactic outcome variables is very much the
Some researchers have used other kinds of pas- same as for the oral GJTs discussed above:
sive testing (not requiring production). Lee and strong negative correlations between AoA and
Schachter (1997) used a picture/sentence matching morphosyntactic score, no overlap between native-
task, and Oyama (1978) used a listening compre- speaking and non-native-speaking groups, or at the
hension test with varying levels of masking with very least a signicant difference between native
white noise. and adult acquirers.
Several other studies were based on production
data. Hyltenstam (1992) administered oral and Evidence From Phonological Measures Almost no
written production tasks; Ball (1996) used the Bi- research has been conducted to test any specic
lingual Syntax Measure; and Patkowski (1980, see area of phonological acquisition comparing child
also 1990) had native-speaking judges give global and adult L2 learners. A rare exception is Ioup
syntactic prociency ratings on the basis of written and Tansomboons (1987) small-scale experiment
transcriptions of 5-minute audio recordings. on the acquisition of Thai tone by L1 English
Table 5.2 Correlations Between Age of Acquisition (AoA) and Second Language (L2) Prociency as Measured by Other Tests of Morphosyntax
Johnson, 1992 Chinese English 427 46 539 .54 For all learners
Korean .73 For AoA < 15
Jia, 1998 Various English 572 105 334 .35
Bialystok and Chinese English 118 33 132 ns Exact r not reported
Miller, 1999 Spanish 223 28 341
Coppieters, 1987 Various French Minimum 5.5 21 Minimum 18 Not reported No overlap
Mean 17.4 with NS group
Sorace, 1993 English Italian 515 24 1827 Not reported Signicant difference
French 20 with NS group
Lee and Korean English 2.24.6 76 324 Not reported AoA > 15 worst
Schachter, AoA 1115 best
1997
Oyama, 1978 Italian English 520 60 620 .57 r is for AoA-L2
with LoR removed
Hyltenstam, 1992 Finnish Swedish No information 24 <15 Not reported No overlap with
Spanish (sound native) NS group
Ball, 1996 Greek English 1072 102 140 .62a Marked decline
for AoA > 16
Patkowski, 1980 Various, English 661 67 55 .74
mostly
Indo-European
speakers. Their child SL learners did well on mea- Several more recent studies have upheld the
sures of tone; among adult learners, even the most ndings of these earlier studies, but have avoided
advanced performed poorly. some of their problems by consistently including
In contrast, many studies have examined the controls, having a wider range of AoA, and re-
pronunciation of individuals with differing ages porting scores by AoA, not just at a certain pre-
of rst exposure to L2 through global measures determined cutoff point, such as above and below
of phonological competence, such as ratings of 15. Flege, Munro, and MacKay (1995) tested sen-
sentence reading or free production tasks (see tence production by Italian immigrants to Canada.
Table 5.3). These ratings were based on produc- Any bilinguals who received a mean rating that fell
tion, not perception. Judges usually rated the within two standard deviations of the mean native
learners on a scale of accentedness, for which speaker rating were considered to have native ac-
one end of the scale indicated native speaker and cents. Flege, Yeni-Komshian, and Liu (1999) eval-
the other end indicated strong foreign accent. uated the performance of Korean immigrants to the
This means that judges were taking many kinds of United States in a similar way.
phonological evidence into account in their ratings, As can be seen in Table 5.3, all the studies that
including segments, syllable structure, stress, into- have used global accent ratings on sentences or par-
nation, and rhythm. Thus, it may be said that agraphs that were read or spontaneous speech have
studies in this area measured global pronunciation found that degree of accent increases as AoA in-
ability and not necessarily underlying phonological creases. Many studies have found that even learners
abilities. In general, the longer and less constrained with AoA less than 6 cannot achieve the same ratings
the speech sample to be judged was, the more as native speakers. This may be a result of the post-
sharply foreign accents were noted by judges (cf. ponement of onset of true exposure to the L2 until
Neufeld, 1988, for an example). In addition, being age 56 years even when AoA was earlier, however.
found accent free in an L2 does not seem to be At this point, the studies indicated that earlier is
guaranteed for any AoA, but has been found sta- better as far as the probability of achieving accent-
tistically more probable for earlier AoA. free L2 speech is concerned. In studies that conducted
The rst large-scale study that examined the multiple regression analyses, AoA accounted for at
pronunciation of immigrants whose AoA varied least 50% of the variance in accent scores, with other
from child to adult seems to be that done by factors such as sex, length of residence, motivation,
Oyama (1976), who tested Italian immigrants to identication with L2 culture, or self-condence
the United States and who had AoA from 6 to 20 either not signicant or adding only 5% or less to the
years. Both a reading task and a sample of spon- total variance (see sections on input and social-
taneous speech showed less accent than paragraph psychological variables for more detail).
readings; pronunciation ratings correlated highly
with AoA when LoR was partialed out. Subse- Evidence From Other Dependent Measures A few
quent measures of pronunciation ability have ten- studies used global self-assessments of prociency
ded to follow the same format as that of Oyama rather than a test. Bialystok and Hakuta (1999)
(1976). obtained data from the 1990 U.S. population cen-
Patkowski (1980) included a replication of sus and used prociency self-assessments on a 5-
Oyamas (1976) work with 67 immigrants of var- point scale from the Spanish and Chinese speakers
ious backgrounds on a spontaneous speech task. in the state of New York with LoR less than 10.
Bongaerts, Planken, and Schils (1995) reexamined This yielded a sample of 38,787 for L1 Spanish and
Patkowskis data and noted that 15 of the 33 24,903 for L1 Chinese. AoAprociency correla-
subjects with AoA less than 15 obtained a perfect tions for these two groups were .52 and .44,
accent rating; none with a higher AoA received respectively.
as high a score. Thompson (1991) tested Russian- Stevens (1999) also used the 1990 census data,
speaking immigrants to the United States on sen- but restricted her sample to subjects who were
tence and passage reading and on a spontaneous between 18 and 40 years old at the time of the
speech task. Asher and Garca (1969) had Spanish- census and did not exclude subjects with limited
speaking immigrants read four sentences. In the LoR. She did not report raw correlation coef-
work of Tahta, Wood, and Loewenthal (1981), cients, but graphically presented curvilinear rela-
immigrants to the United Kingdom read para- tionships; AoA appeared to be a much stronger
graphs that were judged on a 02 scale (no foreign predictor of prociency than LoR, at least for in-
accent to marked accent). dividuals with LoR greater than 5.
Table 5.3 Correlations Between Age of Acquisition (AoA) and Second Language (L2) Prociency as Measured by Global Phonological Ratings
Oyama, 1976 Italian English 520 60 620 .69 (stories) For error rate;
.83 (paragraph) LoR partialed out
Patkowski, 1980 Various English 661 67 550 .76
Thompson, 1991 Russian English Not given 36 442 .81 For error rate
Asher and Garcia, 1969 Spanish English Mean 5 71 119 Not reported For AoA > 12:
only 7% near native
Tahta et al., 1981 Various English Minimum 2 109 Minimum 6 .66 For error rate;
AoA < 6: all perfect
Flege, Munro, et al., 1995 Italian English 1544 240 223 Not reported AoA accounts for 60%
of variance; AoA > 16:
none native-like
Flege et al., 1999 Korean English Minimum 8 240 123 .62 For AoA < 12
.5 For AoA > 12
Yeni-Komshian, Korean English Minimum 8 240 123 .85 Signicant difference
Flege, and Liu, 2000 between NS and all Koreans
Yeni-Komshian, Korean English Minimum 8 192 623 .31 (Consonants) For error rate
Robbins, and Flege, 2001 .69 (Vowels)
Seliger, Krashen, and Ladefoged (1975) sampled adults in early stages of learning (adults are known
394 immigrants to the United States and Israel to acquire faster initially; see above).
to obtain self-reports on pronunciation. The re- In terms of morphosyntax, Ioup et al. (1994)
searchers asked the participants if they thought merely documented that two adult learners of Ar-
their English/Hebrew would be viewed as native- abic as an L2and language teachers themselves
like by the native speakers of the country where did very well, not that they were completely
they resided. Of the participants who arrived in the indistinguishable from natives. Birdsong (1992)
country before age 10 years, 85% believed they showed substantial overlap between a group of
would be taken for native speakers; only 7% of adult learners and a group of native speakers, but
those who arrived at 16 years or older claimed this may have been because of the high degree of
the same. variability within the native speaker group. Typi-
Finally, a few researchers have used neurologi- cally, CP studies are conducted on the acquisition of
cal rather than linguistic measures as dependent basic structures, for which native speakers obtain
variables in a morphosyntactic task. K. H. S. Kim, virtually perfect scores; this was not the case here.
Relkin, Lee, and Hirsch (1997) had 12 highly pro- White and Genesee (1996) showed that pro-
cient English L2 speakers of various L1 back- ciency among a group of near-native English L2
grounds engage in a silent narration task and speakers was not correlated with AoA, but not
measured activity in narrowly dened segments of much correlation with any variable can be expected
Brocas and Wernickes areas. Comparing subjects in a group that, by denition, can show almost no
exposed to L2 in infancy and in early adulthood, variation. Furthermore, the researchers acknowl-
functional magnetic resonance imaging revealed a edged that the similarity of L2 to the L1 of many
signicant difference in the location of strongest near-natives may have played a role (most mem-
activity within Brocas area. bers of the near-native category spoke a Germanic
Weber-Fox and Neville (1996) used event- or Romance language, especially French).
related potentials to compare the brain activity of Studies of exceptional learners in the realm
Chinese-English bilinguals with AoA varying from of phonology by Bongaerts and his associates
0 to 16. They found signicant differences com- (Bongaerts, 1999; Bongaerts et al., 1995; Bongaerts,
pared with natives for semantic processing tasks in Mennen, & Van der Slik, 2000; Bongaerts, Van
speakers with AoA above 11 and for syntactic Summeren, Planken, & Schils, 1997) have shown
processing tasks in speakers with AoA above 4. The that some learners who sound very nativelike
same researchers also documented accuracy dif- in ordinary conversation can obtain accent ratings
ferences in the same group of bilinguals for AoA in the native speaker range when they read sen-
above 16 in semantic judgments and AoA above 4 tences. Bongaerts and colleagues tested participants
in syntactic judgments (even for AoA > 1 in the who learned L2 Dutch and L2 French in both
case of subjacency violations). Also using ERP, naturalistic and instructed settings. Bongaerts et al.
Hahne (2001) found similar effects for Russian- (1997) claimed that passing for a native speaker in
German bilinguals with AoA greater than 10, with a sentence-reading task is a feat that can only be
small differences in semantic processing, but clear accomplished if the learner has a nativelike un-
qualitative differences for syntactic processing. For derlying competence, but we would like to see such
a discussion of these and other ndings in neuro- feats accomplished on passages of constrained free
anatomical terms, see the work of Ullman (2001). speech (like those found in the work of Ioup et al.,
1994) before such a claim could be denitively
Some Counterevidence A few studies are often accepted, especially when some researchers claim
cited as having documented equivalence of native that Bongaerts et al.s ndings show that nonnative
speakers and adult acquirers: Bialystok (1997), speakers can speak the L2 without a detectable
Ioup, Boustagui, El Tigi, and Moselle (1994), foreign accent (Flege & Liu, 2001, pp. 549550).
Birdsong (1992), White and Genesee (1996) for
morphosyntax, and Ioup et al. (1994) and work by Conclusion A large number of studies have docu-
Bongaerts and colleagues for phonology. Bialystok mented large child-adult differences or strong cor-
(1997) reported on two studies that actually relations between AoA and L2 prociency. These
showed an advantage for adults over children. This studies have used a wide variety of testing formats
very result, however, along with the fact that no and dependent variables (even though grammati-
minimal LoR was reported, leads one to suspect cality judgments are the most common for mor-
that what was at issue here was a rate advantage of phosyntax, and global pronunciation ratings are
96 Acquisition
the most common for phonology) with speakers of The AoA effects are too pervasive to be explained
a wide variety of languages (even though Chinese completely by lack of high-quality input and
and Korean speakers are most strongly represented interaction.
for morphosyntax). Only four studies have found If amount of practice could explain the AoAL2
substantial overlap between adult and native ac- correlation, then why would there be no correla-
quirers for morphosyntax, and in all four, the con- tion between L2 prociency and LoR in many
tradictory results could probably be explained by studies of morphosyntax? In virtually every study
conceptual problems or methodological issues. cited above, LoR was a nonsignicant predictor of
Those studies cited for phonology have shown that L2 after the effect of AoA was removed; in some
some learners can achieve very high levels of nati- other studies, even the raw correlation was non-
velike pronunciation in mostly constrained tasks, signicant, and in the work of DeKeyser (2000),
but have yet to show that late learners can achieve it was exactly zero. Flege et al. (1999) looked at
the same level of phonology as native speakers in LoRL2 relationships for specic structures and
spontaneous production. found that LoR was the best predictor for lexically
based items, but not rules.
For phonology, correlations of accent ratings
Counterarguments: Reinterpreting with LoR were not signicant in some of the
the Age of Acquisition Effect studies cited above (Oyama, 1976; Tahta et al.,
1981); others found a correlation that did not add
A number of subtler counterarguments against the signicantly or only very little (<5%) to variance
CPH have been raised. Although few researchers in a multiple regression (Flege, Munro, et al., 1995,
doubt anymore that there is a strong effect of 1999; Thompson, 1991).
AoA on ultimate L2 attainment, many question However, some signicant correlations have
the maturational interpretation of this effect, ar- been found in studies, such as those of Asher and
guing that AoA is confounded with a number of Garca (1969), Snow and Hoefnagel-Hohle (1978),
environmental variables that are the true cause of Suter (1976), and Purcell and Suter 1980. Snow and
the decline in ultimate attainment. Others do be- Hoefnagel- Hohle only looked at acquisition in the
lieve the cause of the AoA-related decline is to be rst year of naturalistic exposure; Asher and Garcia
found within the individual, but they question were apparently testing young children who were
whether the shape of the AoAprociency function probably still enjoying the benecial effects of a CP.
is compatible with the traditional concept of a CP Suter (1976) and Purcell and Suter (1980) found
coming to an end around puberty. signicant effects for LoR, but the range of LoR and
AoA of the participants was unspecied. Flege, Ta-
Input and Practice One of the oldest counterar- kagi, and Mann (1995) found signicant LoR effects
guments concerns the role of environmental in- for adult L2 learners, but a replication of this study
put. Adults are likely to receive input that is by Larson-Hall (2001) could not duplicate Flege,
quantitatively and qualitatively different from that Takagi, et al.s results. Flege and Liu (2001) found a
which children obtain from their caregivers. signicant and positive partial correlation with LoR
Strongly simplied language aimed at improving and results on a phonological identication task by
comprehension, and often believed to contribute to Chinese learners who were students, but the non-
acquisition, is probably often avoided in interac- students with shorter LoR scored just as highly ini-
tion with adults for fear it may cause offense. On tially as the students with longer LoR. Thus, it is not
the other hand, many adults do not receive input clear whether LoR or input was the real factor
with a variety of social functions, as would be the leading to these L2 speakers prociency.
case for a child, because the use of L2 in their In addition, even if a correlation between LoR
professional lives is limited to impoverished, almost and L2 prociency could be solidly established,
stereotyped interactions, such as initial meetings correlation is not causation, just like the correlation
and shopping scenarios. Although this argument between AoA and L2 prociency does not auto-
cannot be rejected for some individual learners, it is matically imply causation. Even more important,
clear that many adults fossilize at a level high en- the practice argument is hard to use in the case of
ough to make highly simplied input irrelevant. immigrants who have used the L2 almost exclusively
Many others marry native speakers of the L2, yet for 30 or 40 years. By then, they may have had twice
fail to reach native levels, even after decades of as much practice in L2 as a monolingual college
using the L2 most of the time in their social lives. student in L1.
Critical Period 97
More recently, however, the input-and-interac- identify less fully with the L2 culture; therefore, they
tion argument has taken a new direction and led would be expected to be less successful in L2. Again,
to more quantiable hypotheses. Bialystok and however, just as for the input variable, these corre-
Miller (1999), Jia (1998), and McDonald (2000) all lations do not explain the fact that the AoA effect is
documented a strong relationship between indi- so pervasive, even for people who seem to give off
viduals knowledge or use of L1 and their pro- sparks of enthusiasm for the L2, its community, and
ciency in L2 morphosyntax. Jia (1998) found a its culture. Moreover, every single study on AoA and
correlation of .61 between AoA and L1 prociency, L2 morphosyntax that includes stepwise regres-
the inverse of and almost as strong as the correla- sions, or partial correlations for social-psychological
tion of .68, which she found for AoA and L2. variables with the effect of AoA partialed out,
Bialystok and Miller (1999) found a correlation of has found these variables not to be signicant (Jia,
.63 between L1 and L2 prociency, but only for 1998; Johnson & Newport, 1989; Oyama, 1978).
the Chinese, not the Spanish, sample in their study For phonology, in studies that have compared both
and only for oral, not written, grammaticality younger and older learners, such variables have
judgments. contributed either not at all (Oyama, 1976;
Such ndings about the inuence of the use of Thompson, 1991; Yeni-Komshian et al., 2000) or
L1 also exist for phonology. Earlier studies such very little (Flege, Munro, et al., 1995; Flege et al.,
as that of Tahta, Wood, and Loewenthal (1981) 1999). One exception was the work of Moyer
found that the factor of use of English at home (1999), but this study only included speakers with
accounted for 9% of variance in a multiple re AoA above 11, which means there was little room
gression analysis. Flege, Frieda, and Nozawa (1997) left for variation as a function of AoA.
found that, within AoA groupings, Italian immi-
grants to Canada differed in scores on sentence Maturation Without a Critical Period Inherent in
accent depending on whether they reported high the idea of a CP is the concept of an end point, a
or low use of Italian, with those reporting lower point beyond which learning becomes difcult or
use of the L1 scoring better. Yeni-Komshian, impossible. This very idea of an end point implies
Flege, and Liu (2000) found that there was a that by then maturation has taken its course, and
signicant negative correlation (r .65) between not much further decline for the same maturational
L1 and L2 pronunciation scores for Korean im- reasons can be expected. Therefore, a litmus test
migrants to the United States. This inverse rela- for the CPH seems to be whether there is a dis-
tionship was only signicant, however, for the continuity in the AoAprociency. If ultimate at-
groups with AoA 111 and not for adult learners. tainment kept declining as a function of AoA at
These ndings are important, and the hypothesis more or less the same rate, even for immigrants
that a high level of prociency in any language who arrived in the L2 environment well into
requires an enormous amount of practice, and that adulthood, then this would constitute a serious
therefore bilinguals must show a trade-off between challenge to the CPH function (see, e.g., Birdsong,
L1 and L2 prociency, is certainly a plausible one chapter 6, this volume).
(cf. Grosjean, 1998), but again the correlation Bialystok and Hakuta (1999) and Stevens (1999)
with use does not explain the very robust AoA both documented a signicant decline through
effects documented. adulthood when L2 prociency was operationalized
as self-assessment on the 5-point scale used by the
Social-Psychological Variables It is well known 1990 U.S. Census. There are two reasons, however,
that variables such as integrative motivation, risk to be very cautious in interpreting their gures.
taking, self-consciousness, attitude toward the L2 First, the validity of self-assessments is particularly
community, and identication with the L2 culture problematic in this research context. The idea that
are signicant predictors of success in L2, particu- younger is better for L2 learning is widespread in
larly in naturalistic contexts (see, e.g., Gardner, the population, but the idea of a sharp decline at a
1985; Krashen, 1981; Skehan, 1989, 1998; Spolsky, very specic AoA is not. Therefore, subject expec-
1989, 2000). Therefore, early discussions of the CPH tancy is a serious problem when using this partic-
often include the hypothesis that the reason for ular instrument for this particular research
the AoAL2 correlation is to be found in social- question: The older immigrants were when they
psychological factors. Older learners tend to be arrived, the lower their expectations for L2 pro-
more self-conscious, tend to have less of a need ciency. Moreover, the particular scale used by the
to integrate fully into the L2 community, and tend to U.S. Census has a strange quirk. The 5 points on
98 Acquisition
the scale are labeled as not at all, not well, acquisition orders, for instance, very few make any
well, very well, and speak only English. It is childadult comparisons, with the 1987 work of
clear that being monolingual in English does not Pak a rare study that did), and many other possible
necessarily mean speaking better than a bilingual, qualitative differences have not been researched
yet this is what this scale seems to imply. (And even at all.
if only the percentage of respondents who say On the other hand, the couple of studies that
very well as a function of age was analyzed, that have tried to assess qualitative differences indirectly
would still be in part a function of how many by assessing which aptitude factors played the big-
people said speak only English because younger gest role for younger and older learners did nd
immigrants who have become monolingual cannot signicant differences. DeKeyser (2000) showed
say very well.) that verbal aptitude was a signicant predictor for
Birdsong and Molis (2001) also found a signif- adults, but not for children (or conversely, that there
icant decline through adulthood, even though they was no signicant AoA effect for high-aptitude
used the same grammatically judgment items as learners, but there was for the others). Harley and
Johnson and Newport (1989) and several other Hart (1997) showed that aptitude was the best
studies. In fact, they found a much stronger decline predictor for older learners; memory was the best
through adulthood than through childhood and predictor for young children. The claim that no
early adolescence. As argued here, however, the qualitative differences in L2 learning exist between
strong AoAL2 correlation for adults in this study children and adults appears premature, to say the
appears to be largely because of the effect of out- least.
liers. (Note also that this study did nd a dis-
continuity around AoA 1518, even though in the
Conclusion A number of important questions
opposite direction compared to Johnson and
have been raised about the shape of the AoA
Newport.) Finally, Bialystok and Miller (1999)
prociency function and its interpretation. It ap-
found a continuing decline in early adulthood, but
pears that a number of factors, such as differences
only for Chinese learners, for whom the LoR (from
in input, use of L1 and L2, and a variety of social-
1 to 6 years) makes it clear that the scores do not
psychological factors may reinforce the AoA ef-
reect ultimate attainment.
fect, but they far from fully explain it. Nor is
Some studies in phonology have found that the
there sufcient evidence at this point of a contin-
highest probability of being judged as having a
uous decline through childhood and adulthood,
nativelike accent is found for speakers with an AoA
which would threaten the CP interpretation of the
lower than 6 (Flege et al., 1999; Oyama, 1976;
AoAprociency correlation. Many studies, on the
Thompson, 1991), but because others have reported
other hand, have documented the kind of discon-
their data based on predetermined groupings (such
tinuity in the AoA function that is expected under
as before and after AoA 15 in the work of Pat-
the CPH.
kowski, 1980), it is not always easy to see where
discontinuities might appear within those groups.
For further comments on the shape of the regression
line, see the section on Methodology. Interpreting the Findings
Lack of Qualitative Differences If the effect of AoA Even if the arguments above allow us to reject
on ultimate attainment does not reect mere dif- nonmaturational explanations for the AoA effects
ferences in input or use, it must reect differences documented, that still leaves us with a lot of un-
in learning mechanisms. Therefore, as Hakuta answered questions. What exactly is the nature
(2001) stated, it is important for proponents of the of the developmental changes that underlie the
CPH to show different patterns of acquisition in younger is better phenomenon? To understand
adults and children. Hakuta (2001) argued that them better, the data certainly must be examined
such differences have not been found in studies on more closely. What kinds of L2 structures are af-
the role of L1 in L2, studies of the role of universal fected the most? Do the same differences appear
grammar (UG), and studies on acquisition orders. with different testing formats? Are AoA differences
Although this may be largely true, almost no reected equally in accuracy and RT? And, what
research has been carried out to establish such does all of that tell us about the qualitative nature
differences (even among the many studies on of the CP?
Critical Period 99
learning capacities somewhere between ages 4 and Kareev and his associates made a slightly dif-
18 years. That still leaves the question of how this ferent argument. Capacity limitations are benecial
qualitative change should be characterized. not only because of their input-ltering effect, but
Bley-Vroman (1988) argued that there is a also because they act as an amplier for correlation
fundamental change in the sense that children use patterns in the data. Theoretical analyses of this
domain-specic learning procedures and adults phenomenon were provided in the work of Kareev
general problem-solving systems. DeKeyser (2000), (1995, 2000), and empirical corroboration through
on the basis of the different role aptitude plays experimental data from adults with different
at different AoAs (as documented in his study and working memory capacities and exposed to sam-
in the work of Harley and Hart, 1997), further ples of different sizes can be found in the study of
interpreted this difference in terms of children as Kareev, Lieberman, and Lev (1997).
largely limited to implicit learning and adults lar- In spite of this variety of theoretical and em-
gely limited to explicit learning. Ullman (2001), on pirical arguments, however, there are reasons to
the basis of neurolinguistic evidence, came to a remain skeptical. First, there is no direct proof that
similar conclusion about implicit/procedural versus memory limitations are the cause of childrens
explicit/declarative learning as a function of AoA. initially slower, but ultimately more successful,
Whether these interpretations are correct, they learning; there is only an indirect chain of rea-
still beg the question of why such qualitative dif- soning in the sense that children have been shown
ferences would obtain. Several strands of research, to have limited memory capacities, and that lim-
however, provided some intriguing suggestions, ited memory capacities have been shown to be
mostly along the lines of age differences in the size advantageous in adult language learning and in
of working memory, referred to variously as the various forms of mathematical and computational
less-is-more hypothesis (Newport, 1990, 1991), modeling. Second, some empirical research on
the advantage of starting small (Elman, 1993), neural networks has failed to replicate Elmans
or the consequences of learning through a narrow results (Rohde & Plaut, 1999). Finally, the for-
window (Kareev, 1995). mulaic language learning that supposedly helps to
Newport rst argued that language learning speed up learning initially, but is detrimental for
declines over maturation precisely because cogni- ultimate analysis, has been documented repeatedly
tive abilities increase (1990, p. 22). The reasoning in L2 learning by young children (e.g., Wong-
behind this claim is that reduced storage of lan- Fillmore, 1976; Peters, 1983). Thus, it seems to
guage input because of memory limitations actually characterize child L2 learning at least as much
simplies the computation of basic formmeaning as adult L2 learning and can therefore not be
relationships at the morpheme level and thereby attributed to a disadvantage in the area of lan-
avoids or reduces formulaic learning, which ini- guage learning brought about by general cognitive
tially speeds up acquisition, but turns out to be a maturation.
developmental dead end. Goldowsky and Newport Whether the less-is-more hypothesis or any other
(1993) provided evidence for this view in the form cognitive explanation of an ontogenetic nature is
of mathematical modeling. accepted, it is obvious that the cognitive changes
Cochran, McDonald, and Parault (1999) postulated will have to be represented somehow
showed that adults who were given instructions to in the brain. We reported above on a number of
practice American Sign Language strings holisti- studies that have documented differences in neu-
cally or who had to perform under dual-task con- rological representation between language ac-
ditions learned sign language structure better quirers of different AoAs. There, of course, the
than adults under normal conditions. In the same documented neurological differences are the result,
vein, Kersten and Earles (2001) found that adults not the cause, of the different ages of exposure.
learned the morphology of a miniature linguistic But, is there any evidence of the opposite, that is,
system better when they were initially presented maturationally determined differences in the brain
with only small segments of the language. Elman that lead to differences in level (and mechanism) of
(1993) showed that training neural networks to acquisition? Lenneberg (1967), Long (1990), Pen-
process complex sentences succeeded only when eld and Roberts (1959), Pinker (1995), Pulver-
working memory was limited in initial stages of muller and Schumann (1994), Scovel (1988), and
learning and argued in some detail why such nd- innumerable others have pointed to various aspects
ings could be seen to follow logically from various of neurological maturation that take place between
characteristics of neural networks. birth and puberty. To this day, however, nobody
Critical Period 101
has even hypothesized, let alone proven, a cor- because they think it implies adolescents and adults
relation between the development or decline of a can no longer learn a foreign language well. This
specic neurophysiological mechanism and the in- inference clearly also is not justied given that
complete acquisition of specic elements of lan- many CP studies have documented a number of
guage that is so characteristic of late learners. highly successful (if imperfect) adult learners.
This is not to say that such a link cannot be Both misunderstandings stem from a supercial
found. Perhaps there is a way in which specic interpretation of younger is better. As was pointed
aspects of neurological maturation lead to a serious out in the introduction, younger learners are not
reduction in the capacity for implicit learning of faster. Nor are they any better at understanding
linguistic structures, for instance (which may affect rules of grammar, just the contrary. Whatever may
different structures differently depending on their be thought of the various tentative explanations for
salience and corollary ease of explicit learning), but the CP mentioned here, the fact remains that what
any such explanation remains highly speculative. children are good at is ultimate attainment as a
Clearly, just about everything develops between result of prolonged intensive exposure. What many
birth and puberty, so that causal interpretations of adults, especially the more verbally gifted, are good
correlations are even more suspicious here than at is relatively quick grasping of certain abstract
they are in general. patterns that can easily be made explicit.
DeKeyser (2000) and Harley and Hart (1997)
showed that SLA success among children depends
Practical Implications more on memory and success among adults more
on analytical skill. Robinson (2002) showed that
Although the ultimate causes of the CP are still a analytical skill plays a big role in explicit, not
matter of speculation, few issues in SLA theory implicit/incidental learning, and that (working)
have had as much practical impact as the CPH. memory plays a bigger role in incidental learning.
From Canadian-style immersion programs in the Taken together, these results strongly suggest that
early 1960s to foreign language in the elementary children and adults use different mechanisms for
school (FLES) programs designed and implemented learning, which draw on different aptitudes, and
in various countries at this point in time, many that these different aptitudes play a different role
administrators have justied the very existence of depending on the instructional approach.
their curricula with references to the literature that Further conrmation of this hypothesis came
shows younger is better. As a result, opponents of from more pedagogically oriented studies, such as
the CPH, such as Hakuta (2001) and Marinova- that of von Elek and Oskarsson (1973), which
Todd et al. (2000), argued that the CPH has had a showed that, with an implicit method, children
very questionable impact on practice. learned better than adults; with an explicit method,
In reality, however, the correctness of the CPH adults learned more than children. Munoz (2001)
is largely irrelevant to arguments for or against documented that older learners (starting at age 11
formal language teaching at an early age. FLES years) performed better on a variety of tests than
programs (and probably form-focused partial im- younger learners (starting at age 8 years) after the
mersion programs) do not capitalize on the implicit same number of hours of (relatively form-focused)
learning skills of the child because of their focus on English as a foreign language (EFL) classroom
form and, more important, because of the limited instruction. Curriculum designers and program ad-
time involved. Implicit learning works slowly and ministrators should take these differential strengths
requires many years of massive input and interac- and weaknesses into account for learners of all
tion, which even 12 years in an early immersion ages.
program apparently cannot provide in sufcient The main practical implication of the CP liter-
quantity to lead to near-native prociency (cf., e.g., ature, then, is not to call for early programs of any
Swain, 1985). Therefore, even if the CPH is cor- kind, but to adapt programs very thoroughly to the
rect, one cannot expect any substantial prociency age of the learner. Children can learn very little
after several years of FLES (typically less than an explicitly; adults can learn very little implicitly.
hour a day), as many parents have come to realize. Therefore, if early language teaching is needed, it
Patkowski (1994) already pointed this out, but has should rely on large doses of communicative input
often been ignored. and interaction; adolescents and adults need focus
On the other hand, some college students be- on form to boost the explicit learning mechanisms,
come discouraged when they hear about the CPH which at least some of them can substitute for
102 Acquisition
implicit learning with a satisfactory degree of analysis of this pattern is indispensable if the CPH
success. is to have explanatory adequacy (cf. Hyltenstam &
Abrahamsson, 2003). The processing issue can be
seen as one particular example of this problem.
Methodological It is important to pursue work along the lines of
Recommendations Bialystok and Miller (1999), Juffs and Harrington
(1995), or McDonald (2000) in order to nd out to
The literature review in this chapter makes it clear what extent AoA effects are due to increased pro-
that a very large percentage of CP studies have cessing problems, not necessarily to argue for or
relied on the same kinds of instruments, both in against the CPH, but to give AoA effects their
phonology and morphosyntax. This is especially proper interpretation, thus rening the concept
obvious in the morphosyntax area, for which many rather than debating it. Different forms of data
researchers have used largely the same items as collection (oral and written, with and without time
Johnson and Newport (1989), usually in a listening pressure, and with focused RT measurements) will
format. Although this approach is advantageous be necessary to that end. Both Juffs and Harrington
from the point of view of comparability of results, and McDonald provided very useful methodologi-
it also entails the danger of introducing a method cal suggestions for such work.
effect, that is, an artifact in the form of an AoA The crux of the CP debate, however, seems to be
effect caused by the peculiarities of the instrument. the shape of the AoAprociency function. A
Therefore, the use of different structures, different number of vague hypotheses about the biological
items, and different testing formats is certainly reasons for the CP have led various researchers to
advisable. hypothesize boundaries at certain AoAs (for in-
Another consequence of the attention drawn by stance, AoA 15 because of puberty or AoA 5 be-
Johnson and Newport (1989) is the high percent- cause of putative end of lateralization). However,
age of studies that have worked with Chinese or the testing of premature hypotheses about under-
Korean immigrants. This also is a bit of a threat to lying causes should not be confused with the hy-
generalizability because it is known that L1 plays a pothesis itself that there is a maturationally dened
large role in many aspects of language learning, CP; in other words, turning discontinuities at cer-
and there is no reason to assume it would not be tain predetermined AoAs into a litmus test for the
reected in AoA effects. It does seem advisable, CPH seems beside the point. Empirical establish-
however, to keep conducting research with speak- ment of discontinuities is needed before interpre-
ers from one specic L1 background in a given tation of them.
study to avoid noise in the analysis. It is better to This is easier said than done, however, for sev-
strive for generalizability by conducting a number eral reasons. First, several authors have argued that
of studies with different L1s than to try to gener- there may be different CPs for different broad ar-
alize from one study with many L1s, which would eas, such as phonology or morphosyntax (Eubank
sacrice internal to external validity. L1L2 pair- & Gregg, 1999; Schachter, 1996; Seliger, 1978), or
ings of various kinds should be useful here as long even more specically for narrowly dened struc-
as the pattern is not obscured by including struc- tures (see especially Lee & Schachter, 1997;
tures for which the particular L1L2 pairing poses Weber-Fox & Neville, 1996). If these hypotheses
no learning problem at all. were found to be correct, this alone would pre-
A related point is the desirability of a careful clude a boundaries test. Even for a very specic
qualitative analysis of the specic learning prob- structure, it must be taken into account that data
lems posed by a given L2 structure (e.g., adverb are necessarily averaged over individuals, which
placement) for a given L1 (e.g., L1 French, L2 will lead to a smoothing out of discontinuities be-
English). This analysis can be linguistic, psycho- cause of interindividual differences. And, even for a
linguistic, or cognitive-psychological in nature, given individual and a specic L2 structure, ac-
showing how the structures with a specic AoA quisition mechanisms presumably are not switched
effect can be characterized in terms of salience, off overnight.
markedness, prototypicality, semantic complexity, As a result, establishing the exact shape of var-
form-meaning transparency, UG status, processing ious AoA functions will require data for narrowly
difculty, and so on. Several studies have shown dened (groups of) structures from large numbers
that the strength of the AoA effect varies dramati- of individuals with the same L1 or at least L1s that
cally depending on the structure; a qualitative are equivalent in terms of the structure at issue.
Critical Period 103
The only CP study so far that has met even this AoAprociency curve may reect a variety of
elementary requirement with real test data is that inuences, and the challenge is neither to equate all
of Flege et al. (1999). of them with the CP nor to reject the CP because it
The next problem is that of statistical analysis. cannot explain all AoA-dependent variation. Ulti-
Many studies have presented correlation coef- mately, what is needed for the understanding of
cients for AoA and L2 prociency, shown linear these AoA-related functions is an equivalent of
regression lines, or both. Obviously, these tech- Fourier analysis in acoustics (which analyzes com-
niques are not ideal if there are reasons to suspect plex sound waves into their underlying sinus
the AoA function may not be linear. Other re- waves).
searchers have basically let their computers nd a Finally, researchers should take great care to
regression line of any kind, whether linear or avoid both ceiling and oor effects. An inverted S
polynomial (cubic, quadratic, etc.). Such an anal- shape of the kind that has been found in some CP
ysis, of course, is extremely sensitive to even slight studies (no signicant change up to a certain AoA,
outliers, especially with the rather small samples gradual change for a certain AoA range, and then
typically used. The very least that should be done in no additional signicant change) could be caused
such cases is to test whether the regression equation by the combination of a ceiling effect affecting the
that provides the best t is signicantly better youngest learners and a oor effect affecting the
than the others. The study that comes closest to oldest learners (cf. Hyltenstam & Abrahamsson,
meeting this requirement is probably that of Bird- 2003; Jia, 1998). Different tests may be needed for
song and Molis (2001). the youngest and oldest arrivals to establish early
A related problem is the issue of hypothesized and late discontinuities; testing formats other than
discontinuities and cutoff points. Here is the very yes-no grammaticality judgments may also help to
heart of the problem. On the one hand, if there are avoid ceiling and oor effects.
good reasons to hypothesize such a point of dis-
continuity, one can try to t different functions to
different segments of the AoA continuum, but we Conclusion
stated that these cutoff points should be considered
an empirical matter at this point. On the other Evidence from numerous studies has shown that,
hand, if there is a polynomial function that is sig- although adults may be faster than children in
nicantly better than any other, then this is a great initial stages of L2 learning, their ultimate attain-
help in positing a point of discontinuity within a ment is most likely to fall short of native speaker
narrow condence interval, but this only increases standards. This may seem paradoxical to some
the pressure on the statistical analysis to identify (e.g., Harley & Wang, 1997). In our view, how-
every bend in the slope, no more, no less, and to ever, the two phenomena can both be explained by
peg it to a precise AoA. Neither situation is ideal, the same underlying difference in learning mecha-
but it seems to us that, at least with large and nisms: Children necessarily learn implicitly; adults
otherwise homogeneous samples, this latter, in- necessarily learn largely explicitly. As a result,
ductive, approach is likely to be most fruitful at this adults show an initial advantage because of
point. Otherwise, the risk is too big that testing the shortcuts provided by the explicit learning of
a broader hypothesis becomes confounded with structure, but falter in those areas in which explicit
testing poorly justied values for some of its learning is ineffective, that is, where rules are too
parameters. complex or probabilistic in nature to be appre-
Difcult as these issues may be, there is yet a hended fully with explicit rules. Children, on the
further complication. Even if there is such a thing other hand, cannot use shortcuts to the represen-
as a CP for SLA, this does not automatically imply tation of structure, but eventually reach full native
that this CP is the only maturational age effect that speaker competence through long-term implicit
plays a role. Clearly, individuals go through grad- learning from massive input. This long-term effect
ual physical and psychological changes of all kinds of age of onset is most obvious to the casual
throughout their lifespan, and there is no reason to observer in pronunciation, but on closer inspection
believe these changes in visual/auditory acuity, at- appears to be no less robust in the domain of
tention, memory, analytical ability, and so on play grammar.
no role in SLA. In other words, such changes may Such widely documented AoA effects should not
be superimposed on the CP phenomenon and add make one jump to the conclusion that they are
an extra layer of difculty to the analysis. The maturational in nature. It is certainly logically
104 Acquisition
possible that they are caused by environmental and Bialystok, E., & Hakuta, K. (1999). Confounded
other variables that tend to correlate strongly with age: linguistic and cognitive factors in age
AoA, and several such variables have been inves- differences for second language acquisition. In
tigated. The preponderance of the evidence sug- D. Birdsong (Ed.), Second language acquisi-
gests, however, that they cannot explain away the tion and the critical period hypothesis (pp.
161181). Mahwah, NJ: Erlbaum.
very robust effects of AoA. On the other hand,
Bialystok, E., & Miller, B. (1999). The problem of
researchers are still far away from providing com- age in second-language acquisition: Inuences
plete explanatory adequacy for the CP concept. from language, structure, and task. Bilingual-
What is it about maturation that causes a de- ism: Language and Cognition, 2(2), 127145.
cline in the ability to learn specic aspects of lan- Birdsong, D. (1992). Ultimate attainment in second
guage? Clearly, to answer this question continuing language acquisition. Language, 68, 706755.
research in SLA will be needed, as will advances Birdsong, D., & Molis, M. (2001). On the evidence
in the explanatory capabilities of developmental for maturational constraints in second-
neuropsychology as it relates to language acquisi- language acquisition. Journal of Memory and
tion. Until that is accomplished, however, we see Language, 44, 235249.
Bley-Vroman, R. (1988). The fundamental
no reason to reject the concept of a CP. In doing so,
character of foreign language learning. In
a robust empirical nding would be confounded W. Rutherford & M. Sharwood Smith (Eds.),
with some of its edgling explanations. Grammar and second language teaching: A
book of readings (pp. 1930). New York:
Newbury House.
Notes Bongaerts, T. (1999). Ultimate attainment in L2
pronunciation: the case of very advanced late
1. The idea that children are somehow better
L2 learners. In D. Birdsong (Ed.), Second
at language learning and should therefore be im-
language acquisition and the critical period
mersed in a language at a very tender age goes back
hypothesis (pp. 133159). Mahwah, NJ:
much further and was discussed quite explicitly by
Erlbaum.
renaissance authors such as de Montaigne and
Bongaerts, T., Mennen, S., & Van der Slik, F.
Comenius.
(2000). Authenticity of pronunciation in
2. Footnote 2 in Birdsong and Molis (2001)
naturalistic second language acquisition: The
mentioned that, with the three latest arrivals re-
case of very advanced late learners of Dutch
moved, the correlation was just marginally signi-
as a second language. Studia Linguistica, 54,
cant (r .36; p .05). The fth latest arrival,
298308.
however, constituted the most extreme outlier. If
Bongaerts, T., Planken, B., & Schils, E. (1995).
the ve oldest arrivals were removed (or simply the
Can late learners attain a native accent in a
three lowest scores rather than the three latest
foreign language? A test of the critical period
arrivals), then a p > .05 would result, given that a
hypothesis. In D. Singleton & Z. Lengyel
p value of exactly .05 was found with removal of
(Eds.), The age factor in second language
less-extreme scores.
acquisition (pp. 3050). Clevedon, U.K.:
3. We derive this from the R2 of 37.8 given in
Multilingual Matters.
Balls (1996) Table 4.2.
Bongaerts, T., Van Summeren, C., Planken, B., &
Schils, E. (1997). Age and ultimate attainment
References in the pronunciation of a foreign language.
Studies in Second Language Acquisition, 19,
Asher, J. J., & Garca, R. (1969). The optimal age 447465.
to learn a foreign language. The Modern Cochran, B. P., McDonald, J. L., & Parault, S. J.
Language Journal, 53, 334341. (1999). Too smart for their own good: The
Ball, J. (1996). Age and natural order in second disadvantage of a superior processing capacity
language acquisition. Unpublished doctoral for adult language learners. Journal of
dissertation, University of Rochester, New Memory and Language, 41, 3058.
York. Coppieters, R. (1987). Competence differences
Bialystok, E. (1997). The structure of age: in between native and near-native speakers.
search of barriers to second language Language, 63, 544573.
acquisition. Second Language Research, 13, Curtiss, S. R. (1977). Genie: A linguistic study
116137. of a modern day wild child. New York:
Bialystok, E. (2002). On the reliability of Academic Press.
robustness: A reply to DeKeyser. Studies in Curtiss, S. R. (1988). Abnormal language
Second Language Acquisition, 24, 481488. acquisition and the modularity of language.
Critical Period 105
Spolsky, B. (1989). Conditions for second language Von Elek, T., & Oskarsson, M. (1973). A repli-
learning. Oxford, UK: Oxford University cation study in teaching foreign language
Press. grammar to adults (Research Bulletin No. 16).
Spolsky, B. (2000). Language motivation revisited. Gothenburg School of Education, Gothen-
Applied Linguistics, 21, 157169. burg, Sweden.
Stevens, G. (1999). Age at immigration and second Walsh, T. M., & Diller, K. C. (1981). Neuro-
language proficiency among foreign-born linguistic considerations on the optimal age for
adults. Language in Society, 28, 555578. second language learning. In K. C. Diller (Ed.),
Suter, R. W. (1976). Predictors of pronunciation Individual differences and universals in lan-
accuracy in second language learning. guage learning aptitude (pp. 321). Rowley,
Language Learning, 26, 233253. MA: Newbury House.
Swain, M. (1985). Communicative competence: Weber-Fox, C. M., & Neville, H. J. (1996).
some roles of comprehensible input and com- Maturational constraints on functional
prehensible output in its development. In S. M. specializations for language processing: ERP
Gass & C. G. Madden (Eds.), Input in second evidence in bilingual speakers. Journal of
language acquisition (pp. 235253). Rowley, Cognitive Neuroscience, 8, 231256.
MA: Newbury House. White, L., & Genesee, F. (1996). How native
Tahta, S., Wood, M., & Loewenthal, K. (1981). is near-native? The issue of ultimate
Foreign accents: Factors relating to transfer of attainment in adult second language acquisi-
accent from the rst language to a second tion. Second Language Research, 12,
language. Language and Speech, 24, 265272. 233365.
Thogmartin, C. (1982). Age, individual differences Wong-Fillmore (1976). The second time
in musical and verbal aptitude, and pronun- around: Cognitive and social strategies in
ciation achievement by elementary school second language acquisition. Unpublished
children learning a foreign language. IRAL doctoral Dissertation, Stanford University,
International Review of Applied Linguistics, California.
41, 6672. Yeni-Komshian, G., Flege, J. E., & Liu, S. (2000).
Thompson, I. (1991). Foreign accents revisited: the Pronunciation prociency in the rst and
English pronunciation of Russian immigrants. second languages of Korean-English bilin-
Language Learning, 41, 177204. guals. Bilingualism: Language and Cognition,
Ullman, M. T. (2001). The neural basis of lexi- 3, 131149.
con and grammar in first and second lan- Yeni-Komshian, G., Robbins, M., & Flege,
guage: The declarative/procedural model. J. E. (2001). Effects of word class
Bilingualism: Language and Cognition, 4, differences on L2 pronunciation accuracy.
105122. Applied Psycholinguistics, 22, 283299.
David Birdsong
6
Interpreting Age Effects in
Second Language Acquisition
ABSTRACT Age effects in second language acquisition (SLA) are often construed as
evidence for a maturationally based critical period. However, an analysis of end-state
SLA research reveals little congruence with geometric and temporal features of critical
periods. In particular, there is no apparent period within which age effects are
observed; rather, they persist indenitely. We see that not only are maturation and
aging distinct biological processes, but also their behavioral effects are not realized in
comparable ways. Our understanding of age effects in SLA must also take into account
the signicant incidence of nativelike attainment among late learners. This incidence
can be roughly predicted from the slope of the age function. The chapter concludes
with a discussion of factors that inuence the slope of the age function.
109
110 Acquisition
on such factors as the linguistic feature tested, amount heightened sensitivity to certain environmental sti-
of L2 use, and L1-L2 pairing (see Flege, Yeni-Kom- muli, the presence of which is required to trigger a
shian, & Liu, 1999; Moyer, 1999; Scovel, 1988; Se- developmental event. Typically, there is an abrupt
liger, 1978; as well as further sections of this chapter). onset or increase of sensitivity, a plateau of peak sen-
Transcending these assorted variabilities, however, a sitivity, followed by a gradual offset or decline, with
simple generalization emerges, as is shown in this subsequent attening of the degree of sensitivity.
chapter: Observed age effects in SLA are not conned As Bornstein (1989, p. 183) observed, it is
to a temporally bounded period, but persist over the sometimes assumed that the degree of sensitivity
age of arrival spectrum. remains constant over the course of the critical
A number of studies have examined neurological period. By a strict understanding of this assump-
dimensions of language processing and representa- tion, any increase in sensitivity would take place
tion among late versus early L2 learners and high- prior to the critical period per se, and any decrease
versus low-prociency L2 learners. The evidence in in sensitivity would be a phenomenon occurring
these studies comes most often from event-related after the critical period. This view is represented in
brain potentials, functional magnetic resonance Fig. 6.1A. Taking as given that level of attainment
imaging, and positron emission tomography. These is derivable from level of sensitivity, in Fig. 6.1A
methods are applied as subjects perform such tasks and in subsequent gures the vertical axis represents
as passive listening to stories, cued word produc- both sensitivity and attainment. Newport (1991,
tion, and reactions to syntactic and semantic p. 114) specied the mathematical relationship be-
anomalies. Because the focus of this chapter is on tween sensitivity and attainment.
behavioral evidence for age effects in SLA, a review Under the conception of the critical period given
of the neurological evidence is beyond its scope; for in Fig. 6.1A, the highest level of attainment is pos-
overviews, see the work of Abutalebi, Cappa, and sible only during the temporal span of the critical
Perani (2001, and chapter 24, this volume), Bird- period. It is this notion, apparently, that Towell and
song (in press-b), and Ullman (2001a). Hawkins (1994) had in mind when they maintained
However, note in passing a provocative issue that, in SLA, parameter values become progres-
raised in this research: Do polyglots engage distinct sively resistant to resetting with age, following the
neurological substrates in processing/representing critical period (p. 126). This view, although not
each language? Results vary somewhat as a func- incompatible with the sensible notion that full at-
tion of task and method, but a good deal of evi- tainment is possible only during a specied win-
dence converges on the generalization that high dow of opportunity, does not incorporate the
levels of L2 prociency are associated with a com- transitions that lead up to and follow the highest
mon neurofunctional organization for the L1 and level of sensitivity, that is, the onset and the offset,
the L2. (Although L2 prociency normally corre- respectively. As Bornstein (1989) noted, By de-
lates negatively with age of arrival, the separate nition, the sensitive period endures within the con-
effects of these two factors can be teased out by nes of its onset and offset (p. 182). Accordingly,
varying them independently in the experimental the span of a critical period is properly understood
design or by statistical factoring and partialing as beginning at the moment when sensitivity starts
techniques.) That is, if high levels of prociency are to increase and ending at the point at which sensi-
attained, late SLA does not entail loss of plasticity tivity is at its lowest level. Thus, as opposed to a
or massive functional reorganization, with broad brief plateau of peak sensitivity (Fig. 6.1A), the
recruitment of neural circuitry outside areas sub- critical period extends from the beginning of the
serving the L1. Further discussion of age of arrival onset to the end of the offset; Fig. 6.1B represents
and neurofunctional organization in SLA may be this orthodox conception of a critical period.
found in the work of Brovetto (2002) and in further Consistent with the representation in Fig. 6.1B,
sections of this chapter. henceforth we consider a critical period to include
all heightened sensitivity, the transitions as well as
the uppermost level. (Note that it is of no con-
sequence if one or both of the transitional phases
Geometric and Temporal is extremely abrupt or even absent. The idea here is
Features of Critical Periods to incorporate the transitions if they are present, in
contrast to their exclusion under the conception
Generically, a critical period is considered the tem- represented in Fig. 6.1A.) As for behavioral out-
poral span during which an organism displays a comes, the degree of attainment that is reached
112 Acquisition
Figure 6.1 (A) Unconventional representation of a critical period as coextensive with the term of peak
sensitivity; (B) orthodox representation of a critical period that includes transitional and peak sensitivities.
when learning begins within a critical period is not it can be argued that the highest level of sensitivity
limited to full attainment, but includes lower levels is effectively at birth or so close to it that the slope
as well. of the onset would be extremely steep. By most
A further potential confusion to confront is accounts, language learning (rst and second) to
terminological in nature. Some researchers use sen- full adult competence is possible if begun as late as
sitive period in place of critical period to emphasize 4 to 7 years. It is at this point that the beginning of
the gradual nature of the phenomenon and to the offset of the critical period could be roughly
suggest interindividual differences in the timing of located. As for the end of the offset, and thus the
onset and offset. In other instances, such as in the end of the critical period, Johnson and Newport
quotation from Bornstein in this section, sensitive (1989) argued that critical period effects relating to
period is understood more neutrally, that is, in ref- language acquisition should cease at the point of
erence to the bounded duration of enhanced sen- complete neurocognitive maturation, roughly in
sitivity to relevant environmental input. In the the mid- to late teens.
present contribution, I refrain from use of sensitive As shown in Fig. 6.2, the prototypical geometric
period to avert possible misconstruals of the term. features of a hypothesized critical period for lan-
My use of critical period is consonant with the guage acquisition form what may be called a
generic characterization given at the beginning of stretched Z. Regarding the hypothesized temporal
this section and incorporates the dimensions of features, the period of maximal sensitivity to lin-
gradualness and interindividual variability. guistic input, with full attainment of grammatical
Let us now consider the characteristics of a competence assured, extends through early child-
putative critical period for language acquisition. hood (to simplify the image, no onset is represented
Various studies have shown that sensitivity to in Fig. 6.2 and gures that follow). At this juncture,
sounds found in natural languages, such as the the beginning of the offset of the critical period is
ability to distinguish [ba] from [pa], is present observed. The end of the offset coincides with the
among newborns only a few hours old (Eimas, point at which full neurocognitive maturation is
Siqueland, Jusczyk, & Vigorito, 1971). Although reached. After this point, no further age effects are
this and other sensitivities essential to language predicted. Note the addition of a horizontal leg
learning may become more acute in the months extending rightward past the end of the critical
immediately following birth, for practical purposes period per se. This at function represents a oor of
Age Effects in Second Language Acquisition 113
Figure 6.3 Unbounded age functions with (A) postmaturational offset, (B) prematurational offset, and (C)
linear decline.
114 Acquisition
Figure 6.5 Self-rated English L2 oral prociency by age of arrival for (A) Chinese native speakers and
(B) Spanish native speakers living in the United States. Reprinted from Bialystok and Hakuta (1999,
pp. 174175), copyright 1999, with permission of Lawrence Erlbaum Associates.
of discontinuity in the regression of accent ratings p < .0001). For two separate measures of degree of
against age of arrival. However, for the same group nonnativelike accent, Oyama (1976) obtained sig-
of subjects, the researchers noted a slight dis- nicant positive correlations with age of arrival
continuity in the function relating age of arrival to (r .83, p < .001, and r .69, p < .001; partial
scores on a 144-item test of knowledge of English correlations, removing LoR). The observation of
morphosyntax adapted from Johnson and Newport effects across the spectrum of age of arrival clashes
(1989). This inection point occurred at age of with the notion that the effects should be bounded
arrival of approximately 12 years. Flege et al. point within a maturationally dictated temporal span.
out, however, that age effects persisted for the late-
arriving participants. Separate analyses were per-
formed for age of arrival >12 years and age of Disaggregation Analyses
arrival >15 years; signicant correlations in both
cases were reported, suggesting an ongoing decline Differentand sometimes contradictorypictures
in sensitivity. emerge when early-arriving and late-arriving sub-
In studies such as those in this section, data jects are segregated for separate analysis. In some
from early-arriving and late-arriving subjects are cases, results suggest discontinuities in the sensi-
pooled into a single linear regression analysis; in tivity function over age of arrival. For example, in
these cases, the negative correlation of attainment Patkowskis (1990) reanalysis of English L2 pro-
level and age of arrival typically reaches signi- nunciation data from his dissertation (1980), the
cance. Indeed, this generalization holds even with researcher disaggregated the sample into early
data that have been presented as evidence for the learner (age of arrival younger than 15, n 33) and
CPH/SLA. For example, testing for knowledge of late learner (age of arrival older than 15, n 34)
L2 English morphosyntax, DeKeyser (2000) ob- subgroups. As noted, the pooled data for all 76
tained r .63, p < .001, and Johnson and New- subjects had suggested a strong age effect over
port (1989) found r .77, p < .01 in their the span of age of arrival (range 550 years). In
correlations of grammaticality judgment scores contrast, the analyses for subgroups suggested dif-
with age of arrival for all subjects. ferent distributions of accentedness among early
Patkowski (1990) reported signicant negative versus late learners. Separate analyses revealed that
correlations of accuracy of L2 English pronuncia- the slope of the early arrivals regression line was a
tion with age of arrival for all subjects (r .76, fairly steep .052, whereas the regression line for the
Age Effects in Second Language Acquisition 117
late arrivals had a less-pronounced slope of .028. conforming precisely to the stretched Z geometry
Patkowski (1990) reported that these results are in illustrated in Fig. 6.2 (Newport, 1991, p. 123).
line with syntactic prociency data from the same However, this image is an artifact of graphing
sample of subjects (Patkowski, 1980) and viewed means as points without regard to intragroup var-
these ndings as evidence for a discontinuity in the iance. The illusion of a roughly at and tidy func-
age of arrival function. (However, no statistical tion (i.e., the right tail of the stretched Z graphic in
signicance for the slope differences is reported.) It Newport, 1991) is achieved by connecting the
should be noted that the data do not strongly points that represent the mean scores of the ages of
suggest a termination of declining sensitivity be- arrival 1722, 2328, and 3139 years subgroups.
cause the age effect among the late learners ap- Consider Fig. 6.6, which displays the actual scores
proaches signicance (r .288, p .098). What of these subjects (from Birdsong & Molis, 2001,
is also curious about these results is that the cor- adapted from Johnson & Newport, 1989). Plainly,
relation of age of arrival and performance was not the 23 individual late learners scores are not dis-
signicant (r .245, p .169) among early learn- tributed in an orderly manner parallel to the x axis.
ers, that is, for just those subjects whose ages of This essentially random distribution of scores does
arrival (515 years) represented the temporal span not license the conclusion that through adulthood
during which, by the CPH/SLA, sensitivity should the function is low and at or the corresponding
be predictably declining. interpretation that the shape of the function thus
The disaggregation methodology was also supports the claim that the effects of age of ac-
employed in the landmark study of Johnson and quisition are effects of the maturational state of the
Newport (1989). Participants were 46 Korean learner (Newport, 1991, pp. 122123).
and Chinese learners of English, all of whom had To expand on this observation, note that a
lived in the United States for 5 years or more. Half best-t regression line will be visibly at and the
the sample were early learners, having arrived in correlation coefcient will approach 0 under two
the United States at age 15 years or younger; the conditions: when the distribution of points is
others were late learners who had immigrated be-
tween the ages of 17 and 39 years. Participants
were asked to provide grammaticality judgments
for 276 English sentences presented on an audio-
tape. Separate linear correlations of age of arrival
with test scores were performed for early and late
arrivals. Consistent with the idea that age effects
should be observed early, there was a strong age
effect among the early-arriving subjects (r .87,
p < .01); moreover, the correlation of age of arrival
with performance for late arrivals was not sig-
nicant (r .16, p > .05).
As noted, Johnson and Newport (1989, p. 79)
explicitly related these results to their understand-
ing of the CPH/SLA, emphasizing that for post-
maturational age of arrival performance should
not continue to decline over age, and that, with
increasing age of arrival among late learners, there
should be no change up or down in level of attained Figure 6.6 Plot of L2 English grammaticality
prociency. Johnson and Newport (1989, p. 79) judgment scores from late-arriving Korean and
stated that this was exactly the pattern of results Chinese participants. Data are from Johnson
they observed. To bolster this conclusion, Newport and Newport (1989), reproduced in Birdsong and
(1991, pp. 122123) took the late and early lear- Molis (2001). AoA, age of arrival. Reprinted from
ners scores from Johnson and Newport (1989) and Journal of Memory and Language, 44, D. Birdsong
plotted the mean performance of seven respondent and M. Molis, On the evidence for maturational
subgroups: natives and ages of arrival 37, 810, effects in second language acquisition, pp. 235
1115, 1722, 2328, and 3139 years. Lines 249, Copyright 2001, with permission from Else-
connecting the subgroup means produced an image vier Science.
118 Acquisition
random and when the points are distributed hor- as consonant with those of Johnson and Newport
izontally. However, it is only in the latter case that (1989) and as evidence of the robustness of crit-
it can be concluded that sensitivity has stabilized. ical period effects, the absence of an age effect
Recall that, for their late learners, Johnson and among early learners, along with the signicant age
Newport (1989) reported a correlation of age of effect over the spectrum of age of arrival, call this
arrival and scores with a coefcient close to 0 interpretation into question.
(r .16). This result reected the near-random Separate analyses for the groups of early and late
distribution of scores. It is properly interpreted as ages of arrival were also carried out by Birdsong and
indicative of no systematic relationship [of per- Molis (2001) in a strict replication of the Johnson
formance] to age of exposure and denitely is not a and Newport (1989) study. Participants were
leveling off of ultimate performance among those Spanish natives (n 29 early arrivals, n 32 late
exposed to the language after puberty (Johnson & arrivals). Early arrivals (age of arrival 16 years)
Newport, 1989, p. 79). performed at or near ceiling (r .24, p .22),
Related studies that employed the disaggrega- producing a nearly at function. In contrast, the per-
tion analysis have likewise produced ambiguous formance of late arrivals (age of arrival 17 years)
results. In DeKeyser (2000), Hungarian learners of was strongly predicted by age of arrival (r .69,
English were tested with a shortened version of the p < .0001). As in other studies discussed here, age of
Johnson and Newport (1989) instrument. For all arrival was predictive of performance over all sub-
subjects, the correlation of judgment accuracy and jects (r .77, p < .0001).
age of arrival was signicant (r .63, p > .001). The disparate results obtained by Birdsong and
However, for neither the late arrival subgroup Molis (2001) and Johnson and Newport (1989) may
(n 42) nor the early arrivals (age of arrival be viewed in Fig. 6.7. Visually, the two pairs of re-
younger than 16 years, n 15) did the correlation gressions are very different. In the Johnson and
reach signicance. Although the results are viewed Newport data, there was a sharp decline for early
Figure 6.7 Number of items correct as a function of age of arrival. Solid regression lines are t to the
Birdsong and Molis, 2001 (B&M), data; dashed lines are t to the Johnson and Newport, 1989 (J&N89),
data. Division of late and early age of arrival groups equals 16 years. Reprinted from Journal of Memory
and Language, 44, D. Birdsong and M. Molis, On the evidence for maturational effects in second lan-
guage acquisition, pp. 235249, Copyright 2001, with permission from Elsevier Science.
Age Effects in Second Language Acquisition 119
arrivals and a less-steep function for late arrivals; in 1. For pooled early and late learners, analyses of
the Birdsong and Molis results, essentially the ob- age of arrival effects on linguistic performance
verse pattern was seen. To determine if these differ- revealed little evidence of nonlinearity sug-
ences are statistically signicant, Birdsong and Molis gestive of the beginning of a sensitivity decline.
carried out numerous post hoc comparisons of their 2. In all cases of pooled early and late learners,
data with those of Johnson and Newport and used age of arrival effects persisted indenitely,
various cutoff points for early- versus late-arriving with no attening of the function signaling
subjects. The two sets of results differed with respect the end of a sensitivity decline and subse-
to the regression slopes over all subjects and for the quent stabilization of sensitivity level.
early and late age of arrival subgroups. 3. In disaggregated samples, there was incon-
Returning to the question of discontinuity, sistent evidence of signicant age of arrival
analyses of the disaggregated samples in the work effects for early learners, which would in-
of both Johnson and Newport (1989) and Birdsong dicate prematurational decline in sensitivity.
and Molis (2001) indicated that there are separable 4. In disaggregated samples, most of the sur-
linear functions. However, as noted, these func- veyed evidence showed signicant age of
tions are not comparable in terms of their slopes. arrival effects for late learners, indicating
In the Birdsong and Molis study, the nearly at postmaturational decline in sensitivity.
function at ceiling for the early arrivals was fol- 5. In disaggregated samples, there was no evi-
lowed by a steep decline for late arrivals; thus, dence that the performance of late learners
unlike the Johnson and Newport results, the age attened out with increasing age of arrival,
effect came into play after the end of the putative which would suggest stabilization of sensi-
critical period. Moreover, a series of analyses that tivity at its lowest level.
placed the cutoff point for early versus late arrivals
at various ages of arrival between 15 and 27.5 These conclusions may be summarized by recal-
years consistently showed a signicant age effect ling the age functions for language acquisition ill-
for late arrivals. To see if the age effect for late ustrated in Figs. 6.2 and 6.3A6.3C. Results from
arrivals was an artifact of scores at the low end of pooled samples tended to conform to Fig. 6.3C.
performance, Birdsong and Molis performed ad- Data from disaggregated samples resembled either
ditional regression analyses, removing the lowest Fig. 6.3A or Fig. 6.3B. None of the results matched
two and lowest three scores; resulting values were up with the stretched Z features of a critical period
still signicant. for SLA given in Fig. 6.2. (The results also did not
Interestingly, Bialystok and Hakutas (1994) resemble a stretched L, a gure representing the
reanalysis of the Johnson and Newport (1989) re- possibility that the sensitivity decline begins close to
sults revealed that, by moving the early arrival/late birth. Like the stretched Z, the stretched L image
arrival group cutoff point to an age of arrival of 20 captures the eventual stabilization of sensitivity at
years, the resulting linear correlation for late arri- low levels.) In particular, the Johnson and Newport
vals increased to statistical signicance (r .50, (1989) data did not pattern with critical period
p < .05). (In a separate analysis of the Johnson and geometry; the random dispersion of their late arri-
Newport data, Bialystok and Hakuta demonstrated vals scores was not consistent with a oor effect
that the best-tting linear functions were obtained in the sensitivity function. Moreover, Bialystok
with a cutoff at age of arrival 20 years.) and Hakutas (1994) reanalysis of the Johnson
Finally, consistent with this emergent picture of and Newport data revealed that declines of perfor-
age effects among late arrivals (DeKeyser, 2000, mance with increasing age of arrival persisted
is the exception), Birdsongs (1992) study of among subjects with age of arrival of 20 years.
Anglophone late learners of French (age of arrival Thus, the available behavioral evidence suggests
11.528 years) showed a signicant decline in per- that age effects in SLA do not operate within a well-
formance on a grammaticality judgment task with dened temporal period. Further, the occasional
increasing age of arrival (r .51, p .02). nonlinearities in the age function do not reliably
map onto predicted developmental and geometric
patterns of declining sensitivity under the CPH/SLA.
Summary What these results do suggest is an effect that
persists over the span of age of arrival. To what can
Reviewing the analyses reported in this section, the this ongoing decline be attributed? In the Age and
following are observed: Other Factors section of this chapter, a speculative
120 Acquisition
account is considered that relates to neurocognitive critical period relates to the point at which sensi-
aging. Alternatively, one might try to salvage mat- tivity has bottomed out, i.e., the end, not the be-
uration as a causal mechanism and couple it with ginning, of the offset.)
other factors. That is, multiple mechanisms could At the time Long (1990) set forth his Popperian
be invoked that underlie the generation of a single criterion, there was little evidence of nativelike-
linear function. For example, under such a view ness to threaten the CPH/SLA. For example, with
maturational effects might take their toll on lan- respect to pronunciation, none of the roughly 20
guage learning ability up through the midteens, subjects with age of arrival of 12 years in Oyamas
then general age effects would be responsible for (1976) study and only one of Patkowskis (1980)
subsequent declines. Likewise, limits on attainment 34 subjects with age of arrival older than 15 years
arising from a synergistic blend of biology and performed within the range of native controls. In
linguistic representation could be imagined. Per- the area of morphosyntax, none of Coppieterss
haps maturation is an early constraining factor; (1987) 21 adult learners of French (ages of arrival
as L1 representations become progressively en- not specied) and none of Johnson and Newports
trenched (both during and beyond the maturational (1989) 23 late learners of English (age of arrival
period), they inhibit the establishment of compet- of 17 years) performed within the range of na-
ing linguistic representations. tives. (For additional discussion of nonnativelike
It is not the purpose of this chapter to evaluate outcomes, see Hyltenstam, 1992; Hyltenstam &
such multimechanism formulations, although it Abrahamsson, 2000; Long, 1990.)
must be said that they do not have parsimony to However, several experimental studies have
recommend them. However, it is again emphasized demonstrated that nativelike attainment is not
that maturational effects and age effects are not an impossibility for late learners of SLA. A sample of
synonymous. The Merck Manual of Geriatrics research in which nativelikeness was observed in-
(Beers & Berkow, 2000) explains that maturation cludes the work of Birdsong (1992, 2003); Bon-
is a circumscribed phase within the biological gaerts (1999); Cranshaw (1997); Ioup, Boustagui,
process of aging. This relationship is played out El Tigi, and Moselle (1994); Juffs and Harrington
in quite different predictionsand behavioral (1995); Mayberry (1993); Montrul and Slabakova
evidencefor age effects and maturational effects (2001); Van Wuijtswinkel (1994); and White and
in L2 learning. Genesee (1996). These studies dealt with a variety of
target and L1s (including American Sign Language)
and covered a range of grammatical features, in-
cluding wh-movement and tense/aspect distinctions,
Nativelike Attainment as well as pronunciation. In most of these studies,
the incidence of nativelike attainment ranged from
Interpreting the Incidence of 5% of the sample to 15% or above. For a summary
Nativelike Attainment Among of evidence for nativelikeness, see Birdsong (1999a).
Late Learners Some evidence relating to nativelikeness in late
SLA is ambiguous. For example, 6% of the late-
Nativelikeness is dened in the experimental con- arriving participants in the study of Flege et al.
text as L2 learners performance that falls within (1995) performed with nativelike pronunciation,
the range of native control subjects (some studies but all had ages of arrival younger than 16 years.
employ stricter criteria, such as performance The late learners in the work of Birdsong and Molis
within a standard deviation above or below natives (2001) obtained scores well above those in Johnson
means). The attainment of nativelikeness in late and Newports (1989) study, but only 1 subject of
SLA has been viewed as evidence for falsication of 32 late arrivals had a score in the native range, and
the CPH/SLA. Under the criterion proposed by this subjects age of arrival was a relatively young
Long (1990), a single learner who began learning 18 years. However, within this group, 13 partici-
after the [critical period] closed and yet whose pants achieved 92% or higher accuracy scores, and
underlying linguistic knowledge . . . was shown to 3 of these scores were above 95%.
be indistinguishable from that of a monolingual Evidence of nativelike attainment in late SLA
native speaker would serve to refute the [CPH] and its relevance to the CPH/SLA must be con-
(p. 255). (Here, with reference to the discussion of sidered with due caution. For example, if valid
the geometric features of a critical period, it is as- comparisons with native controls are to be made,
sumed that Longs reference to the closure of the the learner sample should not differ from the
Age Effects in Second Language Acquisition 121
sample of natives in terms of education level and normal distribution, such numbers clearly occupy a
chronological age. Care must also be taken not to meaningful area under the bell curve, moving out-
overestimate the incidence of nativelikeness by ward from a point less than two standard devia-
basing the estimate on a sample of the cream of tions away from the mean. In this sense, they are
the crop, that is, learners screened for high levels not outliers to be treated dismissively. More-
of attainment prior to experimentation, as was the over, a 5% or greater incidence of nativelikeness
case in the work of Montrul and Slabakova (2001) would imply that there are substantial numbers of
and White and Genesee (1996). late learners who have attained near-nativelike
Normally, the success rate is based on a random prociency. Age in SLA is not so constraining a
sample of participants who meet a residency re- factor that it prevents late learners from making
quirement, often an LoR of 10 years or longer. As remarkable strides toward nativelikeness, a fact
noted, this is a methodological move intended to recognized by Lenneberg (1967, p. 176).
ensure that subjects are at their end state of SLA Some researchers argued, however, that refer-
not to ensure high levels of prociencywith the encing of attainment to a monolingual standard
result that varying degrees of nativelikeness are is inappropriate for research in bilingualism (see
represented in the sample. With this range of out- Cook, 1997; Grosjean, 1989). Bilinguals are not
comes, researchers are able to carry out informa- two monolinguals in one in any social, psycho-
tive correlations with various biographical factors, linguistic, or cognitive neurofunctional sense. From
such as amount of L2 use. Further, the use of this perspective, it is of questionable methodologi-
unscreened samples allows for safer generalization cal value to quantify bilinguals linguistic attain-
of observed incidences of nativelikeness to broader ment as a proportion of monolinguals attainment,
populations (Birdsong, 2004). with those bilinguals reaching 100% levels of at-
Similarly, an understatement of the rate of na- tainment considered nativelike. However, as Mack
tivelikeness can be an artifact of sampling proce- (1997) and Birdsong and Molis (2001) pointed out,
dures. To target individuals who are at or near demonstrations of nativelike performance by late
their SLA end state, and thus who have reached bilinguals are at least of heuristic utility because
their limits of attainment, the bilingual sample they constitute a challenge to received views that
should consist of those subjected to benign exo- the upper limits of late SLA are inevitably infe-
genous conditions such as extended residence and rior to those of L1 acquisition.
interaction with natives. Irrelevant to determining
the incidence of nativelikeness are individuals who
have had occasional naturalistic exposure to an L2 Nativelike Attainment
or whose exposure was limited to foreign language and the Age Function
course work.
As a way to imagine the relatively small size Let us now reconsider the nding by Birdsong
of the appropriate L2 population from which to and Molis (2001) and Flege et al. (1995) that nati-
sample, consider the input conditions of L1 ac- velikeness in the late age of arrival samples is ob-
quisition. To be ecologically comparable to L1 served among learners with relatively early ages of
learners in the rst 5 years of life, each of the in- arrival, that is, those with ages of arrival in the late
dividuals representing the L2 population should teens. First, note that further empirical study needs
have had more than 6 million target language to be carried out to determine if the incidence of
utterances directed to him or her by native speakers nativelikeness is indeed conned to early late
(Birdsong, 1999a). It is likely that statements about learners. If this does turn out to be a valid gen-
the insignicant rate of success in SLAbelow 5% eralization, how may it be interpreted?
according to Bley-Vroman (1989) and Selinker On the one hand, advocates of the CPH/SLA
(1972)are based on consideration of individuals could argue that their position is not threatened
with considerably less input than this. That is, the if the only apparent exceptions are those learners
low estimates may have been pegged to a much with ages of arrival on the fringe of a period with
larger and less-relevant population. approximative temporal milestones. A contrasting
What is to be made of a 5% or greater rate of interpretation is subtler. Recall the premise that the
nativelikeness in a sample of a relevant population midteen years roughly mark the end of the offset of
of late bilinguals? Certainly, such an incidence sensitivity, after which sensitivity should not con-
exceeds Longs (1990) criterion for rejection of the tinue to decline, but should level off. By this ac-
CPH/SLA. Further, viewed as parameters within a count, individuals with ages of arrival of 15 years
122 Acquisition
and beyond cannot be differentiated in terms of of arrival. Plainly, for the distribution illustrated in
sensitivity level; therefore, among late learners the Fig. 6.8A, the incidence of scores falling within the
incidence of nativelikeness should not be expected range of native controls performance is greater
to be conned to just those with ages of arrival in than in Fig. 6.8B. Thus, it can be seen that, from a
the late teens and early 20s. In contrast, if sensi- shallow slope, the incidence of nativelikeness can
tivity continues to decline indenitely, then the be predicted to exceed the rate of nativelikeness
probability of nativelike attainment should steadily associated with a steeper slope.
decline with all ages of arrival past the period of In the next section, various factors are con-
peak sensitivity. Thus, among late L2 learners, a sidered that may affect the slope of the age func-
decreased incidence of nativelikeness with advan- tion. As this slope and the rate of nativelikeness
cing age of arrival would suggest that the function have been shown to covary, it can be predicted that
is not in fact characterized by a attening out of these factors likewise inuence the probability that
sensitivity. nativelikeness will be observed at a given age of
This section concludes by putting a ne point on arrival.
the notion that the later the age of arrival is, the
lower the incidence of nativelike performance will
be. To understand this logical entailment fully, the Age and Other Factors
slope of the function must be considered. If the slope
of the age-related decline in performance is shal- It is frequently noted in the SLA literature that
low, then at a given age of arrival the incidence of age effects are moderated by other variables. For
nativelikeness will be greater than if the slope is example, with respect to self-reported English
steep (this comparison assumes that both functions prociency of Chinese and Spanish immigrants,
are linear, and that their declines begin at the same Bialystok and Hakuta (1999) found that varying
age of arrival). This rather self-evident relationship years of education were associated with different
is illustrated in Figs. 6.8A and 6.8B. For illustrative slopes of the age function. Flege and colleagues
purposes, an age of arrival of 25 years was chosen. (e.g., Flege et al., 1999; Flege, Frieda, & Nozawa,
For both the relatively shallow age function in Fig. 1997) showed that greater use of the L2 is asso-
6.8A and the steeper function in Fig. 6.8B, let us ciated with less-accented L2 pronunciation across
assume a normal distribution of scores at this age various ages of arrival. The contrasting slopes seen
of arrival and similar kurtosis of the bell curves. Let in Fig. 6.7 are arguably a consequence of the dif-
us further assume that the regression lines in both ferent native language backgrounds of the sub-
cases intersect with the central point (mean, med- jects (Chinese and Korean for Johnson & Newport,
ian, mode) of the distributions of scores at that age 1989; Spanish for Birdsong & Molis, 2001).
Figure 6.8. Hypothetical shallow attainment by (A) age of arrival (AoA) function and (B) steeper function,
with superimposed distribution of attainment values at AoA 25 years.
Age Effects in Second Language Acquisition 123
A variety of other cognitive, task-related, attitu- testing for the group with an age of arrival of 610
dinal, experiential, demographic, aptitude, and years ranged from 16 to 26 years; for the group with
training variables may affect the slope of the age an age of arrival of 1115 years, the chronological
function and, as discussed, the rate of nativelike- age was 2131 years, and participants in the group
ness at a given age of arrival. For further con- with an age of arrival of 1620 years were between
sideration of these factors, see Bialystok and Miller 26 and 36 years of age at testing.
(1999); Klein (1995); Marinova-Todd et al. (2000); We (Birdsong & Flege, 2001) presented 80 sen-
Moyer (1999; 2001); Pulvermuller and Schumann tences containing regular and irregular English verb
(1994); Singleton (1989, 2001); Skehan (1989); pasts and noun plurals in a multiple-choice format
and Stevens (1999). on a laptop computer, and subjects were asked to
Of particular theoretical interest is the role of indicate which of ve options was the correct in-
linguistic variables in determining the timing and ected form of the verb or noun. Within each class
shape features of the age function in SLA. As noted of regulars and irregulars, equal numbers of high-
by Eubank and Gregg (1999), Flege et al. (1999), and low-frequency nouns and verbs were rep-
Flynn and Manuel (1991), Seliger (1978), and resented. (An additional 20 sentences contained
others, the effect of age of arrival is not uniform for phrasal verbs, e.g., look in on, which are under-
all aspects of L2 knowledge. stood to be a class of irregular or idiosyncratic
A study by Birdsong and Flege (2001) illustrated features of the lexicon.) Response latencies and
the heuristic potential of investigating possible in- accuracies were analyzed. Among both Korean and
teractions of age of arrival and linguistic variables, Spanish subjects, we observed strong frequency ef-
specically a theorized distinction between regulars fects for irregulars, but only weak ones for regulars,
and irregulars. Pinker (1999, inter alia) argued that a result consonant with the basic premises of the
computation of regular inectional morphology in dual-mechanism model. Age effects were also asym-
verb pasts (e.g., talk-ed) and noun plurals (e.g., metrical: For regulars, increasing age of arrival
pen-s) involves processing of the compositional had little effect on either response times or accu-
features stem plus afx, which are represented racy, whereas for irregulars, response times in-
symbolically. In contrast to this rule-based com- creased and accuracy decreased with advancing
putation, irregulars (e.g., bought and geese) are age of arrival. Fig. 6.9 represents approximate
accessed as units from lexical (associative) mem-
ory. Ullman (2001b) reviewed an array of beha-
vioral and neurofunctional evidence that pointed to
the dissociability of rule-based and lexical knowl-
edge and proposed that different types of memory
and different neural substrates are involved in
processing regulars versus irregulars. (Crucial de-
tails of the dual-mechanism model and challenges
to it are beyond the scope here; see McClelland &
Patterson, 2002; Pinker & Ullman, 2002.)
For the SLA context, Flege et al. (1999) had no-
ted that, with increasing age of arrival, Korean
learners of English were less accurate in their judg-
ments of items exemplifying arbitrary features of
English than for items that exemplied predictable
regularities. My 2001 study with Flege was designed
to pursue further this interaction of age of arrival by
regularity. We recruited L2 English subjects who
were Spanish (n 30) and Korean (n 30) natives
with LoRs between 10 and 16 years; thus, they were
at or near SLA asymptote. For each native language,
groups of 10 participants represented ages of arrival
of 610, 1115, and 1620 years. To disconfound Figure 6.9. Approximate accuracy scores on reg-
somewhat the factors of age at testing and age of ular versus irregular items, by age of arrival (AoA)
arrival, subjects chronological age ranges over- group, for Korean and Spanish participants in the
lapped the ranges of age of arrival. Thus, age at study of Birdsong and Flege (2001).
124 Acquisition
accuracy values for pooled Spanish and Korean declarative information may be more negatively
subjects. Details of the accuracy data underscore impacted by aging than other brain areas, particu-
the interaction of age of arrival and the linguistic larly more so than those implicated in coordinating
variable in question. Among Spanish natives, pro- activities in real time (i.e., the combinatorial op-
portions correct by increasing age of arrival group erations involved in regular afxation). Note that
were .94, .89, and .92 for regulars and .71, .68, and this putative decit is not thought to affect the abil-
.51 for irregulars. Corresponding values for Kor- ity to accumulate new lexical itemsfor example,
eans were .98, .95, and .93 for regulars and .87, .74, as we age we add neologisms like e-mail and
and .69 for irregulars. Brovetto (2002) suggested blogger to our vocabularybut impairs the re-
that regularirregular dissociations can be observed trieval of prescribed (irregular) phonological forms.
among higher prociency L2 learners (thus mirror- This avenue of investigation lends itself to the
ing the organization observed at the L1 acquisition formulation of several falsiable hypotheses:
end state), but not among learners at lower levels
of L2 prociency. 1. With respect to linguistic features, age effects
This behavioral dissociation is an example of a in SLA are not indiscriminate, because they
research nding that speaks to the need for ner- disrupt the learning and retrieval of idio-
grained perspectives on the SLA age question than syncratic, irregular forms (which are usually
the simple earlier is better rule of thumb. In ascribed to the lexicon) more than abstract
addition, it opens the door to new speculation on elements of the grammar.
the possible mechanisms underlying age effects. 2. The slopes of the age functions associated
Ullman (2001b) argued that declarative memory is with knowledge of regulars and irregulars
involved in learning and storing assorted facts, are distinct.
whereas computations of regular forms and other 3. The age functions for both regulars and
symbolic or rule-based knowledge are coordinated irregulars are not characterized by discon-
by the procedural memory system. tinuities or bottoming out, but by linear,
I suggested (Birdsong, 2004) that declarative unbounded performance decrements.
memory and its associated neuroanatomy are dis- 4. The class of irregulars may include not only
crepantly affected by aging. For example, cortisol irregular inections on nouns and verbs, but
levels increase with age, leading to atrophy of the also other idiosyncratic linguistic facts, such
hippocampal area and impairing learning and de- as the choice and placement of particles and
clarative memory function (Lupien, Lecours, Lussier, prepositions in phrasal verbs.
Schwartz, Nair, & Meaney, 1994; Lupien et al., 5. In terms of neurofunctional anatomy, there
1998). In addition, starting as early as 30 years of are distinct loci of age effects in SLA that
age, neurobrillary tangles and neuritic plaques de- are relatable to the regularirregular dis-
velop in the normal brain. These degenerative his- tinction.
tological features are most prevalent in the neural 6. Cognitively, SLA age effects are most ap-
regions that subserve declarative memory, speci- parent in a declarative memory system that is
cally the hippocampus and temporal-associative not dedicated uniquely to facts of language.
cortex (Scheibel, 1996). Starting at age 20 years, 7. Declining linguistic performance is not re-
normal aging is associated with declines in dopamine lated specically to maturation, but to the
D2 receptors in the hippocampus and frontal cortex aging process more generally.
areas (Li, Lindenberger, & Sikstrom, 2001). How-
ever, such declines are also found in the basal
ganglia, anterior cingulate cortex, and amygdala, Conclusion
leading to an important clarication: The associative
areas of the brain are not the only ones affected by The objective of the preceding section was not to
aging, but age-related declines may be more severe in propose a research agenda, but to illustrate how the
the associative areas than in other areas. Consistent understanding of SLA age effects might benet from
with this idea is the general observation that de- principled, granular investigations of linguistic vari-
clarative memory abilities decline dramatically more ables that may interact with the age factor. Similarly,
with age than procedural memory functions. the other sections of the chapter were intended to
To sum this line of speculation, the neural sub- suggest ways that age effects in SLA can be inter-
strates underlying the learning and processing of preted without being bound to ill-suited constructs
Age Effects in Second Language Acquisition 125
such as the CPH/SLA and to received notions such as and the critical period hypothesis (pp. 122).
an insignicant incidence of nativelikeness. Mahwah, NJ: Erlbaum.
The observed patterns of linguistic behavior Birdsong, D. (Ed.). (1999b). Second language
reviewed in this chapter suggest that the use it, acquisition and the critical period hypothesis.
then lose it characterization of SLA age effects is Mahwah, NJ: Erlbaum.
Birdsong, D. (2003). Authenticite de prononciation
imprecise on several counts. First, the decline in
en francais L2 chez des apprenants tardifs
attained L2 prociency is not linked to matura- anglophones: Analyses segmentales et glo-
tional milestones, but persists over the age spec- bales. Acquisition et Interaction en Langue
trum; this is progressive loss, not decisive loss. Etrange`re, 18, 1736.
At the same time, any number of exogenous and Birdsong, D. (2004). Second language acquisition
endogenous variables may come into play that can and ultimate attainment. In A. Davies and
atten the slope of the decline and result in sig- C. Elder (Eds.), The handbook of applied
nicant numbers of nativelike attainers. Not ev- linguistics (pp. 82105). London: Blackwell.
erybody is losing it to the same degree. Finally, Birdsong, D., & Flege, J. E. (2001). Regular-
the it of the characterization suggests a single irregular dissociations in the acquisition of
English as a second language. In A. H.-J. Do,
monolithic learning faculty. However, it is likely
L. Domnguez, & A. Johansen (Eds.), BUCLD
that L2 learning involves distinct cognitive and 25: Proceedings of the 25th Annual Boston
neural components with differential susceptibilities University Conference on Language Develop-
to the effects of age. ment (pp. 123132). Boston: Cascadilla Press.
Birdsong, D., & Molis, M. (2001). On the evidence
References for maturational effects in second language
acquisition. Journal of Memory and
Abutalebi, J., Cappa, S. F., & Perani, D. (2001). Language, 44, 235249.
The bilingual brain as revealed by functional Bley-Vroman, R. (1989). What is the logical pro-
neuroimaging. Bilingualism: Language and blem of foreign language learning? In S. Gass
Cognition, 4, 179190. & J. Schachter (Eds.), Linguistic perspectives
Au, T. K.-F., Knightly, L. M., Jun, S.-A., & Oh, J. S. on second language acquisition (pp. 4168).
(2002). Overhearing a language during child- Cambridge, U.K.: Cambridge University Press.
hood. Psychological Science, 13, 238243. Bongaerts, T. (1999). Ultimate attainment in for-
Beers, M. H., & Berkow, R. (Eds.). (2000). The eign language pronunciation: The case of very
Merck manual of geriatrics. Whitehouse advanced late foreign language learners. In
Station, NJ: Merck. D. Birdsong (Ed.), Second language acquisi-
Bever, T. G. (1981). Normal acquisition processes tion and the critical period hypothesis
explain the critical period for language learn- (pp. 133159). Mahwah, NJ: Erlbaum.
ing. In K. C. Diller (Ed.), Individual differ- Bornstein, M. H. (1989). Sensitive periods in
ences and universals in language learning development: Structural characteristics and
aptitude (pp. 176198). Rowley, MA: causal interpretations. Psychological Bulletin,
Newbury House. 105, 179197.
Bialystok, E., & Hakuta, K. (1994). In other Brovetto, C. (2002). The representation and pro-
words: The science and psychology of second- cessing of verbal morphology in the rst and
language acquisition. New York: Basic Books. second language. Unpublished doctoral
Bialystok, E., & Hakuta, K. (1999). Confounded dissertation, Georgetown University,
age: Linguistic and cognitive factors in age Washington, DC.
differences for second language acquisition. In Cook, V. J. (1997). Monolingual bias in second
D. Birdsong (Ed.), Second language acquisi- language acquisition research. Revista Canaria
tion and the critical period hypothesis (pp. de Estudios Ingleses, 34, 3549.
161181). Mahwah, NJ: Erlbaum. Coppieters, R. (1987). Competence differences
Bialystok, E., & Miller, B. (1999). The problem of between native and near-native speakers.
age in second language acquisition: Inuences Language, 63, 544573.
from language, task, and structure. Bilingual- Cranshaw, A. (1997). A study of Anglophone
ism: Language and Cognition, 2, 127145. native and near-native linguistic and meta-
Birdsong, D. (1992). Ultimate attainment in second linguistic performance. Unpublished doctoral
language acquisition. Language, 68, 706755. dissertation, Universite de Montreal, Canada.
Birdsong, D. (1999a). Introduction: Whys and why Curtiss, S. R. (1977). Genie: A linguistic study
nots of the critical period hypothesis. In D. of a modern day wild child. New York:
Birdsong (Ed.), Second language acquisition Academic Press.
126 Acquisition
Marler, P. (1991). The instinct to learn. In S. Carey acquisition. Language Learning, 44,
& R. Gelman (Eds.), The epigenesis of mind: 681734.
Essays on biology and cognition (pp. 3766). Scheibel, A. B. (1996). Structural and functional
Hillsdale, NJ: Erlbaum. changes in the aging brain. In J. E. Birren &
Mayberry, R. (1993). First-language acquisition K. W. Schaie (Eds.), Handbook of the
after childhood differs from second-language psychology of aging (4th ed.) (pp. 105128).
acquisition: The case of American Sign San Diego, CA: Academic Press.
Language. Journal of Speech and Hearing Scovel, T. (1988). A time to speak: A psycho-
Research, 36, 12581270. linguistic inquiry into the critical period
McClelland, J. L., & Patterson, K. (2002). Rules or for human speech. Rowley, MA: Newbury
connections in past-tense inections: What House.
does the evidence rule out? Trends in Seliger, H. W. (1978). Implications of a multiple
Cognitive Sciences, 6, 465472. critical periods hypothesis for second
Montrul, S., & Slabakova, R. (2001). Is native-like language learning. In W. Ritchie (Ed.), Second
competence possible in L2 acquisition? In language acquisition research: Issues and
A. H.-J. Do, L. Domnguez, & A. Johansen implications (pp. 1119). New York:
(Eds.), BUCLD 25: Proceedings of the 25th Academic Press.
Annual Boston University Conference on Selinker, L. (1972). Interlanguage. International
Language Development (pp. 522533). Review of Applied Linguistics, 10, 209231.
Boston: Cascadilla Press. Singleton, D. (1989). Language acquisition: the age
Moyer, A. (1999). Ultimate attainment in L2 factor. Clevedon, U.K.: Multilingual Matters.
phonology. Studies in Second Language Singleton, D. (2001). Age and second language
Acquisition, 21, 81108. acquisition. Annual Review of Applied
Moyer, A. (2001). Beyond ultimate attainment: Linguistics, 21, 7789.
Contextualizing language acquisition inquiry Skehan, P. (1989). Individual differences in second-
for a multicultural Germany. Unpublished language learning. London: Arnold.
manuscript, University of Maryland, College Stevens, G. (1999). Age at immigration and second
Park. language proficiency among foreign-born
Neter, J., Kutner, M., Nachtsheim, C., & Wasser- adults. Language in Society, 28, 555578.
man, W. (1996). Applied linear statistical Towell, R., & Hawkins, R. (1994). Approaches to
models. Chicago: Irwin. second language acquisition. Clevedon, U.K.:
Newport, E. L. (1991). Contrasting conceptions Multilingual Matters.
of the critical period for language. In S. Carey Ullman, M. T. (2001a). The neural basis of lexicon
& R. Gelman (Eds.), The epigenesis of mind and grammar in first and second language:
(pp. 111130). Hillsdale, NJ: Erlbaum. The declarative/procedural model. Bilingual-
Oyama, S. (1976). A sensitive period for the ism: Language and Cognition, 4, 105122.
acquisition of a nonnative phonological Ullman, M. T. (2001b). A neurocognitive
system. Journal of Psycholinguistic Research, perspective on language: The declarative/
5, 261283. procedural model. Nature Reviews
Patkowski, M. S. (1980). The sensitive period for Neuroscience, 2, 717727.
the acquisition of syntax in a second language. Van Wuijtswinkel, K. (1994). Critical period ef-
Language Learning, 30, 449472. fects on the acquisition of grammatical com-
Patkowski, M. S. (1990). Age and accent in a petence in a second language. Unpublished
second language: A reply to James Emil Flege. thesis, Katholieke Universiteit, Nijmegen, The
Applied Linguistics, 11, 7389. Netherlands.
Pinker, S. (1994). The language instinct: How the White, L., & Genesee, F. (1996). How native is
mind creates language. New York: Morrow. near-native? The issue of ultimate attainment
Pinker, S. (1999). Words and rules. New York: in adult second language acquisition. Second
Basic Books. Language Research, 12, 238265.
Pinker, S., & Ullman, M. (2002). The past and Yeni-Komshian, G., Flege, J. E., & Liu, S. (1997).
future of the past tense. Trends in Cognitive Pronunciation prociency in L1 and L2 among
Sciences, 6, 456463. Korean-English bilinguals: The effect of age of
Pulvermuller, F., & Schumann, J. H. (1994). arrival in the U.S. Journal of the Acoustical
Neurobiological mechanisms of language Society of America, 102(A), 3138.
Manfred Pienemann
Bruno Di Biase
Satomi Kawaguchi
7 Gisela Hakansson
ABSTRACT This chapter focuses on the interplay between rst language (L1) transfer
and psycholinguistic constraints on second language (L2) processability. The theoret-
ical assumptions underlying this chapter are those made in processability theory (PT)
(Pienemann, 1998), which include, in particular, the following two hypotheses: (a) that
L1 transfer is constrained by the processability of the given structure, and (b) that the
initial state of the L2 does not necessarily equal the nal state of the L1 because there is
no guarantee that the given L1 structure is processable by the underdeveloped L2
parser. In other words, it is assumed that L1 transfer is constrained by the capacity
of the language processor of the L2 learner (or bilingual speaker) irrespective of the
typological distance between the two languages. Using the PT hierarchy as a com-
parative matrix, we demonstrate on the basis of empirical studies of L2 acquisition that
learners of closely related languages do not necessarily transfer grammatical features at
the initial state even if these features are contained in L1 and L2, providing the features
are located higher up the processability hierarchy. We further demonstrate that such
features will be transferred when the interlanguage has developed the necessary pro-
cessing prerequisites. In addition, we demonstrate that typological distance and dif-
ferences in grammatical marking need not constitute a barrier to learning if the feature
to be learned is processable at the given point in time. All of this demonstrates that
processability is a key variable in L1 transfer.
128
L1 Transfer 129
By the 1970s, the outlook on L1 transfer had other words, looking at the Italian-German contrast
changed under the inuence of the newly emerging by itself, there appears to be a case for transfer.
discipline of second language acquisition research, However, once the contrast inherent in the German-
which was fueled initially in particular by the idea English constellation is included, it can be seen that
that learners construct their own linguistic systems the lack of a whybecause distinction appears in
that may be quite independent of the L1 and L2. the interlanguage independent from the typological
With the new emphasis on the creative construc- contrast between L1 and L2. In such cases, the
tion process (e.g., Dulay & Burt, 1974) in L2 transfer hypothesis has to be rejected. Instead, it has
acquisition, the notion of L1 transfer appeared a to be concluded that this is a genuine interlanguage
less attractive explanatory concept. It became clear feature. Unfortunately, this logic has not always
that specic L2 acquisition theories were needed been adhered to in later research. However, in mak-
that would be able to predict the exact conditions ing a case for developmental constraints on L1
for creative construction on the one hand and for transfer, we apply exactly this logic in evaluating
L1 transfer on the other hand because these ex- the signicance of empirical evidence for or against
planations compete with each other. competing theoretical positions, including our own.
In this context, Felix (1980) argued that the role Another example of an interlanguage feature
of L1 transfer can be determined in empirical stud- that may, at rst glance, be taken as an instance of
ies only if the null hypothesis is tested in a typologi- L1 transfer appeared in the acquisition of English
cal manner. This requires a systematic typological question formation. Lightbown and Spada (1999)
comparison of the given linguistic feature in L1 and found that French learners of English made a dis-
L2 and in the interlanguage (i.e., the learner lan- tinction between subject-verb inversion with pro-
guage) in the following constellation: nominal and referential subjects (containing a
noun). Lightbown and Spada claimed that the
preference of the French learners of English for
First Second pronominal subjects was caused by inuence from
language language Interlanguage the L1 because French requires inversion only with
Feature x pronouns, not with referential subjects. However,
Feature x when the same task was given to a group of
Swedish learners of English (Ewehag & Jarnum,
2001), the same pattern was found. Again, inver-
Felix (1980) argued as follows: When the in-
sion was preferred with pronominal subjects. Given
terlanguage contains an L2 deviation that is struc-
that Swedish does not make a distinction between
turally similar to the L1, it can be assumed that this
pronominal and referential subjects, the preference
structure has been transferred from the L1 only if
shown by the Swedish learners cannot be attributed
this structure does not appear in the interlanguages
to transfer.
of other learners of the same L2 whose L1 does not
A key factor in evaluating research on L1
contain the feature in question. Using this matrix in
transfer is its underlying theoretical basis. As
empirical studies, Felix demonstrated, however,
Weinreich (1953/1974) pointed out, a great deal
that in the above constellation features are often
of research before his time was inexplicit or even
assigned as follows:
prescientic in its theoretical foundation. The ma-
jority of studies that were conducted in the two
decades following Weinreich were associated with
First Second
behaviorist ideas that, in the view of many L2
language language Interlanguage
acquisition researchers, were discredited by the
Feature x
above-mentioned rationalist critique. This may be
Feature x
one reason why in the past three decades a great
deal of research on L1 transfer has been carried out
Felix (1980) illustrated this with the distinction within a largely rationalist paradigm that makes
between why and because, which is made in English two key assumptions: (a) the modularity of mind
and German (warum, weil), but not in spoken Ital- and (b) the existence of a universal grammar to
ian. He showed that, despite this linguistic contrast which learners may or may not have access. To
(in their L1), both Italian learners of German and describe more clearly the research carried out in
German learners of English fail to observe this dis- this tradition, it may be useful to sketch the back-
tinction in their interlanguage at a certain stage. In ground of these two assumptions.
130 Acquisition
In the rationalist tradition, learnability analyses to which L2 learners are thought to have access to
have been based on four components that must be universal grammar and according to the degree to
specied in any learnability theory (e.g., Pinker, which L1 knowledge is transferred to the L2.
1979; Wexler & Culicover, 1980): (a) the target The most radical position is that of Schwartz
grammar, (b) the data input to the learner, (c) the and Sprouse (1994, 1996), who proposed the full
learning device that must acquire that grammar, transfer/full access model. These authors assumed
and (d) the initial state. that the initial state of L2 acquisition is the nal
The rationale for assuming these components is state of L1 acquisition (Schwartz & Sprouse, 1996,
rooted in the way in which learnability theory has p. 40). Schwartz and Sprouse assumed that L2 learn-
been formulated in response to the logical prob- ers have full access to universal grammar. How-
lem in language acquisition (cf. Wexler, 1982). ever, they believed that parameters are already set
The logical problem basically describes the fol- as in the L1. In this perspective, L2 acquisition is
lowing paradox: Children acquire the basic prin- seen as the process of restructuring the existing
ciples of their native language in a relatively short system of grammatical knowledge. In keeping with
period of time and on the basis of limited linguistic this, if positive evidence in the input is needed to
input. Many of these principles are said to be im- restructure aspects of L1 knowledge, and this evi-
possible to infer from the observations made by the dence is not available or obscure, then this can
learner. lead to fossilization. The last process is thought to
In other words, the rationalist approach pro- explain why, contrary to child language, in L2 ac-
posed by Wexler (1982) characterizes a theory of quisition convergence with the TL [Target Lan-
language learnability as a solely linguistic problem guage] grammar is not guaranteed (Schwartz &
of the relationship between the representation of Sprouse, 1996, p. 42).
linguistic knowledge and the acquisition of that The full transfer/full access model has also been
knowledge. This is why the four components of assumed in research within a nonparametric frame-
such a theory are described as (a) the target gram- work (LaFond, Hayes, & Bhatt, 2001) based on
mar, which describes linguistic knowledge; (b) lin- optimality theory (Tesar & Smolensky, 1998). In
guistic input; and (c) the learning device, which has this context, universal grammar is seen as the basis
to acquire the target grammar given a certain set of of innately specied grammatical knowledge, and
knowledge contained in the (d) initial state. optimality theory is used as the learning mecha-
These assumptions go hand in hand with the nism that allows the learner to restructure the L1
assumed autonomy of syntax. Chomsky (1990) constraint hierarchies to conform with the hierar-
claimed that natural languages display properties chies found in the L2. Unfortunately, alternative
learnable only if the principles underlying these models of access to universal grammar were not
properties do not have to be acquired explicitly but evaluated in this research.
can be inferred from the structure of an innate The position of Vainikka and Young-Scholten
cognitive system (spelled out in universal grammar, (1994, 1996) differed from that of Schwartz and
which contains universal principles and param- Sprouses (1996) full access/full transfer position in
eters1). Fodor (1981) argued that such principles the amount of transfer assumed to occur from the
cannot be reduced to principles of other domains of L1 to the L2 setting of parameters. For Vainikka
cognition, and that therefore it must be assumed and Young-Scholten, transfer is limited to lexical
that they are specic to the linguistic domain of categories; they assumed that the L2 initial state
cognition, which is similar in its specicity to the contains only lexical categories and their projec-
visual domain of cognition. This assumption is tions (including the directionality of their heads).
known as the modularity hypothesis and forms the However, functional categories are not transferred
basis for limiting the components of a theory of from the L1. They therefore described their posi-
language learnability to the above four compo- tion as the minimal tree position, which in effect
nents: If there is an independent linguistic module includes the assumption that L1 word order is
of cognition, then it is possible to study it in iso- transferred.
lation. This does not exclude interaction with other A further position was proposed by Eubank
cognitive systems, but it justies the reductionism (1993), who hypothesized that lexical and func-
present in Wexlers (1982) assumption. tional categories can be transferred to the L2, but
Scholars who accept that universal grammar that the feature strength associated with functional
plays a role in L2 acquisition attribute different categories is not transferred. Eubank argued that
roles to it. These roles vary according to the degree the feature strength of the inection will not be
L1 Transfer 131
transferred because the afxes themselves may be ventions of natural languages are created, governed,
fundamentally different from language to language. constrained, acquired, and used in the service of
Platzack (1996) proposed a universal initial communicative functions (p. 192).
hypothesis of syntax based on the Minimalist The competition model has been applied to
Program. His study focused on the acquisition of child language, language processing, and L2 ac-
Swedish word order, and he demonstrated that quisition. According to this model, it is the task of
word order constellations can be captured by the the language learner to discover the specic rela-
weak/strong distinction in functional heads.2 He tionship between the linguistic forms of a given
assumed that the default value of functional heads language and their communicative functions. The
is weak. If all functional heads are weak in a linguistic forms used to mark grammatical and se-
sentence, a universal default word order subject- mantic roles differ from language to language. For
verb-complement will be generated. Only if a instance, agreement marking, word order, and
functional head is strong can the position of animacy play different roles in the marking of
grammatical functions change. subjecthood and agency in different languages.
Platzack (1996) claimed that the initial syntac- Linguistic forms are seen as cues for semantic
tic hypothesis of the child must be that all syntactic interpretation in online comprehension and pro-
features are weak (p. 375). He further claimed duction, and different cues may compete, as in the
that the child has access to the full range of func- above case of the marking of subjecthood, hence
tional categories already at the time of rst sentence- the name competition model.
like utterances (p. 377). In other words, he claimed In the competition model, the process of learn-
that every human being is expected to assume ing linguistic forms is driven by the frequency and
from the outset that any unknown language s/he is complexity of formfunction relationships in the
exposed to, including the rst language, has the input. In this context, the majority of L2 learn-
word order subject-verb-complement: this is the ing problems are modeled in connectionist terms.
order obtained if there are no strong features at all MacWhinney (1987) exemplied this with the
(p. 378). Regarding L2 acquisition, he claimed that preverbal positioning of a linguistic form as a
we initially go back to. . .[the initial hypothesis of (processing) cue for the semantic actor role. He
syntax] when trying to come to grips with a second stated that the strength of this cue can be viewed
language (p. 380). as the weight on the connection between the pre-
The above views all have in common that L2 verbal positioning node (an input node) and the
learners are believed to have full access to universal actor role (an output node). If the preverbal posi-
grammar. However, several scholars hold other tioning node is activated, it then sends activation to
views. Felix (1984), Clahsen (1986), and Meisel the actor node in proportion to the weight on the
(1983, 1991) all developed models in which L2 connection (p. 320).
learners have limited or indirect access to universal The competition model has formed the con-
grammar. This assumption is congruent with the ceptual basis of experiments on bilingual sentence
observation that L2 acquirers do not necessarily processing (e.g., Gass, 1987; Harrington, 1987;
become native speakers of the L2. Given that the Kilborn & Ito, 1989; McDonald & Heilenman,
limited availability of universal grammar creates an 1991; Sasaki, 1991). In these studies, bilingual
explanatory void, these authors all made proposals speakers of different languages need to identify
for a more general cognitive substitute that can the function of different cues in L1 and L2. The
account for the somewhat decient process present input material is designed to reect the coordina-
in L2 acquisition. tion and competition of cues. For instance, Har-
The competition model (Bates & MacWhinney, rington (1987) studied the (competing) effect of
1981, 1982, 1987; MacWhinney, chapter 3, this word order, animacy, and stress on the compre-
volume) represents a fundamentally different ap- hension of Japanese and English sentences by na-
proach to language acquisition from the rationalist tive speakers and nonnative speakers of the two
tradition. It is a functionalist approach that is based languages who were all speakers of both languages.
on the assumption that linguistic behavior is con- Obviously, the three cues have different weights in
strained, among other things, by general cogni- the two target languages concerned. The results
tion (rather than a language-specic cognitive showed that L2 learners transferred their L1 pro-
module) and communicative needs. In keeping cessing strategies (i.e., weighting of cues) when
with the functionalist tradition, Bates and Mac- interpreting L2 sentences. This overall result was
Whinney (1981) assumed that the surface con- predicted by the competition model because, within
132 Acquisition
this framework, processing cues are not initially L1 structure is processable by the underdeveloped
separated by languages, and it is therefore to be L2 parser.
expected that their weighting is transferred. The key assumption of the processing perspec-
MacWhinney (chapter 3, this volume) also at- tive in L2 acquisition is that L2 learners can pro-
tributes a key role to L1 transfer in the acquisition duce only those linguistic forms for which they have
(as opposed to the processing) of an L2. This is acquired the necessary processing prerequisites
motivated mainly by the stark contrast in learning (Pienemann, 1998). Therefore, PT predicts that, re-
outcomes in L1 and L2 acquisition that has also gardless of linguistic typology, only those linguistic
been noted by many rationalist researchers (e.g., forms that the learner can process can be trans-
Bley-Vroman, 1990; Clahsen & Muysken, 1989; ferred to the L2. These claims are operationalized in
Meisel, 1991). The logic behind this is straight- PT by embedding in a coherent theoretical frame-
forward: Both L1 and L2 learners rely on cue work of L2 processing. To illustrate this operation-
strength in acquisition. The reason the outcomes alization, we give a summary of PT and characterize
are different is because, in the case of L2 learners, the lexical and hence language-specic nature of the
L1 patterns interfere with L2 learning. On the basis processing of key morphosyntactic features within
of the above assumptions, MacWhinney developed this framework.
(1997) a strong view on L1 transfer that is in effect The assumption that L1 transfer may be devel-
similar to the full transfer/full access hypothesis of opmentally constrained is not new in L2 acquisi-
Schwartz and Sprouse (1996), despite their funda- tion research. Wode (1976, 1978) demonstrated
mentally different theoretical orientation: such constraints for the acquisition of negation and
interrogatives. He showed that German learners
[T]he early second language learner should ex- of English produce certain forms that exist in the
perience a massive amount of transfer from L1 L1 and the L2 only after they have developed
to L2. Because connectionist models place such the structural prerequisites in the L2. Zobl (1980)
a strong emphasis on analogy and other types observed similar phenomena, as did Kellerman
of pattern generalization, they predict that all (1983). What PT (Pienemann, 1998) adds to the
aspects of the rst language that can possibly concept of developmental constraints on transfer is
transfer to L2 will transfer. This is an extremely an explicit formal framework for specifying these
strong and highly falsiable prediction. (p. 119) constraints. This framework is described in the
following sections before testing it in sets of data to
MacWhinney (1997) illustrated his point about allow us to test the null hypothesis for transfer in
structurally impossible transfer using German typological minimal pairs.
and English as an example. German nouns are
implicitly marked for grammatical gender, whereas
English nouns are not. He concluded that German A Sketch of Processability Theory
learners therefore have no basis for transferring the
German gender system to English. Therefore, this The basic logic underlying PT is this: Structural
set of features is not included in the list of things options that may be formally possible will be pro-
that will be transferred. duced by the language learner only if the necessary
Our own approach to cross-linguistic inuences processing procedures are available. In this per-
in L2 acquisition does not take the initial state spective, the language processor is seen, in agree-
or general learning mechanisms as its point of ment with Kaplan and Bresnan (1982), as the
departure, but instead argues in terms of process- computational mechanisms that operate on (but
ing constraints (e.g., Hakansson, Pienemann, & are separate from) the native speakers linguistic
Sayehli, 2002; Pienemann, 1998). As mentioned, knowledge. PT primarily deals with the nature of
the theoretical assumptions underlying our ap- those computational mechanisms and the way in
proach are those made in processability theory (PT; which they are acquired.
Pienemann, 1998), which include, in particular, the The fundamental point behind PT is that re-
following hypotheses: (a) that L1 transfer is con- course needs to be made to key psychological as-
strained by the processability of the given structure pects of human language processing to account for
and (b) that the initial state of the L2 does not the developmental problem3 because describable
necessarily equal the nal state of the L1 (contrary developmental routes are at least partly caused
to the assumption made by Schwartz and Sprouse, by the architecture of the human language proces-
1996) because there is no guarantee that the given sor. For linguistic hypotheses to transform into
L1 Transfer 133
executable procedural knowledge (i.e., a certain among other things, on the storage of in-
processing skill), the processor needs to have the formation about the grammatical subject
capacity for processing the structures relating to (namely, number and person), which is cre-
those hypotheses. ated before the verb is retrieved from the
Processability theory is based on a universal lexicon.
hierarchy of processing procedures that is derived Premise 4. Grammatical processing has access
from the general architecture of the language to a grammatical memory store. The need for
processor. This hierarchy is related to the require- a grammatical memory store derives from the
ments of the specic procedural skills needed for linearization problem and the automatic and
the target language. In this way, predictions that incremental nature of language generation.
can be tested empirically can be made for language Levelt (1989) assumed that grammatical in-
development. formation is held temporarily in a grammat-
The view of language production followed in PT ical memory store that is highly task specic
is largely that described by Levelt (1989). It also and in which specialized grammatical pro-
overlaps to some extent with the computational cessors can deposit information of a specic
model of Kempen and Hoenkamp (1987), which nature (e.g., the value of diacritic features,
emulates much of Garretts work (e.g., Garrett, such as the values for person and num-
1976, 1980, 1982). The basic premises of that view ber). In Kempen and Hoenkamps (1987)
are the following: Incremental Procedural Grammar, the spe-
cialized procedures that process NPs, VPs,
Premise 1. Processing components, such as and the like are assumed to be the locus of the
procedures to build NPs (noun phrases) and grammatical buffer. Pienemann (1998) pre-
the like, are relatively autonomous specialists sented evidence from online experiments and
that operate largely automatically. Levelt aphasia in support of these assumptions (e.g.,
(1989) described such grammatical proce- Cooper & Zurif, 1983; Engelkamp, 1974;
dures as stupid because their capacity is Paradis, 1994; Zurif, Swinney, Prather, &
strictly limited to the very narrow but highly Love, 1994).
efcient handling of extremely specic pro-
cessing tasks (e.g., NP procedures and verb The process of incremental language generation
phrase [VP] procedures). The automaticity of as envisaged by Levelt (1989) and Kempen and
these procedures implies that their execution Hoenkamp (1987) is exemplied in Fig. 7.1, which
is not normally subject to conscious control. illustrates some of the key processes involved in the
Premise 2. Processing is incremental. This generation of the example sentence a child gave
means that surface lexico-grammatical form the mother the cat. The concepts underlying this
is gradually constructed while conceptualiza- sentence are produced in the Conceptualizer. The
tion is still ongoing. One key implication of conceptual material produced rst activates the
incremental language processing is the need lemma CHILD in the lexicon. This activation starts
for grammatical memory. For the next pro- from within the lexicalization system, a subsystem
cessor to be able to work on still-incomplete of the grammatical encoder. The lemma contains
output of the current processor and for all of the category information N, which calls the cate-
this to result in coherent surface forms, some gorial procedure NP. This procedure can build the
of the incomplete intermediate output has to phrasal category in which N is head, that is, NP.
be held in memory. The categorial procedure inspects the conceptual
Premise 3. The output of the processor is linear, material of the current iteration (the material cur-
even though it may not be mapped onto the rently being processed) for possible complements
underlying meaning in a linear way. This is and speciers and provides values for diacritic
known as the linearization problem (Levelt, features. Given certain conceptual specications,
1981), which applies both to the mapping of the lemma A is activated, and the NP procedure
conceptual structure onto linguistic form and attaches the branch Det to NP.
to the generation of morphosyntactic struc- During this process, the diacritic features of Det
tures. One example of this is subject-verb and N are checked against each other. This implies
agreement, as illustrated in the sentence, She that the grammatical information singular is
gives him a book. The afxation of the extracted from each of the two lemmas at the time
agreement marker to the verb depends, of their activation and is then stored in NP until the
134 Acquisition
head of the phrase is produced. This process of of a relation between the phrase and the rest of the
exchange of grammatical information is a key intended message. This is accomplished by assign-
feature of language production. Below, we utilize ing a grammatical function to the newly created
Lexical-Functional Grammar (cf. Bresnan, 1982, phrase.
2001), which has the capacity to model the ex- Although the process was still ongoing, the next
change of grammatical information by feature conceptual fragment would have been processed in
unication. parallel, and the output of the Formulator4 would
The production process has now proceeded to have been delivered to the Articulator. This means
the point at which the structure of a phrase has been that new conceptualization occurs while the con-
created, and the associated lemmata are activated. ceptual structure of the previous iteration is pro-
What is missing to make this the beginning of a duced. The whole process then moves from
continuous and uent utterance is the establishment iteration to iteration.
L1 Transfer 135
Kempen and Hoenkamps (1987) research im- formulator. The key assumption of De Bots work
plied that, in the process of incremental language for L2 processing is that, in all cases for which
generation, the following processing procedures the L2 is not closely related to the L1, different
and routines are activated in the following se- (language-specic) procedures have to be assumed.
quence: Pienemann (1998, p. 78) therefore concluded that
most of the processing procedures discussed in this
1. Lemma access section have to be acquired by the L2 learner. He
2. The category procedure cited diacritic features such as tense, number,
3. The phrasal procedure gender, and case, which vary between lan-
4. The S procedure guages, as obvious examples of cross-linguistic
5. The subordinate clause procedure, if appli- differences in the lexical prerequisites for language
cable processing.
Recall that it is hypothesized by PT that the time
Pienemann (1998) hypothesized that these key course in the activation of grammatical encoding
grammatical encoding procedures are arranged procedures determines the sequence in which these
according to their sequence of activation in the procedures are acquired by L2 learners. The reader
language generation process, and that this sequence may wonder how language can be produced when
follows an implicational pattern in which each a given learner has not developed a specic en-
procedure is a necessary prerequisite for the fol- coding procedure. This is in fact the case for every
lowing procedures. The basic thesis of PT is that, in stage of acquisition before mastery of the target
the acquisition of language processing procedures, language. All of the grammatical forms not yet
the assembly of the component parts will follow the developed are caused by the absence of specic
above-mentioned implicational sequence. The key processing procedures. PT assumes that the hier-
to predicting which grammatical structures are pro- archy of processing procedures will be cut off in the
cessable and in which sequence is based on a ma- procedural grammar of the learner at the point of
trix of information transfer that determines the missing processing procedure. The rest of the
which pieces of grammatical information can be hierarchy will be replaced by a direct mapping of
exchanged between which constituents given the conceptual structures onto surface form as long as
availability of the different procedures and their there are lemmata that match the conceptually in-
storage capacity. Pienemann pointed out that these stigated searches of the lexicon. In other words, it is
processing procedures are operational only in ma- hypothesized by PT that the processing procedures
ture users of a language, not in language learners: and the capacity for the exchange of grammatical
information will be acquired in their implicational
While even beginning second language learners sequence as depicted in Table 7.1, where t1, t2, and
can make recourse to the same general cognitive so on refer to different points in the course of
resources as mature native language users, they language development.
have to create language-specic processing rou-
tines. In this context it is important to ensure that
Levelts model (and Kempen and Hoenkamps Memory Stores in Language
specic section of it) can, in principle, account Processing
for language processing in bilinguals, since sec-
ond language acquisition will lead to a bilingual In characterizing some of the key psychological
language processor. (1998, p. 73) constraints on language production, we repeatedly
made reference to the storage of linguistic infor-
Processability theory utilizes, among other things, mation. It is therefore useful to clarify to some
De Bots (1992) work to apply the processability extent the role of the storage of grammatical in-
hierarchy to bilingual language production. De Bot formation in the process of language production.5
(1992) adapted Levelts model to language pro- There are several factors that necessitate the stor-
duction in bilinguals. Based on work by Paradis age of linguistic information in language produc-
(1987), he argued that information about the spe- tion. At various points in the production process,
cic language to be used is present in each part of propositional or grammatical information has to
the preverbal message, and this subsequently in- be held in memory. One factor is the linearization
forms the selection of language-specic lexical problem (Levelt, 1981): When conceptualiza-
items and of the language-specic routines in the tion and articulation are not temporally aligned,
136 Acquisition
t1 t2 t3 t4 t5
S0 procedure
(embedded S)
S procedure Simplied Simplied Interphrasal Interphrasal
information information
exchange exchange
Phrasal procedure Phrasal Phrasal information Phrasal
(head) information exchange information
exchange exchange
Category procedure Lexical Lexical Lexical Lexical
(lexical category) morphemes morphemes morphemes morphemes
Word/lemma
processed material has to be held in memory until it vision, for which Gough (1972) argued on the basis
can be used by the articulator. What is needed here of experimental evidence that letters are taken out
is a store with fast access time. of the visual buffer at the rate of about 15 ms per
The information generated by the formulator is letter.
specically syntactic in nature and therefore has to The lexicon is stored in permanent memory and
be deposited in a store that is suited to handle this is at least partly open to conscious processing. It is
type of information, and attention to it (conscious therefore a store of declarative knowledge that can
or non-conscious) is not necessary for this opera- be activated for language production.
tion. For instance, one does not need to be aware In other words, there is a fundamental division
of, or control, the fact that the information con- between procedural (implicit) and declarative (ex-
cerning person and number matches the lexi- plicit) memory stores that is crucial to the archi-
cal entries of the verb and the grammatical subject. tecture of the Formulator. Overwhelming empirical
In fact, it is possible to attend to only a small evidence supporting the dissociation of procedural
number of such processes. Otherwise, with the and declarative knowledge was amassed by Paradis
normal speed of language generation, attentional (1994). He summarized the available clinical evi-
memory resources would get clogged up. On the dence as follows:
other hand, attention must be focused on the
propositional content because it reects the con- Lesions in the hippocampal and amygdalar
ceptualization the speaker wants to express. Prop- system as well as in parietal-temporal-occipital
ositional information is therefore temporarily and frontal association cortices compromise
stored in working memory, which functions as recognition and recall, and cause selective an-
the resource for temporary attentive processes6 that terograde impairment of declarative memory
include conceptualizing and monitoring (Baddeley, while preserving procedural memory such as the
1990; Broadbent, 1975). acquisition and execution of complex skills. On
Levelt (1989) assumed that grammatical infor- the other hand, lesions of the basic ganglia,
mation is held temporarily in a syntactic buffer, a cerebellum, and other non-limbic-diencephalic
memory store which is highly task specic and non- sites, as well as circumscribed neocortical le-
attentive. Specialized grammatical processors can de- sions, selectively affect learning and memory
posit information of a specic nature in the syntactic for skilled, automatised functions (Mayes, 1988)
buffer, which is needed to synchronize the availability such as language (aphasia) and well practised
of surface structure fragments for phonological en- voluntary movements (apraxia). (p. 396)
coding because surface structure fragments may be
available before they need to be produced. Paradis (1994, p. 396) cited a wealth of studies
Specialized ultra-short-term stores are also demonstrating the dissociation of procedural and
known in other cognitive elds, for instance, in declarative knowledge in patients with Alzheimers
L1 Transfer 137
a: DET, SPEC A
NUM SG
man: N, PRED MAN
NUM SG
PERS 3
owns: V, PRED OWN (SUBJ)(OBJ)
SUBJ NUM SG
SUBJ PERS 3
Figure 7.5 Functional structure. TENSE PRESENT
many: DET, SPEC MANY
NUM PL
dogs: N, PRED DOG
The slots to the right of the verb that are lled by NUM PL
SUBJ[ECT] and OBJ[ECT] in Fig. 7.5 list the ar-
guments of the predicate: rst the owner, then the
item owned.
The PRED entry of the f-structure therefore makes Implementing a Processing
it possible to relate the different constituents to the Hierarchy Into Lexical-Functional
roles (actor, patient, etc.) described by the sentence. Grammar
This forms the link between the syntactic form and its
underlying predicateargument relations. The implementation of the processability hierarchy
As mentioned, feature unication is one of into a description of a given language based on
the key concepts that relates Lexical-Functional Lexical-Functional Grammar affords a prediction
Grammar to the psycholinguistic model of lan- of the stages in which the language can develop
guage generation presented in the work of Levelt in L2 learners. The main point of the implemen-
(1989). Therefore, it may be useful to have this key tation is to demonstrate the ow of grammatical
notion illustrated in light of the sketch of Lexical- information in the production of linguistic struc-
Functional Grammar. In the context of this chap- tures. We demonstrate this with the example of
ter, this may best be achieved with reference to three morphological rules and two word order
morphological structures. rules, both relating to English.
In Lexical-Functional Grammar, the morpho- The brief discussion of feature unication in the
logical component operates on the basis of a func- previous section may serve as a useful basis for an
tional description of the sentence. The sentence A illustration of the ow of grammatical information
man owns many dogs may illustrate this. Note in morphological structures. Considering the uni-
that lexical entries contain information that is rel- cation of the feature NUM[BER] in the noun
evant here. The relevant pieces of information are phrase a man (i.e., matching the values of this
listed in Table 7.2. The well-formedness of sen- feature for a and man), one can see that the
tences is guaranteed, among other things, by en- unication of the NUM value in noun phrases is an
suring that functional descriptions of the sentence operation that is restricted entirely to the noun
and lexical entries match, that is, the phrase a phrase. In fact, the noun phrase procedure is the
man is functionally well formed becauseamong locus of this operation. In PT, an afxation re-
other thingsthe value for NUM[BER] is SG (i.e., sulting from feature unication within a phrase
singular) in the subsidiary function NUM SG is called phrasal because it occurs inside phrase
under SUBJ as well as in the lexical entry for boundaries (cf. Pienemann, 1998). This operation
man. In the same way, the noun phrase many relies on the capacity of phrasal procedures to store
dogs is well formed because of a match of the and unify the feature values of their constituents.
feature NUM in many and dogs. The actual In contrast, some morphological regularities rely
structure of the morphological component is not merely on category procedures. An example is En-
crucial to the present line of argument. The central glish or German tense marking (-ed or -te), the in-
point here is that feature unication entails the formation for which can be read off the lexical entry
matching of feature values (such as NUM[BER]) of the verb without any further exchange of lexical
within and across constituents. information within the phrase, as can be seen in
140 Acquisition
Table 7.1. In PT, an afx resulting from the use of a authors made constituent structure dependent on
category procedure is called a lexical morpheme. control equations as in Rule 1, which stipulate
Subject-verb agreement, in contrast, involves the constituents to be lled by certain lexical classes.
matching of features in two distinct constituents,
namely NPsubj and VP. The insertion of the -s afx S00 ! XP S0
for subject-verb agreement marking requires the 8 9
< wh c
> >
=
following grammatical information:
adv c
>
: >
;
SENT MOOD INV
-s Vafx TENSE present
Rule 1
SUBJ NUMBER sg
SUBJ PERSON 3
Rule 1 describes the occurrence of wh-words and
adverbs in focus position as in yesterday he went
The value of the rst equation (the one relating to home. Note that this position ( XP) can only be
tense) is read off the functional description of lled by wh-words and adverbs because this is de-
sentences as illustrated in Fig. 7.5. The values for ned in the constraint equations.
NUMBER and PERSON must be identical in the English inversion can be accounted for by
functional structure of SUBJ and the lexical entry Rule 2:
of V. Hence, this information from inside both
constituents has to be matched across constituent S0 !V S
boundaries. This process may be described infor- 8 9
>
< aux c >
= Rule 2
mally as follows:
ROOT c
>
: >
;
SENT MOOD c Inv
[A man]NPsubj [{owns} . . . ]VP (Present, imperfective)
It is the interaction of Rule1 and Rule 2 that creates
PERSON 3 PERSON 3
the correct word order (for instance, Where has he
NUM sg NUM sg been?). A lexical redundancy rule for wh-words
ensures that the lling of the focus position creates
From a processing point of view, the two mor- the information MOOD inv (inversion). This
phological processes, plural agreement in noun information then feeds into the equation in Rule 2,
phrases and subject-verb agreement, have a differ- which licenses an auxiliary verb in a position left of
ent status. Whereas the rst occurs exclusively NPsubj. In other words, grammatical information is
inside one major constituent, the second requires created through the processing of one constituent,
that grammatical information be exchanged across and that information is utilized during the proces-
constituent boundaries. This type of morphological sing of another constituent. In terms of exchange of
process is referred to as interphrasal afxation. information, then, inversion is an example of
This operation relies on the capacity of the sentence exchange of grammatical information at the level
procedure to store and unify the feature values of of the sentence procedure (cf. Table 7.1) because the
sentence constituents. information MOOD inv present in XP and in
The basic point to bear in mind for the dis- Aux is matched in the sentence procedure.
cussion of word order within the framework of The second word order example is canonical word
Lexical-Functional Grammar is the fact that word order, which in Lexical-Functional Grammar is ex-
order is dened through constituent structure. This pressed simply through constituent structure rules:
is because this theory of grammar contains only
one level of constituent structure, and no inter- S ! NPsubj VNPobj1 NPobj2 Rule 3
vening representations occur. In other words,
no actual linguistic material is assumed to be S ! NPsubj NPobj1 NPobj2 V Rule 4
moved from one place to another, as proposed
in transformational grammar. Instead, in Bresnans Rule 3 accounts for a subject-verb-object (SVO)
(1982) and Pinkers (1984) account of English language, whereas Rule 4 accounts for a subject-
word order, constituent structure allows a range of object-verb (SOV) language. Because grammatical
different word order constellations. To achieve the functions are assigned at the level of constituent
correct constellation in a given context, these structure, a strict canonical order obviously does not
L1 Transfer 141
involve the unication of any features across major of this chapter to highlight briey the relationship
constituent boundaries. In other words, no infor- of this theory of grammar to language production
mation on lexical features has to be transferred be- and to sketch out additional constraints that the
tween constituents. It is quite possible to produce architecture of the Formulator imposes on Lexical-
canonical sentence schemata without phrasal cate- Functional Grammar. At the most general level,
gories by using a at constituent structure and by the language processor is seen, in the PT perspec-
mapping semantic roles directly onto c-structure in tive, concurrent with Kaplan and Bresnan (1982)
the initial stage of syntactic development. as the computational routines that operate on but
However, canonical word order is not the only are separate from the native speakers linguistic
possible organization principle of early syntax. A knowledge. We pointed out that feature unica-
parallel type of organization principle is based on tion, which is one of the main characteristics of
the morphological marking of semantic roles. This Lexical-Functional Grammar, captures a psycho-
would involve an afxation process driven directly logically plausible process that involves (a) the
by the conceptual structure and based merely on identication of grammatical information in the
the lexical class of the lexical material. Such afxes lexical entry, (b) the temporary storage of that in-
could be inferred directly from constituent struc- formation, and (c) its utilization at another point
ture and would not involve any agreement mark- in the constituent structure. Pienemann (1998) also
ing. Slobin (1982) supplied evidence in support demonstrated that feature unication is one of the
of this prediction. His data showed that, in the key processes in morphology and word order.
acquisition of Turkish, a noncongurational8 lan- Every level of the hierarchy of processing proce-
guage, children acquired morphological markers of dures can be represented through feature unica-
grammatical functions at the same developmental tion. In other words, the essence of that hierarchy
point in time as xed word order was acquired in can be captured through feature unication in
congurational languages. Lexical-Functional Grammar.9
We are now in a position to locate ve English The main proviso on this made by Pienemann
morphosyntactic phenomena within the hierarchy (1998) is that the procedures that underlie Lexical-
of processability. The structures discussed here are Functional Grammar cannot be understood to
highlighted in Table 7.3, which includes lexical, represent psychological procedures themselves. In-
phrasal, and interphrasal morphemes as well as stead, they can be considered a shorthand notation
canonical word order and inversion. The table also that contains all the necessary elements to relate
lists a number of further structures and their posi- structures to a hierarchy of processability. The
tion within the hierarchy. However, because of lim- formalism of Lexical-Functional Grammar is de-
ited space, a full exposition of English as a second signed to be highly noncommittal about when
language development within PT is not possible unications are performed. They can be done in-
here (for further details, cf. Pienemann, 1998). crementally, as each phrase is built, or at the end,
when an entire constituent structure has been con-
structed (see Maxwell & Kaplan, 1995, for some
The Interface of discussion on this point). Because PT assumes strict
Lexical-Functional Grammar limits on grammatical memory, it would follow
and Processability Theory that unications ought to be done as soon as pos-
sible (cf. Pienemann, 1998).
Given the key role that Lexical-Functional Gram- These limitations on memory are relevant to
mar plays in PT, it may be useful for the coherence a further feature of Lexical-Functional Grammar,
142 Acquisition
which is that the theory in its present form imposes PT does not imply, however, that the learner will
no limitations on the amount or nature of informa- never attempt to form diacritic features and func-
tion that can be transferred between constituents torization rules that reect L1 regularities. Instead,
by unication. For example, arbitrarily complex the theory does imply processing constraints on
substructures can be built in different constituents L1 transfer:
and checked for consistency. This possibility has
been shown to lead to the possibility of writing [B]ulk-transfer of the L1 Formulator would
Lexical-Functional Grammars for highly unnatural lead to very unwieldy hypotheses. German
kinds of languages (Berwick & Weinberg, 1984, learners of English, for instance, would have to
pp. 107114) and to computational intractability invent large sets of diacritic features for nouns,
(Barton, Berwick, & Ristad, 1987, pp. 103114). verbs and adjectives without any evidence of
In PT, rather than having an unlimited and un- their existence in the L2, since German denite
constrained ability to unify information from dif- determiners express a complex set of diacritic
ferent constituents, learners are assumed initially features of the noun (three genders and two
to have no such ability, but to acquire it gradually. numbers). Since English nouns do not contain
This argues that Lexical-Functional Grammar these diacritic features the complex system of
should be modied so that information ow be- denite determiners presented in Table 7.4
tween constituents is inherently restricted. Piene- corresponds to merely one English grammatical
mann (1998) used the system of Lexical-Functional morpheme (the). (p. 81)
Grammar with the informal assumption that uni-
cation occurs at the lowest node shared by the In this case the simplest structural solution
two constituents between which information needs would be to abandon the L1 diacritic features
to be unied. altogether. This would in fact reproduce a sit-
In short, Lexical-Functional Grammar affords a uation which is close to the English determiner
valid application of Pienemanns (1998) hierarchy system. However, the relationship between L1
of processing procedures; it is also readily avail- and L2 diacritic features may be more complex
able, compatible with Levelts model, and attrac- than in the above example, with two intersect-
tive from a typological point of view. ing sets of diacritic features and different form-
function relationships in L1 and L2. In other
words, there is potentially a multitude of L1
features only some of which are applicable to
Processing Constraints on First the L2. (p. 81)
Language Transfer
In essence, the lack of psychological plausibility
The internal mechanics of PT imply processing present in the bulk transfer approach forms a log-
constraints on L1 transfer for the following reason. ical argument in favor of processing constraints on
Given the architecture of the language generator, L1 transfer, the position assumed by PT:
there is no guarantee that one can simply utilize L1
procedures for the L2. Pienemann (1998) argued I hypothesize that the L1 Formulator will not be
that such bulk transfer would lead to internal bulk-transferred. Instead, the learner will re-
problems: construct the Formulator of the L2. This would
not exclude that in the course of this process L1 Typological Proximity Without
procedures be utilized. However, I hypothesize an Advantage
that such L1 transfer always occurs as part of
the overall reconstruction process. (pp. 8182) The notion of developmentally moderated transfer
basically implies that certain grammatical structures
The case of constraints on the transfer of mor- identical in L1 and L2 nevertheless require the de-
phological and lexical regularities is relatively velopment of certain processing prerequisites before
straightforward, as the example of the determiner the L1 procedures can be utilized in the L2. In this
illustrates. Similar constraints are operational on section, we review a number of key studies in sup-
word order. This point has been illustrated by port of this hypothesis. What these studies have in
Pienemann (1998, pp. 99102) with the acquisition common is that they all focus on L1 transfer in the
of German separable verbs, but is not repeated here context of typological proximity. In other words, we
for reasons of space. Sufce it to say that word or- provide empirical evidence to show that typological
der phenomena depend crucially on the correct proximity does not guarantee L2 learners ready ac-
annotation of lexical entries that differ in their cess to L1 knowledge or to L1 processing skills.
diacritic features even between related languages. Hakansson et al. (2002) provided empirical evi-
In other words, according to PT, both the con- dence to demonstrate that L1 transfer is develop-
struction of the L2 from square one and devel- mentally moderated as predicted by PT. The study
opmental constraints on L1 transfer follow from focuses on the acquisition of German by Swedish
the hierarchical nature of the learning task. In this school children. The L1 and the L2 share the fol-
scenario, there is no other logical point of depar- lowing word order regularities in afrmative main
ture for this L2 construction process than the be- clauses:
ginning of the processability hierarchy because the
hierarchy at this point is stripped off all language-
SVO
specic lexical features and syntactic routines. It
Peter mag Milch
would therefore be logical for this L2 construction
Peter gillar mjolk
process to follow the path described in the pro-
(Peter likes milk10)
cessability hierarchy and for L1 knowledge and
Adverb fronting (ADV)
skills to become accessible once they are process-
*Heute Peter mag Milch
able in the developing system.
*Idag Peter gillar mjolk
To sum, PT implies the hypothesis that the L1
(Today Peter likes milk)
Formulator will not be bulk transferred because the
Subject-verb inversion (INV) after ADV
processing of syntax is lexically driven, and the
Heute mag Peter Milch
processor relies on highly language-specic lexical
Idag gillar Peter mjolk
features. Instead, the learner will construct the
(Today likes Peter milk)
Formulator of the L2 from scratch. This would not
exclude that, in the course of this process, L1 pro-
cedures will be utilized. However, it is hypothesized To place this developmental sequence in the overall
that such cases of L1 transfer occur as part of the context of the processability hierarchy, an over-
overall L2 construction process. This means that L1 view of the implementation of key morphosyntac-
transfer is developmentally moderated and will oc- tic features of German and Swedish is provided in
cur only when the structure to be transferred is Tables 7.5 and 7.6. For a full exposition of this
processable within the developing L2 system. implementation process and supporting empirical
Table 7.5 Processing Procedures Applied to German Word Order and Morphology
Table 7.6 Processing Procedures Applied to Swedish Word Order and Morphology
evidence, refer to the studies of Pienemann (1998) which is ungrammatical in the L1 as well as in the
and Pienemann and Hakansson (1999). L2 (e.g., *Heute Peter mag Milch).
The results of the study by Hakansson et al. Hakansson et al. (2002) argued on the basis of
(2002) are summarized in Table 7.7, which treats PT that the L2 system can utilize L1 production
all learner samples as parts of a cross-sectional mechanisms only when the L2 system has devel-
study. Therefore, Table 7.7 represents an implica- oped the necessary prerequisites to process the L1
tional scale11 (cf. Hatch & Farhady, 1982) of the forms, and that therefore the INV procedure of the
data, which demonstrates that the learners follow L1 cannot be utilized before the full S-procedure
the sequence (a) SVO, (b) ADV, and (c) INV. In has been developed in the L2.
other words, ADV and INV are not transferred Given that, in this study, German was in fact the
from the L1 at the initial state even though these third language of the informants and that English
rules are contained in the L1 and the L2. This im- was the second, it may be easy to conclude that
plies that, for a period of time, learners produce the the nonapplication of INV (or V2) was caused by
constituent order transfer from English. In fact, this explanation is
popular among Swedish schoolteachers of German
*adverb S V O and has also been suggested by Ruin (1996) and
Naumann (1997). Many Swedish teachers of Ger-
man disrespectfully termed this phenomenon the
Table 7.7 Implicational Scale Based on All Learners English illness.
in the Study by Hakansson et al. (2002) However, such a proposal is far from conclu-
sive. In the study discussed in this section, ADV did
Name SVO ADV INV not appear at the early stage, although it is also
Gelika (Year 1) part of English grammar and could therefore
Emily (Year 1) be transferred. In other words, for the proposal to
Robin (Year 1) be conclusive, one would need to consider how
Kennet (Year 1) the transfer-from-L2 hypothesis would be testable.
Mats (Year 2) Logically, the hypothesis would have to predict
Camilla (Year 2) that all L2 word order constraints would be trans-
Johann (Year 1) ferred or at least all those that are shared by the L1,
Cecilia (Year 1) the L2, and the L3. Otherwise, the transfer hypoth-
Eduard (Year 1)
esis would have no predictive power and could
Anna (Year 1)
Sandra (Year 1)
not be falsied unless one added a separate theory
Erika (Year 1) predicting which items are to be transferred and
Mateus (Year 2) which are not.
Karolin (Year 2) In the absence of such a theory, one can only
Ceci (Year 2) test the transfer-all hypothesis. To follow this line
Peter (Year 2) of argument, it is important to remember that the
Johan (Year 2) data from the study showed a strictly implicational
Zandra (Year 2) development. It is evident from this analysis (see
Zoe (Year 2) Table 7.7) that 6 of the 20 learners produced SVO
Caro (Year 2)
only and no ADV. If one follows the transfer-from-
ADV, adverb fronting; INV, subject-verb inversion; SVO, subject- L2 view, they would appear to have transferred
verb-object. selectively only one word order pattern known
L1 Transfer 145
from their L2 (English). This clearly falsies the Haberzettl (2000) carried out a longitudinal
transfer-all hypothesis and leaves the selective- study of four child learners of German aged 6 to
transfer-from-L2 hypothesis with the problem of 8 years over a period of 2 to 4 years based on
making no testable prediction about when transfer monthly sessions 2060 min long. The key nding
will take place. in the present context is that the Russian learners
Further evidence supporting the hypothesis that acquired the split-verb position gradually over sev-
L1 transfer is developmentally constrained comes eral months, whereas the Turkish learners acquired
from Johnstons study of the acquisition of English it categorically and with nativelike correctness once
by learners of Polish and Vietnamese. Johnstons the structure emerged. Haberzettl concluded from
study consisted of a total of 16 samples from Polish these ndings that the Turkish learners beneted
and Vietnamese adult immigrants in Australia and from the structural overlap in word order constel-
focused on the acquisition of L2 grammatical rules. lations in German and Turkish. This conclusion is
The full distributional analysis of this study is fully compatible with the predictions that can be
available in the work of Johnston (1997) and is also made on the basis of the processability hierarchy
partly reported in the work of Pienemann, John- and the notion of developmentally moderated trans-
ston, and Brindley (1988), for instance. Polish uses fer. As the discussion here has shown, the split-verb
subject-verb agreement marking, Vietnamese does construction is in fact associated with Level 4 of the
not. According to the full transfer/full access hy- hierarchy, and both types of learners followed the
pothesis, AGR[EEMENT] ought to be transferred predicted sequence. However, the Turkish learners
from Polish to English. This would mean that Polish could take advantage of their L1 processing skills
learners should have an advantage over the Viet- once their interlanguage developed to the point at
namese learners concerning this structure. How- which they could be integrated into the L2 pro-
ever, a separate implicational analysis of the two cessor. In other words, this type of study consti-
groups revealed that both groups followed the same tutes evidence in support of the productive nature
pattern, and in both groups AGR[EEMENT] was of developmentally moderated transfer; the studies
acquired late. reviewed in the preceding section demonstrated the
constraining nature of the same notion.
Kawaguchis study constituted a prime test case a canonical SOV word order, which requires no
for the full transfer/full access model (Schwartz & exchange of grammatical information within the
Sprouse, 1994, 1996) as well as for the competition sentence as it can rely on direct mapping of se-
model (MacWhinney, chapter 3, this volume), which mantic roles onto surface structure (cf. our dis-
would both predict the Australian learners of Jap- cussion above). In other words, because of the low
anese to transfer English SVO word. However, this demands on processability, this word order pattern
prediction was clearly falsied by Kawaguchis can be processed at the initial stage of clause de-
study. velopment despite the typological distance between
The results of Kawaguchis study for the initial the L1 and the L2 (for a more detailed and formal
word order hypothesis are displayed in Table 7.8, account of information distribution in Japanese
which is based on a longitudinal corpus collected syntax, see Kawaguchi, in preparation, and Di
from two informants without any previous expo- Biase and Kawaguchi, 2002).
sure to the target language. The informants were This analysis of the initial word order in the
interviewed four times in their rst year of learning acquisition of Japanese as an L2 also highlights a
Japanese as an L2 starting at the very beginning of key difference between Clahsens (1984) strategies
the learning process. and the processability approach. As Vainikka and
Table 7.8 affords a distributional analysis of the Young-Scholten (1994) and Towell and Hawkins
corpus in relation to the positioning of the verb in (1994) pointed out, Clahsens strategies would
clauses. The rst line in the body of Table 7.8 lists predict that the initial hypothesis in L2 acquisition
the number of target-like occurrences of the verb in is formed on the perceptual array actor, action,
clause-nal position, and the second line lists the acted upon, thus producing universal SVO pat-
number of occurrences of the verb in nontarget terns for all L2s. No such assumption is made in
positions. It is easy to see from this analysis that PT. The only stipulation that exists at this level is
from the very beginning of their acquisition of that no grammatical information be exchanged
Japanese, neither of the learners ever produced within the sentence. This constrains the language
verbs in a nonnal position, and this was without processor to produce only structures that can be
exception,12 despite the fact that this is in stark processed without such information exchange.
contrast to the structure of their L1. SVO and SOV both satisfy this condition.
To summarize, we found that all learners of Di Biase (in preparation) studied another typo-
Japanese studied longitudinally by Kawaguchi logical constellation of the same kind as Kawa-
started with SOV word order and with subject guchi. In his study, he focused on the acquisition of
omission although their L1s followed an SVO a pro-drop language13 (Italian) as L2 by speakers
pattern and one of the L1s does not permit subject of a non-pro-drop language (English). According to
omission. Obviously, these ndings falsied the Whites (1989, p. 87) analysis, this type of learner
hypothesis that L1 features are transferred to the has to learn two things: the fact that null subjects
L2 at the initial state. are permitted and the circumstances in which the
This raises the question of why L2 learners language makes use of null subjects. These as-
would start out with a structure that is typologi- sumptions were derived from the more general
cally rather distant from their L1. The answer is assumption that L2 learners transfer the setting of
implied in PT and, more specically, in the devel- the L1 parameter to the L2.
opmentally moderated transfer hypothesis advo- Di Biase (in preparation) carried out a longitu-
cated in this chapter. In relation to the initial dinal study with two Australian informants over a
hypothesis for word order, PT predicts the use of a 1-year period. Both informants were university
canonical word order pattern. Japanese follows students of Italian. One informant (Ernie) was
Table 7.8 The Position of the Verb in Main Clauses by Second Language Learners of Japanese (Jaz and
Lou)
t1 t2 t3 t4
empirical data. Swedish learners of German did not assume a strictly linear relationship between input
transfer V2 to German (which would yield a cor- and output, following the motto if it is not in the
rect result), and Australian learners of Japanese did input, it cannot occur in the output. As noted,
not transfer SVO to Japanese (which would yield empirical data falsify such an assumption. This is
an incorrect result). also illustrated by the well-attested example of
The strong initial transfer assumption inherent overgeneralization in English regular past marking,
in MacWhinneys (1997, and chapter 3, this vol- such as in Cazdens She holded the baby rabbits
ume) competition model also produces predictions (1972, p. 96).
that were falsied by empirical data, particularly As these examples show, the assumption of a
by the Swedish-German study (Hakansson et al., strictly linear relationship between input and out-
2002), which showed that verb-second is not put and a rich transfer assumption produce pre-
transferred from Swedish to German even though dictions that are falsied by empirical dataat
this structure exists in the L1 and in the L2. All least for the domain of morphosyntax.
other cases of nontransfer discussed above proved A rich transfer assumption also is not supported
the same point. in the area of bilingual L1 acquisition. According
In addition, it may be useful to consider the to De Houwer (chapter 2, this volume) no studies
explanatory parsimony of MacWhinneys (1997) have empirically backed up the existence of the sort
assumption that all aspects of the rst language of language repertoires that would be predicted to
that can possibly transfer to L2 will transfer (p. develop in bilingual children in line with a transfer
119). Recall that MacWhinney (1997) illustrated theory. Indeed, she maintains that the interpreta-
his point about structurally impossible transfer tion of morphosyntactic features of the two input
using German and English as an example. German languages would assume that processing mecha-
nouns are implicitly marked for grammatical gen- nisms in bilingual children would enable them to
der, whereas English nouns are not. He concluded approach each input language as a morpho-
that German learners therefore have no basis for syntactically closed set.
transferring the German gender system to English. The gist of the cross-linguistic survey of L1
Consequently, this set of features was not included transfer presented in this chapter can be summed
in his list of things that will be transferred. in two fundamental trends: (a) Structures higher
Our point is the following. Whereas L1-L2 up the processability hierarchy are never transferred
contrasts are transparent to the linguist, the ques- at the initial stateregardless of typological con-
tion remains regarding how the learner recog- stellation; (b) initial word order may vary as long
nizes these differences. Recall that at the beginning as the ow of grammatical information is restricted
of this chapter we argued that the relationship be- to the initial stage of processability.
tween German and English diacritic features (of These trends clearly contradict any theory that
nouns) is not obvious to the learner, and that a full places emphasis on extensive L1 transfer at the
transfer hypothesis would lead to unwieldy hy- initial state and support a view on transfer that is
potheses. Conversely, it is precisely this lack of sensitive to the developmental state of the learners
transparency in the relationship between L1 and L2 language.
that makes a radical no-transfer hypothesis equally
unlikely.
Assuming a lexically driven model of language Acknowledgments
production such as the one proposed by Levelt We would like to thank the editors of this volume
(1989), gender is one of several diacritic features for their thoughtful and detailed comments on an
residing in the lexical entry for (German) nouns, earlier version of this chapter. We would also like
and the learner will have to discover for all lexical to express our gratitude to MARCS Auditory
classes (such as noun, verb, etc.) which of the L1 Laboratories, University of Western Sydney, Aus-
diacritic features are also marked in the L2, using tralia, for nancial assistance to create an oppor-
known or unknown linguistic means, and which tunity for us to meet in Sydney and discuss the
research presented in this chapter. We also want
additional diacritic features are marked using
to thank Simone Duxbury for her careful editorial
which linguistic means. This is a monumental learn- work on the manuscript. The research published
ing task. Assuming that diacritic features such as in this chapter was funded in part by a grant
gender are not transferable for structural reasons by the State Department of Higher Education of
would amount to a classical conditioning assump- North Rhine Westfalia, Germany, to Manfred
tion within the competition model that would Pienemann.
L1 Transfer 149
(Eds.), The current state of interlanguage De Bot, K. (1992). A bilingual production model:
(pp. 5561). Amsterdam: Benjamins. Levelts speaking model adapted. Applied
Bley-Vroman, R. (1990). The logical problem of Linguistics, 13, 124.
second language learning. Linguistic Analysis, DeCamp, D. (1973). Implicational scales and
20, 349. sociolinguistic linearity. Linguistics, 73,
Bresnan, J. (Ed.). (1982). The mental representa- 3043.
tion of grammatical relations. Cambridge, Di Biase, B. (in preparation) Processability and
MA: MIT Press. subject-verb agreement in a pro-drop
Bresnan, J. (2001). Lexical-functional syntax. language. In M. Pienemann (Ed.),
Oxford, England: Blackwell. Cross-linguistic aspects of processability
Broadbent, D. E. (1975). The magic number theory. Amsterdam: Benjamins.
seven after 15 years. In A. Kennedy & Di Biase, B., & Kawaguchi, S. (2002). Exploring
A. Wilkes (Eds.), Studies in long term the typological plausibility of processability
memory (pp. 318). London: Wiley. theory: Language development in Italian
Cazden, C. (1972). Child language and education. second language and Japanese second
New York: Holt, Reinhart, and Winston. language. Second Language Research, 18,
Chomsky, N. (1990). On the nature, use and 272300.
acquisition of language. In W. G. Lycan Dulay, H., & Burt, M. (1974). Natural sequences
(Ed.), Mind and cognition. A reader in child second language acquisition.
(pp. 627646). Cambridge, MA: Blackwell. Language Learning, 24, 3753.
Clahsen, H. (1984). The acquisition of German Engelkamp, J. (1974). Psycholinguistik. Munich,
word order: A test case for cognitive Germany: Ullstein.
approaches to L2 development. In R. W. Eubank, L. (1993). On the transfer of parametric
Andersen (Ed.), Second languages: A cross- values in L2 development. Language Acquisi-
linguistic perspective (pp. 219242). Rowley, tion, 3, 183208.
MA: Newbury House. Ewehag, R., & Jarnum, H. (2001). Forstasprakets
Clahsen, H. (1986). Connecting theories of inverkan pa andraspraket [First language in-
language processing and (second) language uence on the second language]. Unpublished
acquisition. In C. W. Pfaff (Ed.), First and manuscript, Department of Linguistics, Lund
second language acquisition processes University, Sweden.
(pp. 103116). Cambridge, MA: Newbury Felix, S. W. (1980). Interference, interlanguage and
House. related issues. In S. W. Felix (Ed.), Second
Clahsen, H. (1992). Learnability theory and the language development. Trends and issues (pp.
problem of development in language acquisi- 93107). Tubingen, Germany: Narr.
tion. In J. Weissenborn, H. Goodluck, & Felix, S. W. (1984). Maturational aspects of uni-
T. Roeper (Eds.), Theoretical issues in versal grammar. In A. Davies, C. Criper, &
language acquisition: Continuity and change A. Howatt (Eds.), Interlanguage (pp. 133161).
(pp. 5376). Hillsdale, NJ: Erlbaum. Edinburgh, Scotland: Edinburgh University
Clahsen, H., & Muysken, P. (1989). The UG Press.
paradox in L2 acquisition. Second Language Fodor, J. (1981). Fixation of belief and concept
Research, 2, 129. acquisition. In M. Piatelli-Palmarini (Ed.),
Cohen, N. (1984). Preserved learning capacity in Language and learning. The debate between
amnesia: Evidence for multiple memory Jean Piaget and Noam Chomsky (2nd ed.,
systems. In L. R. Squire & N. Butters (Eds.), pp. 143149). Cambridge, MA: Harvard
The neuropsychology of human memory University Press.
(pp. 83103). New York: Guilford Press, Garrett, M. F. (1976). Syntactic processes in
Cohen, N. (1992, November). Memory, amnesia sentence production. In R. Wales & E. Walker
and the hippocampal system. Paper (Eds.), New approaches to language
presented at the Cognitive and Neuro mechanism (pp. 231256). Amsterdam:
Science Colloquium, McGill University, North-Holland.
Montreal, Quebec, Canada. Garrett, M. F. (1980). Levels of processing in
Cook, V. J., & Newson, M. (1996). Chomskys language production. In B. Butterworth (Ed.),
universal grammar. An introduction (2nd ed.). Language production, Vol. 1, Speech and Talk
Oxford, England: Blackwell. (pp. 170220). London: Academic Press.
Cooper, W. E., & Zurif, E. B. (1983). Aphasia: Garrett, M. F. (1982). Production of speech:
Information-processing in language produc- Observations from normal and pathological
tion and reception. In B. Butterworth (Ed.), language use. In A. W. Ellis (Ed.), Normality
Language production (Vol. 2, pp. 225256). and pathology in cognitive functions.
London: Academic Press. London: Academic Press.
L1 Transfer 151
Gass, S. M. (1987). The resolution of conflicts Language transfer in language learning (pp.
among competing systems: A bidirectional 112134). Rowley, MA: Newbury House.
perspective. Applied Psycholinguistics, 8, Kempen, G., & Hoenkamp, E. (1987). An incre-
329350. mental procedural grammar for sentence for-
Gough, P. B. (1972). One second of reading. In mulation. Cognitive Science, 11, 201258.
J. F. Kavanagh & I. G. Mattingly (Eds.), Kilborn, K., & Ito, T. (1989). Sentence processing
Language by ear and by eye (pp. 331358). in a second language: The timing of transfer.
Cambridge, MA: MIT Press. Language and Speech, 32, 123.
Guttman, L. (1944). A basis for scaling qualitative Lado, R. (1957). Linguistics across cultures: Ap-
data. American Sociological Review, 9, plied linguistics for language teachers. Ann
139150. Arbor, MI: University of Michigan.
Haberzettl, S. (2000). Der Erwerb der Verbstallung LaFond, L., Hayes, R., & Bhatt, R. (2001). Con-
in der Zweisprache Deutsch durch Kinder mit straint demotion and null-subjects in Spanish
typologisch verschiedenen Muttersprachen. L2 acquisition. In J. Camps & C. R. Wiltshire
Eine Auseinandersetzung mit Theorien zum (Eds.), Romance syntax, semantics and L2
Syntaxerwerb anhand von vier Fallstudien. acquisition (pp. 121136). Amsterdam:
Doctoral dissertation, Potsdam University, Benjamins.
Potsdam, Germany. Levelt, W. J. M. (1981). The speakers linearisation
Hakansson, G., Pienemann, M., & Sayehli, S. problem. Philosophical Transactions, Royal
(2002). Transfer and typological proximity in Society London, B295, 305315.
the context of L2 processing. Second Lan- Levelt, W. J. M. (1989). Speaking. From
guage Research, 18, 250273. intention to articulation. Cambridge, MA:
Harrington, M. (1987). Processing transfer: Lan- MIT Press.
guage-specic processing strategies as a source Liceras, J. M., & Diaz, L. (1999). Topic-drop
of interlanguage variation. Applied Psycho- versus pro-drop: Null subjects and pronominal
linguistics, 8, 351377. subjects in the Spanish L2 of Chinese, English,
Harris, R. J. (Ed.), (1992). Cognitive processing in French, German and Japanese speakers.
bilinguals. New York: Elsevier Science. Second Language Research, 15, 140.
Hatch, E., & Farhady, H. (1982). Research design Lightbown, P., & Spada, N. (1999). How lan-
and statistics for applied linguistics. Rowley, guages are learned. Oxford, England: Oxford
MA: Newbury House. University Press.
Johnston, M. (1997). Development and variation MacWhinney, B. (1987). Applying the competition
in learner language. Doctoral thesis, Austra- model to bilingualism. Applied Psycholin-
lian National University, Canberra, Australia. guistics, 8, 315327
Kaplan, R., & Bresnan, J. (1982). Lexical- MacWhinney, B. (1997). Second language acqui-
functional grammar: A formal system for sition and the competition model. In A. M. B.
grammatical representation. In J. Bresnan de Groot & J. F. Kroll (Eds.), Tutorials in
(Ed.), The mental representation of bilingualism (pp. 113142). Mahwah, NJ:
grammatical relations (pp. 173281). Erlbaum.
Cambridge, MA: MIT Press. Maxwell, J. T., & Kaplan, R. M. (1995). The in-
Kawaguchi, S. (1999). The acquisition of terface between phrasal and functional con-
syntax and nominal ellipsis in JSL discourse. straints. In M. Dalrymple, R. M. Kaplan, J. T.
In P. Robinson (Ed.), Representation and Maxwell, & A. Zaenen (Eds.), Formal issues
process: Proceedings of the Third Pacic in lexical-functional grammar (pp. 571590).
Second Language Research Forum (Vol. 1, Stanford, CA: CSLI.
pp. 8593). Tokyo: Pacic Second Language Mayes, A. R. (1988). Human organic memory
Research Forum. disorders. Cambridge, England: Cambridge
Kawaguchi, S. (2002). Grammatical development University Press.
in learners of Japanese as a second language. McDonald, L. J., & Heilenman, L. K. (1991).
In B. Di Biase (Ed.), Developing a second Determinants of cue strength in adult rst and
language (pp. 1728). Melbourne: Language second language speakers of French. Applied
Australia. Psycholinguistics, 12, 313348.
Kawaguchi, S. (in preparation). Syntactic develop- Meisel, J. M. (1983). Strategies of second language
ment in Japanese as a second language. In acquisition: More than one kind of simpli-
M. Pienemann (Ed.), Cross-linguistic aspects cation. In R. W. Anderson (Ed.), Pidginisation
of processability theory. Amsterdam: and creolisation as language acquisition
Benjamins. (pp. 120157). Rowley, MA: Newbury House.
Kellerman, E. (1983). Now you see it, now you Meisel, J. M. (1991). Principles of universal grammar
dont. In S. Gass & L. Selinker (Eds.), and strategies of language use: On some
152 Acquisition
art (pp. 288315). Cambridge, England: Wode, H. (1978). The L1 versus L2 acquisition of
Cambridge University Press. English negation. Working Papers on Bilin-
Wexler, K., & Culicover, P. (1980). Formal prin- gualism, 15, 3757.
ciples of language acquisition. Cambridge, Zobl, H. (1980). The formal and developmental
MA: MIT Press. selectivity of L1 influence on L2 acquisition.
White, L. (1989). Universal grammar and second Language Learning, 30, 4357.
language acquisition. Amsterdam: Benjamins. Zurif, E., Swinney, D., Prather, P., & Love, T.
Wode, H. (1976). Developmental sequences in (1994). Functional localization in the brain
naturalistic L2 acquisition. Working Papers on with respect to syntactic processing. Journal
Bilingualism, 11, 112. of Psycholinguistic Research, 23, 487497.
Jaap M. J. Murre
8
Models of Monolingual and Bilingual
Language Acquisition
ABSTRACT Children learn language despite the very impoverished nature of the input.
Since the 1960s, the symbolic-deductive paradigm has explained this with reference to
an innate mental language system. For about two decades, an alternative to symbolic
accounts of language and language acquisition has been offered by connectionism,
which can be viewed as one of the main subsymbolic-inductive paradigms. The recent
models in this paradigm test detailed models against large databases of utterances. A
general conclusion from this research is that, despite being very noisy and inconsistent,
the nature of language input is nevertheless sufcient to support inductive mechanisms
by which seemingly rulelike behavior emerges from a data-driven learning process.
Constraints on the learning process are imposed by the architectures of the models.
Several models within the symbolic-deductive paradigm have now also been worked
out in much more detail, and a lively discussion between proponents of the two par-
adigms is currently taking place. We review some of the prominent models in both
paradigms, with an emphasis on the connectionist models. In particular, we look at
models of the acquisition of stress assignment, phonology, past tense formation, plu-
ralization, and certain aspects of semantics.
154
Models of Language Acquisition 155
enjoyed very wide popularity and seemed to have the arm position are adjusted so that they become
wide applicability. He also provided some impor- more synchronous as time goes by. After a pro-
tant proofs that guaranteed correct learning be- longed phase of such imitation learning, the robot
havior of the Perceptron. This started the rst wave is able to relate the joint angles of the arm to the
of widespread interest in connectionism. visual image. The robot is now able to go from
Neural networks, as foreseen by Hebb (1949) an envisioned goal position to a set of joint an-
and developed by Rosenblatt and many others after gle values; it can move its hand to grab an object
him, are based on the metaphor of networks of it sees.
interconnected nerve cells (neurons) that exchange Imitation learning may also play a role in the
simple signals, called activations, over connections. development of speech. During the babbling stage,
What such a network can do depends on how it an infant makes random movements with the speech
is wired, on which nerve cells are connected, and organs while at same time hearing the sounds
on how strong or efcient the connections are. caused by these movements. In the next stage, the
Learning is achieved by adjusting the efciency infant is able to echo the sounds of his or her
(weight) of each connection in such a way that the caretakers. During the nal stage, the babbling
behavior of the network is slowly molded into de- becomes less random and drifts toward the pho-
sired or target behavior. This target behavior may neme inventory of the caretakers.
be provided by the modeler in the form of teaching
or target signals, in which case error-correcting
learning is discussed. Sometimes, neural networks Backpropagation
are able to extract regularities from the stimuli to
which they are exposed without being told their The most popular form of error-correcting learning
aim. They achieve this regularity learning by cre- for neural networks is backpropagation (Ru-
ating and updating internal category structures. melhart, Hinton, & Williams, 1986). One of its
Thus, it is seen that network models vary in the earliest applications illustrates the power of this
extent to which they need to be supervised. First or learning algorithm. Sejnowksi and Rosenberg (1987)
second language learning will probably have ele- trained a network to pronounce text by presenting
ments of both unsupervised and supervised learn- it with samples of text with a phonetic transcrip-
ing. Using neural networks, it is possible to study tion. The network was capable of learning the task
just how much supervision is necessary to achieve and showed generalization of its behavior. When
a certain learning performance. In some cases, the new texts to which it had not been exposed were
model is not given a specic target to produce, but presented, it correctly pronounced the majority of
it is merely informed how close it was to the target. the words. With the advent of backpropagation
This is called reinforcement learning. An often-used and the concurrent publication of a comprehensive
example is balancing a broom on ones hand or on collection of articles on neural network models and
a computer-controlled car. The broom is observed principles, the so-called PDP (Parallel Distributed
continuously, and adjustments are made to the Processing) volumes (McClelland & Rumelhart,
position of the hand or car until the broom nally 1986; Rumelhart & McClelland, 1986b), a second
falls down. The reinforcing signal in this case is of wave of popularity started for connectionism (see,
the form 1, 1, 1, . . . (as long as the broom stays e.g., Bechtel & Abrahamsen, 2002, for a recent
up), . . . , 1, 1, 1 (when it nally falls down). introduction to connectionism).
An interesting class of learning models that also The underlying learning mechanism of back-
falls somewhere between the two extremes of fully propagation is based on the Perceptron learning
supervised learning and reinforcement learning is rule pioneered by Rosenblatt. The perceptron is
imitation learning. It involves a babbling stage in limited to input and output values of 0 and 1
which the network learns how it must control its (i.e., no graded values are allowed). For each out-
own effectors (e.g., muscles) to achieve a certain put node, a target signal (also 0 or 1) is available.
goal. A good example is the robot arm by Kuper- The network has to learn to produce these target
stein (1988). Before any goal-directed tasks are signals given the input pattern. An output nodes
attempted, the robot rst spends a period exploring activation becomes 1 if its net input is higher than
its own movements by setting the joint angles of its some threshold (usually 0) and is 0 otherwise.
arm to random values. Its two eyes observe the The Perceptron learning rule works with a very
effects of this motor babbling, and the internal simple principle: If some input contributes toward
network connections between the visual image and an error (mismatch between spontaneous and
Models of Language Acquisition 157
target activation) in the output, adjust the weights deliver at least a good solution. For many in-
from those inputs. Or, more precisely, if the output teresting learning problems, it can be proven that it
is already equal to the output target, the weight is is not feasible to nd the globally optimal solution
left unchanged. Otherwise, if the target activation within a reasonable amount of time, so for these
is higher than the spontaneous output, the weights problems we must make do with a good but sub-
to the output node are increased by some small optimal solution.
amount, and if the target is lower, the weights are Standard backpropagation works with feedfor-
decreased by a small amount. Weights from nodes ward networks only. This means that higher layers
with activation value 0 are never changed. When a cannot be connected to lower layers (i.e., those
pattern (input-target pair) is presented, the learning closer to the input). This limits their use to input
rule is applied to all weights in the network. An output associations and makes it hard to apply
entire training set, consisting of many patterns, them to time-varying signals, such as language ut-
usually has to be presented several times during the terances. In such cases, the system must retain an
training procedure while small adjustments are internal state that reects the history of the signal
made to the weights. This typically continues un- thus far. This state may be compared to the stack
til no further improvement in the performance is necessary to parse context-free languages. Gener-
observed. alizations to backpropagation networks with re-
The Perceptron is a two-layer network. It has no current connections were presented in the work of
middle layers (called hidden layers), so it cannot do Rumelhart et al. (1986).
any internal processing. Minsky and Papert (1969) A simplied version of a backpropagation net-
proved that two-layer networks cannot represent work that is able to learn time-varying signals is
certain important logical relationships between in- the Simple Recurrent Network by Elman (1990),
put and output, including the exclusive-or function. which is also able to learn simple grammars. This
Their analysis implied that there are many inter- network uses a buffer into which the hidden layer
esting pattern sets for which there exist no weight activations are copied after each learning cycle (see
values that allow it to produce an error-free output Fig. 8.1). The buffer enables the network to keep
for every input pattern. Nonetheless, if a solution track of the history of past patterns encountered.
does exist, the Perceptron learning rule is guaran- One of the problems with these networks is that
teed to nd it (Rosenblatt, 1958). they are hard to train. Elman had to use a special
The backpropagation algorithm by Rumelhart procedure, called combined subset training, by
et al. (1986) remedied the shortcomings of the which the network was rst trained on a small set
perceptron algorithm because (a) it can be used in
networks with one or more hidden layers (there-
fore, it is sometimes known as multilayer percep-
tron), and (b) it can be used with networks that
have graded inputs and outputs. Widrow and Hoff
(1960) had published a learning rule that could be
used with some types of graded activation rules:
the delta rule. The backpropagation rule can be
seen as a generalization of this learning rule and
is, therefore, often called the generalized delta
rule.
The backpropagation learning rule is very sim-
ilar to the perceptron rule when applied to the
weights from the hidden units to the output units;
for weights from the input layers to the hidden
layers (for which no explicit target output is given),
the errors are backpropagated from the output Figure 8.1 Simple Recurrent Network (Elman,
layers. The local error values in the output and 1990). After each learning cycle, the contents of
hidden nodes are used similar to the perceptron the hidden layer are copied to the buffer and held
rule. during the next cycle. In this way, new patterns are
Backpropagation will not always nd an opti- processed together with some trace of past pat-
mal solution in the form of a set of weights that terns. These networks are therefore able to learn
maximizes the performance, but it will typically series of patterns and simple grammars.
158 Acquisition
of examples. Throughout the training, this set was of related input patterns. In Fig. 8.2, it is shown
gradually extended to the nal set. This procedure how rst the word in is learned and then the
may broadly be compared to grammar learning by word no.
infants, starting with a small set of simple utter- Learning proceeds in two stages: First, a process
ances before a larger range is acquired. Unfortu- of competition among the representation nodes
nately, as discussed in the Models section, the takes place; as a result, a single winning node
suitability of Simple Recurrent Networks as psy- remains activated (in this case, Node U2). Second,
chological models of grammar acquisition is very the connections to this node are adjusted. Using the
limited (Sharkey, Sharkey, & Jackson, 2000). Hebb rule, a weight from node i to node j is in-
creased if (and only if) both nodes are activated.
Typically, the weight is decreased if node j is acti-
Hebbian Learning vated but node i is not. After having been exposed
and Categorization to in and no, pattern in will always activate the in
node and pattern no will activate the no node. In
One of the main features of a neural network is this way, we could produce a word recognizer that
usually its insensitivity to small perturbations in the learns in an unsupervised manner: Simply by pre-
input patterns. Some networks are connected such senting many words to the system, it will develop
that they can store patterns. When noisy versions of word recognition nodes.
these patterns are presented, they are able to cor-
rect the small mistakes in the input, a feat often
dubbed pattern completion or content-addressable Modularity and Innate Knowledge
memory.
Hopeld (1982) showed that this behavior can The value of unsupervised learning lies in that it is
be obtained with Hebbian learning and a simple possible to discover regularities in the input pat-
activation rule. Interesting and nontrivial compar- terns and to form categories or other higher-level
isons with certain complex systems in physics can units in an autonomous fashion. It is very likely
be made, by which the concept of attractor is that such processes play a crucial role in the ac-
linked with that of energy. With this approach, an quisition of cognitive skills. Moreover, networks
attractor becomes an activation state with low en- such as shown in Fig. 8.2 can serve as modules in
ergy, and under the inuence of activation updates, larger networks. For example, there could be lower-
the activation state migrates toward a low-energy level modules that recognize letters on the basis of
state. When analyzing the pattern completion be- handwritten patterns. On top of the letter modules,
havior, it thus appears that distortions of a pattern then could be positioned one or more word mod-
are attracted toward the original pattern. A stored ules. It would sufce to provide such a model with
representation in a Hopeld network is therefore enough handwritten words to allow it to discover
an attractor. both letter units and word units (see Murre, 1992,
Hebbian learning is often used to extract regu- or Murre, Phaf, & Wolters, 1992, for an example
larities from the set of input patterns. For example, of such a simulation). If it were trained with Rus-
if the input was speech, a network could be used to sian input patterns, it would develop nodes recog-
nd the phonemes automatically. A key ingredient nizing Cyrillic letters and Russian words. The
in nearly all of these algorithms is a form of com- outcome of the learning process thus is strongly
petition between the nodes: Only one node (or a determined by the input patterns.
few nodes) can be active in a layer. The node with The above learning scheme might seem to be
the highest net input usually wins the competition. an example of pure induction, with little room for
The principle of competitive learning is illus- innate knowledge. This is not true, however, be-
trated in Fig. 8.2. The network has three input cause the modular architecture of such a model has
nodes, which only serve to hold the input pattern. to be taken into account. How many modules does
In this case the input nodes represent the letters i, n, it have? How large are they? How are they con-
and o, so that the words in and no can be formed. nected? This overall structure provides an impor-
The network also has two uncommitted represen- tant constraint on what can and cannot be learned.
tation nodes U1 and U2. Initially, they do not rep- It is, therefore, the second determinant of the out-
resent any specic pattern. One of the main goals come of the learning process, and it can be regarded
of a competitive learning procedure is to have these as one of the points at which innate knowledge
nodes represent specic input patterns or categories shapes the learning process.
Models of Language Acquisition 159
Figure 8.2 Schematic overview of competitive learning. The connections with circles at the end indicate
inhibitory connections; arrowheads indicate excitatory connections. (a) Letters I and N of the word
IN have been presented. (b) Uncommitted representation nodes U1 and U2 are competing. Assume
that the weights are initially equally strong with small random variations. (c) Node U2 has won the
competition process because its connections were a little bit stronger. After resolution of the competition
process, the connections from letter nodes I and N to node U2 are strengthened (Hebbian learning); the
connection from O to U2 is weakened (anti-Hebbian learning). (d) When the word IN is presented again,
the IN node will very rapidly become activated. (e) When the word NO is presented, however, the other
node will become activated. (f) Hebbian learning further establishes the previously uncommitted node U1
as the NO node.
into (0.3, 0.1, 0.9). The weights of the neighbors neighborhood are semantically related. This ap-
would also be shifted toward the input pattern. proach to semantics is explored further in the sec-
Exactly how large the neighborhood is varies, and tion Harvesting Semantics From Texts. In the
generally it shrinks in the course of the learning following section, a selection of models of language
process. Initially, it may incorporate half the net- acquisition is reviewed, and a small selection of
work, whereas toward the end of learning, it may models of L2 acquisition is examined in the next
include only the winner itself. section. A more general and more complete over-
The self-organizing map is interesting because it view of computational psycholinguistic models can
offers a model for the acquisition and emergence of be found in the work of Dijkstra and De Smedt
similar maps found in the brain. Such topological (1996), and a recent collection of models of (rst)
maps are found, for example, in the auditory areas language acquisition can be found in the book by
of the temporal cortex, where neurons that are sen- Broeder and Murre (2000).
sitive to well-dened pitches are neatly arranged
from high to low in a tonotopic map. Many other
examples have been documented, including the vi-
sual areas and somatosensory areas of the brain. Models of First Language
Kohonen maps have been applied to areas rel- Acquisition
evant for language processing, such as learning of
phonemes and speech recognition (the phonetic When reviewing a certain area of modeling in
typewriter by Kohonen, 1988); of handwritten psychology or cognitive neuroscience, different
graphemes, including those of cursive writing (Scho- levels of maturity in the approach can be distin-
maker, 1991); and of the semantics and broad guished. In relatively new or difcult areas, the
lexical class as in the semantotopic maps by Ritter models are often merely aimed at existence proofs,
and Kohonen (1990). In the last, a Kohonen map attempting to show that something is indeed pos-
was trained on simple three-word sentences of the sible. Thus, Sejnowksi and Rosenbergs (1987)
type { John, Mary} {walks, runs} {slowly, fast}, such NetTalk demonstrated that it is indeed possible to
as John walks slowly or Mary runs slowly. achieve industry standard text-to-speech conver-
Many small sets of related words and many possible sion with a highly automated method based on
set-sequences were used. In the resulting semanto- learning from examples. Although their text and
topic map, nouns, verbs, and adverbs occupied con- phoneme transcriptions were aligned manually, no
secutive positions, covering three different areas of rules or other information entered into the learning
the map. Within these areas, related words were process. A competing model, DECTalk, had been
positioned on nodes that were close together. The developed through laborious encoding of the pro-
semantic similarity was thus reected in the two- nunciation rules of English with its many excep-
dimensional layout of the map. tions. Yet, despite the fact that more explicit
Based on these types of maps, and with addition information was used in creating DECTalk, Net-
of several other modules and learning algorithms, Talk achieved a comparable generalization of per-
Miikkulainen (1993; also see Miikkulainen & Dyer, formance on untrained texts. Once there is such an
1991) constructed an impressive system that can existence proof, more sophisticated research ques-
answer questions about stories and can parse and tions may be asked, delving into the psychological
paraphrase them. Semantotopic maps have not yet plausibility of the approach and perhaps even into
been found in the brain, although there is good the biological plausibility. With connectionist mod-
evidence for at least an overall organization into els, there is often the implicit promise that similar
people, animals, and tools on the temporal cortex processes may go on in the brain, although the class
(Gazzaniga, Ivry, & Mangun, 2002). of models for which this claim is substantiated is
The vocabulary and grammar used by Ritter still quite small.
and Kohonen (1990) were very small compared to One aspect that many models such as NetTalk
real language, making it a toy problem. Using ad- have in common is that they exhibit rulelike be-
vanced techniques from matrix algebra, however, it havior, yet rules are not explicitly represented, and
has proven to be possible to harvest semantics no symbols are passed around in the network.
from raw texts (Landauer & Dumais, 1997). The There is only the ow of activation signals from
approach has much in common with that of Ritter input to output. The output activation pattern must
and Kohonen (1990). For example, they also used be interpreted to arrive at a phonetic transcription,
the heuristic that words that occur in a similar for example, by selecting the output node with the
Models of Language Acquisition 161
highest activation value. The total output of the pattern of each individual word with the rest of the
network is thus richer than the single phoneme it words structure.
produces because it also produces a host of second Gillis, Daelemans, and Durieux (2000) de-
guesses. When the nal output is very similar to scribed a remember-all model that uses lazy
that of rule-based systems, we still say that the learning to acquire stress assignment. The system
network has discovered the rules of pronunciation. is fed with words with stress assigned. All instances
A set of rules in such a case can be considered a are stored and remembered. When a new word is
compact description of a much richer domain. encountered, its stress pattern can be determined by
Once the existence proofs have been estab- nding the words in long-term memory that re-
lished, more advanced questions may be asked. In semble it. In remember-all schemes, it is usually
the case of language acquisition, the focus could be necessary to calculate similarity between remem-
questions like the following: Is the model scalable? bered instances and newly encountered ones, and
In other words, is it able to deal with large, real- much of the work done on this type of model is
world training sets that have not been extensively invested in the study of similarity metrics. Their
preprocessed? Does the system acquire the pro- approach is based on that by Aha, Kibler, and Al-
nunciations or other task aspects modeled in bert (1991), with some extensions. Suppose, for
roughly the same order as human subjects? What example, that the system encounters a new Dutch
are the difcult cases? Do humans frequently falter word politie (/po:li:si:/, police). It will attempt to
on these as well? Is the method of training com- assign one of a limited number of stress patterns by
parable to the way humans acquire the task (e.g., comparing it with all words in the database (long-
considering the number of training trials)? Can term memory).
stages in the learning process be discerned? If the Words are matched in a syllable grid, and the
network is damaged, will the errors resemble those number of attribute value coincidences is counted.
of human subjects with certain types of brain For example, /po:li:si:/ and /po:li:o:/ (polio, polio)
damage? Can anything be learned from the internal give a high match, whereas /po:li:si:/ and /a:Gressi:/
representations that have been formed during the (agressie, aggression) give a low match. This ap-
learning process? Is there a relationship between proach can be rened, as Gillis et al. (2000, Table
the overall model architecture and theories of cog- 5.1) showed, because not all attributes give the same
nitive processing? Is there any such relationship amount of information. In fact, most of the infor-
with the gross anatomy of the brain? As more and mation necessary for stress assignment is carried by
more such questions are answered in a satisfactory the nucleus (middle) and coda (end) of the nal
manner, the models become more sophisticated. syllable. They therefore weighed each attribute
During this process, there is usually a simulta- (entry in the grid) by its importance. Using this re-
neous demand for more detailed and more exten- vised scheme, the match between /po:li:si:/ and
sive collections of data, as can indeed be observed /po:li:o:/ becomes lower because the nal syllable
for the case of language acquisition. With the de- does not match. The match between /po:li:si:/ and
velopment of very fast personal computers, it is /a:Gressi:/ becomes much higher, and the system
now feasible to model real-world data sets. More- decides to assign the latters stress pattern to politie,
over, their collection, administration, and distri- which is correct (penultimate syllable). Using that
bution are greatly boosted by the Internet. An of polio would have led to an incorrect assignment
example is the CHILDES (Child Language Data of the antepenultimate syllable.
Exchange System) database at http://childes.psy. Differential weighing of attributes also forms
cmu.edu/, created by Brian MacWhinney (Mac- the cornerstone of Nosofskys (1986) Generalized
Whinney & Snow, 1985). Context Model, which is instance based as well.
Attributes that are important for category distinc-
tions receive a higher weight compared to those
Remember-All Models of Prosody attributes that categories tend to have in common.
Intuitively, this makes sense; if animals have to be
Like many other language domains, prosody can be categorized as either cats or dogs, it helps very little
modeled through explicit rules and through other to know that the animal to be categorized has four
subsymbolic or nonsymbolic approaches. Children legs, two eyes, two ears, and a tail. Although there
may learn correct stress assignment, for example, are four attribute values, they convey no differen-
through the development of explicit rules. Alter- tial information, and they would each receive a
natively, they could simply memorize the stress weight of 0.
162 Acquisition
One of the most popular approaches to gram- generated a staggering number of theories and
mar learning is the simple recurrent network of philosophies. Compared to these, the contributions
Elman (1988, 1991), which is a special case of of computational models have been modest. These
backpropagation through time as formulated by are nonetheless interesting because they illustrate
Rumelhart et al. (1986). The simple recurrent net- how rough but useful semantic associations be-
work is able to acquire (i.e., induce) simple phrase tween words can be derived even from raw, un-
structure grammars so that, when given a gram- preprocessed texts.
matically correct sentence fragment, it is able to The semantotopic maps by Ritter and Kohonen
produce a legitimate next word. Several other stud- (1990) illustrated the principle using a small-scale
ies have been carried out, all indicating that the database. Recent techniques have used very large
networks are indeed able to capture grammatical corpora, exceeding in some cases even the size of
regularities from an input set (e.g., Cleeremans, those to which a single human is exposed during a
1993). Servan-Schreiber, Cleeremans, and McClel- lifetime. The basic premise of most of this research
land (1991) showed that their simple recurrent is that words that tend to appear in the same con-
network could be trained to be a perfect recognizer text must be semantically related. As a rst ap-
for a nite-state grammar. So, there seem to be good proach, all words encountered could be placed in a
possibilities for at least one class of neural network huge list, and a count of all words observed in the
models. context could be kept behind each word. Context
Sharkey et al. (2000), however, analyzed the could be dened arbitrarily as, say, within 10 words
behavior of the simple recurrent network and to the left and 10 to the right, or some linguistic or
pointed out several difculties that severely limit its textual unit such as the sentence or paragraph could
applicability as a psychological model of grammar be chosen. If any two words are selected from the
acquisition. One of the problems mentioned by list, the match can be looked at in context to gain a
Sharkey et al. concerns the low maximum level of rough estimate of their semantic relatedness. This
embedding that can be handled. This problem is approach was used by several authors.
sidestepped here because it could be argued that Landauer and Dumais (1997), for example,
human working memory has severe limitations dened a feature vector for each word encoun-
as well. tered in a text, for which the features consisted
One of the other problems addressed by Sharkey of those words found in the neighborhood. These
et al. (2000) is the extreme difculty of training a vectors may become very large as the number of
simple recurrent network to be a grammar recog- lexical types may approach 100,000. Therefore,
nizer. In one study, only 2 of 90 simulations yielded they used methods from matrix algebra (Singular
a successful performance. Several other problems Value Decomposition, a form of generalized factor
were signaled, including one that plagues all back- analysis) to reduce this vector and shrink it to
propagation networks, namely, catastrophic inter- about 80 features. Landauer and Dumais showed
ference (e.g., French, 1999). In this case, it would that this not only results in a more manageable long-
be predicted that children would have to relearn a term memory size, but also gives better performance
substantial fraction of the already-learned exam- (also see the material at http://lsa.colorado.edu/).
ples each time they came across novel grammatical Their method, called Latent Semantic Analysis, am-
structures. If no relearning takes place on old pat- plies underlying semantic factors and lters out
terns when novel ones are learned, the backpropaga- some of the noise.
tion network (including simple recurrent networks) A similar approach was followed by Grifths
rapidly forgets everything it has learned. Sharkey and Steyvers (2004). Instead of dimension reduc-
et al. (2000) did not explore remedies that have tion, they used a topic-nding approach. Each
been proposed to alleviate this catastrophic for- word occurs in the context of certain topics. Se-
getting (French, 1999; Murre, 1992). Once these lection of one or more topics gives a distribution of
are applied, some of the negative conclusions by words that are likely to occur. Their topic-based
Sharkey et al. (2000) may be lifted. system was able to successfully classify abstracts of
articles published in the Proceedings of the Na-
tional Academy of Sciences. An extension of their
Harvesting Semantics From Texts model also included word order statistics, based
on Hidden Markov Models (Grifths & Steyvers,
How meaning is represented in mind and brain and 2003), and is able to generate pseudoabstracts after
how it is acquired are important topics that have topic selection.
164 Acquisition
Although these methods have clear practical model is comprehensive and addresses most areas in
applications, they have at this point to my knowl- language processing and acquisition: phonology,
edge not been studied as models of human semantic lexicon, morphosyntax, and conceptualization (Le-
acquisition. One step toward this would be to study velt, 1989). The Extended Competition Model em-
the gradual pattern of acquisition rather than to phasizes competition between various options as the
process a large batch of text at once. main mechanism that operates during production
Children not only have to learn how words relate and comprehension. Different possibilities, such as
to each other, but also have to ground their meaning word orders or choices of lexical items, compete.
by relating them to aspects of the nonlinguistic en- Levels maybut need notinuence each other,
vironment. Siskind (1996, 2000) pointed out that which is called resonance. MacWhinney (in press)
this is a nontrivial problem because many words argued that models of L2 learning must always take
may be presented simultaneously with many possi- into account L1 learning because of transfer effects
ble aspects of the nonlinguistic context. He presents and common learning mechanisms. The Extended
a lexical acquisition method for bootstrapping Competition Model provides a basis for imple-
the cross-domain semantics using an online method, mentation in a series of computational models, but
rejecting a remember-all approach. Although his this has not been accomplished yet.
simulations used articial examples, he did consider
the effect noise might have.
Bilingual Lexicon
The two leading connectionist models of the bilin-
Models of Multilanguage gual lexicon are the Bilingual Interactive Activation
Acquisition (BIA) model (Dijkstra & Van Heuven, 1998;
Grainger, 1993; also see chapter 10 by Thomas and
Although the eld of bilingualism is developing Van Heuven in this volume) and the Bilingual
rapidly, computational modeling of multilanguage Interactive Model of Lexical Access (BIMOLA;
acquisition lags far behind. Some advances, how- Grosjean, 1988, 1997), both building on the work
ever, suggest that important contributions can be of James McClelland and colleagues. Although
expected in this area. Most models focus on the neither models the acquisition process, they form an
learning process of an L2 after an L1 has already important benchmark against which acquisition
been mastered. This distinction between L1 and L2 models of bilingual lexicons should be tested.
is itself subject to theorizing. How much later must The BIA model is an extension of the interactive
L2 be learned to be considered an L2? What are the activation model by McClelland and Rumelhart
criteria? (1981), which was developed to account for con-
Ullman (2001) addressed this issue in the con- text effects in letter recognition. It has three levels:
text of his declarative/procedural model. He argued letter features, letters, and words. Only four-letter
that learners at a greater age of exposure change words can be represented. Important in the model
their learning strategy of grammar from procedural are the recurrent connections from the word level
to declarative, from implicit habit formation to down to the letter level. A partially recognized
explicit memorization of instances and schoolbook word is able to help disambiguate a letter, so that a
rules. The L2 thus is acquired in a fundamentally B in beer will be recognized faster than in bxxx, as
different manner. The use of rules here is quite is the case in humans. The BIA model adds a fourth
different from the rulelike behavior that emerges level: language nodes. Activating a particular lan-
in the models discussed in the preceding section. guage node allows selection of words in that
These often lack internal rules, but are able to show language while inhibiting words in the other lan-
regularities in processing that are hard to distin- guages. The words from different languages are
guish from true rule-based processing. Ullman represented in an integrated lexicon at the word
stressed the explicit, conscious use of rules in late level. During recognition, all word nodes that be-
L2 learning as the main mechanism. come activated through a particular language node
A general approach to modeling language ac- will be strongly favored in the recognition process.
quisition, the Competition Model (Bates & Mac- The BIMOLA is an extension of the TRACE
Whinney, 1982; MacWhinney, 1987) has been model of McClelland and Elman (1986) and as such
extended to L2 acquisition (MacWhinney, in press; focuses on spoken word recognition. It has no lan-
also see MacWhinney, chapter 3, this volume). The guage level, but has distinct modules for different
Models of Language Acquisition 165
languages at the phoneme and word levels. Both the clustering did not change, but when Node 22
the BIA model and BIMOLA have been applied to was lesioned, the separation between the two lan-
cross-modal priming, recognition of cognates and guages was severely disrupted. French compared
homographs, semantic effects, and other phenom- this nding to that of bilinguals with diffuse brain
ena. Many of these are reviewed in chapter 10 by damage, who do not usually mix languages, and
Thomas and Van Heuven in this volume. contrasted it with that of rare cases for whom a
Meara (1999) also studied the bilingual lexi- small lesion caused a loss of separation between
con, using Random Autonomous Boolean (RAB) languages (e.g., Albert & Obler, 1978).
networks (Kauffman, 1993), which resemble some- Li and Farkas (2002) presented a self-organizing
what the model by Hopeld (1982). RAB networks model of bilingual processing (SOMBIP) that is
also have recurrent connections as well as attractor similar in spirit to Frenchs model but takes a quite
states. The basic idea of Mearas model is that L1 different approach. Its architecture is based on two
and L2 both form an attractor state, and that the coupled self-organizing maps, one of which is a
network is in either of the two states. When the semantotopic map (Miikkulainen, 1993; Ritter &
network is in L1, more and more lexical units from Kohonen, 1990); the other represents the pho-
L2 may be recruited, for example, because the nology (a phonotopic map). The system was
system is suddenly exposed to many words from trained on realistic bilingual speech input from the
L2. In that case, the attractor will rapidly shift to CHILDES corpus: conversations between a child
L2, which will then become the dominant lexical (13 years old) and the childs native English-
process. This model aims to explain aspects of speaking father and native Cantonese-speaking
language shifting, although it has not been sys- mother. The input was analyzed for lexical co-
tematically compared with experimental data. occurrence statistics with a new technique that
Neither the BIA model nor the BIMOLA models automatically generates a meaningful semantic rep-
the acquisition phase; their connections are hard- resentation in a way comparable to Latent Se-
wired. Two models that specically aim to model mantic Analysis (see Harvesting Semantics From
acquisition are by French (1998) and by Li and Texts section): Related words have more similar
Farkas (2002). representations than unrelated words. When a
French (1998) applied a simple recurrent net- word is encountered, the semantotopic map is
work to modeling the lexicon in bilingual memory. presented with its semantic representation (derived
His basic simulation uses a grammar with only earlier); simultaneously, the phonotopic map is
subject-verb-object (SVO) sentences and with two presented with its phonological representation.
minivocabularies with four words per category Hebbian learning associates corresponding repre-
(e.g., subject nouns in language Alpha are boy, girl, sentations in the two maps while these are still
man, woman and in language Beta are garcon, emerging. Even though English and Cantonese
lle, homme, and femme; French also mentioned a were presented intermittently and without explicit
scaled-up version with 256 words per category). language labeling, both the semantics and the
Sentences are generated as a continuous stream phonology self-organized into separate regions of
with a very small chance of switching from Alpha the maps, forming language-specic lexicons in a
to Beta (never within a SVO sentence): boy lifts toy single integrated network. Lexical categories, such
man sees pen man touches book . . . boy pushes as nouns, verbs, and prepositions, organized them-
book femme souleve stylo lle prend stylo, and so selves into separate regions within the language
on. A 300,000-word training sequence was fed into areas of the maps. The system also showed evi-
a single recurrent network with a hidden layer of dence of cross-language priming and interference,
32 nodes. With backpropagation, distributed rep- but the simulations were not tted to human data.
resentations are expected to develop in the hidden
layer, with strong overlap between learned items.
French (1998) analyzed these representations Bilingual Phonology and
and founds that they were neatly clustered, rst Speech Perception
into the two languages and within these into the
three grammatical categories. It is interesting that Hancin-Bhatt and Govindjee (1999) developed an
there was single hidden node, Node 22, that had a L2 model of phonology. They used a network to
very strong inuence on which language was pro- explain when and why L2 learners have trouble
cessed. When a random subset of the nodes was acquiring a particular L2 phonology. They were
lesioned (deactivated), the overall characteristics of interested in the pattern of substitutions that takes
166 Acquisition
place when, for example, German speakers of En- bilingual processing (see Thomas & Van Heuven,
glish substitute a continuant /s,z/ for the English chapter 10, this volume), there are still very few
interdentals /y,|/ and speakers of Hindi and Turkish models of bilingual language acquisition. Consider-
use a stop /t/, for example. Their network takes ing the progress that has been made, where such models
into account frequency of occurrence, and it has could be developed can be sketched. The work on la-
both a perception and a production part, so that it tent semantic analysis by Landauer and Dumais (1997),
can be used to explain both aspects of language for example, seems very promising as a basis for a
processing. It was trained on the L1 phonology in model of the bilingual lexicon, possibly using neural
Hindi and Japanese and then tested on the L2 networks as an implementation (cf. Ritter & Kohonen,
phonology of English interdentals. The pattern of 1990) rather than matrix algebra.
errors approximated that of experimental subjects. Dumais, Letsche, Littman, and Landauer (1997)
One of their conclusions was that L1 speakers are have already shown that their approach is able to
biased in their perception of L2, and that this is the handle multilanguage semantics. The system was
main determinant in, possibly inaccurate, feature trained on aligned French and English texts (i.e.,
selection. translations). The languages semantic spaces were
A model of the acquisition of speech sounds in a moved into proper alignment through forced
foreign language is that of Keidel, Zevin, Kluender, placement of selected words that were assumed to
and Seidenberg (2003). Their approach can be seen be semantically identical in both languages (e.g.,
as an implementation of Fleges Speech Learning names of countries). In the next training stage,
Model (Flege, 1995) and Bests Perceptual Assimi- texts in one language only were added to the
lation Model (Best, McRoberts, & Goodell, 2001). system (folded-in). The system was tested on
In the Speech Learning Model, L2 speech sounds cross-language retrieval: Queries in, say, English
are perceived relative to existing L1 prototypes. successfully retrieved texts in both English and
The model predicts, among others, that L2 speech French, including French texts that had not been
sounds are easier to acquire if they differ phoneti- presented with an aligned English translation. It
cally from those in L1. The Perceptual Assimilation will be clear that this scenario resembles semantic
Model bears many similarities to this model, but acquisition and translation, and it would be inter-
assumes that speech perception occurs by direct esting to study the models from that perspective.
perception of gestural information. Another area in which computational models
Keidel et al. (2003) presented a large number will undoubtedly contribute is the critical period
of digitized English and isiZulu CV syllables, re- debate (see chapters 2, 5, and 6, this volume).
corded from native speakers, to a network model. Models such as that of Keidel et al. (2003), dis-
The network was trained on the English speech cussed in the preceding section, may be used to
with a generalized recurrent backpropagation al- elucidate the extent that already-formed speech
gorithm (Pearlmutter, 1995), of which the Simple categories inuence the acquisition of new ones. Is
Recurrent Network discussed in a separate previ- (biologically constrained) plasticity of the system
ous section is a special case. The system learned to the all-important factor, or does a rmly converged
recognize and generalize the English speech sounds set of prototypes by itself hamper new acquisition?
well. When presented with the Zulu stimuli, these Can these insights be used to forge new represen-
were assimilated to English phonemes in accor- tations, for example, by presenting synthetically
dance with human English speakers when exposed exaggerated speech sounds to L2 learners, as was
to Zulu. Models such as this are able to make done by McClelland, Fiez, and McCandliss (2002)?
specic predictions when complex phonetic sys- These authors made practical use of the insights
tems interact and contrastive analysis may be dif- gleaned from modeling in the teaching of an L2.
cult or inconclusive. In the future, the development of more of these
models can tell us how to increase the acquisition
speed when learning another language. Because com-
Current and Future putational models are fed with real-world data, it is
Developments even feasible to build such models into computer-
assisted language learning programs, such that the
Compared to the thriving eld of computational learning process will be optimized for individual
psycholinguistics (e.g., Dijkstra & De Smedt, 1996) students. Models of bilingual language acquisition
and the developing subelds of models of language are rapidly becoming more sophisticated, providing
acquisition (Broeder & Murre, 2000) or models of new battlegrounds for theory and practice.
Models of Language Acquisition 167
COMPREHENSION
This page intentionally left blank
Natasha Tokowicz
Charles A. Perfetti
Introduction to Part II
Comprehension
173
174 Comprehension
component processes and their relationships have purposes. Bilingual word recognition has made great
been addressed in bilingual research. In particular, advances in the recent past as a result of the available
there is much more to say about bilingual word- models. Thomas and Van Heuven suggest that
level processes than higher-level comprehension joining localist and distributed models will further
processes. our understanding of bilingual comprehension. Be-
yond the representational details of models, how-
ever, is the value of building competing models that
Word Identication address the same problems. This competition ex-
poses basic assumptions about language processes
Word identication entails lexical access through that can be hidden when each model addresses a
phonological and printed inputs. It is axiomatic different problem.
that these inputs are linguistically specic. One
hears a word with Dutch phonology or with French
phonology, and a comprehender with the required Parsing
language skill identies the word accordingly.
From this point, however, the details become in- Listeners and readers must do something with the
teresting. Two of the chapters in this part relate to words they hear and see to construct messages.
the study of bilingual word identication. Although Building phrasal units from strings of words and
it seems intuitively reasonable to skilled bilinguals connecting these units with each other in the way
that they can effectively turn off or attenuate one allowed by the grammar of the language is a large
of their languages, the research by now suggests part of this process. How to explain parsing in the
that this seldom happens. Perhaps one language rst language (L1) has proved to be difcult and
can be turned down, but not quite turned off. As contentious. How do comprehenders decide, on a
Dijkstra (chapter 9) demonstrates, bottom-up fac- word-by-word basis, how to attach a word to the
tors such as stimulus list composition and task current representation of a sentence? Theories that
demands make a difference for bilingual word rec- stress basic principles of simplicity and theories
ognition. Furthermore, top-down information, such that stress more complex multiple constraints of-
as the knowledge that only one of your languages is fer rather different solutions to this question. In
needed for a given task is not sufcient and can be the case of an L2, the question becomes even more
overridden by the bottom-up information (see also difcult. The grammar of the L2 is not as well rep-
MacWhinney, chapter 3). resented as that of the L1 in most cases. So, how
A classic question is whether word form infor- does a learner of a second language go about de-
mation for the two languages is stored together or ciding how to attach a word to a current sentence
separately. Given the above results and others, we representation?
may conclude that word form information is most Frenck-Mestre (chapter 13) reviews some of the
likely stored in a shared way (or at least in a way recent research on bilingual parsing. In particular, she
that allows sufcient cross talk between the two considers the evidence that bilinguals use information
languages; see Francis, chapter 12). As mentioned, from their L1 to process their L2. Thus, a persons L1
task demands will inuence whether there appears can indicate which particular syntactic structures will
to be selective or nonselective access of word forms be difcult to comprehend in L2. A similar conclu-
in the two languages. sion was reached by Fender (2003), who showed that
The critical issue of how words are recognized by Japanese and Arabic speakers of English as a second
bilinguals recently has received much attention be- language have opposite difculties in processing En-
cause of the precision available in mathematical mod- glish as a result of different native language struc-
els. Thomas and Van Heuven (chapter 10) provide tures. The dominance of L1 syntactic structures in
a review of the two major types of computational L2 comprehension was also evident in research by
models used in this area, localist and distributed Tokowicz and MacWhinney (2002), who showed
models. Their review includes a summary of the is- that native English speakers learning Spanish had
sues that have been tackled with models; these issues difculty rejecting Spanish sentences with grammat-
include neighborhood effects, priming, and homo- ical errors when the word-by-word translation
graph/cognate effects. Although we are far from a mapped directly to an acceptable English structure.
complete model of bilingual comprehension, prog- Also, Tokowicz and MacWhinney (in press) found
ress in computational modeling comes from models that these learners showed brain responses (measured
designed for specic problems rather than for general by event-related potentials) that indicated more
Introduction to Part II 175
sensitivity to grammatical violations in their L2 and Tokowicz, Kroll, De Groot, & Van Hell, 2002,
(Spanish) when the constructions were formed simi- for more information about the consequences of
larly, rather than differently, in L1 and L2. This was imprecise meaning overlap across languages).
true despite the participants inability to distin- In answer to the question of whether cognates
guish grammatically acceptable and unacceptable are stored in a special way relative to noncognates,
sentences overtly. Finally, evidence shows that non- Sanchez-Casas and Garca-Albea (chapter 11) con-
procient bilinguals initially comprehend L2 through clude that there is preliminary evidence to support
an L1 lens. McDonald (1987) showed that English a special status for cognate representations. They
learners of Dutch declined in their use of word order argue that cognates are treated as morphologically
(a valid English cue) and increased in their use of case related words within a language and demonstrate
inection (a valid Dutch cue) to comprehend L2 that they follow the same priming pattern as such
sentences as their Dutch competence increased. words. Interestingly, Francis (chapter 12) provides
evidence that translation equivalents in general are
not treated as within-language synonyms.
Semantic-Syntactic Another factor that has been shown to inuence
Representations meaning representation is age of acquisition (AoA).
Izura and Ellis (2002, 2004) showed that regardless
Representing meaning is central to comprehension of L1 AoA, L2 words learned earlier are processed
at all levels. Word identication brings access to more rapidly than L2 words learned later. This
word meanings and their associated concepts, and pattern has been observed in several tasks, includ-
parsing builds groupings of words and morphemes ing translation recognition, lexical decision, and
into phrasal units that provide both reference and object naming. Thus, the age at which an L2 word
semantic relationships. The result of these word is learned has an impact on the word form-to-
identication and syntactic processes is a repre- meaning connection that is the foundation of L2
sentation of meaning at the clausal and sentence comprehension.
levels. This meaning representation, correspond-
ing to a proposition in theories of comprehension
(Kinstsch, 1988), can be considered the basic unit Text Representation and
of relational meaning in a text, spoken or written. Integration (and Understanding)
It is our impression that there is little in bilin-
gual research that corresponds fully to this level of Text representation and integration is an area that
analysis, although several chapters in this section has received relatively little attention in the psycho-
focus on parts of it. For example, how words are linguistic literature on bilingualism and is not re-
represented in the memory of a bilingual has been a presented in the chapters in this part. This is true
major question. Are words from the two languages also for the level of real understanding (fth in our
stored separately in their own language or con- list of comprehension processes), so we comment on
nected together by their meaning similarity? Do these two together. We suspect that the neglect re-
translation equivalents activate identical meaning sults from the natural focus on word- and, to a lesser
representations? Are cognate translations stored extent, syntactic-level processes that are the building
differently from noncognate translations? Each of blocks of comprehension. In the long run, we would
these issues is addressed in this section. expect to see increased attention at least to the
The basic answer to the rst of these questions consequences for text representation of the lexical
is, well, it depends. A single pool of semantic fea- and syntactic processes that have been studied. Pre-
tures most likely comprises the meanings of trans- sumably, a parsing problem in reading a sentence
lation equivalents. Whether translation equivalents in L2 must lead to one of two consequencesa
activate exactly the same meaning may depend on breakdown in comprehension such that both the
the manner in which L2 was learned (e.g., in the current sentence and subsequent sentences are mis-
classroom or abroad; see De Groot, 1992). How- understood or a reective repair that slows the
ever, as always, there are caveats. Generally, it comprehension process, but keeps the representation
seems that the differences in meaning are few and coherent. Both of these outcomes place comprehen-
far between. For the most part, translations are sion at risk. Similarly, at the word level, does it
just that, words that have the same meaning across matter downstream in the representation of sen-
languages (see Guasch, 2001; Sanchez-Casas, tence and clause meaning that a word read in L2 has
Suarez-Buratti, & Igoa, 1992; Tokowicz, 2000; also activated an L1 word representation for a few
176 Comprehension
milliseconds? Moreover, does sustained reading or listener. In C. M. Brown & P. Hagoort (Eds.),
listening to an L2 text build up some protection The neurocognition of language (pp. 123
from this word-level interference? 166). Oxford, England: Oxford University
Beyond these basic questions about how text- Press.
level processes might interact with lexical and De Groot, A. M. B. (1992). Determinants of word
translation. Journal of Experimental Psychol-
parsing processes is the application of text com-
ogy: Learning, Memory, and Cognition, 18,
prehension research tools to bilingual processing. 10011018.
For example, computational models of compre- Fender, M. (2003). English word recognition and
hension (e.g., Kintsch, 1988; Van den Broek, word integration skills of native Arabic- and
Young, Tzeng, & Linderholm, 1999) can be sen- Japanese-speaking learners of English as a
sitive to limitations in working memory, readers second language. Applied Psycholinguistics,
knowledge and goals, and other factors that would 24, 289315.
apply to L2 comprehension as well as L1. Guasch, M. (2001). Forma y signicado en el
procesamiento lexico de bilingues del
castellano y del catalan. Unpublished
masters thesis, Universitat Rovira i Virgili,
Individual Differences Tarragona, Spain.
Izura, C. & Ellis, A. W. (2002). Age of acquisition
Comprehension processes in L1 show wide-ranging effects in word recognition and production in
individual differences in adults and children; these rst and second languages. Psicologica, 23,
differences arise from such components as we re- 245281.
viewed above, plus others (Perfetti, 1999). Similarly, Izura, C. & Ellis, A. W. (2004). Age of acquisition
there are many individual difference that are likely effects in translation judgment tasks. Journal
of Memory and Language, 50, 165181.
to affect how one learns and processes an L2, and
Kintsch, W. (1988). The role of knowledge in dis-
some of these appear to lie in L1 abilities. Michael
course processing: A construction-integration
and Gollan (chapter 19), in part III on language model. Psychological Review, 95, 163182.
production and control, provide an overview of re- McDonald, J. L. (1987). Sentence interpretation in
search on the effects of L1 processing skill (e.g., bilingual speakers of English and Dutch.
working memory capacity and suppression) on L2 Applied Psycholinguistics, 8, 379414.
processing. Furthermore, motivational factors can Perfetti, C. A. (1999). Comprehending written
also have an impact on an individuals success in L2 language: A blueprint of the reader. In
learning and, ultimately, comprehension. C. Brown & P. Hagoort (Eds.), The neuro-
With recent applications of neuroimaging and cognition of language (pp. 167208).
Oxford, England: Oxford University Press.
electrophysiological techniques to the study of lan-
Sanchez-Casas, R., Suarez-Buratti, B., & Igoa, J. M.
guage processing, such as functional magnetic res-
(1992, September). Are bilingual lexical rep-
onance imaging, positron emission tomography, and resentations interconnected. Paper presented
event-related potentials, we have even more meth- at the Fifth Conference of the European
ods to study bilingual comprehension. Having these Society for Cognitive Psychology, Paris.
added techniques, along with the advances in math- Tokowicz, N. (2000). Meaning representation
ematical modeling, will undoubtedly enhance the within and across languages. Unpublished
already-rich picture of what happens during bilin- doctoral dissertation, The Pennsylvania State
gual language processing. These advances will allow University, University Park.
researchers to pose questions other than those al- Tokowicz, N., Kroll, J. F., De Groot, A. M. B., &
Van Hell, J. G. (2002). Number-of-translation
ready asked. The converging evidence from this set
norms for Dutch-English translation pairs: A
of increasingly diverse methods is likely to encour-
new tool for examining language production.
age the development of models of bilingual com- Behavior Research Methods, Instruments, and
prehension that are more complete and, at the same Computers, 34, 435451.
time, better capture the implications for general Tokowicz, N., & MacWhinney, B. (2002, April).
models of language comprehension that in the past Judging grammatical acceptability in L2:
have focused on monolingual experience alone. Competing grammatical systems in the second
language learner. Paper presented at the Forty-
References Seventh Annual Meeting of the International
Linguistic Association, Toronto, Canada.
Cutler, A., & Clifton, C. E. (1999). Comprehend- Tokowicz, N., & MacWhinney, B. (in press). Im-
ing spoken language: A blueprint of the plicit and explicit measures of sensitivity to
Introduction to Part II 177
violations in second language grammar: An of reading: Inferences and the on-line con-
event-related potential investigation [Special struction of a memory representation. In H.
issue]. Studies in Second Language Learning. van Oostendorp & S. R. Goldman (Eds.), The
Van den Broek, P., Young, M., Tzeng, Y., & construction of mental representations during
Linderholm, T. (1999). The landscape model reading (pp. 7198). Mahwah, NJ: Erlbaum.
This page intentionally left blank
Ton Dijkstra
9
Bilingual Visual Word Recognition
and Lexical Access
179
180 Comprehension
us. Bilingual word recognition appears to be basi- of possible lexical candidates and nally to the
cally language nonselective, automatic (i.e., not recognition of the presented word. In localist con-
under control of the reader), andalthough task nectionist network models (see Thomas & Van
dependentits rst processing stages might remain Heuven, chapter 10, this volume), this viewpoint is
unaffected by nonlinguistic contextual factors. often represented in terms of an activation process.
On presentation of a letter string, a number of
word candidates are initially activated, one of them
the intended target word. A subsequent lateral in-
Word Recognition and hibition process between word candidates leads to
Lexical Access a reduction of the activation of nontarget candi-
dates. Finally, the target word becomes activated
Lexical access is the process of entering the mental the most, and it is recognized when it surpasses a
lexicon to retrieve information about words. The recognition threshold.
mental lexicon is the database containing all words Relative to the monolingual domain, two un-
in the mind of the language user. Lexical infor- ique questions can be posed with respect to the
mation can be, for instance, orthographic (spell- bilingual word recognition process. The rst ques-
ing), phonological (sound), or semantic (meaning) tion is whether lexical candidates from different
in kind. Word recognition can then be dened as languages that share their script are activated when
the process of retrieving these word characteristics a letter string is presented. For instance, is the
on the basis of the input letter string. Although Dutch word VORK activated on presentation of
these different characteristics might become active the English word PORK? The answer to this
under many circumstances (for instance, phono- question may be no (implying language specic
logical codes may become available automatically), lexical access), yes (implying language nonse-
particular tasks may require specic kinds of lexi- lective access), or it depends. This last option
cal information to be performed. suggests that lexical access can be selective or
For instance, if one needs to decide whether nonselective depending on the circumstances.
a particular letter string is a word in the target Particular tasks or experimental circumstances
language or a nonword (lexical decision), ortho- might induce language-specic access, for instance,
graphic, phonological, and semantic information by modulating the activation of representations
could in principle all be used. However, if one must that are or are not required for responding. The
name a presented word, the retrieval of its pho- complete nonselective access view and the context-
nological information is indispensable to access the dependent nonselective access view agree in that the
words articulatory code. Finally, if asked to se- underlying word recognition architecture should
mantically categorize the object represented by the allow language-nonselective lexical access under at
word (e.g., Is a hammer a tool?), the words mean- least some circumstances, but they differ in their
ing information must be found before a response can interpretation of task-dependent results.
be initiated. The second question that is unique to the bilingual
Retrieving information from the mental lexicon domain is whether language information can be used
about these characteristics of a word takes time. It to speed up the processing of presented words. Lan-
may take a few hundreds of milliseconds to retrieve a guage information could be provided by the nonlin-
words meaning information. Furthermore, research guistic or linguistic context in which the item is
in the monolingual domain indicates that the pre- presented (e.g., the instruction or stimulus list com-
sentation of a letter string initially leads to the ac- position in an experiment or the language of a book)
tivation of several possible orthographic word or by the item itself (e.g., its language membership).
candidates in relatively close correspondence to the If information about language is provided by the
input signal. For instance, it appears that all words context, the question is whether the word identi-
that differ from the presented input string in only one cation system can use it to reduce the number of
letter position become noticeably active. Such words items in the candidate set (e.g., by suppressing the
are called neighbors (e.g., cork is a neighbor of activation of items from the irrelevant language). If
work). information about language is provided by the item
In subsequent stages of word recognition, a itself, the question is whether it is available in time to
more and more careful analysis of the input signal affect word recognition or arrives only after the
is performed, leading to a reduction of the number word has already been recognized.
Bilingual Word Recognition 181
In the rst part of this chapter, I review the em- instance, in a study by Gerard and Scarborough
pirical evidence with respect to these two issues and (1989), English monolinguals and Spanish-English
follow with a brief discussion of earlier models that bilinguals performed a lexical decision experiment
specically focused on the language selective versus with interlingual homographs and cognates. Stim-
nonselective access issue in the domains of reading ulus words included cognates, homographic non-
and listening. In the second part of the chapter, cognates, and nonhomographic control items.
I consider empirical evidence on how task demands Cognates and controls were either high frequency
and context factors affect the bilingual word rec- or low frequency in both English and Spanish.
ognition process. This part includes a discussion of Homographic noncognates were high frequency in
more recent models and viewpoints that not only English and low frequency in Spanish or vice versa.
account for bilingual word recognition, but also The ndings generally supported the language-
consider its task and context dependence. selective access hypothesis. Although a signicant
main effect of word type was found, this was
mainly caused by slow responses to homographic
Empirical Studies: noncognates that were of low frequency in the
Language-Selective Access target language. Word latencies varied primarily
of Interlingual Homographs with the frequency of usage in the target language;
and Cognates the frequency of the word forms in the nontarget
language did not affect the latencies. Finally, no
Studies that examined whether lexical candidates signicant latency differences were found between
from different languages are activated during bilin- the bilinguals and monolinguals, suggesting that
gual word recognition basically made use of two they were all effectively operating in a language-
types of stimulus materials: words that are identical selective manner. In sum, this study suggested that
or very similar in meaning or form between two lexical access in bilingual word recognition was
languages (so-called cognates and interlingual ho- restricted to only one language.
mographs), and words that exist only in one lan- Several later studies replicated the null results
guage but vary with respect to the number of similar under comparable experimental circumstances (De
words in the other language (interlingual neighbor- Groot, Delmaar, & Lupker, 2000, Experiment 2;
hood density variations). An overview of studies De Moor, 1998; Dijkstra, Van Jaarsveld, & Ten
involving these two types of test words is presented. Brinke, 1998, Experiment 1). In the rst experi-
Interlingual homographs are words that are ment from Dijkstra, Van Jaarsveld, et al. (1998),
identical with respect to their orthography, but not Dutch-English bilinguals performed an English
their meaning (or, most often, their phonology). lexical decision task on a list of words that included
Other terms used are interlexical homographs or English-Dutch homographs and cognates, as well
false friends. An example is the Dutch-English as exclusively English control words. Analogous
word room, meaning cream in Dutch. Cognates to the earlier ndings by Gerard and Scarborough
are words from two languages that are identical for Spanish-English bilinguals, the experiment did
(or very similar) in orthographic form and largely not result in any signicant RT differences for
overlap in meaning. A Dutch-English example is interlingual homographs relative to exclusively
lm. Researchers have used these types of items in English control words. Although that nding again
their research to determine if they are read by bilin- appeared to support selective access, a puzzling
guals in a different way than matched control result was that cognates, in contrast to the homo-
words that occur in only one language. If reaction graphs, did induce a signicant facilitation effect.
time (RT) differences between the two item types This nding was in accordance with language-
arise, this is probably because of their existence nonselective access and interpreted as such by
in two languages rather than one. Such RT differ- Dijkstra, Van Jaarsveld, et al. (1998). De Groot
ences therefore provide evidence in favor of et al. (2000, Experiment 2) replicated the null
language-nonselective access; their absence supports results for homographs observed in Experiment
language-selective access. 1 by Dijkstra, Van Jaarsveld, et al. with a different
In a number of early studies, no clear RT dif- set of English stimulus materials and for a differ-
ferences were observed between test items and ent sample of Dutch-English bilinguals.
controls (Caramazza & Brones, 1979; Macnamara However, problematic to the language-selective
& Kushnir, 1971; Soares & Grosjean, 1984). For access view, more and more studies following Gerard
182 Comprehension
and Scarboroughs work reported evidence in relative language activation, De Groot et al. (2000)
support of language-nonselective access, even un- proposed an account that is strategic in nature.
der the experimental conditions investigated by Participants would not always follow the instruc-
these authors. For instance, Von Studnitz and tion in the task (Say yes to an English word) to
Green (2002) found signicant inhibition effects the letter. On some trials, they would explicitly
for homographs in a similar German-English ex- check the language membership of the target item
periment and suggested that a different focus of to make sure that they responded to an English
the participants on speed and accuracy or on the item (language-specic processing strategy). This
wordlikeness of the stimuli underlay this result. would induce slower responses to homographs
Font (2001) performed a Spanish lexical decision than to matched controls because, in a nonselective
study involving French-Spanish bilinguals and access system, not only the target reading, but also
found facilitatory effects for French-Spanish inter- the nontarget language reading of a homograph
lexical homographs that had little phonological would be activated, leading to lexical competition
similarity across languages. and slower responses. On other trials, they would
The only way to save the selective access ac- not check the language membership of the item and
count, therefore, would be to assume that the evi- would respond yes to any word they encoun-
dence showing cross-language differences between tered. Thus, in this language-neutral processing
interlexical homographs and control items was mode the response to a homograph would be based
somehow awed, for instance, because the item on the availability of any reading, irrespective of
types were not really comparable or were not language, and homographs could then be re-
matched properly. However, this explanation is sponded to faster than controls. This mixture of
impossible to defend, not only because the avail- two processing modes would lead to a mixture of
able studies appeared to be conducted properly, but facilitation and inhibition effects for homographs,
also because many other studies observed RT dif- yielding an overall null result. I return to this issue
ferences for interlingual homographs and cognates in the discussion of the task dependence of bilin-
under different experimental conditions. More- gual word recognition studies.
over, yet other studies observed cross-linguistic ef-
fects using different stimulus materials (see the
studies on between-language neighborhood effects
discussed below).
Empirical Studies: Language
How then may the null results observed by Nonselective Access of
Gerard and Scarborough (1989) and Dijkstra, Van Interlingual Homographs
Jaarsveld, et al. (1998) be reconciled with a lan- and Cognates
guage nonselective access hypothesis? First, the null
results could be a consequence of a particular Recent studies have demonstrated that, in spite of
combination of stimulus characteristics, stimulus the observed null results for interlingual homo-
list composition, and task demands. Of course, this graphs, language-nonselective access indeed took
view requires a specication of the mechanism that place. De Moor (1998) repeated the English lexical
induced the null results. decision study by Dijkstra, Van Jaarsveld, et al.
Several proposals to this end have been made. (1998) and showed that the meaning of the Dutch
Dijkstra, Van Jaarsveld, et al. (1998) and Grosjean reading of the interlingual homographs was ap-
(2001) interpreted the null results in terms of parently activated as well. She rst replicated the
relative English/Dutch language activation. They nding of similar RTs to interlingual homographs
noted that the experiments in question contained and controls under the circumstances of Dijkstra
only purely English words and test words for which et al.s study. Next, on the trial after the inter-
the English reading was relevant. At the same time, lingual homograph appeared, De Moor presented
Dutch was the stronger native language of the the English translation of the Dutch reading of the
participants. As a consequence, Dutch may have homograph. For instance, the interlingual homo-
been activated only to a limited degree, sufciently graph brand was followed by re, which is the
to induce a difference between control words and English translation of the Dutch word brand. A
cognates, but not sufciently to lead to a difference small but reliable translation priming effect of 11 ms
between controls and interlingual homographs. was found. In a replication of this experiment
Whereas the accounts by Dijkstra, Van Jaars- with different stimulus materials, Van Heste (1999)
veld, et al. (1998) and Grosjean (2001) referred to observed a reliable 35-ms difference between
Bilingual Word Recognition 183
translation and control trials. These ndings indi- than in the control condition. More important, re-
cate that the Dutch word form had been activated, sponses to cognate translations were as fast as in the
even though it did not affect the lexical decision identity condition, but noncognate translations were
latency to the homograph on the previous trial. as slow as in the control condition.
Furthermore, cross-linguistic effects of ortho- Yet other studies have shown that language-
graphic and semantic overlap between the different nonselective access occurs not only with respect
readings of cognates and interlingual homographs to orthographic codes, but also for phonological
have been reported by many studies. Several of codes (e.g., Brysbaert, Van Dyck, & Van de Poel,
such studies used experimental paradigms involv- 1999; Doctor & Klein, 1992; Jared & Kroll, 2001;
ing unmasked and masked primes. For instance, Jared & Szucs, 2002; Nas, 1983). Nas asked
Beauvillain and Grainger (1987) had French- Dutch-English participants to perform an English
English bilinguals make English lexical decisions lexical decision experiment in which half of the
on target strings preceded by French prime words. nonwords were cross-language pseudohomophones
With respect to their English reading, the homo- that looked like English, but sounded like
graphic primes were either semantically related to Dutch words (according to the English spelling-
the target (e.g., coinmoney, where coin means to-sound rules). An example is the pseudohomo-
corner in French) or unrelated. The RTs to target phone SNAY, which sounds like the Dutch words
words were shorter for the related condition than SNEE (pronounced [snay]). The bilinguals rejected
for the unrelated condition when prime words were the pseudohomophones more slowly and with more
presented for a duration of 150 ms. Thus, although errors than standard nonwords (such as PRUSK).
the participants knew the prime word always be- Apparently, the language-nonselective lexical access
longed to the French language and was (strictly to the internal lexicon of a bilingual seems to pro-
spoken) irrelevant to the target decision, they were ceed at least in part via nonselective phonological
still affected by the English reading of the homo- mediation.
graphic prime. Brysbaert et al. (1999) found that there is a par-
In a Spanish-English priming study, Cristof- allel application of spelling-to-sound rules of two
fanini, Kirsner, and Milech (1986) compared the languages to stimulus input. In a masked priming
amount of priming that was observed when the paradigm, Dutch-French bilinguals and French
prime was either a cognates counterpart from monolinguals identied briey presented French
the other language (obediencia, followed by obe- target items preceded by briey presented and
dience) or the cognate itself (obedienceobedience). masked prime words or nonwords. The primes were
Priming effects were qualitatively and quantita- French nonwords or Dutch words. French nonword
tively similar to those observed for inections and primes were pseudohomophonic with the target and
derivations, suggesting that morphology and not different in only one letter (e.g., fainfaim), pseu-
language was the feature governing lexical organi- dohomophonic with only one letter in common
zation and access (see also Sanchez-Casas and (ntfaim), or nonhomophonic graphemic controls
Garca-Albea, chapter 11, this volume). Further- (faicfaim). If the prime was a Dutch word, it was
more, the effects decreased as a function of ortho- either homophonic to the French target (paarpart),
graphic similarity. a graphemic control (paalpart), or unrelated to the
More recent studies have masked the briey target (hoogpart). For the French primeFrench
presented primes to avoid conscious participant target stimuli, the bilinguals identied fewer target
strategies (e.g., Bijeljac-Babic, Biardeau, & Grain- words than the monolinguals, but the two groups
ger, 1997; De Groot & Nas, 1991; Sanchez-Casas, displayed similar orthographic and phonological
Davis, & Garca-Albea, 1992). In the study by priming effects for the three types of nonwords. For
Sanchez-Casas et al. (1992), Spanish-English bilin- the Dutch primeFrench target stimuli, the effects of
guals performed a semantic categorization task on orthographic primetarget overlap were also com-
target words preceded by masked primes with a parable across the two groups of participants.
duration of 60 ms. Primetarget pairs involved However, with respect to phonological overlap, a
identical cognates or noncognates (ricorico; pato different pattern emerged for bilinguals and mono-
pato), translations of cognates or noncognates (rich linguals. Signicant interlingual phonological prim-
rico vs. duckpato); or nonword primes combined ing effects were observed for bilinguals, but not for
with cognates or noncognates as targets (control monolinguals.
condition: rictrico vs. wuckpato). Faster responses In a study by Dijkstra, Grainger, and Van Heu-
were obtained to targets in the identity condition ven (1999), Dutch-English bilinguals performed an
184 Comprehension
English lexical decision task with English words probably responded mostly to the Dutch readings
varying in their degree of orthographic (O), pho- of the interlingual homographs. This explains
nological (P), and semantic (S) overlap with Dutch why the RTs to interlingual homographs were
words. Their six different test conditions are ex- similar to those for Dutch (L1) controls and faster
emplied by the following items: hotel (overlap than those to English (L2) controls. In the English
in S, O, and P codes), type (SO), news (SP), lexical decision task, however, the target language
step (OP), stage (O), and note (P). The rst was English, and the participant could only respond
two conditions (SOP and SO conditions) consist of safely after verifying that the language of the pre-
what are usually called cognates; the last three sented item was English. Under these task condi-
conditions contain interlingual homographs (OP tions, the early available Dutch (L1) codes had time
and O conditions) or interlingual homophones to affect the response based on the English (L2)
(P condition). Lexical decisions were facilitated by codes. This accounts for the overlap effects (or-
cross-linguistic orthographic and semantic similar- thographic facilitation and phonological inhibi-
ity relative to control words that belonged only tion) that Dijkstra et al. (1999) observed for the
to English. In contrast, phonological overlap led to interlingual homographs.
inhibitory effects. A control experiment with Interestingly, Lemhofer and Dijkstra (2004)
American English monolinguals did not lead to further found that cognates were recognized faster
systematic differences between test and control than the matched English and Dutch controls. Be-
items. Because the items in this study were com- cause at the same time the homographs (having an
parable to those in Dijkstra, Van Jaarsveld, et al. identical orthographic form across languages) did
(1998), they provide a new explanation for the not show any effects (relative to Dutch controls),
occurrence of null effects in the earlier study. The the effect for cognates appears to depend at least on
null effects may have been caused by mixing of their overlap in meaning across languages. In other
two types of items (O and OP items), leading to a words, there must have been coactivation of the
cancellation of O facilitation effects by P inhibition cognates semantics in both languages. In fact, it
effects. may be that cognates are represented in a special
Lemhofer and Dijkstra (2004) showed that the way, with a strong link between orthographic and
pattern of results varied in a systematic way when semantic representations.
the task was changed from English lexical decision In these last two studies, the orthographic repre-
to generalized Dutch-English lexical decision. The sentation of cognates was identical in the two lan-
generalized lexical decision task is a language- guages (e.g., lm). What happens if cognates are
neutral or global variant of the lexical decision task presented that are nonidentical in the language
in which bilingual participants press a yes button pair the bilingual knows? For instance, are Dutch-
if a presented item is a word in at least one of their English bilinguals affected in their recognition of the
languages (e.g., English or Dutch) and a no Dutch word bakker by its similarity (but non-
button if it is a nonword in either language. When identity) to the English word baker? Van Hell
Dutch-English bilinguals performed this task, in- and Dijkstra (2002) had trilinguals with Dutch as
terlingual homographs were processed faster than their native language, English as their L2, and French
English control words, but about as fast as Dutch as their third language perform a word association
controls. Cross-linguistic phonological overlap did task or a lexical decision task in their L1. Stimulus
not affect the RTs, suggesting that participants words were (mostly) nonidentical cognates such
responded primarily on the basis of the fastest as bakker or noncognates. Shorter association
available orthographic codes. and lexical decision times were observed for Dutch-
The difference between these results and those English cognates than for noncognates. For trilin-
of Dijkstra et al. (1999) for the same interlingual guals with a higher prociency in French, faster re-
homographs can be understood as a consequence sponses in lexical decision were found for both
of differences in the time-course of activation of Dutch-English and Dutch-French cognates. In other
words of the native rst language (L1) and non- words, even when their orthographic and phono-
native second language (L2) and the demands of the logical overlap across languages is incomplete, cog-
two types of lexical decision task. First, note that nates may be recognized faster than noncognates.
native language words are generally activated fas- In a lexical decision study with French-Spanish
ter than nonnative language words. Because in the bilinguals, Font (2001) found that cognates dif-
generalized lexical decision task the response could fering in one letter between languages (called
be based on the rst available code, the participants neighbor cognates by Font) were still facilitated but
Bilingual Word Recognition 185
signicantly less so than identical cognates. Fur- such orthographically similar words (neighborhood
thermore, the amount of facilitation depended on frequency). In bilingual studies, effects of number
the position of the deviating letter in the word. of orthographic neighbors were used as indexes of
Neighbor cognates with the different letter at the the relative inuence of nontarget language words
end of the word (e.g., French texteSpanish texto) on target word recognition in different experi-
were facilitated more than neighbor cognates mental tasks and conditions. Target words them-
with the different letter inside (e.g., French usuel selves belonged only to one language (i.e., there
Spanish usual). In fact, facilitatory effects for the were no interlingual homographs, homophones, or
latter type of cognate disappeared, and effects ten- cognates in the stimulus list).
ded toward inhibition when such cognates were of In an English lexical decision task performed
low frequency in both languages. Similar patterns of by Dutch-English bilinguals, Grainger and Dijkstra
results were found in L1 and L2 processing. (1992) found that English words with many neigh-
These results make it likely that the size of bors in Dutch (the nontarget language) were harder
RT effects observed for cognates and interlingual to recognize than neutral words with approximately
homographs depends on their degree of cross- the same number of neighbors in two languages,
linguistic overlap (also cf. Cristoffanini et al., 1986). which were in turn harder to recognize than English
Note that it follows logically that across-language words with more neighbors in their own language.
pairs that do not share orthography (e.g., Chinese Thus, RTs to items existing exclusively in one lan-
and English), no orthographically similar word guage were affected by the number of similar words
candidates can be activated, but effects of phono- from another language.
logical similarity and semantic overlap might still Van Heuven, Dijkstra, and Grainger (1998) ma-
occur (Bowers, Mimouni, & Arguin, 2000; Gollan, nipulated the number of orthographic neighbors of
Forster, & Frost, 1997). the target words in the same and the other language
Other studies showed that, in tasks with L1 tar- of the bilinguals in a series of progressive demasking
get words, effects of L2 competitors can be obtained and lexical decision experiments involving Dutch-
as well in mixed stimulus lists (Dijkstra, Timmer- English bilinguals. Increasing the number of Dutch
mans, & Schriefers, 2000) and even in completely orthographic neighbors systematically slowed RTs
blocked (L1) lists (Van Hell & Dijkstra, 2002). to English target words. Within the target language
Observed effects of L2 on L1 are often smaller than itself, an increase in neighbors consistently produced
those of L1 on L2, but this appears to be because inhibitory effects for Dutch target words and facili-
of the relative strength of the two languages and tatory effects for English target words. Monolin-
is therefore also dependent on L2 prociency (cf. gual English readers also showed facilitation
Jared & Kroll, 2001). To conclude, many studies because of English neighbors, but no effects of Dutch
support the language-nonselective access hypothesis neighbors.
with respect to form (orthographic and phonologi- Simulations with a computer model of bilingual
cal) as well as semantic representations. word recognition (the bilingual interactive activation
[BIA] model; see Thomas & Van Heuven, chapter 10,
this volume) suggest that the opposite effects of En-
Empirical Studies: Orthographic glish (facilitation) and Dutch (inhibition) neighbors
and Phonological Neighborhood may be caused by differences in the specic organi-
Effects zation of the English and Dutch lexicons. Whatever
the correct explanation may be, the most important
Perhaps the strongest results in favor of nonselec- point is that neighbors from both the same and the
tive access concern experiments that used so-called other language are activated during the presentation
neighbors as stimulus materials. As indicated in of a target word. This provides evidence that, with
the Introduction, an orthographic neighbor is any respect to orthographic codes, the lexicon of bilin-
word differing by a single letter from the target guals is integrated and nonselective in nature.
word with respect to length and letter position Jared and Kroll (2001) showed that the same
(Coltheart, Davelaar, Jonasson, & Besner, 1977). conclusions hold for the phonological part of the
For instance, work and cord are both neighbors of bilingual lexicon. In a word naming study, they
cork. Monolingual word identication and word observed cross-linguistic effects of phonological
naming have been shown to be sensitive to the word body neighbors, effects that could only have
number of orthographic neighbors (neighborhood arisen during the word identication process. Word
density) of the target words and to the frequency of body neighbors are words that share their medial
186 Comprehension
vowels plus nal consonants (word body) with the must be stored in the bilinguals mental lexicon for
target word. For example, save and wave are body each word. It has been referred to as a language tag
neighbors. In four experiments involving English- or a language node. Very little is known about such
French and French-English bilinguals, Jared and tags or nodes. Two representational possibilities
Kroll tested if word naming in the target language are that the language information pertaining to an
(e.g., bait in English) was slowed by the existence item is retrieved via the form (orthographic or
of word body neighbors with different pronuncia- phonological) representation of an item or via its
tions in the nontarget language (e.g., fait in lemma, a more abstract syntactic/semantic repre-
French). Participants named blocks of English test sentation. Possibly, each word has its own separate
words that preceded and followed blocks of French language tag; alternatively, all words of one lan-
ller words. For the rst-presented English words, guage may share their language tag.
cross-language interference effects were obtained in An interesting question is at which moment in
French-English bilinguals, for whom French was a time language information becomes available rel-
more dominant language than English. Such effects ative to word identication. If such information is
were not observed for English-French participants, available soon enough, it might help to speed up
whose L1 was English. For them, nontarget lan- word recognition by excluding lexical candidates
guage spelling-to-sound correspondences were ap- from the nontarget language. For instance, if the
parently more weakly activated than the target task is to respond to English words, all word can-
language spelling-to-sound correspondences. After a didates that are not English could be excluded from
switch from naming in another language (French), consideration. Furthermore, if language informa-
spelling-to-sound correspondences from both the tion from the context is able to affect the speed of
bilinguals languages appeared to be activated across word recognition, then bilinguals might be slower
blocks, depending on language uency. Less-uent to recognize a target word if it is preceded by an
English-French participants showed effects of item of a different language relative to an item of
French spelling-to-sound rules in English naming the same language as the target.
afterward if they had been presented with the ene- Dijkstra, Timmermans, et al. (2000) examined
mies themselves, but not with other items having the the role of language information contained in the
same word bodies. item itself. In three experiments, each with a dif-
These neighborhood studies, involving target ferent instruction, bilingual participants processed
words that occur exclusively in one language, in- the same set of homographs embedded in identical
dicate that the bilingual lexicon is integrated in mixed-language lists. Homographs of three types
nature for language pairs such as Dutch-English were used: high frequency in English and low fre-
and French-English that have a common script. At quency in Dutch; low frequency in English and
the same time, they also show that word candidates high frequency in Dutch; and low frequency in
from both languages of the bilingual are activated both languages. In the rst experiment (involving
in parallel and therefore also support a language- language decision), one button was pressed when
nonselective access hypothesis. an English word was presented, and another button
To conclude, the studies reviewed in the present was pressed for a Dutch word. In the second and
and the previous sections provide an answer to the third experiments, participants reacted only when
rst question posed in the Introduction, whether they identied either an English word (English go/
lexical candidates from different languages sharing no go) or a Dutch word (Dutch go/no go), but they
their scripts are activated when a letter string is did not respond if a word of the nontarget language
presented. The answer appears to be an unqualied (Dutch or English, respectively) was presented. The
yes. In the next section, the second question posed overall results in the three experiments were similar
in the Introduction is considered: Can language in- to those obtained by Dijkstra, Van Jaarsveld, et al.
formation be used to speed up the processing of (1998, Experiment 2) for lexical decision. In all
presented words? three tasks, inhibition effects arose for homographs
relative to one-language controls. Even in the
Dutch go/no go task for Dutch-English bilinguals
Language Information and performing in their native language, participants
Bilingual Word Recognition were unable to completely exclude effects from the
nontarget language on homograph identication.
Bilinguals know, of course, to which language a More important for the present discussion, how-
particular word belongs. This kind of information ever, is the nding that target language homographs
Bilingual Word Recognition 187
were often overlooked, especially if the fre- a generalized lexical decision experiment by Van
quency of their other language competitor was Heuven et al. (1998), signicant language switch-
high. In the Dutch go/no go task, participants did ing effects (on the order of 3035 ms) occurred as
not respond to low-frequency items belonging to well.
their native language in about 25% of the cases. However, other research by Von Studnitz and
Inspection of cumulative distributions showed that Green (1997, 2002) and by Thomas and Allport
if they did not respond after about 1,5001,600 (2000) indicated that task/decision switches gen-
ms, they did not respond anymore within the time erally may be much larger in size than language
window of 2 s. The observed attening of the cu- switches. Indeed, in a combined event-related po-
mulative distribution toward an asymptotic value tential and RT study involving a generalized lexi-
suggests that recognition of the homograph read- cal decision task on triplets of items, De Bruijn,
ing from the nontarget language in some way Dijkstra, Chwilla, and Schriefers (2001) did not
prohibited the subsequent recognition of the tar- nd any effects of prime language on the activation
get language reading (e.g., after recognition, all of the two readings of interlingual homographs.
other lexical candidates may be suppressed). These results suggest that context language as such
Thus, selection of one of the readings of the inter- does not operate as a very effective factor for lex-
lexical homographs takes place rather late during ical selection in stimulus lists.
processing. The relationship between lexical selection and
It is clear that the system must at some time language selection might be different for items for
arrive at a selection of one lexical item only, but which the language membership could in principle
apparently the role played by the language of that already be determined before recognition because
item in aiding selection is only minor. In fact, de- of the presence of language cues in the items
termination of the language of the item may de- themselves. The items could contain, for instance,
pend on lexical selection having taken place. In language-specic bigrams or diacritical markers.
addition, it does not seem possible to discard the In such cases, lexical search might be limited to
homograph reading from the nontarget language the relevant target language from the very begin-
and focus on the target reading only on the basis of ning. On the basis of earlier-presented evidence
the instruction that just the target language needs a (the neighborhood studies), it is assumed that in-
response. One reason for this may be a tendency formation contained in the signal is an important
that the word, rather than its language label, trig- early determiner of the set of lexical candidates
gers the response. that is initially activated. Thus, it seems likely that
A number of studies have investigated whether language-specic bigrams or diacritical markers
the language of the previous item in a list can affect are critical in this respect, and that because of
the recognition of a target word (e.g., Kolers, 1966; their presence, the initial set of lexical candidates
Macnamara & Kushnir, 1971). Such list effects activated may indeed become restricted to one
could arise in two ways. First, after a target item in language (see Mathey & Zagar, 2000, p. 200, for
a list is recognized, it might leave behind a trace of relevant monolingual data). However, I deem it
activation of the language that it belongs to until unlikely that, for words that are well known by
the subsequent trial, thereby affecting the proces- the bilingual, this kind of information is often
sing of the next item. For instance, a language used in a top-down way (i.e., the bilingual notices
switching effect would arise if on trial t an English a particular bigram that is unique to a language
word activates the English language tag, and on and then uses this information for his or her lan-
trial t 1, this language tag feeds back activation guage decision) because the automatized bottom-
to all English words or if it then inhibits all words up recognition process will usually be much faster.
from the Dutch lexicon. (Note that the presence of cross-linguistic phono-
Second, a switch effect could be observed if the logical effects in Chinese-English bilinguals indi-
decision process on the target item would be slightly cates that they are not using script differences as
changed because of the language of the previous a language cue to restrict their lexical selection
trial or because a task switch occurred between process.)
trials as well. In line with the rst view, Von To conclude, the empirical evidence collected so
Studnitz and Green (1997) found that bilinguals far suggests that language information associated
RTs on switch trials were a signicant 17 ms slower with the presented lexical item or provided by the
than on nonswitch trials in a (German-English) list context cannot be used to any great extent to
generalized lexical decision task. Furthermore, in speed up the processing of the target item.
188 Comprehension
Bilingual interactive activation model (BIA; Dijkstra & Van Heuven, 1998)
Resting level activation of words reects the state of language activation as well as prociency
Stimulus list composition (previous items) affects activation state of word forms
Participant expectations do not exert strong effects on the activation state of words
Top-down inhibition effects on the non-target language arise via language nodes
Identication and decision levels interact
as a characterization, not of bilingual participants, varied groups of bilinguals. Participants who had
but of items in the bilinguals mental lexicon that studied the L2 for less than 2 years showed results in
are more or less well known (cf. De Groot & correspondence with the word association model;
Comijs, 1995). This led to the suggestion that there participants with more language experience be-
might be two ways in which the equivalent of a haved in line with the concept mediation model.
word in another language could be determined: via On the basis of this study and others, Kroll and
concept mediation or via word association. Take, Stewart (1994) proposed the revised hierarchi-
for instance, the Russian word kniga. On presen- cal asymmetric model, which combines the two
tation of the Cyrillian letter string corresponding translation routes in an elegant way. According to
to this word, an orthographic representation might this model, translation from L1 into L2 requires
be activated that itself would activate the corre- concept mediation, in other words, two processing
sponding meaning representation. In the next step, steps: L1 ? C, C ? L2. As a consequence, it takes
this meaning activation could activate the English more time than the translation from L2 into L1,
word form book. This translation procedure would which is lexically mediated and proceeds via word
be called concept mediation. However, a second association (one direct link from L2 to L1). Re-
possibility would be that the Russian word form search with respect to the revised hierarchical
kniga would directly activate the English word form model has focused on the extent of asymmetry in
book either via a direct orthographic connection the connections between L1 and L2 and on the
(between scripts) or through a phonological con- degree of L2 prociency needed to obtain a gradual
nection. This translation route would be referred to shift from word association to concept mediation
as word association. (because of an increasingly strong link of L2 word
Potter, So, Von Eckardt, and Feldman (1984) forms to concepts through L2 use).
contrasted the two routes experimentally by com- More recent evidence with respect to these
paring bilingual performance in word translation issues has not always been consistent, suggesting
and in picture naming. For picture naming, they that additional factors may play a role (e.g., the
assumed that access to the meaning representation nature of the mapping between the languages).
on the basis of the picture was necessary before the For instance, although Sholl, Sankaranarayanan,
name of the picture became available. A model and Kroll (1995) presented evidence from picture
assuming that the word association route is always naming and translation in support of asymmetric
followed predicts that translation from L1 to L2 effects, some other studies reported symmetric ef-
will be faster than picture naming in L2. This is fects (De Groot & Poot, 1997; La Heij, Kerling, &
because translation can be done through a direct Van der Velden, 1996). Furthermore, Kroll, Mi-
L1-L2 connection (one step), but the picture needs chael, Tokowicz, and Dufour (2002) found that
to be turned into a concept and then into the L2 translation from L1 to L2 (assumed to be concep-
word (two steps). In contrast, a pure concept me- tually mediated) changed more in the course of
diation model predicts that the translation into L2 acquisition than translation in the other direction.
and picture naming in L2 will take about equally At the same time, other studies obtained evidence
long because both tasks require a retrieval of the supporting the presence of conceptual mediation
concept before the L2 word form can be retrieved in early bilinguals (Altarriba & Mathis, 1997; Van
(two steps in both routes; in a sense, the word form Hell & Candia Mahn, 1997). See also Kroll and
could be considered as a picture). In a group of Tokowiczs chapter 26 of this volume for addi-
highly uent Chinese-English bilinguals, clear evi- tional discussion of these issues.
dence was found in favor of the concept mediation
model. Picture naming and translation in L2 led
to comparable results (if anything, picture naming The Bilingual Interactive Activation
was faster rather than slower than translation). The Model and Bilingual Visual
same result held for a less-procient group of Word Recognition
English-French bilinguals.
In a number of later studies, Kroll and colleagues The BIA model (Dijkstra & Van Heuven, 1998;
found evidence that early (novice) bilinguals ap- Van Heuven et al., 1998) is an implemented
peared to use the word association link; more pro- localist-connectionist model of bilingual visual
cient bilinguals showed more evidence in favor word recognition. Here, it is discussed only in gen-
of concept mediation. As one example, Kroll and eral terms (for more details on the BIA model, in-
Curley (1988) replicated Potters study with more cluding a visualization, see Thomas & Van Heuven,
190 Comprehension
chapter 10, this volume). The language-nonselective previous item effects (Dijkstra, Van Jaarsveld, et al.,
access model distinguishes four hierarchically or- 1998). In spite of this list, there are many aspects of
ganized levels of different linguistic representations: bilingual word recognition that are not fully ac-
letter features, letters, words, and language tags (or counted for by the model. For instance, there are
language nodes). When a word is presented to the no phonological or semantic representations in the
model, rst the features of its constituent letters are model, the representation of interlingual homo-
registered (activated). Next, letter features activate graphs and cognates is underspecied, the language
the letters of which they are part for each letter node concept is not without problems, and task
position in the presented word. These letters in turn and context effects are not described in any detail
activate the words of which they are part in any (the BIA model, discussed separately, lls in some
language. Word candidates activate the language of these theoretical gaps).
tag to which they are connected and simultaneously
feed activation back to the letter level. Word can-
didates and letters also inhibit other word candi- The Bilingual Interactive Model
dates and letters, respectively (lateral inhibition). of Lexical Access and Bilingual
Language nodes inhibit the activation of word Auditory Word Recognition
candidates from another language (e.g., the English
language node reduces the activation of Dutch word With respect to spoken word recognition by bilin-
candidates). After a complex interactive process of guals, Lewy and Grosjean (1997) have proposed a
activation and inhibition, the lexical candidate model that at rst sight is rather similar to the BIA
corresponding to the presented word becomes the model, but nevertheless makes a number of differ-
most active word unit. ent assumptions. This model, the bilingual inter-
The BIA model assumes that the resting level active model of lexical access (BIMOLA), is also
activation of words from different languages re- a localist-connectionist model. A visualization of
ects the subjective frequency of the words, that is, the model is given by Thomas and Van Heuven in
the number of times that the language user has chapter 10 of this volume.
encountered or used them. This level is therefore Similar to BIA and the monolingual TRACE
dependent on the L1 and L2 prociency of the bi- model for auditory word recognition (McClelland
lingual. In addition, the resting level activation of & Elman, 1986), BIMOLA consists of three levels
words depends on their recency of use. If a word of nodes. First, an auditory input word activates
has not been used for a while, its resting level ac- phonological features, which are shared by the two
tivation may slowly decrease further. languages. Second, features activate associated
The language nodes provide a potential mech- phonemes, which are organized to some extent in
anism in the BIA model through which (list and independent subsets for each language, but are part
sentence) context effects can operate (Dijkstra, Van of a larger system. Within the phoneme level, there
Heuven, & Grainger, 1998). If context would is subset activation and lateral inhibition. Subset
affect the relative activation of the language nodes, activation implies that when a phoneme in a given
the subsequent suppression by these nodes of words language is activated, it sends a small positive sig-
from another language may change the relative ac- nal to other phonemes in the language subset (in-
tivation state of words from different languages. For dicating that the language in question is probably
instance, in an English language context, activation relevant to the situation at hand). At the same time,
of Dutch words could be inhibited or partially phonemes exert an inhibitory inuence on other
suppressed. phonemes of the same language (lateral inhibition
Although in the BIA model only orthographic within the subset). Finally, phonemes activate
representations are implemented, the model has words of which they are part. The word level is
been shown to account for a variety of empirical organized similar to the phoneme level, allowing
effects in the domain of word recognition: neigh- subset activation and lateral inhibition. Between
borhood density effects within and between lan- levels, units can be activated from both the bottom
guages, shifting neighborhood effects across an up and the top down. Furthermore, the word level
experiment, masked priming effects in bilinguals receives top-down preactivation from external in-
(Bijeljac-Babic et al., 1997), L2 prociency differ- formation, for example, reecting the language
ences in masked priming with bilinguals, effects mode (activation state of the two languages) of the
for interlingual homographs in a go/no go task bilingual and higher linguistic information from
(Dijkstra et al., 1998a), and (verbally) language of syntactic or semantic sources.
Bilingual Word Recognition 191
A comparison of BIMOLA to the BIA model led to a much lower correlation with these other
reveals some important differences between the tasks. To clarify the nature of the cross-task com-
two models. First, in contrast to the BIA model, parison in this study, progressive demasking is a task
BIMOLA assumes the presence of subset activa- in which the target stimulus gradually becomes
tion, making it possible that word candidates of visible because the duration of a mask is reduced
one language (the base language) become active and that of the target increased. In language deci-
before words from the other language. This allows sion, in contrast, bilingual participants press one
the auditory word identication system to function button if a visually presented word belongs to one
rather language selectively under certain circum- target language (e.g., English) and another one if it
stances. Second, BIMOLA assumes there can be belongs to the other target language (e.g., Dutch).
top-down effects from higher-level information De Groot, Borgwaldt, Bos, and Van den Eijnden
sources on word activation. (2002) investigated in more detail what may be the
If these assumptions are correct, the processes common and different mental processes underlying
and mechanisms underlying bilingual auditory and different tasks such as (delayed) word naming,
visual word recognition would operate somewhat lexical decision, and perceptual identication in L1
differently. However, although undoubtedly dif- and L2. An important conclusion based on these
ferences in input and modality characteristics must studies is that it may be inappropriate to talk about
have their consequences for processing, the abstract bilingual word recognition in general (i.e., without
organization of the processing systems for the two specifying the precise task and experimental cir-
modalities may be more similar than expected (e.g., cumstances under which it takes place) because
Schulpen, Dijkstra, Schriefers, & Hasper, 2003). In performance is so task and context dependent.
other words, some of the mentioned differences In a series of closely related experiments, Dijk-
may arise from researcher disagreement rather than stra, Van Jaarsveld, et al. (1998) examined the ef-
from actual processing. fects of task demands and stimulus list composition
on bilingual word recognition in some detail. In the
section on language-selective access, the rst ex-
periment of this study was described. In an English
Task Dependence of Bilingual lexical decision task including English-Dutch
Word Recognition Results homographs and cognates as well as exclusively
English control words, Dutch-English bilinguals
The empirical studies and models reviewed above responded about equally fast to interlingual ho-
are in support of the view that lexical access is mographs and exclusively English control words
nonselective under many circumstances. However, (and faster to cognates). In the second experiment
in the introduction the possibility was suggested of this study, exclusively Dutch words were added
that even if access to the identication system is to the stimulus list, but the task remained English
basically nonselective in nature, particular circum- lexical decision. Participants had to respond no
stances might allow it to operate in a language- to these items because they were not real English
selective way. It is therefore important to consider words. In this experiment, strong inhibition effects
the extent to which the result patterns observed in arose for interlingual homographs relative to con-
different experimental situations are task dependent trol items, especially for homographs that were
and how such context dependence comes about. low frequency in English and high frequency in
There is clear evidence in the literature that task Dutch. Dijkstra et al. explained the inhibition ef-
demands can affect bilingual performance to a fects in Experiment 2 as the result of a frequency-
considerable extent (e.g., De Groot et al., 2000; dependent competition between the two readings
Dijkstra, Timmermans, et al., 2000; Dijkstra, Van of the homographs for which the participants could
Jaarsveld, et al., 1998; Thomas & Allport, 2000). not ignore the nontarget language reading of the
Dijkstra and Van Heuven (1998) compared the interlingual homograph.
results of a number of different tasks and found If there was a race to recognition between the
that some of these tasks showed functional overlap, two readings, the inhibitory effects should be able
that is, they led to similar result patterns, whereas to turn into facilitation effects using the appropri-
others were more different. For instance, when the ate task instructions. This hypothesis was tested
same stimulus materials were used, lexical decision in a generalized lexical decision task (Dijkstra,
and progressive demasking results were often Van Jaarsveld, et al., 1998, Experiment 3). Here,
highly correlated, but language decision sometimes participants responded as soon as either of the two
192 Comprehension
readings of a homograph became available. As the explicitly instructed that they would encounter
result of a race between the two readings of the Dutch words requiring a no response, but such
homograph, a facilitation effect of homographs items were presented only in the second part of the
relative to their matched monolingual controls experiment. No signicant RT differences were
should arise. In addition, the degree of facilitation found between the interlingual homographs and
observed should be a function of the frequency of matched English control items in the rst part of
both the English and the Dutch readings of the the experiment. However, strong inhibitory effects
homograph. The largest benet to the RT relative for interlingual homographs relative to control
to matched English controls should now be ob- words were observed in the second part. Exami-
served for the homographs with low-frequency nation of the transition from part 1 to part 2
English and high-frequency Dutch readings. This showed that, as soon as Dutch items started to
was indeed what happened. Given the instruction come in, the RTs to interlingual homographs were
Say yes to English and/or Dutch words, partici- considerably slowed compared to control words.
pants were able to use either reading of an inter- The results in the two parts of the experiment
lingual homograph to speed up their decision mimicked Experiments 1 and 2 in the study by
process. Furthermore, the presence of both English Dijkstra et al. In contrast to earlier experiments,
and Dutch word frequency effects suggests that the instruction of the present experiment clearly
participants reacted on the basis of the rst avail- indicated that Dutch words requiring a no re-
able reading of the homograph. Therefore, the sponse might appear. Nevertheless, no inhibition
homographs that beneted most from the change for interlingual homographs was obtained in part
in instruction in Experiment 3 were those that 1. Apparently, the participants performance was
suffered most in Experiment 2: homographs not affected by whether the instruction did or did
with a low-frequency English reading and a high- not mention the possibility that Dutch words
frequency Dutch reading. would be presented.
Figure 9.1 shows the differences between ho- These results not only show quite clearly that
mographs and matched controls for the different participants are sensitive to the demands of the
frequency categories in the three experiments. It will task, but also suggest that various tasks affect the
be clear to the reader that the result patterns change process of word recognition itself only to a limited
across experiments in a systematic yet complex extent. In the next section, it is suggested that the
way. The result patterns obtained in experiments output of the identication system is used as input
are not necessarily direct reections of the underly- to a task/decision system.
ing identication system, but depend on a complex
interaction among this system, the requirements of
the task to be performed, and stimulus list compo-
sition. The presence of strong inhibition effects How a Dutch-English Bilingual
when stimulus list composition was changed from Recognizes an Interlexical
exclusively English to mixed English-Dutch (Dijk- Homograph in English
stra, Van Jaarsveld, et al., 1998, Experiment 2) in- Lexical Decision
dicates that task demands (top-down sources) could
not easily reduce the parallel activation of words On the basis of the empirical evidence discussed in
from the two languages (bottom-up sources, Ex- the previous sections, now I present a detailed il-
periment 1). Otherwise, participants would have lustration of the bilingual word recognition process
switched off the Dutch lexicon because it hin- as it takes place in a specic task context. Assume
dered task performance. At the same time, the that Dutch-English bilinguals perform the English
change from inhibition to facilitation effects when lexical decision experiment by Dijkstra, De Bruijn,
the task changed from English lexical decision to et al. (2000), discussed in the section on task
generalized lexical decision indicates that partici- dependence. The participants are instructed to re-
pants could exploit such parallel activation to speed spond yes to all words that had an English
up their response (Experiment 3). reading (including homographs) and no to Dutch
Dijkstra, De Bruijn, Schriefers, and Ten Brinke words and to nonwords. To give the correct re-
(2000) performed an English lexical decision ex- sponse, the participant must somehow relate (or
periment that combined features of Experiments 1 bind) the yes response (e.g., pressing the right
and 2 by Dijkstra, Van Jaarsveld, et al. (1998). button) to English words and the no response to
Prior to the experiment, the participants were Dutch words and nonwords.
Figure 9.1 Result pattern of Experiments 13 in the work of Dijkstra, Van Jaarsveld, and Ten Brinke
(1998). Experiment 1: English lexical decision without exclusively Dutch words; Experiment 2: English
lexical decision including exclusively Dutch words; Experiment 3: generalized lexical decision (including
exclusively Dutch words). HFE-HFD, high-frequency Englishhigh-frequency Dutch; HFE-LFD, high-
frequency Englishlow-frequency Dutch; LFE-HFD, low-frequency Englishhigh-frequency Dutch; LFE-
LFD, low-frequency Englishlow-frequency Dutch.
194 Comprehension
The upper part of Fig. 9.2 shows how this may In fact, they now strengthen the relationship be-
be done in the initial phase of the experiment. For tween the Dutch word and the no response, or
a clear presentation, the gure does not make a they slow down to make sure their decision is also
distinction among orthographic, phonological, and based on language membership information (for
semantic codes for words, but as this review has evidence that such modulation is possible, see Von
indicated, different codes do play a role in recog- Studnitz & Green, 2002).
nition. Further, note that the no response Because the response to English controls re-
to Dutch words involves a different decision pro- mains relatively constant across the two parts of
cess than that to nonwords (not indicated). Dutch the experiment, the rst option seems to be the
words will generate considerable activity in the more likely one. The lower part of Fig. 9.2 repre-
mental lexicon, and the no response can only be sents this option graphically. The strengthening of
given after the language membership of the words the stimulusresponse binding for Dutch words has
is retrieved. In contrast, nonwords will induce severe consequences for the interlingual homo-
much less activity in the lexicon than words, and graphs. When an interlingual homograph is pre-
the no response will be initiated when no word sented, the Dutch reading now interferes strongly
has been recognized after a temporal deadline has with the English reading because both the no
passed. response and the yes response are activated. In
In the rst half of the experiment, no exclusively other words, response competition leads to much
Dutch words are presented to the participants. In slower RTs to interlingual homographs than to
other words, the response binding of Dutch words controls (of course, there may be other reasons for
to the no response is not strengthened. As a such results as well).
consequence, participants will respond to inter-
lexical homographs about as quickly as to mono-
lingual English words because the English reading
of the homograph elicits the yes response. The
Task and Context Effects
only effects that will be observed are those arising in Bilingual Word
from interactions between representations in the Recognition Models
lexicon (cf. the study by Dijkstra et al., 1999).
Because the Dutch reading is not strongly con- The previous sections have shown that bilingual
nected to the no response, by itself it contributes word processing depends on the task that must
little to the response. Note that the participants be performed and the nonlinguistic and linguistic
may respond to the presence of a word rather than contexts in which it is performed. A number of
to the presence of an English word. Indeed, the more recent bilingual word recognition models in-
information that the word that is recognized is corporate the distinction between the actual word
English might come in only after the response has identication system and a task/decision system
already been initiated. If the language check is done that I argued for on the basis of the empirical evi-
only after response initiation or execution, the re- dence. These models are discussed in the remainder
sponse will still always be correct. It may be that of the chapter. Some of their basic characteristics
the participants notice that, in fact, they responded are summarized in Table 9.1.
to the wrong reading of an interlexical homograph,
namely, the Dutch reading. This might not only The Inhibitory Control Model
speed up the response in the trial in question (rel- and the Task/Decision System
ative to controls), but also would make the par-
ticipants more careful during the next time a Green (1986, 1998) developed a model that is to a
homograph was presented, perhaps slowing them large extent compatible with and complementary
at that time relative to controls. to the BIA model and the revised hierarchical
In the second half of the experiment, Dutch model (also see chapters 17, 22, and 25, this vol-
words are interspersed in the stimulus list. If the ume). Rather than on the process of item identi-
rst word is presented, participants are often too cation itself, this inhibitory control (IC) model
late to check the language membership of the target focuses on the importance of the demands posed by
item, so they respond yes to the Dutch word and different tasks and the control (regulation) that
make an error. They realize this as soon as they language users can exert on their language pro-
have retrieved the language membership of the item cessing by modifying levels of activation of (items
and then become more careful in subsequent trials. in) language networks (Green, 1998, p. 68). A key
Figure 9.2 Stimulusresponse bindings in the English lexical decision experiment by Dijkstra, De Bruijn,
Schriefers, and Ten Brinke (2000). On the left side, simplied word and language membership repre-
sentations are presented for the interlexical homograph ROOM; on the right side, they are for the
matched English control word HOME. In the rst part of the experiment (top), no Dutch words were
included; therefore, the binding of Dutch words (including the Dutch reading of the interlexical homo-
graphs) to the no response was weak. In the second part (bottom), the presence of Dutch words led to
stronger binding. The possibility that participants responded whenever a word was triggered (checking
language membership information only later) is indicated by the shortcut from word level to response.
196 Comprehension
concept in this model is the language task schema activation state of languages is called language
that species the mental processing steps (or action mode, and it is continuous and sensitive to many
sequences) that a language user takes to perform a factors. Examples of such factors in interactions
particular language task. A language task schema are, for instance, the person spoken or listened to,
regulates the output from the word identication the language users language prociency, the users
system by altering the activation levels of repre- attitude toward language mixing, and the con-
sentations within that system and by inhibiting tent and function of the ongoing discourse. Listen-
outputs from the system (Green, 1998, p. 69). For ers and readers can be in a bilingual mode if they
instance, when a bilingual switches from one lan- are talking to other bilinguals or are reading a text
guage to another in translation, a change in the about which they know that there are possibly
language schema that is applied must take place. elements from another language (Grosjean, 1998,
When an English word must be translated into p. 137). However, if a bilingual listens to someone
French, this requires the language users to switch who is obviously monolingual, the activation
from the input language of the item, English, to state of the bilinguals languages would switch more
the output language, French. Otherwise, the pre- to a monolingual mode, in which only or mainly the
sented English word would be repeated (read out context-relevant language is active. According
loud) instead of translated. Thus, the task schema to Grosjean, the bilinguals language mode af-
for translation must actively suppress the word fects perception and the speed of access to one or
representations (or lemmas) with an English lan- two lexicons, and the language mode itself is af-
guage tag (membership) at the stage of output fected both by the language users expectations and
selection. Because this suppression can take place by language intermixing (whether there are words
only after the (lemma) representations are acti- of one or more languages embedded in the stimulus
vated, inhibition is called reactive. However, list).
the exerted inhibition of English words needs to If this view is evaluated in terms of the presently
be overcome later if such words are presented available data on the recognition of isolated visu-
on the next trial. In sum, language changes require ally presented words, it appears that only the sec-
overcoming the inhibition of the previous lan- ond part of this view can be correct. For instance,
guage tags. the role played by the readers expectations seems
Irrespective of whether the details of this ap- to be limited. Under various circumstances, word
proach turn out to be correct, the IC model makes candidates of different languages are activated if
the important point that bilingual language pro- they are close enough in terms of their character-
cessing is a process that always takes place within a istics to the input letter string (parallel activation
particular task context and with certain goals in of word candidates also seems to take place in the
mind. In other words, it is not very informative to auditory domain). Bilingual lexical access seems
talk about bilingual word recognition in general to be profoundly language nonselective, and top-
without providing more information about the down factors such as expectation do not seem to
conditions under which it takes place and the goal be able to change that. At the same time, Grosjean
that needs to be achieved. seems to be correct in asserting that stimulus list
composition (language intermixing) is an important
The Language Mode Framework factor affecting bilingual word recognition perfor-
and Nonlinguistic Context Effects mance. Similar comments hold with respect to
the BIA and IC models: The assumption that non-
Both the IC and the BIA models assume that the linguistic context may affect the activation state of
relative activation of languages can be affected to individual presented words does not seem to be
some extent by stimulus context via top-down in- warranted.
hibition of lexical representations. More generally, However, note that in daily life words are usu-
task context can affect the activation state of the ally recognized in the linguistic context of a sen-
word recognition system. There is yet another tence, not as isolated items in stimulus lists. The
theoretical approach that assumes that the relative language mode hypothesis has been formulated es-
activation state of words and languages is context pecially for language use in such natural contexts. In
sensitive. In the language mode framework pro- the following section, the BIA model is consid-
posed by Grosjean (1997, 1998, 2001), language ered; it proposes that the effects of nonlinguistic and
processing mechanisms and languages as a whole linguistic context may come about through different
can be active to different extents. This relative mechanisms.
Bilingual Word Recognition 197
The BIA Model and Linguistic sentations. All these representations are assumed to
Context Effects be part of a word identication system that pro-
vides output to a task/decision system. The informa-
The BIA model (Dijkstra & Van Heuven, tion ow in bilingual lexical processing proceeds
2002) is an extension and adaptation of the BIA exclusively from the word identication system
model (see Fig. 9.3). The BIA model contains not toward a task/decision system without any inu-
only orthographic representations and language ence of this task/decision system on the activation
nodes, but also phonological and semantic repre- state of words.
Figure 9.3 The architecture of the BIA model (an extension of the bilingual interactive activation [BIA]
model) (Dijkstra & Van Heuven, 2002). L1, rst language; L2, second language.
198 Comprehension
In this framework, nonlinguistic context effects processing in both languages is affected by the
can affect the word recognition process only indi- acquisition and use of more than one language.
rectly, via the task/decision system. Nonlinguistic This suggests that nonselective access holds not
context effects arising from instruction or partici- only with respect to lexical aspects of processing,
pant expectancies can affect the way information but also with respect to semantics and syntax. This
from the word identication system is used, but conclusion leads to a number of interesting pre-
not the activation state of word candidates. In dictions. One prediction is that syntactic priming
contrast, linguistic context (such as a preceding might arise between languages (e.g., the Dutch
sentence) can interact directly with the word rec- dative construction De man voerde de hond een
ognition system. In other words, semantic and kluif [The man fed the dog a bone] might prime
syntactic aspects of the sentence context can mod- the analogous English construction in The woman
ulate the activation of lexical candidates (of course, gave the child a book). Another prediction is that
both nonlinguistic and linguistic context effects bilinguals should differ from monolinguals in
may operate at the same time). certain switching tasks because bilinguals possess
The few studies that have so far investigated bi- such well-developed cognitive skills for controlling
lingual word recognition in sentence context (e.g., cross-language competition. With respect to the
Altarriba, Kroll, Sholl, & Rayner, 1996; Li, 1996) interactions of L1 and L2 during sentence proces-
suggest that semantic and syntactic aspects of sen- sing, Kroll and Dussias concluded that bilinguals
tence context may indeed modulate the bilingual resemble monolinguals in the semantic domain of
word recognition process. For instance, Altarriba sentence processing, but that they clearly process
et al. recorded the eye movements of Spanish-En- language differently in the syntactic domain. For
glish bilinguals who were reading English (L2) instance, similar event-related brain potentials were
sentences that contained either an English (L2) or a found for L1 and L2 speakers in several studies for
Spanish (L1) target word (Experiment 1). Sentences semantic processing, but not for syntactic proces-
provided either high or low semantic constraints on sing (e.g., Hahne, 2001; Hahne & Friederici, 2001;
the target words. An example sentence of the high Weber-Fox & Neville, 1996).
constraint and Spanish target condition is He In sum, although the BIA model makes some
wanted to deposit all his dinero at the credit union, basic assumptions to allow the development of an
for which dinero is Spanish for money. An in- account for the recognition of words in sentence
teraction arose between the frequency of the tar- context, a detailed account is not available. Con-
get word and degree of sentence constraint for siderable research efforts are necessary before the
Spanish target words with respect to the rst xa- contours of such an account will become visible.
tion duration, but not for English target words.
Thus, when the Spanish target words were of high
frequency and appeared in highly constrained sen- Conclusion
tences, the participants apparently experienced
interference. This result suggests that sentence con- The empirical data reviewed in this chapter indicate
straint inuences not only the generation of se- that the recognition process of isolated words is ba-
mantic feature restrictions for upcoming words, but sically language nonselective in nature. This means
also that of lexical features. The high-frequency that word candidates from different languages ini-
Spanish word matched the generated set of semantic tially become active on the presentation of a let-
features, but not the expected lexical features when ter string. This nonselectivity seems to hold for all
the word appeared in the alternate language (Al- representations that characterize words (e.g., ortho-
tarriba et al., p. 483). Note that word frequency (a graphic, phonological, and semantic codes). Bilin-
lexical information source) and not language gual word recognition also seems to be automatic in
membership interacted with the sentence constraint. the sense that the process takes place relatively un-
This suggests that (just as for isolated words) lexical affected by nonlinguistic contextual factors. This
characteristics are more important than language applies not just to words from the native language
characteristics in the determination of word recog- (L1), but also to words from the L2. At the same
nition in sentences. time, when words are processed in sentence context,
On the basis of a review of the available studies their processing seems to be sensitive to the semantic
of the processing of words in sentence context, and syntactic aspects of the sentence.
Kroll and Dussias (2004) drew some important Word recognition models have aptly described
conclusions. First, they concluded that sentence the bilingual word recognition process by means of
Bilingual Word Recognition 199
the activation metaphor. This metaphor is useful to Arabic/French cognates. Memory and
describe how, on the basis of stimulus character- Cognition, 28, 12891296.
istics such as word frequency and different degrees Brysbaert, M., Van Dyck, G., & Van de Poel, M.
of similarity between input and lexical representa- (1999). Visual word recognition in bilinguals:
tion, lexical candidates can be activated to various Evidence from masked phonological priming.
Journal of Experimental Psychology: Human
degrees. Attempts have been made to apply the
Perception and Performance, 25, 137148.
activation metaphor to the level of languages as a Caramazza, A., & Brones, I. (1979). Lexical
whole as well, but it remains to be determined what access in bilinguals. Bulletin of the
it means to say that languages can be activated to Psychonomic Society, 13, 212214.
different degrees (Dijkstra & Van Hell, 2003). Coltheart, M., Davelaar, E., Jonasson, J. T., &
Available models differ considerably in their Besner, D. (1977). Access to the internal
views about how different tasks and contextual lexicon. In S. Dornic (Ed.), Attention and
factors such as instruction and participant ex- performance VI (pp. 535555). New York:
pectations affect the bilingual word recognition Academic Press.
process. Some approaches assume considerable Cristoffanini, P., Kirsner, K., & Milech, D. (1986).
Bilingual lexical representation: The status of
context sensitivity of the L1/L2 activation state in
Spanish-English cognates. Quarterly Journal
the mental lexicon; others explain context effects at of Experimental Psychology, 38A, 367393.
strategy-sensitive decision levels rather than at the De Bruijn, E. R. A., Dijkstra, A., Chwilla, D. J., &
level of the word recognition system. Schriefers, H. J. (2001). Language context
To conclude, an increasing amount of research in effects on interlingual homograph recognition:
the last decade has led to important insights, for Evidence from event-related potentials and
instance, that bilingual word recognition appears to response times in semantic priming. Bilin-
be subserved by a language-nonselective access sys- gualism: Language and Cognition, 4,
tem that is sensitive to task demands and context 155168.
aspects. Future research must investigate how dif- De Groot, A. M. B., Borgwaldt, S., Bos, M., & Van
den Eijnden, E. (2002). Lexical decision and
ferent sorts of contextual factors affect the word
word naming in bilinguals: Language effects
recognition process and to what extent bilinguals and task effects. Journal of Memory and
can exert cognitive control over the different com- Language, 47, 91124.
ponents of the language processing system. Research De Groot, A. M. B., & Comijs, H. (1995).
is also needed to disentangle lexical, syntactic, and Translation recognition and translation
semantic effects on words processed in sentences production: Comparing a new and an old
with various task goals in mind. tool in the study of bilingualism. Language
Learning, 45, 467509.
References De Groot, A. M. B., Delmaar, P., & Lupker, S. J.
(2000). The processing of interlexical homo-
Altarriba, J., Kroll, J. F., Sholl, A., & Rayner, K. graphs in a bilingual and a monolingual task:
(1996). The inuence of lexical and concep- Support for nonselective access to bilingual
tual constraints on reading mixed-language memory. Quarterly Journal of Experimental
sentences: Evidence from eye-xation and Psychology, 53, 397428.
naming times. Memory and Cognition, 24, De Groot, A. M. B., & Nas, G. (1991). Lexical
477492. representation of cognates and noncognates in
Altarriba, J., & Mathis, K. M. (1997). Conceptual compound bilinguals. Journal of Memory and
and lexical development in second language Language, 30, 90123.
acquisition. Journal of Memory and De Groot, A. M. B., & Poot, R. (1997). Word
Language, 36, 550568. translation at three levels of prociency in a
Beauvillain, C., & Grainger, J. (1987). Accessing second language: The ubiquitous involvement
interlexical homographs: Some limitations of of conceptual memory. Language Learning,
a language-selective access. Journal of 47, 215264.
Memory and Language, 26, 658672. De Moor, W. (1998). Visuele woordherkenning
Bijeljac-Babic, R., Biardeau, A., & Grainger, bij tweetalige personen [Visual word
J. (1997). Masked orthographic priming in recognition in bilinguals]. Unpublished
bilingual word recognition. Memory and masters thesis, University of Ghent, Belgium.
Cognition, 25, 447457. Dijkstra, A., De Bruijn, E., Schriefers, H. J., &
Bowers, J. S., Mimouni, Z., & Arguin, M. (2000). Ten Brinke, S. (2000). More on interlingual
Orthography plays a critical role in cognate homograph recognition: Language intermixing
priming: Evidence from French/English and versus explicitness of instruction.
200 Comprehension
naming: Evidence for asymmetric connections Schulpen, B., Dijkstra, A., Schriefers, H. J., &
between bilingual memory representations. Hasper, M. (2003). Recognition of
Journal of Memory and Language, 33, interlingual homophones in bilingual auditory
149174. word recognition. Journal of Experimental
La Heij, W., Kerling, R., & Van der Velden, E. Psychology: Human Perception and
(1996). Nonverbal context effects in forward Performance, 29, 11551178.
and backward translation: Evidence for Sholl, A., Sankaranarayanan, A., & Kroll, J. F.
concept mediation. Journal of Memory and (1995). Transfer between picture naming and
Language, 35, 648665. translation: A test of asymmetries in bilingual
Lemhofer, K., & Dijkstra, A. (2004). Recognizing memory. Psychological Science, 6, 4549.
cognates and interlingual homographs: Effects Soares, C., & Grosjean, F. (1984). Bilinguals in a
of code similarity in language specic and monolingual and bilingual speech mode:
generalized lexical decision. Memory and The effect on lexical access. Memory and
Cognition, 32, 533550. Cognition, 12, 380386.
Lewy, N., & Grosjean, F. (1997). A computational Thomas, M. S. C., & Allport, A. (2000). Language
model of bilingual lexical access. Manuscript switching costs in bilingual visual word
in preparation, Neuchatel University, recognition. Journal of Memory and
Switzerland. Language, 43, 4466.
Li, P. (1996). Spoken word recognition of Van Hell, J. G., & Candia Mahn, A. (1997).
code-switched words by Chinese-English Keyword mnemonics versus rote rehearsal:
bilinguals. Journal of Memory and Language, Learning concrete and abstract foreign words
35, 757774. by experienced and inexperienced learners.
Macnamara, J., & Kushnir, S. (1971). Linguistic Language Learning, 47, 507546.
independence of bilinguals: The input switch. Van Hell, J. G., & Dijkstra, A. (2002). Foreign
Journal of Verbal Learning and Verbal language knowledge can inuence native
Behavior, 10, 480487. language performance in exclusively native
Mathey, S., & Zagar, D. (2000). The contexts. Psychonomic Bulletin and Review,
neighborhood distribution effect in visual 9, 780789.
word recognition: Words with single and Van Heste, T. (1999). Visuele woordherkenning
twin neighbors. Journal of Experimental bij tweetaligen [Visual word recognition in
Psychology: Human Perception and bilinguals]. Unpublished masters thesis,
Performance, 26, 184205. University of Leuven, Belgium.
McClelland, J. L., & Elman, J. L. (1986). The Van Heuven, W. J. B., Dijkstra, A., & Grainger,
TRACE model of speech perception. Cognitive J. (1998). Orthographic neighborhood effects
Psychology, 18, 186. in bilingual word recognition. Journal of
Nas, G. (1983). Visual word recognition in Memory and Language, 39, 458483.
bilinguals: Evidence for a cooperation Von Studnitz, R. E., & Green, D. W. (1997).
between visual and sound based codes during Lexical decision and language switching.
access to a common lexical store. Journal of International Journal of Bilingualism, 1, 324.
Verbal Learning and Verbal Behavior, 22, Von Studnitz, R. E., & Green, D. (2002).
526534. Interlingual homograph interference in
Potter, M. C., So, K.-F., Von Eckardt, B., & German-English bilinguals: Its modulation
Feldman, L. B. (1984). Lexical and conceptual and locus of control. Bilingualism: Language
representation in beginning and procient and Cognition, 5, 123.
bilinguals. Journal of Verbal Learning and Weber-Fox, C. M., & Neville, H. J. (1996).
Verbal Behavior, 23, 2338. Maturational constraints on functional
Sanchez-Casas, R., Davis, C. W., & Garca-Albea, specializations for language processing: ERP
J. E. (1992). Bilingual lexical processing: and behavioral evidence in bilingual speakers.
Exploring the cognate/non-cognate Journal of Cognitive Neuroscience, 8, 231256.
distinction. European Journal of Cognitive Weinreich, U. (1968). Languages in contact. The
Psychology, 4, 311322. Hague, The Netherlands: Mouton.
Michael S. C. Thomas
Walter J. B. van Heuven
10
Computational Models of Bilingual
Comprehension
202
Computational Models 203
models are sometimes thought of as more theoret- the bilingual system is acquired as well as details of
ically opaque. its processing dynamics in the adult state.
We contend that, between them, localist and letter nodes, by which the features and letters are
distributed models have the potential to inform coded for each position of a 4-letter word. There
every one of these issues. However, we begin by a are 14 visual features and 26 letters for each posi-
consideration of the current status of models of tion. The two top layers in the BIA model differ
bilingual word comprehension. from the IA model. The BIA model has a word
layer of all Dutch and English 4-letter words.
Furthermore, the BIA model has a language node
layer, assigning a single node to each language.
Localist Approaches
Visual input in the model is coded as the absence or
presence of letter features. At each position, letters
In psycholinguistic research, localist models
are excited when they are consistent with a feature
of monolingual language processing have been
and inhibited when they are not consistent with a
used since the beginning of the 1980s. In 1981,
visual feature (in Fig. 10.1, arrows with triangular
McClelland and Rumelhart (1981; Rumelhart &
heads represent excitatory sets of connections, and
McClelland, 1982) used a simple localist connec-
those with circular heads represent inhibitory sets
tionist model to simulate word superiority effects.
of connections). Each letter activates words that
This Interactive Activation (IA) model has since
have that letter at the same position and inhibits
been used to simulate orthographic processing in
words that do not have that letter at that position.
visual word recognition. The model has been ex-
An important aspect of the BIA (and the IA)
tended with decision components by Grainger and
model is that all nodes at the word level are inter-
Jacobs (1996) to account for wide variety of em-
connected; they can mutually inhibit each others
pirical data on orthographic processing.
activation. This is called lateral inhibition. Fur-
The IA models were used to simulate word rec-
thermore, activated words feed activation back to
ognition in a variety of languages (e.g., English,
their constituent letters. The parameters that reg-
Dutch, French), but in each case within a monolin-
ulate these interactions in the BIA model are iden-
gual framework. Dijkstra and colleagues (Dijkstra
tical to the ones used in the original IA model
& Van Heuven, 1998; Van Heuven, Dijkstra, &
(McClelland & Rumelhart, 1981). Because words
Grainger, 1998) subsequently extended the IA
of both languages are fully connected to each other,
model to the bilingual domain. They called this new
the BIA model implements the assumption of an
model the Bilingual Interactive Activation (BIA)
integrated lexicon. In addition, because the letters
model. Both the IA and BIA models are restricted to
at the letter layer activate words of both languages
the orthographic processing aspect of visual word
simultaneously, the model implements the as-
recognition, encoding information about letters and
sumption of nonselective access. However, lateral
visual word forms in their structure.
connections allow the words of the two languages
In the following sections, we focus on the
to compete and inhibit each other.
BIA model and examine how this model can or
Moreover, this competition can be biased. Apart
cannot account for empirical ndings on cross-
from the incorporation of two lexicons, the BIA
language neighborhood effects, language context
model is special in its inclusion of language nodes,
effects, homograph recognition, inhibitory effects
in this case one for English and one for Dutch. The
of masked priming, and the inuence of language
language nodes collect activation of all words from
prociency. We end this section with a short dis-
one lexicon and, once activated, can suppress the
cussion of a localist model of bilingual speech
word units of the other language. The parameter
perception (BIMOLA) and a new localist bilingual
that controls this inhibition is important to the
model based on the theoretical BIA model
behavior of the model (Dijkstra & Van Heuven,
(Dijkstra & Van Heuven, 2002), which integrates
1998). In the summary of simulation results with
orthographic, phonological, and semantic repre-
the BIA model, we discuss the role of this top-down
sentations (SOPHIA).
inhibition.
The Bilingual Interactive Processing The behavior of the BIA model in re-
Activation Model sponse to an input is determined by a combination
of excitatory and inhibitory inuences that cycle
Structure The BIA model is depicted in Fig. 10.1. around the network. Three components contribute
It consists of four layers of nodes. It shares with the to this interaction. First, activation ows up the
IA model the same lower-level layers of feature and network, from feature, to letter, to word, to
Figure 10.1 The Bilingual Interactive Activation (BIA) model. Excitatory connections are indicated by
arrows (with arrowheads pointing in the direction of activation spread), inhibitory connections by lines
with closed circles (pos, position). Note that although two pools of word units are depicted, one for each
language, during processing all words compete with all other words via inhibitory lateral connections,
representing an integrated lexicon.
208 Comprehension
language nodes. In each case, the higher nodes with the empirical fact that L2 words tend to be com-
which the input is consistent are activated, and prehended and produced more slowly and less ac-
those with which the input is inconsistent are in- curately than their comparable L1 translation
hibited. Second, at the word level, words (from equivalents in unbalanced bilinguals.
both languages) compete with each other to be the
most active. Third, activation also ows back Language Nodes Dijkstra and Van Heuven (1998)
down the network. Word units reinforce the letters described several functions of the language nodes in
of which they are comprised, and language units the BIA model. For example, the language nodes
inhibit words of the opposing language. Letters and represent a language tag that is activated during
words are therefore not processed in isolation, but word identication and that indicates the language
in the context of the words that contain these let- to which the word belongs. In addition, language
ters and of the languages of which words are nodes can inhibit words of the other language to
members. reect a stronger representation of, for example,
Each time step (cycle), activation ows between the L1 language compared to the L2 language.
the layers, and the new activation of each node is As a result, L1 words will inhibit L2 words more
calculated. After a few cycles, letters and words strongly during recognition because of extra top-
that are similar or identical to the input are acti- down inhibition from the language node. As indi-
vated. Thus, the word node that best matches the cated by Dijkstra, Van Heuven, and Grainger
input string will reach the recognition threshold. (1998), the BIA model would probably produce a
The number of cycles it takes to reach this thresh- similar functional behavior without top-down
old can then be compared with human response inhibition from the language nodes, when lateral
latencies. The threshold can be at a xed word inhibition at the word level is asymmetric between
activation level, or it can vary around a mean words of different languages (e.g., L1 words inhibit
(Jacobs & Grainger, 1992). L2 words more than vice versa).
Furthermore, top-down inhibition from the lan-
Modeling Second Language Prociency in the Bilin- guage nodes can be used to simulate context effects
gual Interactive Activation Model An important as- in word recognition. Thus, the correct reading of an
pect of the BIA model is that differences in word interlingual homograph depends on language con-
frequency are reected in the resting-level activa- text, an effect that the language nodes could imple-
tion of the words. High-frequency words have a ment. However, this function of the language might
higher resting-level activation than low-frequency also be replaced by a decision mechanism because
words. Therefore, high-frequency words are acti- results obtained with homographs suggest that bi-
vated more quickly and reach the recognition linguals are not able to suppress nontarget language
threshold earlier than low-frequency words. candidates even in the context of explicit instruc-
Word frequencies of Dutch and English taken tions to do so (Dijkstra, De Bruijn, Schriefers, &
from the CELEX database (Baayen, Piepenbrock, Ten Brinke, 2000).
& Van Rijn, 1993) are converted into resting-level Finally, language nodes collect activation from
activations in the BIA model. These word fre- the word level and therefore serve as an indicator
quencies reect frequencies from a perfectly bal- of the total activation of all the word nodes
anced bilingual. Most studies with bilinguals, in their respective languages. Summed activation
however, use participants who acquired their L2 is an important notion in, for example, the mul-
later in life (late bilinguals) and who are less tiple readout model (MROM) of Grainger and
procient in their L2. One consequence is that, for Jacobs (1996). A large value, representing lots of
these participants, the (subjective) word frequen- word node activity, implies that the input must be
cies of their L2 are lower than the frequencies of fairly wordlike. Summed activation is used as a
their L1 (especially for high-frequency L2 words). criterion to make a yes response in the lexical
In the BIA model, this can be implemented decision task. Furthermore, summed activation is
by varying the resting-level activation range of used to adjust the deadline of the no response (if
the L2. This is not to assume that differences the input is very wordlike, give the word nodes
in prociency can be explained solely in terms more time for one of them to reach threshold be-
of frequency effects. For instance, greater knowl- fore deciding the input is actually a nonword).
edge of L2 grammar might result in relatively Although the MROM does not explicitly imple-
high L2 prociency as well. However, the manip- ment this summed activation as a representation
ulation of resting activation nevertheless captures in the model, the language nodes of the BIA
Computational Models 209
model can be seen as an explicit implementation frequency range. Simulations showed that this
of this notion. model was able to capture several effects. For Dutch
targets, it replicated the inhibitory effect of Dutch
Bilingual Interactive Activation and English neighbors. For English targets, it rep-
Model Simulations licated the inhibition effect of Dutch neighbors and
the facilitation effect of English neighbors. Thus, the
Neighborhood Effects An interesting aspect of the model correctly simulated the different effects of
BIA model is that it incorporates the IA model as within-language neighbors in Dutch and English.
part of its structure. Therefore, the behavior of the Here, an important exploratory role of com-
IA model inside and outside of the BIA structure can putational modeling is demonstrated. Detailed
be compared. Indeed, examining two IA models, comparison of the models performance against
one for each language, would constitute a particular empirical data allows the evaluation of different
theory of bilingual word recognition (one in which theoretical assumptions once implemented in the
there was selective access of the input to each model.
language and no top-down inhibition from the
language nodes). Thus, the BIA model permits a Priming Effects Effects of nontarget language
detailed comparison of a selective access model with neighbors have been also obtained in masked
a nonselective access model. priming experiments (Bijeljac-Babic, Biardeau, &
In this way, Dijkstra and Van Heuven (1998) Grainger, 1997, for French and English). The BIA
were able to compare three models of bilingual vi- model can be used to simulate masked priming
sual word recognition, each implementing a differ- using the simulation technique described by Jacobs
ent theory: (a) a selective access model, simulated and Grainger (1992). This technique simulates
with the monolingual IA model; (b) the BIA model masked priming by presenting the prime to the
with top-down inhibition; (c) the BIA model with- model on the rst and second processing cycles;
out top-down inhibition. In addition, they changed on the third cycle and following cycles, the target
the frequencies of the English words in the model by word is presented to the model. A simulation with
changing the resting-level range to reect the fact the BIA model reported by Bijeljac-Babic et al.
that the target empirical data for the models were showed that the model captured the longer average
collected from participants who were not balanced target recognition times for same-language masked
bilinguals. All other parameter settings were iden- primes sharing orthographic similarity to the target
tical in the three models. The performance of each (e.g., realheal) than for cross-language masked
model was compared with the empirical results of primes also bearing orthographic similarity to the
Van Heuven et al. (1998), who demonstrated that target (e.g., beaubeam). As with the empirical
the presence of neighbors (words that can be con- data, there was an effect of primetarget related-
structed by changing a single letter of a target word) ness in both prime language conditions.
in the nontarget language slowed word recognition Bijeljac-Babic et al. (1997) also demonstrated
in both Dutch and English. The presence of within- that the size of the cross-language inhibition effect
language neighbors accelerated recognition times in of an orthographically similar masked prime on
English, but had an inhibitory effect on Dutch word target recognition increased as a function of the
recognition. participants level of prociency in the prime
The correlation of the simulation data with the words language. Employing the BIA model with a
English word data showed that, irrespective of the French and an English lexicon and varying the
assumed frequency range, the BIA model that in- resting activation of the word units in L2 to rep-
cluded asymmetric top-down inhibition (from the resent prociency, Dijkstra, Van Heuven, et al.
Dutch language node to English words) produced (1998) successfully simulated the dependence of the
better simulation results than the other models. In cross-language inhibition effect on L2 prociency.
contrast to the English results, for Dutch, high The L2 neighbors have to be sufciently active to
correlations were obtained with the selective access interfere with L1 recognition.
model that incorporated only the Dutch lexicon. The monolingual results from the control sub-
However, combining the results over both lan- jects in the study of Bijeljac-Babic et al. (1997)
guages, the highest correlations over all experiments were then simulated with the monolingual IA
were obtained with a BIA model variant involving model. Interestingly, the results of the monolingual
only top-down inhibition from the Dutch language simulation deviated from those of the experiment
node to active English words and a reduced English because the model predicted a facilitation effect for
210 Comprehension
the related condition whereas the empirical results identify whether the stimulus is a word in either of
produced no priming effect. According to the their languages. In an English lexical decision task,
model, the overlap in several letters between non- on the other hand, Dijkstra et al. found that ho-
word prime and target word should have resulted mographs were recognized no more quickly than
in faster target recognition. However, the models English control words. Thus, the representation of
predictions were in line with several studies from each pair of homographs with a single node in the
the monolingual literature (e.g., Ferrand & Grain- BIA model cannot account for the data.
ger, 1992; Forster, Davis, Schoknecht, & Carter, However, when the interlingual homograph
1987). Dijkstra et al. (1998) discussed possible is represented by two separate word nodes in the
stimulus confounds that may explain the diver- BIA, one in each language, the model fails to recog-
gence of empirical and modeling results in Bijeljac- nize either of the representations of the homograph
Babic et al.s monolingual controls. Overall, the (Dijkstra & Van Heuven, 1998). Both word nodes
simulation results indicated that the BIA model can become strongly activated because of bottom up in-
successfully simulate the effects of different levels formation, but at the same time they inhibit
of prociency on cross-language masked priming. each other as competitors at the word level. There-
fore, they will stay below the standard word recog-
Interlingual Homographs and Cognates Empirical nition threshold. Dijkstra and Van Heuven showed
data from studies employing cognates (e.g., the that the BIA model with language node to word
Dutch and English word lm, with the same inhibition can suppress the inappropriate reading
meaning in each language) and interlingual homo- of the homograph. Furthermore, they showed that
graphs (e.g., the Dutch and English word room, with top-down inhibition from the Dutch language
which means cream in Dutch) constitute a chal- node to all English words, the BIA model could
lenge for any model of bilingual word processing. simulate the results of the Dutch go/no go task of
For example, recognition latencies of interlingual Dijkstra et al. (2000), in which subjects only gen-
homographs appear to be affected by such factors erate a response if the stimulus is a word rather
as task demands, list composition, and word char- than making a yes/no decision. The BIA model
acteristics (De Groot, Delmaar, & Lupker, 2000; captured the frequency-dependent interference ef-
Dijkstra, Grainger, & Van Heuven, 1999; Dijkstra, fect observed for homographs when each homo-
Timmermans, & Schriefers, 2000; Dijkstra, Van graph was represented with a separate node and
Jaarsveld, & Ten Brinke, 1998; see Dijkstra & Van a resting-level activation based on its within-
Heuven, 2002, for a review). Only models that language frequency.
include components to simulate task demands and
strategic modications of decision criteria depend-
ing on list composition will stand a chance of ac- Other Localist Models
counting for all experimental effects. However, it is
still informative and useful to investigate what an Bilingual Interactive Model of Lexical Access A
orthographic processing model like the BIA model localist model has been developed to account for
predicts regarding how these words should be bilingual speech perception. This model is called
recognized, even without sophisticated task-level the Bilingual Interactive Model of Lexical Access
processing structures. (BIMOLA; Lewy & Grosjean, 1997) and is de-
Interlingual homographs can be represented in picted in Fig. 10.2. This model is based on the IA
the BIA model in two ways: (a) as a single word model of auditory word recognition called TRACE
node with, for example, a summed frequency of the (McClelland & Elman, 1986). The BIMOLA has
reading in each language or (b) as separate repre- layers of auditory features, phonemes, and words
sentations for each language, each with a frequency just like TRACE has. However, unlike in TRACE,
reecting its usage in that language (Dijkstra et al., representations are not duplicated at each time
1999). slice. The BIMOLA has a feature level that is
Simulations with interlingual homographs re- common to both languages. On the other hand, the
presented as a single combined node show that phoneme and word levels are organized by lan-
these homographs are always processed faster guage. This contrasts with the BIA model, for
than control words. However, empirical data (e.g., which the languages are not distinguished at the
Dijkstra et al., 1998) indicated that homographs letter and word levels other than by the fact that L1
are only faster in a generalized (language-neutral) and L2 words are connected to different language
lexical decision task, in which bilinguals must nodes.
Computational Models 211
Figure 10.2 The Bilingual Model of Lexical Access (BIMOLA) (Lewy & Grosjean, 1997), a model of
bilingual speech perception.
Differences Between the Bilingual Interactive Acti- competition that can be biased by top-down acti-
vation Model and the Bilingual Interactive Model of vation from the language node level.
Lexical Access The BIA model and the BIMOLA The BIA model incorporates competition be-
are both localist models that share properties like tween words of different languages to account for
the parallel activation of words of both languages, cross-language interference effects in visual word
but there are also some clear differences. Although recognition described in the sections on Neigh-
the BIA model has an integrated lexicon, the borhood Effects, Priming Effects, and Interlingual
BIMOLA has separate lexicons for each language. Homographs and Cognates. As a model of speech
This means that during recognition in the perception, the BIMOLA has to account for the
BIMOLA, L1 words only compete with other L1 empirical effects revealed in this different modal-
words and L2 words only with other L2 words to ity, such as the base language effect in guest
reach threshold, whereas in BIA, all words compete word recognition (see Grosjean, 2001). To account
with all other words through lateral inhibition in a for language context effects in speech perception,
212 Comprehension
the BIMOLA implements a top-down language acti- (priming data, the effects of consistency between
vation mechanism that uses global language infor- orthographic and phonological codes, pseudoho-
mation to activate words of a particular language. mophone effects, and the role of neighborhoods;
There are no explicit representations of lan- Van Heuven & Dijkstra, 2001). The model is able
guage nodes in the BIMOLA, but the top-down to simulate effects that cannot be simulated by other
language activation mechanism included in this models of visual word recognition that include
model can be seen as an implicit implementation of representations of phonology, such as the dual route
language nodes. However, the explicit language cascaded model (Coltheart, Rastle, Perry, Langdon,
nodes in the BIA model differ from the top-down & Ziegler, 2001) and the MROM-p (Interactive
language activation mechanism in the BIMOLA Activation Multiple Read-Out Model of Ortho-
because they do not activate the words of the lan- graphic and Phonological Processes; Jacobs, Rey,
guage they represent; rather, the language nodes Ziegler, & Grainger, 1998). In particular, SOPHIA
only inhibit words of the other language. The can account for the facilitatory effects found for
mechanisms, however, are similar in that they alter words with many body neighbors in the lexical de-
the relative activation of the two languages. Here, cision task (Ziegler & Perry, 1998). Body neighbors
the different segregation of the lexical units and the are those neighbors that share their orthographic
different top-down dynamics illustrate the distinc- rime with the target word. The model is currently
tive theoretical assumptions incorporated into vi- applied to bilingual phenomena.
sual and speech perception models of bilingual
word recognition to account for the different em-
pirical effects of each modality. In other words, Conclusion
the modelers implicitly assume that the different The BIA model, a localist model of bilingual or-
demands of recognition in each modality have led thographic language processing, has been success-
to different functional architectures. ful in simulating several empirical data patterns,
particularly those involving neighborhood effects
in word recognition and in masked priming. A
The Semantic, Orthographic, and Phonological comparison with the BIMOLA, a model of bilin-
Interactive Activation Model Dijkstra and Van gual speech perception, illustrated that different
Heuven (2002) proposed a new theoretical model theoretical assumptions may be necessary to cap-
called the BIA model. This model is an extension ture empirical effects in visual and auditory mo-
of the BIA model to include phonological and dalities. The SOPHIA model can already account
semantic representations. Language nodes are also for several monolingual empirical ndings. The
present in the BIA model, but they can no longer model has great potential, especially when it im-
inhibit words of the other language. The ortho- plements all aspects of the BIA model to be able
graphic, phonological, semantic, and language to simulate a wide variety of empirical ndings in
node representations are part of the identication bilingual language processing.
system of the BIA model. In addition, the model
has a task/decision system that regulates control
(see Dijkstra & Van Heuven, 2002).
At this moment, the identication system of Distributed Approaches
the theoretical BIA model has been implemented
in a localist connectionist model (Van Heuven & Building a Distributed Model
Dijkstra, 2003). This implemented model is called
the SOPHIA (semantic, orthographic, and phono- The construction of distributed models of bilingual
logical interactive activation) model. The architec- language comprehension differs from that of lo-
ture of the SOPHIA model is shown in Fig. 10.3. calist models in that it involves two stages. First,
Unique for this model are the sublexical layers the modeler constructs representations (or codes)
of syllables and clusters. The cluster layers consist that will depict the relevant cognitive domains.
of onset, nucleus, and coda letter and phoneme These domains might include phonological repre-
representations. sentations of spoken words, orthographic repre-
So far, simulations with this model have focused sentations of written words, representations of word
on monolingual, monosyllabic word processing. meaning, or representations of the identity of
The SOPHIA model is able to account for a number words appearing in the sequential strings that make
of effects in monolingual visual word recognition up sentences. In addition, the modeler constructs
Computational Models 213
Figure 10.3 The Semantic, Orthographic, and Phonological Interactive Activation (SOPHIA) model
(Van Heuven & Dijkstra, 2001, in preparation).
a network architecture that will allow the relevant The Bilingual Single Network Model
associations between the domains to be learned.
However, connection strengths in the network are Thomas (1997a, 1997b, 1998) considered in some
initially randomized, so the system begins with no depth how distributed models of the monolingual
content. In the second stage, the model undergoes language system, such as Seidenberg and McClel-
training to learn the relevant mappings, for in- lands (1989) distributed model of word recog-
stance, between each words form and its meaning. nition and reading, might be extended to the
It is important to realize that, in distributed mod- bilingual case. Two hypotheses were considered:
els, the modelers theory is implemented in the way that the bilingual has separate network resources
initial representations are constructed and in the available to learn each language, along with con-
architecture that is chosen by the modeler to learn trol structures integrating the output of each net-
the mappings. work, or that the bilingual has a single combined
Work in the area of distributed models of bi- representational resource in which both languages
lingual memory is relatively new. In the following are stored but each language is identied by
sections, we consider three distributed models of language-specic contextual information (infor-
bilingual language comprehension. We then ex- mation that may be based on differences in pho-
amine the potential of distributed models to in- nology, on differences in context of acquisition and
vestigate a range of phenomena of interest in usage, or simply an abstract language tag). The
bilingual language processing. empirical evidence from visual word recognition
214 Comprehension
contains both indications of the independence of product of seeking to learn the form-to-meaning
lexical representations for each language (e.g., relations for two languages across a single repre-
recognition of interlingual homographs according sentational resource.
to within-language frequency [Gerard & Scarbor- Thomas constructed two articial languages of
ough, 1989], a lack of long-term repetition priming 100 items each to examine the interference effects
between orthographically dissimilar translation under carefully controlled conditions. Words were
equivalents [Kirsner, Smith, Lockhart, King, & Jain, constructed around consonant-vowel templates
1984]) and evidence of interference effects in cases and included both a frequency structure and or-
of cross-language similarity, for instance, in the thographic patterns that were either shared across
slowed recognition of interlingual homographs the languages or distinct to one. The orthographic
compared to cognate homographs under some con- representations were similar to those included in
ditions (Klein & Doctor, 1992) and the speeded the BIA model, involving the position-specic en-
recognition of cognate homographs under other coding of letters in monosyllabic words. Repre-
conditions (Cristoffanini, Kirsner, & Milech, 1986; sentations were constructed to encode each words
Gerard & Scarborough, 1989). meaning, based around distributed semantic fea-
One of the features of distributed networks is ture sets (see Plaut, 1995, for a similar monolin-
that the internal representations they develop de- gual implementation; De Groot, 1992, for a related
pend on similarity patterns within the mappings theoretical proposal). Finally, a binary vector en-
they must learn (Thomas, 2002). Given that inter- coded language membership. The network archi-
ference between the bilinguals languages occurs tecture of the BSN model is shown in Fig. 10.4, with
when vocabulary items share some degree of simi- the number of units in each layer included in pa-
larity, Thomas (1997a, 1997b) decided to explore rentheses. This network was trained to learn the
the single network hypothesis. This is the idea that relationship between orthography and semantics.
interference effects are the consequence of at- In this model, word recognition begins by turn-
tempting to store two languages in a common rep- ing on the relevant units at input for the letters of the
resentational resource. The model therefore sought word. The connection weights then carry this acti-
to capture a combination of empirical effects for vation up to the internal or hidden processing units.
independence and for interference as the emergent Further connections then activate the relevant
Figure 10.4 The Bilingual Single Network (BSN) model (Thomas, 1997a, 1997b). Rectangles correspond
to layers of simple processing units.
Computational Models 215
semantic features for the target words. In essence, dimensions of a notional similarity space (which in
this model transforms an activation pattern for a fact has 60 dimensions, capturing decreasing levels
words orthography to a pattern for its meaning in of variance). The position of each vocabulary item
two stages. is plotted in this two-dimensional space. Two
Examination of the respective activation pat- versions are shown (a) under conditions of equal
terns for each word across the hidden units can give training on each language or (b) under conditions
an indication of the representations that the model in which the network is exposed to one language
has developed to recognize the words in two lan- three times as often as the other. Four pairs of
guages. As suggested in the discussion of modeling words in each language are linked in the diagram,
approaches, this set of distributed representations showing the representation of a cognate homograph
is less readily interpreted because it is not hand in each language, an interlingual homograph, a
coded but the models own solution. A statistical translation equivalent with a different form but
technique called principal components analysis al- language-common orthography, and a translation
lows examination of the latent similarity structure equivalent with a different form and language-
in the representations that the model has learned. specic orthography.
Figure 10.5 depicts the structure of the internal This gure illustrates several points. First, the
representations plotted on the two most prominent two parallel, vertical bands reveal that the network
1 1
2nd Principal Component
0 0
-1 -1
-2 -2
-3 -3
-4 -2 0 2 4 -4 -2 0 2 4
1st Principal Component 1st Principal Component
Figure 10.5 The structure of the internal representations learned by the Bilingual Single Network model
for balanced and unbalanced networks. Diagrams show the positions of the words in each articial
language on the two most salient dimensions of the 60-dimensional internal similarity space (L1, rst
language; L2, second language).
216 Comprehension
has developed distinct representations for each involve the integration of multiple sources of in-
language by virtue of the language membership in- formation (see Thomas, 1997a, for a full discussion
formation included in the input and output. Second, of lexical decision in the context of monolingual
the representations contain a similarity structure distributed models). Second, the model includes
that reects the common set of meanings that the obvious simplications regarding the use of two
languages share. Thus, in Fig. 10.5a, words with small articial vocabulary sets.
common meanings are roughly at the same vertical Perhaps most serious, however, this model is able
level. However, the emergent internal representa- to develop bilingual representations over a single
tions also capture common orthographic forms, il- resource because its exposure to each language is
lustrated by the related positions of homographs. simultaneous and intermixed. On the other hand, it
Third, in the balanced network, the orthographic is well known that, under conditions of sequential
characteristics of the input have been exploited to learning, in which training on one set of mappings
provide further structure, such that translation ceases and another begins, models of this sort are
equivalents with language-general orthographic liable to show interference effects, forgetting as-
patterns are represented more similarly than those pects of the rst set of knowledge that are incon-
without (in Fig. 10.5a, the line linking the transla- sistent with the second. This suggests that the BSN
tion equivalent in each language cluster is shorter model might have difculty capturing L2 acquisi-
when the two forms have language-general or- tion. Empirically, the commonly held view is that L2
thography than when they have language-specic acquisition produces a bilingual lexicon not func-
orthography). However, this distinction is not ap- tionally different from when the two languages are
parent in the unbalanced network, Fig. 10.5b, in acquired simultaneously, and L2 acquisition does
which the dominance of L1 has not permitted the not greatly interfere with L1 performance. Thomas
orthographic distinctions present in L2 to be- and Plunkett (1995) explored the conditions under
come apparent. Finally, the L2 representations in which such catastrophic forgetting would occur in
the unbalanced network are less well delineated, networks trained on two (articial) languages, one
occupying a smaller area of representational space. after the other. Interference was a genuine problem,
The L2 has not yet been encoded in sufcient detail. although it could be overcome by increasing the
In functional terms, the model was able to dem- salience of the information encoding language
onstrate behavior illustrating both the independence membership. We return to the issue of catastrophic
of lexical representations and interference effects. In forgetting in the section Potential Applications of
terms of independence effects, interlingual homo- Distributed Models to Bilingual Phenomena.
graphs showed recognition accuracy that depended
on within-language frequency effects, and there
was an absence of cross-language long-term priming The Bilingual Simple Recurrent
effects for translation equivalents (long-term prim- Network and Self-Organizing Model
ing was implemented by giving the network extra of Bilingual Processing
training cycles on the prime and then testing the
change in recognition accuracy of the target). In Two further distributed models have addressed
terms of interference effects, the model demon- how the representations for the bilinguals two
strated a disadvantage for interlingual homographs languages may be acquired within a single repre-
compared to cognate homographs and, in the un- sentational resource or, more specically, where
balanced network, a facilitatory effect for cognate the information comes from that allows bilinguals
homographs in L2. Finally, the use of the common to separate their two languages. The aim of these
semantic output layer allowed the model to account models was to examine how the implicit structure
for cross-language semantic priming effects. of the problem faced by the bilingual might lead to
Despite reconciling effects of independence and the emergence of differentiated internal represen-
interference, this preliminary model has several tations, in the rst case through differences in word
disadvantages. For example, as discussed in the order in sentences, in the second through differ-
section Interlingual Homographs and Cognates, ences in word co-occurrence statistics in corpuses
interlingual homograph recognition in lexical de- of each language and in the third through differ-
cision depends on task demands and stimulus list ences in co-occurrence statistics in the phonology
composition; there is no way to achieve such ex- of the words in each language.
ibility in the current BSN model. In part this is French (1998) explored whether word order
because lexical decision is a complex task that may information would be sufcient to distinguish the
Computational Models 217
two languages in the BSRN shown in Fig. 10.6. The most psychological detail in its representation
input to the model was a set of sentences in which of English and Chinese phonology and in its use
the language could switch with a certain proba- of a training set derived from a bilingual child
bility. Each language employed a different vocabu- language corpus. The greater the detail incorpo-
lary. French constructed a model in which the input rated into the model, the more closely it should
and output representations encoded the identity of be able to simulate patterns of empirical data. The
all possible words in the vocabulary. A network was model also seeks to include stronger constraints
used that had cycling activation, such that every from the neurocomputational level in its use of
word could be processed in the context of the words self-organizing maps and in the learning algo-
that had gone before it in a sentence (the so-called rithms employed. Self-organizing maps are two-
simple recurrent network, SRN; Elman, 1990). The dimensional sheets of simple processing units that,
networks task was to predict the next word in the when exposed to a set of training patterns, develop a
sentence. To do so, the network had to acquire representation of the similarity structure of the do-
representations of sentence structures in each arti- main across the sheet of units (see Fig. 10.7).
cial language. French found that, as long as lan- Such maps are found in the sensory cortices of the
guage switches occurred with a sufciently low brain, where different areas of the map represent
probability (0.1%), differences in word order alone sensations from different areas of the body. In the
were sufcient to develop distinct representations SOMBIP, two self-organizing maps were learned,
for each language. one for the representation of the sounds of words
Li and Farkas (2002) developed an ambitious in English and Chinese and one for the meanings of
distributed model called the self-organizing model words; associative links were then learned between
of bilingual processing (SOMBIP), aimed at cap- the two maps.
turing both bilingual production and comprehen- Although the SOMBIP is ambitious in the
sion. The model is shown in Fig. 10.7. This work is number of psychological and neurocomputational
impressive in that, of all the models, it incorporates constraints it incorporates, potentially increasing
Figure 10.6 The Bilingual Simple Recurrent Network (BSRN) (French, 1998).
218 Comprehension
ENGLISH
PHONOLOGY
ENGLISH
SEMANTICS
Figure 10.7 The Self-Organizing Model of Bilingual Processing (SOMBIP; Li & Farkas, 2002).
the validity of subsequent model ndings, current (see Francis, chapter 12, this volume; Chen & Ng,
results are preliminary and have yet to be compared 1989; Kirsner et al., 1984; Potter et al., 1984;
directly to empirical data. Interestingly, one of the Smith, 1997). Indeed, the SOMBIP ends up devel-
main theoretical claims of the model is that it can oping representations for the two languages that
account for the language-specic aspects of the bi- are so different (both phonologically and semanti-
lingual lexicon without recourse to language nodes cally) that an additional training procedure has to
or language tags. However, some of the design de- be included specically to allow translation equi-
cisions within the model belie this claim. valents to become associated.
First, the phonological representations incor- Finally, the use of semantic representations based
porate an additional vector (used to encode tonal- on word co-occurrence statistics leads to some odd
ity) that only Chinese words employ. Such a vector assumptions in the model: One part of the system is
would be sufcient to serve the same role as the used to derive co-occurrence statistics (not shown in
language membership information built into the Fig. 10.7). To achieve this end, this part of the sys-
BSN model, information that in that case was tem is supplied with the identities of all the words
sufcient to distinguish two languages within a in the bilinguals two vocabularies, coded in the
single set of representations. structure of their input representations. Yet, it is
Second, the representations of word meaning the very task of another part of the model (the
used in the SOMBIP are based on word co-occur- phonological map) to learn these identities. Why is
rence statistics (thought to be a valid indicator of this necessary if the knowledge is already prewired
meaning in the unilingual case because words with into the system? Despite these difculties, the
similar meanings tend to occur in similar sentence SOMBIP is an interesting new model that awaits
contexts). However, in the bilingual case, because close evaluation against the empirical data.
the words making up the sentences of the two
languages are different, this approach has the un- Assessment of Existing
fortunate effect of generating two entirely separate Distributed Models
meaning systems. This is at odds with the generally
held view that the bilingual lexicon has a single, These three distributed models explore ways in
language-common level of semantic representation which bilingual lexical representations may emerge
Computational Models 219
as a consequence of exposing a learning system investigation of many of the key issues for bilingual
(construed in broadly neurocomputational terms) language processing raised in the introduction. In
to the particular problem that faces the bilingual this section, we outline some of the potential ex-
during language comprehension. The models have tensions.
in common the assumption that representations will Certainly in production, perhaps also in com-
emerge over a shared representational resource. prehension, language processes must be controlled
This assumption is driven by parsimony in relating according to language context, for example, to
the bilingual system to the unilingual system: We achieve a switch of language in production or to
should rst explore whether single-resource systems optimize recognition processes during comprehen-
are capable of explaining the behavioral data before sion. Several distributed models have been pro-
adding new assumptions to the models. In addition, posed that examine processes of control (e.g.,
the single resource provides a ready explanation Cohen, Dunbar, & McClelland, 1990; Gilbert &
for why cross-language interference effects should Shallice, 2002). For instance, Cohen et al. used a
emerge in bilingual word recognition. distributed model to demonstrate how a naming
However, the alternate hypothesis that the process could achieve increasing degrees of auto-
bilinguals two language systems employ entirely maticity and escape from attentional control as it
separate representational resources is not quite experienced increasing degrees of training. These
so easily dismissed because cross-language inter- models could provide a basis for new accounts of
ference effects could emerge merely from attempts the control of bilingual language systems and the
to control and coordinate two competing systems. relation of control to language prociency.
Drawing an analogy to current debates in re- Of particular salience in work on L2 acquisition
search on reading and on inectional morphology, is the question of critical periods, that is, the extent
Thomas (1998) discussed ways in which single to which the ability to learn an L2 is constrained
route and dual route hypotheses can be distin- by the age at which acquisition commences (see
guished empirically. In these other debates, there Birdsong, chapter 6, and DeKeyser & Larson-Hall,
is a question about whether language behavior is chapter 5, this volume). Assuming that languages
generated by a single resource showing differential employ the same representational resources in the
performance for different stimuli (in this case, reg- cognitive system, connectionist work has explored
ular and irregular items) or whether separate pro- how age of acquisition effects can occur in a single
cessing mechanisms handle each type of stimulus. network when training on one set of patterns fol-
Thomas (1998) concluded from this analogy that lows that on a rst (Ellis & Lambon-Ralph, 2000).
the question will not be resolved simply on the basis The results have demonstrated a reduction in
of cross-language interference effects in adult bi- plasticity for the second set of patterns and poorer
linguals, but must appeal to wider evidence con- ultimate performance. This occurred because the
cerning acquisition, language loss, and breakdown. rst set of patterns had established dominance over
Distributed models of bilingual language com- the representational resource and optimized it for
prehension remain at an early stage of development. its own needs. Such a model might provide the
In the following section, however, we suggest that, basis for a computational exploration of critical
together with existing models within the monolin- period effects in L2 acquisition.
gual domain, distributed approaches have the po- The assumption that languages compete over a
tential to address many of the key issues within single representational resource is of course not a
bilingual language processing. necessary one. Learning an L2 may cause the re-
cruitment of new resources. So-called constructivist
distributed networks that recruit hidden units to
Potential Applications learn new sets of patterns would provide a prot-
of Distributed Models to able framework within which to explore this al-
Bilingual Phenomena ternative (see Mareschal & Shultz, 1996).
However, the idea of a common resource pro-
The issue of separate or shared representational vides a ready explanation for the interference and
resources is one that may be readily examined in transfer effects during L2 acquisition (see Mac-
distributed modeling, along with interference ef- Whinney, chapter 3, this volume). Early distributed
fects and the possible encoding of language status. modeling work examined such transfer effects
However, monolingual models already exist that, within several contexts, including the acquisition of
if extended to the bilingual domain, would allow pronouns in an L2 (Blackwell & Broeder, 1992),
220 Comprehension
the transfer of word order properties from L1 to L2 self-organizing distributed model of (monolingual)
(Gasser, 1990), and the acquisition of gender as- naming and gesturing, an approach that is readily
signment in French (Sokolik & Smith, 1992; see extendible to bilingual language processing.
Broeder & Plunkett, 1994, for a review). Finally, the ability of distributed systems to
The idea that L2 acquisition takes place in a capture change over time allows them to address
language system with representations that are con- issues of language decay, either when a language is
ditioned by L1 processing may also provide the no longer used (language attrition, e.g., Weltens &
opportunity to explain other aspects of the bilin- Grendel, 1993; see Thomas, 1997a, for discussion
gual system. For instance, Kroll and Stewarts of insights from computational learning systems) or
(1994) revised hierarchical model postulates that, when a previously working bilingual system expe-
in L2 acquisition, L2 lexical representations ini- riences decits following brain damage (see Plaut,
tially hang-off L1 lexical representations before 1996, for discussion of distributed approaches in
making direct connections to semantics. It is quite the monolingual domain).
possible that a distributed network system, initially In sum, much of the potential of distributed
trained on an L1, then trained on L2, would ini- modeling shown in the monolingual domain re-
tially adopt the internal representations condi- mains to be exploited in the study of bilingual
tioned by L1 lexical knowledge to drive L2 language processing.
production and comprehension before undergoing
the more laborious reorganization of the internal
representational space that would establish direct
mappings between semantics and L2 lexical Advantages and Disadvantages
knowledge. In other words, distributed models may of Localist and Distributed
be able to produce the revised hierarchical model as Models
an emergent effect of the experience-driven reor-
ganization of language representations within a Localist and distributed connectionist models share
distributed system. the property that their behavior is strongly inu-
Of course, L2-induced reorganization of repre- enced by the structure of the problem domain that
sentations implies a consequent effect on the orig- they encode. There are three principal dimensions
inal structure of L1 knowledge. We have already for which the emphasis of these models differs in
discussed the idea that the assumption of a single practice. First, localist networks tend to include
representational resource implies that decay of L1 both bottom-up and top-down connections, which
knowledge might occur if immersion in an L2 en- allows dynamic patterns of activation to cycle
vironment entirely replaced L1 usage. An in-depth through the network. Activation states therefore
consideration of such catastrophic forgetting, in- persist over time, allowing localist models to study
cluding the conditions under which it should and the trajectory of activation states while processing
should not occur according to neural network a single input, for instance, in terms of candi-
theory, can be found in the work of Thomas dates that are initially activated before the system
(1997a). If these models are correct, a careful study settles on a nal solution or in terms of short-term
of L1 performance under intense L2 acquisition priming effects as discussed in the BIA model. In
should reveal systematic (although perhaps subtle) addition, alterations in top-down activation, for
decrements in performance. Although there are instance, from language nodes in the BIA model or
anecdotal reports of such decrements, to our in the baseline activation of the word units in each
knowledge these effects have yet to be studied language in the BIMOLA, allow localist models to
systematically. (See Seidenberg & Zevin, in press, investigate the implications of changes in language
for a recent discussion of these issues.) context on recognition times. On the other hand,
The focus of distributed models on learning many distributed models have employed only bot-
mappings between codes provides the potential to tom-up or feedforward connections. This means
account for modality-specic expertise in bilinguals that processing in such distributed models is
because the visual versus auditory domains and the completed by a single pass through the network.
spoken versus written domains instantiate different Activation values are computed in a single set
types of codes. Expertise in one domain does not of calculations, with no unfolding of activation
necessarily transfer to expertise in another. Plaut states over time. However, as the complexity of
(2002) provided a demonstration of how graded, distributed models increases, this distinction is be-
modality-specic specialization can occur in a coming less salient. So-called attractor networks
Computational Models 221
are trainable distributed networks with both bot- L2 acquisition remains an area to be explored using
tom-up and top-down connections (sometimes connectionist models.
called recurrent connections). Both the bilingual In our discussion on approaches to modeling,
SRN model and the SOMBIP permit cycles of ac- we highlighted two central issues arising in the way
tivation in their architecture (see also Thomas, that computational models are related to theory.
1997a, for an extension of feedforward networks The rst of these was the importance of under-
to modeling short-term priming effects using cas- standing how a model works so that a successful
cading activation in a feedforward network as simulation can be directly related back to the the-
proposed by Cohen et al., 1990). ory that it was evaluating. This characteristic might
The second, related dimension in which the be referred to as the semantic transparency of a
modeling approaches differ relates to the type of model. On this point, note that localist computa-
data to which the output of the models is typically tional models have often been viewed as superior
compared. Localist models with cycling activation because the activation of each processing node
eventually settle to a solution, which is either cor- corresponds to the condence level that a certain
rect or incorrect given the task. These models concept is present in the input (whether it is a letter
therefore generate two types of data: a response feature, a letter, a word, or a language). Every ac-
time (number of cycles until the network settles tivation state can therefore be readily interpreted.
into a stable solution) and an accuracy level (the Even unexpected emergent characteristics of IA
percentage of trials on which the model settles onto models that arise from the combination of bottom-
the correct solution). Distributed models like the up and top-down connections can be recharacter-
BSN that use a single set of calculations to derive ized in theoretical terms.
activation values have no temporal component to On the other hand, distributed models in
processing. Although the accuracy of the output their trained state produce activation patterns
can be calculated, there is no equivalent to response across the hidden units without immediate seman-
time. This restricts the data against which simple tic interpretations. Although analytical tools are
distributed models can be compared. available to investigate the computational solutions
The third difference relates to the predominant that the distributed network has learned (for in-
use of handwired, xed connections in localist stance, the principal components analysis that
networks compared to learned connections in produced diagrams of the similarity structure of the
distributed models. As discussed, this characteristic internal representations in the BSN in Fig. 10.5),
makes localist models more amenable to investi- the requirement of additional analysis testies to
gating the static structure of the adult bilingual their reduced semantic transparency. The conse-
language system, whereas distributed models are quence is increased difculty in relating particular
more amenable to examining processes of devel- distributed models back to the theory that gener-
opment and change within this system (Thomas, ated them.1
2002). In principle, localist models can learn This brings us to our second issue. Decisions
their connection strengths (Page, 2000), but this about the particular processing structures chosen in
possibility has yet to be exploited in bilingual the model may implicitly inuence the theoretical
research. hypotheses considered by the modeler. Distributed
Although we have seen both localist and dis- models are often harder to interpret because, dur-
tributed models explore the behavior of a bilingual ing learning, the network explores a wider range of
system in which prociency is greater in one lan- solutions than the modeler considers when hand-
guage than in the other, it is nevertheless true that wiring a localist model. Although the distributed
connectionist models of bilingual language com- model is potentially less easy to understand, it is
prehension have failed to address the implications also potentially richer.
of acquiring an L2 when the processing structures Two examples sufce. The BIA model includes
for an L1 are already in place. In the localist model, language nodes that receive activation from the
the unbalanced bilingual was simulated by giving word units in a given language. Each word is ef-
the two sets of language nodes different resting fectively given equal membership to a language, an
activation levels; in the distributed models (both assumption the modeler makes by giving the same
the BSN and the SOMBIP), the unbalanced bilin- value to each connection between a word unit and
gual was simulated by training the system on two its language node. However, in a model like the
languages simultaneously, but with one language BSN, although language context information is
represented in the training set more than the other. provided with every input, the network is under no
222 Comprehension
obligation to use this information in learning the able to account for empirical data suggestive of a
meaning of each word. The essential point about localist architecture. Again, the implicit assump-
trainable networks is that they evolve processing tions within different models encourage consider-
structures sufcient to achieve the task. Therefore, ation of different theoretical hypotheses.
it is quite possible that a distributed system will use Finally, connectionist models embody abstract
language context information only to the extent principles of neural computation and, as seen in
that it is necessary to perform the task (Thomas, the Li and Farkas model, the SOMBIP, a desire to
2002). For instance, it may use it to disambiguate include more constraints from neural processing.
interlingual homographs, but not cognate homo- This approach encourages us to hope that one day
graphs, for the latter have the same status whatever connectionist models of cognitive processing in bi-
the language context. Thus, the distributed model lingualism may make contact with functional data
allows itself the theoretical hypothesis that lan- gathered from the neural substrate, both under
guage membership may not be a universal, uniform normal circumstances and in breakdown.
representational construct, but a continuum that What type of data might the more neurally
depends on the specic demands of the task. The plausible bilingual connectionist model attempt to
ip side is that network solutions embodying such account for? In bilingual aphasia, evidence suggests
shades of gray hypotheses may be harder to one or both languages may be impaired by brain
understand, but that does not make such hypoth- damage, and patterns of recovery include the par-
eses a priori more unlikely. allel recovery of both languages, selective recovery
Second, representational states adopted by dis- of one language, antagonistic recovery of one lan-
tributed systems may change the way we interpret guage at the expense of another, or alternate an-
empirical data. Here, we take an example from tagonistic recovery in which selective impairment
monolingual word recognition. Early in the recog- of comprehension/production can alternate be-
nition of an ambiguous word like bank, prim- tween the languages during recovery (see e.g.,
ing can be found for both the words meanings Green & Price, 2001). Such evidence suggests that
(money and river). A localist interpretation bilingual connectionist models will have to be able
might encourage the following view: that there are to simulate selective damage and recovery of a sin-
independent, localist representations of each read- gle language as well as selective damage of control
ing of the ambiguous word, and that both readings structures necessary to account for alternate an-
are initially activated during word recognition, but tagonism (Green, 1986, 1998).
that subsequently the system settles into a single In bilingual brain imaging, data from compre-
context-appropriate reading. However, a recurrent hension studies suggested that early bilinguals, who
distributed network trained to recover the mean- receive equal practice with their two languages
ings of words from their written forms offers an- from birth, process both languages with common
other theoretical possibility. neural machinery, corresponding to the classical
Kawamoto (1993) used a single representational language areas of the brain. In late bilinguals, the
resource to store the form-meaning mappings for a degree of language prociency determines the pat-
single language. In this network, the two meanings tern of neural organization, with highly procient
of an ambiguous word were distinguished by the L2 users showing common areas, but less-procient
context in which they were used. When the model subjects showing different patterns of activation for
was required to recognize the word in one of those the two languages (Abutalebi, Cappa, & Perani,
two contexts, it went through an intermediate 2001; see also chapter 24, this volume). Distributed
representational state that bore similarity to both connectionist models embodying increasing neural
meanings before diverging to settle on one of the constraints should aim to reect the role of lan-
meanings. This model could account for the em- guage prociency in determining whether the bi-
pirical data without requiring the simultaneous lingual comprehension system engages separate or
activation of independent, competing representa- combined computational machinery.
tions because the intermediate, hybrid state could
prime words related to either meaning. This model
is relevant because the simultaneous activation of Conclusion
independent, subsequently competing representa-
tions is a processing assumption that is normally We have argued that the use of computational models
built into localist models, yet a distributed system is essential for the development of psycholinguistic
with a single representational resource may well be theories of bilingual language comprehension (and
Computational Models 223
indeed production). We explored localist and dis- Bijeljac-Babic, R., Biardeau, A., & Grainger, J.
tributed networks, examining the bilingual data to (1997). Masked orthographic priming in
which each model has been applied, and the types bilingual word recognition. Memory and
of bilingual phenomena that each model has the Cognition, 25, 447457.
potential to illuminate. Nevertheless, the modeling Blackwell, A., & Broeder, P. (1992, May).
Interference and facilitation in SLA: A
of bilingual language processing is at an early stage.
connectionist perspective. Paper presented at
The localist approach is perhaps more advanced Seminar on Parallel Distributed Processing and
than the distributed approach and, at this stage of Natural Language Processing, University of
theory development, perhaps a more useful re- California at San Diego.
search tool. On the other hand, distributed models Broeder, P., & Plunkett, K. (1994). Connectionism
may have the greater potential given the range and second language acquisition. In N. C. Ellis
of phenomena that have been explored within the (Ed.), Implicit and explicit learning of lan-
unilingual domain that have direct relevance to guages (pp. 421455). London: Academic
bilingual language processing. Currently, it is a Press.
time for researchers to use different modeling tools Chen, H.-C., and Ng, M. L. (1989). Semantic
facilitation and translation priming effects in
to investigate different issues within bilingualism.
Chinese-English bilinguals. Memory and
Eventually, however, these models must come to- Cognition, 17, 454462.
gether to generate a (semantically transparent) Cohen, J. D., Dunbar, K., & McClelland, J. L.
model that explains how two languages can be (1990). On the control of automatic processes:
acquired, maintained, and controlled in a dynami- A parallel distributed processing account of
cally changing cognitive system. the Stroop effect. Psychological Review, 97,
332361.
Note Coltheart, M., Rastle, K., Perry, C., Langdon, R.,
& Ziegler, J. C. (2001). DRC: A dual cascaded
1. Interestingly, Li and Farkas (2002) claimed model of visual word recognition and reading
that their model combines the advantages of both aloud, Psychological Review, 108, 204256.
localist and distributed models in that it is not Cristoffanini, P., Kirsner, K., & Milech, D. (1986).
only trainable, but also semantically interpretable. Bilingual lexical representation: The status of
Clearly, a model that can address developmental Spanish-English cognates. The Quarterly
phenomena as well as being easily comprehensible is Journal of Experimental Psychology, 38A,
advantageous. However, the semantic transparency 367393.
of their model is achieved at some expense to psy- De Groot, A. M. B. (1992). Bilingual lexical
chological plausibility. The SOMBIP forces not only representation: A closer look at conceptual
its words, but also its meanings to be represented representations. In R. Frost & L. Katz (Eds.),
over only two dimensions each. It is not clear that Orthography, phonology, morphology, and
this is plausible on psychological grounds, or indeed meaning (pp. 389412). Amsterdam: Elsevier.
on neural groundsthe inspiration for cortical De Groot, A. M. B., Delmaar, P., & Lupker, S. J.
maps comes from sensory cortex. Whether the rep- (2000). The processing of interlexical homo-
resentations of meaning or word forms are driven by graphs in translation recognition and lexical
the same organizational principles is an open ques- decision: support for non-selective access to
tion. However, the depiction of all possible word bilingual memory. Quarterly Journal of
meanings as xy coordinates on a two dimensions Experimental Psychology, 53A, 397428.
would seem perhaps too great a simplication for Dijkstra, A., De Bruijn, E. R. A., Schriefers, H. J.,
the representation of semantics, compared to the & Ten Brinke, S. (2000) More on
more usual depiction of meaning as a large set of interlingual homograph recognition:
(perhaps hierarchically structured) features. Language intermixing versus explicitness of
instruction. Bilingualism: Language and
References Cognition, 3, 6978.
Dijkstra, A., Grainger, J., & Van Heuven, W. J. B.
Abutalebi, J., Cappa, S. F., & Perani, D. (2001). (1999). Recognition of cognates and inter-
The bilingual brain as revealed by functional lingual homographs: The neglected role of
neuroimaging. Bilingualism: Language and phonology. Journal of Memory and Language,
Cognition, 4, 179190. 41, 496518.
Baayen, H., Piepenbrock, R., & Van Rijn, H. Dijkstra, A., Van Jaarsveld, H., & Ten Brinke,
(1993). The CELEX lexical database S. (1998). Interlingual homograph recogni-
(CD-ROM). Philadelphia: University of tion: Effects of task demands and language
Pennsylvania, Linguistic Data Consortium. intermixing. Bilingualism, 1, 5166.
224 Comprehension
Dijkstra, A., Timmermans, M., & Schriefers, H. information in bilinguals. In R. J. Harris (Ed.),
(2000). Cross-language effects on bilingual Cognitive processing in bilinguals
homograph recognition. Journal of Memory (pp. 207220). Amsterdam: Elsevier Science.
and Language, 42, 445464. Grainger, J., & Jacobs, A. M. (1996).
Dijkstra, A., & Van Heuven, W. J. B. (1998). The Orthographic processing in visual word
BIA model and bilingual word recognition. recognition: A multiple read-out model.
In J. Grainger & A. M. Jacobs (Eds.), Localist Psychological Review, 103, 518565.
connectionist approaches to human cognition Green, D. W. (1986). Control, activation and
(pp. 189225). Mahwah, NJ: Erlbaum. resource: a framework and a model for the
Dijkstra, A., & Van Heuven, W. J. B. (2002). control of speech in bilinguals. Brain and
The architecture of the bilingual word recog- Language, 27, 210223.
nition system: From identication to decision. Green, D. W. (1998). Mental control of the
Bilingualism: Language and Cognition, bilingual lexico-semantic system Bilingualism:
5, 175197. Language and Cognition, 1, 6781.
Dijkstra, A., Van Heuven, W. J. B., & Grainger, Green, D. W., & Price, C. J. (2001). Functional
J. (1998). Simulating competitor effects with imaging in the study of recovery patterns in
the bilingual interactive activation model. bilingual aphasia. Bilingualism: Language and
Psychologica Belgica, 38, 177196. Cognition, 4, 191201.
Ellis, A., & Lambon-Ralph, M. A. (2000). Age of Grosjean, F. (2001). The bilinguals language
acquisition effects in adult lexical processing modes. In J. L. Nicol (Ed.), One mind, two
reect loss of plasticity in maturing systems: languages: Bilingual language processing.
Insights from connectionist networks. Journal Oxford: Blackwell.
of Experimental Psychology: Learning, Jacobs, A. M., & Grainger, J. (1992). Testing a
Memory, and Cognition, 26, 11031123. semi-stochastic variant of the interactive
Elman, J. (1990). Finding structure in time. activation model in different word recognition
Cognitive Science, 14, 179211. experiments. Journal of Experimental
Ferrand, L., & Grainger, J. (1992). Phonology Psychology: Human Perception and
and orthography in visual word recognition: Performance, 18, 11741188.
Evidence from masked nonword priming. Jacobs, A. M., Rey, A., Ziegler, J. C., &
Quarterly Journal of Experimental Grainger, J. (1998). MROM-p: An interactive
Psychology, 42A, 353372. activation, multiple read-out model of
Forster, K. I. (1976). Accessing the mental lexicon. orthographic and phonological processes in
In E. C. J. Walker & R. J. Wales (Eds.), visual word recognition. In J. Grainger &
New approaches to language mechanisms. A. M. Jacobs (Eds.), Localist connectionist
Amsterdam: North-Holland. approaches to human cognition (pp. 147
Forster, K. I., Davis, C., Schoknecht, C., & 188). Mahwah, NJ: Erlbaum.
Carter, R. (1987). Masked priming with Kawamoto, A. H. (1993). Nonlinear dynamics in
graphemically related forms: Repetition or the resolution of lexical ambiguity: A parallel
partial activation. The Quarterly Journal of distributed processing account. Journal of
Experimental Psychology, 39A, 211251. Memory and Language, 32, 474516.
French, R. M. (1998). A simple recurrent network Kirsner, K., Smith, M. C., Lockhart, R. L. S.,
model of bilingual memory. In M. A. King, M. L., & Jain, M. (1984). The
Gernsbacher & S. J. Derry (Eds.), Proceedings bilingual lexicon: Language-specic units in an
of the 20th Annual Conference of the integrated network. Journal of Verbal
Cognitive Science Society (pp. 368373). Learning and Verbal Behavior, 23, 519539.
Mahwah, NJ: Erlbaum. Klein, D., & Doctor, E. A. (1992). Homography
Gasser, M. (1990). Connectionism and universals and polysemy as factors in bilingual word
of second language acquisition. Studies in recognition. South African Journal of
Second Language Acquisition, 12, 179199. Psychology, 22, 1016.
Gerard, L. D., & Scarborough, D. L. (1989). Kroll, J. F., & Stewart, E. (1994) Category
Language-specic lexical access of interference in translation and picture naming:
homographs by bilinguals. Journal of Evidence for asymmetric connections between
Experimental Psychology: Learning, Memory, bilingual memory representations. Journal of
and Cognition, 15, 305315. Memory and Language, 33, 149174.
Gilbert, S. J., & Shallice, T. (2002). Task Lewy, N., & Grosjean, F. (1997). A computational
switching: A PDP model. Cognitive Psychol- model of bilingual lexical access. Manuscript in
ogy, 44, 297337. preparation, Neuchatel Univesity, Switzerland.
Grainger, J., & Dijkstra, A. (1992). On the Li, P., & Farkas, I. (2002). A self-organizing con-
representation and use of language nectionist model of bilingual processing. In
Computational Models 225
R. Heredia & J. Altarriba (Eds.), Bilingual and Performance 21. Oxford, U.K.: Oxford
sentence processing (pp. 5985). North- University Press.
Holland, The Netherlands: Elsevier Science. Smith, M. C. (1997). How do bilinguals access
Mareschal, D., & Shultz, T. R. (1996). Generative lexical information? In A. M. B. de Groot &
connectionist architectures and constructivist J. F. Kroll (Eds.), Tutorials in bilingualism:
cognitive development. Cognitive Develop- Psycholinguistic perspectives (pp. 145168).
ment, 11, 571605. Hillsdale, NJ: Erlbaum.
McClelland, J. L., & Elman, J. L. (1986). The Sokolik, M., & Smith, M. (1992). Assignment of
TRACE model of speech perception. Cognitive gender to French nouns in primary and
Psychology, 18, 186. secondary language: A connectionist model.
McClelland, J. L., & Rumelhart, D. E. (1981). An Second Language Research, 8, 3958.
interactive activation model of context effects Thomas, M. S. C. (1997a). Connectionist networks
in letter perception, Part 1: An account of and knowledge representation: The case of
basic ndings. Psychological Review, 88, bilingual lexical processing. Unpublished
375405. doctoral thesis, Oxford University, England.
Meyer, D. E., & Ruddy, M. G. (1974, June). Thomas, M. S. C. (1997b). Distributed represen-
Bilingual word recognition: Organization and tations and the bilingual lexicon: One store
retrieval of alternative lexical codes. Paper or two? In J. Bullinaria, D. Glasspool, &
presented at the annual meeting of the Eastern G. Houghton (Eds.), Proceedings of the
Psychological Association, Philadelphia. Fourth Annual Neural Computation and
Morton, J. (1969). Interaction of information in Psychology Workshop (pp. 240253).
word recognition. Psychological Review, 76, London: Springer.
165178. Thomas, M. S. C. (1998). Bilingualism and the
Page, M. (2000). Connectionist modelling in psy- single route/dual route debate. In M. A.
chology: A localist manifesto. Behavioral and Gernsbacher & S. J. Derry (Eds.), Proceedings
Brain Sciences, 23, 443512. of the 20th Annual Conference of the
Plaut, D. C. (1995). Semantic and associative Cognitive Science Society (pp. 10611066).
priming in a distributed attractor network. In Mahwah, NJ: Erlbaum.
Proceedings of the 17th Annual Conference of Thomas, M. S. C. (2002). Theories that develop.
the Cognitive Science Society (pp. 3742). Bilingualism: Language and Cognition, 5,
Hillsdale, NJ: Erlbaum. 216217.
Plaut, D. C. (1996). Relearning after damage in Thomas, M. S. C., & Plunkett, K. (1995). Re-
connectionist networks: Toward a theory of presenting the bilinguals two lexicons. In J. D.
rehabilitation. Brain and Language, 52, 2582. Moore & J. F. Lehman (Eds.), Proceedings of
Plaut, D. C. (2002). Graded modality-specic the 17th Annual Cognitive Science Society
specialization in semantics: A computational Conference (pp. 760765). Hillsdale, NJ:
account of optic aphasia. Cognitive Erlbaum.
Neuropsychology, 19, 603639. Van Heuven, W. J. B., & Dijkstra, A. (2001,
Potter, M. C., So, K-F., Von Eckardt, B., & September). The Semantic, Orthographic, and
Feldman, L. B. (1984). Lexical and conceptual PHonological Interactive Activation Model.
representation in beginning and more pro- Poster presented at the 12th Conference of the
cient bilinguals. Journal of Verbal Learning European Society for Cognitive Psychology,
and Verbal Behavior, 23, 2338. Edinburgh, Scotland.
Rumelhart, D. E., & McClelland J. L. (1982). An Van Heuven, W. J. B., & Dijkstra, A. (2003). The
interactive activation model of context effects semantic, orthographic, and phonological
in letter perception: Part 2. The contextual en- interactive activation model. Manuscript in
hancement effect and some tests and extensions preparation.
of the model. Psychological Review, 89, 6094. Van Heuven, W. J. B., Dijkstra, A., & Grainger,
Seidenberg, M. S. (1993). Connectionist models J. (1998). Orthographic neighborhood effects
and cognitive theory. Psychological Science, 4, in bilingual word recognition. Journal of
228235. Memory and Language, 39, 458483.
Seidenberg, M. S., & McClelland, J. L. (1989). A Weltens, B., & Grendel, M. (1993). Attrition of
distributed, developmental model of word vocabulary knowledge. In R. Schreuder &
recognition and naming. Psychological B. Weltens (Eds.), The bilingual lexicon
Review, 96, 523568. (pp. 135155). Amsterdam: Benjamins.
Seidenberg, M. S., & Zevin, J. (in press). Compu- Ziegler, J. C., & Perry, C. (1998). No more prob-
tational models in cognitive development: The lems in Colthearts neighborhood: Resolving
case of critical periods in language learning. In neighborhood conicts in the lexical decision
M. Johnson & Y. Munakata (Eds.), Attention task. Cognition, 68, B53B62.
Rosa Sanchez-Casas
Jose E. Garca-Albea
11
The Representation of Cognate
and Noncognate Words
in Bilingual Memory
Can Cognate Status Be Characterized
as a Special Kind of Morphological Relation?
ABSTRACT One of the main issues addressed in bilingual research has been how bi-
linguals represent and access the words from their two languages. Studies carried out in
different languages suggest that the distinction between cognate (words that are similar
in form and meaning) and noncognate (words only similar in meaning) translations can
be relevant in determining how words are represented in the bilingual lexicon. In the
present chapter, we review a program of research in which we examined the visual
recognition of these two types of translations in Spanish-English and Catalan-Spanish
bilinguals in experiments using the priming paradigm and the lexical decision task.
These experiments showed on the one hand that facilitation effects are only obtained
with cognate translations and on the other that these effects cannot be the result of
mere form and/or meaning similarity. The latter also seems to hold for morphological
priming effects within a language. On the other hand, they showed that cognate fa-
cilitation effects do not differ from those obtained with morphologically related words
both within and between languages. On the basis of this evidence, we propose that
cognate words are represented differently from other words related across languages
(either by form or meaning), and that this representation can be characterized in
morphological terms. The implications of these data for models of bilingual word
recognition are discussed.
226
Representation of Cognates 227
both the lexical and conceptual levels (see orthographic/phonological characteristics as well
De Groot, 1993, 2001; Kroll, 1993; and Sanchez- as semantic features. Therefore, morphological
Casas, 1999, for reviews). relations may in fact have to be reduced to a
The answer to this question appears to depend convergence of semantic, phonological, and or-
on two types of variables: variables related to the thographic relationships without explicit represen-
language user, such as level of prociency, experi- tation themselves in the lexicon. Generally, these
ence, and learning environment of the second studies have provided evidence that morphological
language (L2) (e.g., Chen, 1990; De Groot & Poot, relations are something special and different from
1997; Kroll, 1993; Kroll & Stewart, 1994; Potter form and meaning relationships, and that they can
et al., 1984; Talamas, Kroll, & Dufour, 1999); and be coded in the lexicon as an independent level of
word type variables, such as cognate status, representation (e.g., Drews & Zwitserlood, 1995;
concreteness, and word frequency (e.g., Davis, Frost, Deutsch, Gilboa, Tannenbaum, & Marslen-
Sanchez-Casas, & Garca-Albea, 1991; De Groot, Wilson, 2000; Frost, Forster, & Deutsch, 1997;
1993; De Groot & Nas, 1991; Dijkstra, Grainger, Garcia-Albea et al., 1998; Marslen-Wilson, Tyler,
& Van Heuven, 1999; Garca-Albea, Sanchez- Waksler, & Older, 1994).
Casas, & Igoa, 1998; Gollan, Forster, & Frost, In bilingual research, similar questions could be
1997; Sanchez-Casas, Davis, & Garca-Albea, asked, now regarding word relations across lan-
1992). (See De Groot, 2001, for a general overview guages. In particular, we deal with cognate trans-
of the effects of these variables.) The research lations that, similar to morphologically related
reported in this chapter focuses on the role of one words, have a common root and are semantically
word type variable, cognate status. Its general aim and orthographically (and at times phonologically)
is to provide some preliminary evidence that similar (e.g., ricorich, torretower) or even iden-
morphological relationships across languages de- tical (e.g., animalanimal). Given these similarities,
termine the way cognate words are represented in it may be the case that a cognate relation between
the bilingual lexicon. words can be reduced to a mere form or meaning
The inuence of morphology in language pro- similarity between them. However, it may also be
cessing has attracted increasing attention in psy- the case that a cognate relation is a special kind of
cholinguistic research during the last years, morphological relation, and, as some morphologi-
especially in relation to issues concerning lexical cally related words, they are jointly represented in
representation and processes. This is not surprising the bilingual lexicon. These are the two issues ad-
if one takes into account that morphological fea- dressed in this chapter (see also Friel & Kennison,
tures are considered critical in dening the struc- 2001, for a comparison of methods for oper-
ture of words, thereby affecting other linguistic ationalizing cognate status).
levels (semantic, syntactic, and phonological/or- First, we review some studies that deal with
thographic) (Garca-Albea et al., 1998). Many stud- priming effects with cognate and noncognate
ies in different languages have investigated the role translations, thus showing the relevance of the
of morphology in the organization of the mental distinction between these two types of translations
lexicon (e.g., Feldman, 1995; Frost & Katz, 1992; in the study of bilingual lexical representation.
Sandra & Taft, 1994; see the special issue of Lan- Second, we describe a series of experiments carried
guage and Cognitive Processes, Frost & Grainger, out with Spanish-English and Catalan-Spanish bi-
2000). However, these studies have been restricted linguals; these experiments investigated the role of
to experiments with native speakers of a single lan- form or meaning similarity in lexical representa-
guage, and very few attempts have been made tion across languages by contrasting the cognate
to explore this issue across languages (but see relation between words with two other types of
Cristoffanini, Kirsner, & Milech, 1986; Kirsner, word relations: those for which the critical words
Lalor, & Hird, 1993). were only similar in meaning (i.e., noncognate
In monolingual research, many studies have at- words, e.g., librobook, hojasheet) and those for
tempted to determine whether morphological re- which the critical words only shared a similar form
lations are represented independent of both form (false friends, e.g., gamogame, tornotorch). Third,
(orthographic/phonological) and meaning rela- we review some preliminary evidence from two
tionships (see Garca-Albea et al., 1998, for a re- further experiments using Catalan-Spanish bilin-
view). This question is a critical one because words guals and in which morphological effects across
that are morphologically related share a common languages and within the same language are com-
root or stem; because of this, they tend to share pared for both cognate and noncognate translations.
228 Comprehension
Finally, we discuss the implications of these nd- facilitation effects may not be reecting lexical
ings for current bilingual lexical models. processing. This may be either because the tech-
nique is sensitive to episodic contamination or be-
cause of the inuences of strategic factors. That is, if
the priming technique used allows subjects to con-
The Cognate and Noncognate sciously identify the priming stimulus, this stimulus
Distinction: Evidence From will be recorded in episodic memory, and then it
Priming Studies would not be possible to separate a lexical priming
effect from general memory effects. This will be the
As mentioned, there is evidence to suggest that one case when both long and short SOAs are used (e.g.,
of the variables that modulates the way in which De Groot, 1983; Feldman & Moskovlijevic, 1987;
words are represented in bilingual memory con- Feustel, Shiffrin, & Salasoo, 1983; Forster & Davis,
cerns the characteristics of the word themselves. 1984; Oliphant, 1983). When the prime presen-
One of these characteristics examined in a variety tation is long, subjects will become aware of its
of studies and across a variety of languages is the relation with the target and can develop a strategy
cognate status of the translation pair. In these to translate the prime before the target is presented.
studies, the priming paradigm was one of the pro- If this happens, the observed cross-language prim-
cedures most often used. This paradigm tests whe- ing effects could be caused by a reactivation of the
ther the presentation of a word (the prime) facilitates episodic trace of the primes translation and not by
the recognition of another word (the target), which residual activation in the reactivated primes lexi-
is subsequently presented. The prime either can be cal representation. In that case, one would be
clearly visible (unmasked or standard priming) or measuring within-language priming and not cross-
presented under conditions in which it is not avail- language priming (Gollan et al., 1997).
able for conscious report (masked priming). The Presenting the prime for a shorter period and
response times and error rates on primed target following it immediately by the target also is not
words are compared to those on unprimed words free of problems because the subjects are still aware
(controls) using tasks such as lexical decision, of the prime, and the relation between prime and
naming, or semantic categorization. In the bilingual target becomes more transparent. This may en-
version of this procedure, prime and target belong to courage the use of certain response strategies. As an
different languages, allowing to determine whether example, it is possible that subjects are more
priming effects are obtained across languages and inclined to make yes responses when the target is
thus investigation of the possible connections be- preceded by a related word (a translation or a se-
tween words from the two languages in bilingual mantically related word) than when it follows an
memory. unrelated prime, thus resulting in priming for the
The studies that used the standard priming pro- related primetarget pairs (for a detailed discussion
cedure found different patterns of priming effects of the contribution of episodic and strategic factors
with cognate and noncognate translations. Some to priming effects see De Groot & Nas, 1991;
studies showed facilitatory effects with cognate Forster, 1998; Forster & Davis, 1984; Tenpenny,
translations (e.g., Cristoffanini et al., 1986; Gerard 1995).
& Scarborough, 1989; Kerkman, 1984, cited in De The inuence of these factors can be reduced
Groot, 1993), and others failed to obtain facilitation if the prime is presented under masked conditions
with noncognate translations (e.g., Cristoffanini (Forster & Davis, 1984; Forster, Davis, Scho-
et al., 1986; Kirsner, 1986; Kirsner, Brown, Abrol, knecht, & Carter, 1987). In this procedure, a se-
Chadha, & Sharma, 1980; Kirsner, Smith, Lock- quence of visual stimuli is presented in rapid
hart, King, & Jain, 1984; Scarborough, Gerard, & succession, with each stimulus superimposed on the
Cortese, 1984). Yet, other studies, which used previous one. First, a sequence of hatches (#) is pre-
shorter SOAs (stimulus onset asynchronies) than sented for 500 ms, acting as a forward mask. Then,
those just referred to (less than 300 ms), obtained the prime is displayed in lowercase for about 60
facilitatory effects with both cognate (De Groot & ms; nally, the uppercase target is presented for
Nas, 1991) and noncognate translations (e.g., 500 ms. The target also acts as a backward mask.
Altarriba, 1992; Chen & Ng, 1989; De Groot & Under these conditions, subjects are not aware
Nas, 1991; Jin, 1990; Schwanenugel & Rey, 1986). of the prime, which reduces the possible inuence
One of the problems with the standard priming of episodic and strategic factors. Importantly, there
technique used in these studies is that the observed is clear evidence that supports the lexical nature
Representation of Cognates 229
of the masked priming effects, making this proce- masked priming effects (e.g., Davis & Schoknecht,
dure more adequate to examine issues related to 1996; Gollan et al., 1997; Jiang, 1999).
bilingual lexical memory than the unmasked pro- The relevance of the exact nature of the masking
cedures (e.g., Forster et al., 1987; Forster & Davis, procedure was pointed out by Davis et al. (1991).
1984). Specically, De Groot and Nas (1991) and
Later studies adopted this or a similar priming Williams (1994) presented the primes in uppercase
procedure (e.g., De Groot & Nas, 1991; Garca- letters and the targets in lowercase. Furthermore,
Albea et al., 1998; Gollan et al., 1997; Grainger & the prime was longer than the target in number of
Frenck-Mestre, 1998). These studies aimed at in- letters. Consequently, the masking of the prime
vestigating the central concern of this chapter: the may not have been complete, and therefore the
cognate status of translation pairs. De Groot and prime could have been more available to the
Nas (1991) carried out a series of lexical decision subject than in the display used by Forster and
experiments (with English-Dutch bilinguals) in Davis (1984). In fact, reanalyzing the data reported
which subjects had to decide whether a sequence by De Groot and Nas along these lines, it can be
of letters was a word. Cognate and noncognate observed that the magnitude of the priming effects
primetarget pairs were compared using both with noncognate words decreased as a function of
masked and unmasked procedures. Both cognate prime availability. In particular, when the prime
and noncognate translations showed facilitatory was visible, the reported effect was 113 ms; when
effects, but cross-language associative priming (i.e., the prime and target were not matched in length,
priming between semantic associates) was only the effect was 40 ms; and when Forster and Daviss
observed in the case of cognate translations (e.g., display was used (lowercase primes and uppercase
baker will prime brood [bread in English], but targets), the effect was reduced to 22 ms. Given
blanket will not prime laken [sheet in English]). these considerations, the results of De Groot and
On the basis of these ndings, the authors sug- Nas are not necessarily inconsistent with the pat-
gested that the representation of the words in tern of results that other authors, using masked
the two types of translation pairs are connected priming, reported for cognate and noncognate
at the lexical level of representation (as tapped by translations.
the translation priming technique), but that only Clearer evidence concerning the distinction be-
the representations of words in cognate translation tween cognate and noncognate translations was
pairs, not those of noncognate translations, are provided by Davis et al. (1991) in a series of ex-
linked at the conceptual level of representation (as periments carried out with Spanish-English bilin-
tapped by the association priming technique). guals who were competent in both languages.
Like De Groot and Nas (1991), Williams These authors also used a lexical decision task
(1994), using the same masking procedure and the combined with a masked priming procedure (For-
lexical decision task, also reported facilitatory ef- ster et al., 1987) to compare the pattern of effects
fects with noncognate translations. However, other of cognate and noncognate translations. But, con-
studies have failed to do so (Garca-Albea et al., trary to previous studies, they tested their bilingual
1998; Garca-Albea, Sanchez-Casas, Bradley, & subjects in the two language directions (i.e., Span-
Forster, 1985; Grainger & Frenck-Mestre, 1998; ish primeEnglish target and English prime
Sanchez-Casas, Davis, & Garca-Albea, 1992). Spanish target). Three priming conditions were
This inconsistency may be explained by two fac- included for each translation type: (a) an identity
tors: the language in which prime and target were condition for which prime and target were the
presented (either rst language L1-L2 or L2-L1) same word (e.g., clearclear, tailtail); (b) a trans-
and the exact masking procedure used. lation condition for which the target was the trans-
The possible relevance of the languages of the lation of the prime (e.g., claroclear, colatail); and
prime and the target was suggested by Grainger (c) a form control condition for which the target
and Frenck-Mestre (1998). In particular, the au- was preceded by a nonword prime with the same
thors proposed that, for priming effects to emerge, orthographic similarity with the target, as in the
the prime has to be in L1 and the target in L2, case of cognate translation pairs (e.g., clarnclear,
which is the way that both De Groot and Nas tairtail). In addition to this form control, Davis et
(1991) and Williams (1994) tested their bilingual al. varied the form overlap (in number of letters)
subjects. As we discuss in the next section later, this between the cognate translation pairs (e.g., rich
suggestion is consistent with the results of other rico, towertorre), because some previous un-
studies that have obtained a pattern of asymmetric masked priming results had suggested that the
230 Comprehension
cognate effects may be caused by the cognates interpretation is that such priming effects are the
form similarity (Gerard & Scarborough, 1989). result of persistent activation in the critical memory
The authors also examined the possible inuence of representations (e.g., Evett & Humphreys, 1981).
language dominance by testing three groups of bi- That is, when the prime is presented, it induces
linguals (balanced, Spanish dominant, English activation in the corresponding word detector in
dominant) and a fourth group with a low compe- memory. The induced activation is assumed to
tence in English (that they called semibilinguals). persist for some time after stimulus offset, so that
Davis et al. (1991) found that cognate transla- when the target is subsequently presented, the
tions produced a robust facilitatory effect; there was corresponding word detector will still be in an ac-
no trend for noncognate priming at all. What is tivated state. Under this interpretation, priming
more, the size of the cognate priming effect was the effects should be expected to grade with the degree
same as of the effect observed with identity prime of prime/target form overlap. However, as shown,
target pairs. In addition, the cognate facilitatory cognate priming effects were not smaller than
effects were not affected by the degree of ortho- identity priming effects, despite the fact that the
graphic similarity. Finally, in contrast to other words in cognate pairs differed from one another in
masked priming studies also examining procient at least one letter.
bilinguals (e.g., Gollan et al., 1997), those effects An alternative interpretation was suggested by
were obtained regardless of the language of prime Forster and colleagues (Forster et al., 1987; Forster
and target (i.e., they occurred both when Spanish & Davis, 1984). These authors interpreted the
words were used as primes and English as targets priming effect as a postaccess effect, which they
and vice versa) and were of a similar size in the three called entry opening. Their proposal was based on
bilingual groups (balanced, Spanish dominant, and the idea that visual word recognition can be viewed
English dominant). In some respects, the pattern of as a table look-up procedure, for which a stimulus
priming effects for the semibilingual group differed is matched against a stored lexical representation
from that observed for the bilingual groups. In by consulting a table of learned correspondences.
particular, this group of Spanish speakers, less Specically, they suggested that some abstract
procient in their L2 (i.e., English), also did not representation of the stimulus is rst used to select
show noncognate priming effects; priming effects a set of compatible lexical candidates. Those can-
were obtained with cognate words, although only didates are said to be examined in parallel for their
when the prime was in L1 (Spanish) and the target in congruency with a fuller specication of the stim-
L2 (English). This suggests that some level of com- ulus (postaccess check); once an appropriate match
petence is required for cognate priming effects to is found, the corresponding lexical entry needs to
emerge in both directions (see Mildred, 1986, for be opened for its contents to become available to
similar evidence). higher-order language processes (such as parsing).
Leaving for the moment the pattern of results Once an entry is opened, it remains in that state for
for the semibilingual group, the ndings reported a few seconds to allow slower processes (such as
by Davis et al. (1991) for the procient bilingual semantic interpretation) to have continued access
groups clearly demonstrated that cognate and to the lexical database. Given that entry opening
noncognate translations showed different patterns takes processing time, any reaccess of an already
of priming effects. Interestingly, they provided some opened entry would save time and thus lead to
initial evidence to support the view that the degree a facilitation of a response based on information
of form similarity does not affect the magnitude of stored in that entry (i.e., priming). Under this in-
the facilitatory effects obtained for cognates, and terpretation, a priming effect would occur if the
that meaning similarity by itself (as in noncognate priming stimulus resembles the target word suf-
words) does not sufce to produce cross-language ciently to open its lexical entry.
priming effects. It should also be stressed that Adopting this mechanism of priming, and tak-
cognate priming effects were not different from ing into account the results mentioned above, the
identity priming effects and did not change as a cognate priming effects can be explained by as-
function of the language direction of the prime and suming that cognate translations are represented
target and were not affected by the language jointly in memory. That is, information concerning
dominance of the bilingual participants. What do the words rico and rich is stored in a common
these results suggest? entry. Given such a situation, either word would be
To answer this question, we rst have to ask able to open the combined entry and so produce
how masked priming effects arise. One possible a priming effect when that entry is subsequently
Representation of Cognates 231
reaccessed by the target. This interpretation can Although we are aware of the importance of
also explain why no priming effects are found be- the role of L2 prociency in determining masked
tween noncognate translations. These translations priming effects, the evidence reviewed in the rest
are listed separately, so the prime and target open of this chapter considers only the recognition of
separate entries (Davis et al., 1991; Sanchez-Casas, words involving different cross-language relations
Davis, et al., 1992). in highly procient bilinguals. In particular, in the
There is, however, one problem with the open- next section we examine a series of priming ex-
ing entry interpretation just described, and this is periments with Catalan-Spanish bilinguals that
its difculty to explain why asymmetrical cognate considers more closely the contribution of form
priming effects are obtained in the semibilingual and meaning to cognate priming effects and that
group. If cognates are jointly represented, the same provides additional evidence to support the view
amount of priming should be expected in both that cognate and noncognate words are differently
language directions. A possible explanation of the represented in the bilingual lexicon.
asymmetrical cognate priming effects that would
be compatible with the proposal of common lexical
representations for cognate translations was sug-
gested by Kim and Davis (2001). These authors Can Cognate Relations
carried out an experiment to determine how masked Be Reduced to Mere Form
priming effects were affected by processing pro- or Meaning Relationships?
ciency in naming Korean words (a logographic
writing system) and found only masked priming Following the results reported by Davis et al.
effects for the practiced logographic processors. To (1991), Garca-Albea, Sanchez-Casas, and Valero
explain the processing prociency effect, Kim and (1996) carried out several priming experiments
Davis proposed that priming will only occur when in which they investigated more directly the extent
the activation of both prime and target reach a that form and meaning contribute to cross-
stable state. Specically, priming will require language priming effects. They examined Catalan-
combined and mutually supportive prime and tar- Spanish bilinguals with a high prociency in the
get activation initially caused by form-based acti- two languages. Spanish and Catalan are two Ro-
vation and meaning overlap. Such an interpretation mance languages that share orthography, phonol-
could account for the asymmetrical priming effects ogy, and meaning to a large extent. Therefore, they
in the semibilingual group (nonprocient English provide a good opportunity to examine in more
speakers) by suggesting that the response to L1 detail the possible contribution of form or meaning
targets has probably been made before the prime to priming effects across languages.
activation was developed to a sufcient degree, Garca-Albea et al. (1996) contrasted three types
precluding their joint activation. of word relations. In addition to cognate (e.g.,
An alternative processing-based explanation of cotxecoche) and noncognate (e.g., ga`biajaula)
the asymmetrical cognate priming effects in the translations, they included false friends, which, as
semibilingual group might be derived from Kroll mentioned, are similar in form but have totally
and Stewarts model (1994). This model proposes different meanings (e.g., curtacurva). These word
that nonprocient bilinguals interpret words in L2 relations allowed them to determine whether form
via activating their translation equivalents in L1. If (false friends), meaning (noncognates), or both
L2 words are recognized in this indirect way, it is (cognates) underlies the priming effects to be ob-
possible that their representations take longer to served. They used the same task as Davis et al.
access than L1 representations, and as a conse- (1991) and the same masking procedure, but unlike
quence, there might not be enough time for the those authors, they manipulated neither language
L2 prime to activate an L1 target (for more details, dominance nor degree of form overlap of cognate
see Davis, Sanchez-Casas, & Garca-Albea, 2002). translations. Cognate words, noncognates, and false
More studies manipulating L2 prociency are re- friends were selected and presented under three
quired to test further the processing-based expla- experimental priming conditions: identity (e.g.,
nations outlined above and to determine their cochecoche, jaulajaula, curvacurva), translation
compatibility with an interpretation of the differ- (e.g., cotxecoche, ga`biajaula), an orthographi-
ence between cognate and noncognate words in cally and phonologically related word in the case of
terms of their representational status based on the false friends (e.g., curtacurva), and a form control
open entry model (Forster & Davis, 1984). (e.g., cordecoche, pramajaula, curnacurva).
232 Comprehension
Within each primetarget pair, the prime and target with procient bilinguals (e.g., Davis & Kim, 2000;
had the same number of letters (four or ve) and the Davis & Schoknecht, 1996; Gollan et al., 1997;
language of the target could be either Spanish or Jiang, 1999; Jiang & Forster, 2001), a nding that
Catalan. In other words, both language directions may have resulted from the fact that the language of
were tested. prime and target had different scripts in the latter set
The results obtained when the target was pre- of studies. Gollan et al.s study is particularly rele-
sented in Catalan are shown in Fig. 11.1. As can be vant here because these authors compared cognates
observed, again cognate translations produced a and noncognates in Hebrew and English (for which
facilitatory priming effect, which was not different cognate translations are similar in meaning and in
statistically from the corresponding identity prim- phonological form, but not in orthography).
ing effect. In contrast, noncognate translations did Gollan et al. (1997) tested procient English-
not produce a signicant effect. Also, no priming Hebrew bilinguals in the two language directions
effect was observed for false friends. As illustrated (from L1 to L2 and from L2 to L1) and compared
in Fig. 11.2, the general pattern of results was the priming effects in the two types of translation, with
same when the target was presented in Spanish. unrelated word pairs serving as the baseline con-
Again, only cognate translations produced a facil- dition from which priming effects were assessed.
itatory effect, and as before, identity and cognate Most relevant for the issue addressed here is that
primes were equally effective. The effect for non- the authors observed facilitatory priming effects
cognates was negligible, and false friends showed a for cognate and noncognate translations, but only
nonsignicant tendency toward inhibition. when the prime was in the dominant language and
The pattern of ndings obtained by Garca-Albea the target in the nondominant language. Gollan
et al. (1996) conrmed the contrast between cognate et al. suggested that the change in script between
and noncognate translations found by Davis et al. prime and target caused the noncognate priming to
(1991) with Spanish-English bilinguals. In both emerge because this change functions as a powerful
studies, only for cognate translations were facilita- orthographic cue that enables the prime to be
tory effects observed. These effects were similar in accessed in time to facilitate the recognition of
size to those found with identical primetarget pairs, the target (see, however, Jiang, 1999, for evi-
and they were observed in the two language direc- dence against this account). The authors attributed
tions (both from L1 to L2 and from L2 to L1). This the asymmetric nature of the cognate priming effect
last nding contrasts with other masked priming obtained with different scripts to an overreliance
studies that have shown asymmetric priming effects on phonology in reading L2 (i.e., Hebrew).
Figure 11.1 Size of priming effect in cognates, noncognates, and false friends as a function of priming
condition (identity and translation). The language of the target was Catalan.
Representation of Cognates 233
Figure 11.2 Size of priming effect in cognates, noncognates, and false friends as a function of priming
condition (identity and translation). The language of the target was Spanish.
Another nding obtained by Garca-Albea et al. only partially (e.g., papelpaper, hojasheet). The
(1996) that contrasts with previous studies is the experiment clearly showed that there is an asym-
absence of priming effects for false friends. In metry in the recognition of cognate versus non-
particular, Gerard and Scarborough (1989) ob- cognate translations. Only the latter seem to be
tained facilitatory effects for false friends, but these affected by differences in the meaning relationship
authors used an unmasked priming procedure and between the two words in translation pairs. When
identical false friends (that is, words spelled iden- noncognate meanings do not neatly map across the
tically, but with different meaning, e.g., red in two languages, subjects take more time to recog-
English meaning net in Spanish). Moreover, nize the translation relation than when they do.
more recent studies have reported inhibitory effects However, in recognizing the cognate translations,
under some conditions (see De Groot, Delmaar, & this factor does not play a role. That is, subjects
Lupker, 2000, and Dijkstra et al., 1999, for re- recognize the cognate translations equally quickly
views); however, they used homographs, and this irrespective of the degree of meaning overlap. This
might explain the different pattern of effects. The same pattern of effects has been obtained when
fact that false friends did not show any effect in the languages involved are Catalan and Spanish
Garca-Albea et al.s study allows the conclusion (Guasch, 2001).
that form similarity by itself cannot account for the Two important conclusions can be derived from
facilitatory effect in cognate translations. Finally, the results discussed (Davis et al., 1991; Garca-
the consistent lack of facilitation with noncognate Albea et al., 1996). On the one hand, they suggest
translations suggests again that meaning by itself that cognate relations cannot be reduced to mere
cannot be responsible for the priming effects ob- form (orthographic and phonological) or meaning
tained with cognate translations. relationships, as has also been found in morpho-
The degree of semantic overlap within cognate logically related words (see Garca-Albea et al.,
and noncognate translations was manipulated by 1998). On the other hand, they support the claim
Sanchez-Casas, Suarez-Buratti, and Igoa (1992) in that cognate translations may be represented
an experiment with Spanish-English bilinguals. jointly in the bilingual lexicon (see details below).
Subjects were presented with word pairs and had Further evidence for the special representational
to decide whether they were translations (this task status of cognate translations was provided by
is called translation recognition). The authors se- Sanchez-Casas and Almagro (1999). These authors
lected cognate and noncognate translations that carried out a series of experiments, also with highly
had the same meaning (e.g., leonlion, vidalife) competent Catalan-Spanish bilinguals, using the
and translation pairs that shared their meaning same masked priming procedure and the lexical
234 Comprehension
decision task. In particular, they were interested in were observed for cognate translations and false
determining the time course of priming effects for friends, and they were about equal size (19 and 20 ms,
cognate translations in bilingual word recognition respectively) and not different from those obtained
in comparison to those for other types of word with identical primetarget pairs (22 and 28 ms for
relations across languages (i.e., noncognate trans- cognates and false friends, respectively). However,
lations and false friends). To achieve this goal, they the priming effect for noncognate translations failed
manipulated the SOA, selecting three prime dura- to reach signicance (5 ms).
tions (30, 60, and 250 ms) that together involved Figure 11.4 shows the results with an SOA of 60
two priming procedures: masked (SOA 30 ms ms. In this case, the pattern of priming effects was
and 60 ms) and unmasked (SOA 250 ms). consistent with previous ndings. Cognate trans-
As in previous studies, Sanchez-Casas and lations show facilitation (52 ms), but neither non-
Almagro (1999) compared cognates (e.g., puno cognates (10 ms) nor false friends (10 ms) produced
puny), noncognates (e.g., patoa`nec), and false reliable effects. Identity priming effects were ob-
friends (e.g., corocorc) in three priming condi- served in all three word-type conditions (61, 45,
tions: identity, translation (an orthographically and and 48 ms for cognates, noncognates, and false
phonologically related word in the case of false friends, respectively); again, the identity priming
friends), and control (an unrelated word). Priming and cognate priming conditions did not differ (61
effects were examined for all three of these priming vs. 52 ms, respectively).
conditions and with each of the three prime dura- Finally, as in some earlier studies (e.g., Altar-
tions. On the basis of previous ndings with the riba, 1992; Chen & Ng, 1989; De Groot & Nas,
same languages (Garca-Albea et al., 1996), only 1991; Jin, 1990; Schwanenugel & Rey, 1986),
one language direction was tested (Spanish prime with an SOA of 250 ms, facilitation was observed
and Catalan target). It should be stressed that the both in cognate (101 ms) and in noncognate
words across all three types of word relations had a translations (47 ms). However, again there was no
similar length (four or ve letters), and that cognate evidence of facilitation for false friends (6 ms).
translations and false friends shared the same Regarding identity priming, the three types of
degree of form overlap (three letters on average); words produced facilitation (124, 110, and 131 ms
noncognate translations had on average just one for cognates, noncognates, and false friends, re-
letter in common. spectively). As before, identity and translation ef-
The results obtained with an SOA of 30 ms are fects in the cognate condition were similar in size
presented in Fig. 11.3. It can be observed that identity (101 vs. 124 ms; see Fig. 11.5).
primes were equally effective in the three word-type Three aspects of this pattern of results are par-
conditions. The priming effects were 22, 28, and 28 ticularly relevant in view of the issues addressed
ms for cognates, noncognates, and false friends, re- here. First, cognate translations behaved differently
spectively. More interesting, cross-language effects from noncognate translations and false friends,
Figure 11.3 Size of priming effect in cognates, noncognates, and false friends as a function of priming
condition (identity and translation). The SOA (stimulus onset asynchrony) used was 30 ms.
Representation of Cognates 235
Figure 11.4 Size of priming effect in cognates, noncognates, and false friends as a function of priming
condition (identity and translation). The SOA (stimulus onset asynchrony) used was 60 ms.
showing facilitation across all three SOAs. Inter- translations share lexical representations (see
estingly, when similar SOAs were used, morpho- Sanchez-Casas, Davis, et al., 1992). Finally, the re-
logically related words have been shown to produce sults clearly showed that both form and meaning
a similar pattern of priming. In particular, Dom- contribute to the process of bilingual word recogni-
inguez, Segu, and Cuetos (2002) showed that mor- tion, but at different stages. Form seems to play a role
phological priming effects are maintained across early on in the process, as demonstrated by the facil-
different SOAs (34, 64, and 250 ms); form and itation observed for false friends at a very short SOA
meaning priming effects do not. Both sets of nd- (30 ms). This nding is consistent with a large body of
ings provide further evidence that cognate relations, data that supports the view that access to bilingual
as morphological relations, differ from semantic memory is nonselective (e.g., De Groot et al., 2000;
and orthographic/phonological relationships. Dijkstra, Van Jaarsveld, & Ten Brinke, 1998; see
A second noteworthy nding is that, across all Dijkstra, chapter 9, this volume). In contrast, mean-
three SOAs, cognate relations behave the same as ing similarity by itself seems to exert an inuence
identity relations within a language. This has also later in the recognition process, as suggested by the
been obtained in previous studies (see Davis et al., results with noncognates at a longer SOA (250 ms).
1991; Garca-Albea et al., 1996) and appears to The most relevant of these ndings is that the
provide further support for the claim that cognate pattern of cross-language priming effects with
Figure 11.5 Size of priming effect in cognates, noncognates, and false friends as a function of priming
condition (identity and translation). The SOA (stimulus onset asynchrony) used was 250 ms.
236 Comprehension
cognate words resembled closely the pattern ob- phase of the experiment, 10 min later, the English
tained, under comparable conditions, with words words were presented again in English (i.e., repe-
that were morphologically related within a lan- tition priming condition), together with the English
guage. For both types of word relations, it appears translation of the previously presented Spanish
that facilitation effects cannot be attributed to just words (i.e., translation priming conditions), and a
form or meaning similarity. Given this nding, the comparable list of English words was presented for
possibility could be considered that cognate trans- the rst time. Using this priming procedure, the
lations involve a special kind of morphological authors examined priming effects with noncognate
relation. As Garca-Albea et al. (1998) pointed out, translations (e.g., panaderabakery) and with four
cognate words are indeed morphologically related different types of cognate translations: cognates
words, given that they share a common root or stem that were orthographically identical (e.g., festival
across languages. Thus, it is possible that they have festival); two types of cognates that shared the
the same lexical representational status as mor- same root, followed by different sufxes that are
phologically related words. This would imply that regular in both languages (e.g., observacion
it is morphology and not language that is critical observation, crueldadcruelty); and nally cog-
for lexical organization in bilinguals. Evidence nates with the same root, but with irregular sufxes
concerning this possibility is presented in the next (e.g., calamidadcalamity). The results did not
section. show any priming for noncognate translations, but
for cognate translations, priming effects were ob-
tained, and these were of a similar size for the four
types of cognates. The authors interpreted these
Can Cognate Relations ndings as supporting the claim that cognate rela-
Be Considered a Special Kind tions are equivalent to morphological relations.
of Morphological Relationship? As far as we know, the only reported studies
that have further examined this claim are those
The model of word recognition proposed by Kirs- carried out by Garca-Albea et al. (1998) and by
ner et al. (1993) is the only model that incorporates Sanchez-Casas, Garca-Albea, and Igoa (2000) with
morphology as an important factor for bilingual highly competent Catalan-Spanish bilinguals. In
lexical organization. In particular, these authors both studies, the authors compared morphological
proposed the notion of related words forming a type priming effects across languages for cognate and
of morphological paradigm. The critical idea noncognate translations with the effects produced
was that words that share form and meaning will by morphologically related primes within the same
undergo conjoint learning, such that when a word language and with the corresponding translation
becomes more uent because of practice, other re- effects.
lated words will also benet. In this model, cognate Garca-Albea et al. (1998) examined gender and
translations are viewed as a morphological relation number inection relations. It should be mentioned
of a special kind. that, in Spanish, gender is generally formed by
A study by Cristoffanini et al. (1986) with adding the sufx -a (feminine; e.g., nina) or -o
Spanish-English bilinguals provided support for (masculine; e.g., nino) to the stem. Regarding
this model. These authors used a lexical decision number formation, an -s is added if it ends in a
task and a long-term priming paradigm. In this vowel (e.g., ninas) and an -es if it ends in a con-
paradigm, in contrast to those involving short sonant (e.g., leones). Only the former was used in
SOAs (for which the interval between prime and this experiment (for more information about these
target is less than 300 ms, and no other stimuli inections, see Garca-Albea et al., 1998). The
intervene between the presentation of prime and authors used bisyllabic nouns and adjectives with
target), a long time elapses between the presenta- monosyllabic stems that had either a cognate (e.g.,
tion of the prime and the target word (several macomajo) or a noncognate translation (e.g.,
minutes or even longer intervals, and other stimuli boigloco). Half of the primetarget pairs were
intervene between the presentation of the prime gender inections, and the other half were number
and the target). inections (in the latter, the prime was always the
Specically, Cristoffanini et al. (1986) designed singular form of the word, and the target was the
the experiment in two phases. In the rst or study plural form). The language of the target was always
phase, subjects were presented with separate lists of Spanish because previous ndings had shown that
Spanish and English words. In the second or test language direction does not affect the priming
Representation of Cognates 237
effects (Davis et al., 1991; Garca-Albea et al., were tested under the same four masked prim-
1996). In both cognate and noncognate translation ing conditions: (a) a within-language morphologi-
conditions, the Spanish target could be preceded by cal condition (e.g., olvidaolvidar, limpiolimpiar);
a morphologically related word in Spanish (e.g., (b) a translation condition (e.g., oblidarolvidar,
majamajo, puertapuertas, localoco, pato netejarlimpiar); (c) a cross-language morphologi-
patos); its translation in Catalan (e.g., macomajo, cal condition (e.g., oblidaolvidar, netejolimpiar);
portaspuertas, boigloco, a`necspatos), a mor- and (d) an unrelated control condition.
phologically related word in Catalan (e.g., maca The results of this experiment are shown in Fig.
majo, portapuertas, bojaloco, a`necpatos); and 11.7. As before, cognates showed facilitation when
an unrelated nonword control. the prime and target were morphologically related
Figure 11.6 presents the results for the two types within the same language (51 ms), when they were
of translations. As would be expected if cognate morphologically related across the two languages
relations are in fact a special case of morphological (36 ms), and when they were translations of one
relations, the size of the priming effect for cog- another (38 ms). In contrast, noncognates only
nate translations was the same in the within-language showed facilitation when the morphological rela-
morphological condition (56 ms), the cross-language tion held within the same language (68 ms).
morphological condition (48 ms), and the translation The two experiments just described provide
condition (52 ms). However, with noncognate words, clear evidence that cognate priming effects are the
only the within-language morphological condition same as priming effects observed with morpho-
produced signicant facilitation (51 ms). logically related words, thus supporting our claim
A similar pattern of effects was obtained by that cognate translations can be considered a spe-
Sanchez-Casas et al. (2000), but now with verbal cial kind of morphological relations. More inter-
inections within and across languages. Again, esting, the data are consistent with the view that
highly procient Catalan-Spanish bilinguals were morphology could be the critical principle of or-
tested, and the same masked priming procedure ganization not only of the monolingual lexicon,
was used. The target was again presented in as some authors have suggested (e.g., Drews &
Spanish and the prime in Catalan. In this case, the Zwitserlood, 1995; Frost et al., 1997; Garcia-Albea
authors selected cognate and noncognate Spanish et al., 1998; Marslen-Wilson et al., 1994), but also
verbs from the three different conjugations that of the bilingual lexicon. We now consider the im-
exist in this language (-ar, -er, and -ir; e.g., olvidar, plications of this hypothesis for current models of
comer, sufrir). The cognate and noncognate words bilingual lexical memory.
Figure 11.6 Size of priming effect in cognates and noncognates as a function of word relation: gender and
number inections. MWL, morphological relation within language; MBL, morphological relation be-
tween languages; TRAN, translation.
238 Comprehension
Figure 11.7 Size of priming effect in cognates and noncognates as a function of word relation: verbal
inections. MWL, morphological relation within language; MBL, morphological relation between lan-
guages; TRAN, translation.
Implications for Models Moreover, it has been found that the degree of se-
of Bilingual Memory mantic overlap within a translation pair affects the
recognition of noncognates, but not that of cognates
We have provided evidence that conrms the idea (e.g., hojasheet, for which the meaning of the
that the cognate status of a translation is a relevant words in this translation pair is not completely iden-
factor in determining the occurrence of priming tical in the two languages, was recognized as a
effects. In particular, the studies discussed in this translation pair slower than vidalife, for which the
chapter have consistently shown that cognate meaning is identical in the two languages; Sanchez-
translations produce facilitatory priming effects in Casas et al., 1992). However, the important nding
bilinguals as long as the bilinguals have a reason- to stress here is that cognates, in contrast to non-
able level of competence in both languages. These cognates and false friends, produced priming effects
effects do not appear to result merely from the across all prime durations (30, 60, and 250 ms).
form or meaning similarity that characterizes cog- Taken together, this set of ndings provides
nate word translations. clear evidence to support the claim that cognate
Two ndings led us to reject the possible con- priming effects across languages, as seems to be the
tribution of orthographic/phonological similarity: case for morphological priming within a language,
the presence of facilitatory priming effects in cog- cannot be caused by mere form (false friends)
nate translations regardless of their form overlap or meaning (noncognate) relations. How then
(e.g., nochenight, torretower; Davis et al., 1991) are cognate translations represented in bilingual
and the absence of facilitation with false friends, memory? Several ndings suggested that they share
words that are similar in form but not in meaning a common lexical representation, in contrast to
(e.g., curtacurva; Garca-Albea et al., 1996). Re- noncognate translations, which presumably are
garding the contribution of meaning to cross- represented separately. On the one hand, the cog-
language priming effects, the results clearly showed nate priming effects were equally large as the
that noncognate translations, which are only se- within-language identity priming effects. This held
mantically similar to their primes, do not produce for both pairs of languages tested, Spanish-English
any evidence of such effects. This is not to say that and Catalan-Spanish. On the other hand, as would
form or meaning on their own do not inuence bi- be expected if cognate translations are indeed re-
lingual word recognition at all because both false presented jointly, language direction did not affect
friends and noncognate translations showed facili- the pattern of priming effects; that is, there were
tatory priming effects, the former with an SOA of equally large effects in both directions (from L1 to
30 ms, and the latter with an SOA of 250 ms. L2 and from L2 to L1).
Representation of Cognates 239
An alternative proposal regarding the represen- common root, as has been suggested for morpho-
tation of cognate translations in bilingual memory logically related words. This would imply, for in-
was advanced by De Groot and Nas (1991; see also stance, that words that are morphologically related
De Groot, 1992). Assuming a model of bilingual across languages, such as portapuertas, will be re-
memory in which two levels of representation are presented under the same root as puertapuertas,
distinguished, a lexical (orthographic-phonological) words that are morphologically related within the
level and a conceptual (meaning) level, these authors same language. The question now is how such mor-
have proposed the existence of common represen- phological level of representation, which cognate
tations at the conceptual level for cognate transla- translations share, could be incorporated into mod-
tions, but not for noncognate translations. They els of bilingual word recognition. We refer to two
based this conclusion on the enhanced priming ef- of these models: the distributed lexical/conceptual
fects that they observed with cognate translations feature model (Kroll & De Groot, 1997) and the
compared to noncognate translation and on the bilingual interactive activation (BIA) model (Dijkstra
nding that the cross-language masked priming et al., 1998; Dijkstra & Van Heuven, 1998; see
effects for semantic associates of the primes were Thomas & Van Heuven, chapter 10, this volume).
only found for cognate translations (e.g., prime
bakery, target brood, meaning bread) and not
for noncognate translations (prime blanket, target The Distributed Lexical/Conceptual
laken, meaning sheet). These two accounts of Feature Model
the representation of cognates in memory, how-
ever, are not necessarily incompatible. They can, Kroll and De Groot (1997) proposed a model that
for instance, be reconciled with one another if incorporates a language-independent (shared) lexi-
distributed memory representations are assumed cal feature level of representation, containing in-
(see De Groot, 1992; Kroll & De Groot, 1997). formation regarding the form of words, and a
In this type of representation, cognate transla- conceptual feature level, for which aspects of mean-
tions could share representational nodes or features ing are represented. In addition to these two levels of
both at the lexical (form) and at the conceptual representation, they postulated a level of lemma
(meaning) level. In contrast, noncognate transla- representations that mediates between the other two
tions might only share features at the conceptual (see Fig. 11.8).
level. The lemma level, which includes some syntactic
Another set of ndings (Garca-Albea et al., and semantic characteristics of words, is specic
1998; Sanchez-Casas et al., 2000) led us to move a for each language, and it could be considered a
step further and propose, within the bilingual lex- means to reect the activation patterns that result
icon, a morphological level of representation in from word form to word meaning mappings. In
which cognates are jointly represented. This level is isolated word recognition, the lemma level would
different from a pure form level, which contains only reect these form and meaning relationships;
the words orthographic and phonological infor- however, when contextual information is present,
mation, and from a concept-representation level, in this level could reect syntactic processes that se-
which the words meaning is represented. Support lectively activate lexical and conceptual features.
for this idea comes from the two nal experiments Proposing language-specic lemmas would not
reported in the previous section, which demon- only allow the two languages to function in an
strated that cognate translations produced facili- autonomous fashion, but also, to the extent that it
tatory priming effects of equal magnitude as words mediates between the lexical and conceptual levels,
that were morphologically related within a lan- it would allow the two languages to inuence each
guage and across languages (both noun and verb other because they share access to common lexical
inected forms). Further support for the similarity and conceptual features.
between the cognate and the morphological prim- As mentioned, within this model, it could be
ing effects was provided by Domnguez et al. proposed that cognate translations share features
(2002), who found that morphological relations at both the lexical (in our case, orthographic and
showed facilitatory priming across different SOAs, phonological features) and conceptual levels of
just as cognate relations do. representation. The greater activation derived from
Returning to the question of how cognate trans- their form and meaning overlap could account for
lations are stored in memory, we propose that this the facilitation effects observed in the three SOAs
type of words is represented on the basis of the used (30, 60, and 250 ms). On the contrary, having
240 Comprehension
Figure 11.8 The distributed lexical/conceptual feature model. Adapted from Kroll and De Groot, 1997.
L1, rst language; L2, second language.
only form (false friends) or meaning features (non- addition to form and meaning features, a common
cognate translations) in common would not be root (port-) under which words morphologically
sufcient for these effects to occur across the dif- related both within and between languages would
ferent SOAs. False friends priming effects would be represented. In this version of the model, cog-
reect an early stage in the recognition process, nate priming effects could be located at the mor-
when only word form has been processed; phological level. Although Kroll and De Groot
noncognate priming effects could be located at a (1997) did not specify how words are processed in
later stage when the word meaning is taken into their model, the way these effects occur could be as
account. follows. When one of the cognate words is pre-
A second possibility as to how cognate trans- sented as the prime (in our case, the Catalan word;
lations are represented is to postulate a morpho- e.g., porta), it will activate its corresponding fea-
logical level of representation at which cognate ture nodes at the form level, most of them shared
translations would be jointly represented. This by its translation (e.g., the Spanish word puerta).
additional lexical level could be located, in Kroll These nodes will then send activation to the mor-
and De Groots (1997) model, between the form phological level, at which the cognates common
level representing orthographic and phonological root is represented (e.g., port-). When the cognate
features and the lemma level. The morphological word is presented as the target (i.e., puerta), both
level would serve two purposes: to represent mor- the shared feature nodes and the root node will
phological relations between words from the same already be activated, speeding up the targets rec-
family and to provide information to the lemma ognition response. Later in the recognition process,
level not only about the words form, but also the morphemic level will send activation to the
about its morphological structure. Thus, according corresponding lemma in the targets language and
to this view language-specic lemmas would reect this in turn to the conceptual level (and possibly
connections between morphemic-meaning map- back to the morpheme level).
pings to and from syntax. Figures 11.9 and 11.10 Figure 11.10 shows how noncognate words
show how this proposal can be implemented into can be represented in the model. In contrast to
the model in the case of cognate and noncognate cognates, the roots of noncognates would be re-
words, respectively.1 presented separately. In addition, these words
As shown in Fig. 11.9, cognate words (e.g., would generally not share form features. Therefore,
portapuerta, door in English) would share, in when a noncognate prime is presented (e.g., the
Representation of Cognates 241
Figure 11.9 The distributed lexical/conceptual feature model for cognate translations. These translations
share lexical and conceptual features as well as morphological representations. Lemmas are language
specic and reect connections between the morphological-meaning mappings to and from syntax. The
f, s, and N stand for feminine, singular, and noun, respectively. Version adapted from Kroll and De Groot,
1997. L1, rst language, L2, second language.
Catalan word taula), activation from the form require the implementation of a prelexical mor-
feature nodes will only activate the correspond- phological parsing mechanism that isolates the
ing root node (i.e., taul-), and no activation will morphological root without reference to whole-
reach the root of its translation (i.e., mes-). Con- word representations.
sequently, no facilitation effects would emerge. The The existence of such a mechanism was
facilitation effects obtained with noncognate words originally proposed by Taft and Forster (1976) to
(i.e., when the prime is clearly visible and displayed account for the way polymorphemic words are
for 250 ms or longer) would reect activation at stored and retrieved, and it has received empiri-
the conceptual level, at which meaning features are cal support from different studies in different lan-
shared by this type of translation. guages (e.g., Caramazza, Laudana, & Romani,
Postulating such a morphological level of repre- 1988; Deutsch, Frost, Pollatsek, & Rayner, 2000;
sentation within Kroll and De Groots (1997) model Drews & Zwitserlood, 1995; Feldman, 1995; For-
would imply, at least in the case of regularly in- ster & Azuma, 2000; Frost et al., 1997; Taft, 1985,
ected forms,2 not only that these morphologically 1994).
complex forms are represented in terms of their In the domain of visual word recognition in
roots and afxes, but also that they are morpho- monolinguals, Taft (1994) proposed an implemen-
logically decomposed to access the words syntactic tation of a prelexical morphological parsing mech-
and semantic information. That is, the model would anism within the framework of an interactive
242 Comprehension
Figure 11.10 The distributed lexical/conceptual feature model for noncognate translations. These trans-
lations do not share lexical features or morphological representations, but they do share conceptual
features. Lemmas are language specic and reect the morphological-meaning mappings to and from
syntax. The f, s, and N stand for feminine, singular, and noun, respectively. Version adapted from Kroll
and De Groot, 1997. L1, rst language, L2, second language.
activation model. In this model, Taft distinguished contrast, noncognate words (e.g., taulamesa) will
different levels of representations, including letters, have language specic morphological representa-
bodies, morphemes, words, and concepts. Of rele- tions, although each of them will be connected to
vance here is the morphological level at which only morphologically related words within the same
bound morphemes are represented (this would be language via a common root (e.g., taulataules,
more clearly the case for transparent prexed and mesamesas). The recognition of both cognate and
inected words; see note 2) and morphological noncognate words will require a prelexical parsing
masked priming effects within the same language procedure that has to operate on the visual stimulus
are located. to extract morphological units that can be matched
Adopting Tafts (1994) approach to bilingual onto the corresponding lexical representations.
word representation and access, it might be assumed This proposal could in principle be compatible
that both cognate and noncognate words are re- with our version of Kroll and De Groots (1997)
presented in a decomposed format. That is, the distributed lexical/conceptual feature model for
root that is shared by cognate words (e.g., porta cognate and noncognate words4 (see Figs. 11.9 and
puerta) and other morphologically related words 11.10), although it should be noted that neither our
within and across languages (e.g., puertapuertas, version of the model nor the original model itself
portapuertas) will be represented at a morpholog- incorporates a level at which word units are re-
ical level together with the corresponding afxes in presented. In our opinion, however, this is not a
the two languages (e.g., the root port- and the gen- problem if a form, a lemma, a morphological, and
der morpheme a and the plural morpheme s).3 In a conceptual level of representation are formulated,
Representation of Cognates 243
as our version of Kroll and De Groots model does. (Dijkstra & Van Heuven, 1998). (See Fig. 11.11 and
In fact, to include a word level would be redundant Dijkstra, chapter 9, and Thomas & Van Heuven,
because all the relevant information about the chapter 10, this volume.)
word would already be contained in the four ex- To account for the facilitation obtained with
isting levels. cognate translations, Dijkstra et al. (1998) extended
Before considering the BIA model (Dijkstra & the model to include a semantic level between the
Van Heuven, 1998), a nal possibility could be word level and the language node level. In this
suggested for how cognate words can be represented model, cognate translations are represented sepa-
in the framework of Kroll and De Groots (1997) rately at the word level, but share their semantic
model. This possibility would be to assimilate the representation. The facilitatory effects obtained for
morphological level to the lemma level, that is, to cognate translations could then be attributed to
represent lemma entries as morphemic units with activation at the semantic level and the subsequent
information about possible variations. This possi- feedback from the semantic to the word level.
bility was considered by Levelt (1989) in the case of How could a morphological level be im-
inections for languages with a rich morphology plemented in this tentative extended version of the
(Catalan and Spanish can be considered morpho- model? One possibility is to locate the morpheme
logically rich languages), for which morphological units between the letter and word levels, as Tafts
information has noticeable consequences for syn- (1994) interactive activation model suggested. As
tactic processes (see Cabre, 1994; Varela Ortega, in the distributed lexical/conceptual feature model,
1992). According to this proposal, in contrast to the in this case a morphological analysis would be
language-specic lemmas of noncognate words, the
lemmas of cognate words will be shared. This
common lemma will contain specications about
the variants in the two languages (e.g., concerning
gender or tense) as one more of the diacritic pa-
rameters that were proposed by Levelt. Within this
model, the translation and morphological masked
priming effects obtained with cognate words would
emerge at the lemma level, and to access the word
information stored at that level, a prelexical
morphological decomposition is not necessarily
required.
required for access purposes. Another possibil- as the prime, but also to the node for its translation
ity, suggested by one of the authors of the BIA model in the other language that shares the same root
in the case of monolingual word recognition, is to (e.g., puerta). When the target word is presented
locate a morphological level between the word level (e.g., puerta), its corresponding root node will
and the semantic level of representation (Giraudo & be activated already, and this activation state will
Grainger, 2000).5 This morphological level could be sustained by the bottom-up activation from the
serve the same functions as we suggested for Kroll word level, as well as by the top-down activation
and De Groots (1997) distributed lexical/concep- from the meaning level. As a consequence of this
tual feature model, that is, to link all the words that joint activation at the morphological level, the
share a common root and to operate as an interface recognition of the target will be facilitated, and
between the words orthographic information (the priming effects are observed.
current model does not include phonology) and its For the cognate priming effects to arise from the
meaning information. Morphological representa- morphological level as described above, it is also
tions would be linked through excitatory connec- necessary to assume that the inhibition sent from
tions to both the word (form) and the meaning the relatively highly activated language node (i.e.,
levels. Specically, these representations would re- the Catalan node) to the words in the other lan-
ceive activation from the word level and send it to guage (i.e., Spanish) is not so large that it sup-
the meaning level. This level in turn would send presses entirely the activation sent off by the
activation back to the morpheme level, and from cognates shared morphological root to the Spanish
this it is sent back to the word level (i.e., top-down word node or alternatively that the inhibitory top-
activation). Within this model, access is assumed to down feedback from the activated Catalan lan-
be nonselective; in other words, words from both guage node arrives too late to cancel the activation
languages are initially activated. reached by the Spanish word node.
Figures 11.12 and 11.13 show how morpho- It should be noted that, in this version of the
logical representations can be incorporated in the model, words are no longer connected directly to the
BIA model for cognate and noncognate words, re- semantic level, and that they can activate but not
spectively. As can be seen in Figs. 11.12 and 11.13, inhibit their morphological representation. In addi-
cognates and noncognates differ in the way their tion, the model does not postulate direct links from
morphological roots are represented. Cognate words language nodes to morphological nodes. To include
share a common root (e.g., portapuerta share the such links would on the one hand be redundant
root port-); noncognates do not (e.g., taulamesa because language nodes already modulate lexical
have different roots, taul- and mes-). Similar to our activity through cross-language top-down inhibi-
proposal concerning Kroll and De Groots (1997) tion to word nodes; on the other hand, it would
model, in our version of the BIA model the facili- require the assumption that inhibitory connections
tatory masked priming effects obtained with cog- from language nodes to morphological nodes only
nate words (i.e., translation and cross-language exist in the case of noncognate translations because
morphological effects) would be located at the cognate translations share a common node at the
morphological level.6 morphological level. Thus, we suggest that the only
How then do cross-language facilitatory effects way in which inhibition from the language nodes
for cognates emerge within the proposed version of can affect the activation level of morphological
the BIA model? In an initial stage of processing, the representations would be through the word units.
cognate word presented as the prime (e.g., porta) That is, when these word units are inhibited by the
will activate the representation of features and the language node, they will activate to a lesser degree
letters containing these features; the letters that do their corresponding morphemic units.
not correspond to those features will be inhibited. Cross-language morphological priming effects
The activated letter nodes will activate the words in in cognate words (e.g., portapuertas) can be ex-
the two languages that share these letters; at the plained in the same manner as translation priming
same time, the remaining words will be inhibited. because they arise from the same shared common
At this point in the recognition process, the word root (e.g., port-).
node corresponding to the prime becomes acti- Figure 11.13 shows the adapted version of
vated, sending activation to the morphemic level, the BIA model for noncognate words. For these
which has its root (e.g., port-). This morphemic words, no facilitation effects would be observed
unit will then send activation back not only to the because, with the presentation of the prime (e.g.,
node for the cognate word that has been presented taula), activation will only reach its nonshared
Representation of Cognates 245
Figure 11.12 Cognate translations (portapuerta; door) as represented in the bilingual interactive acti-
vation model (BIA) (pos, position). These translations share a common root. The morphological level
is proposed to mediate between meaning and form. Version adapted from Dijkstra and Van Heuven,
1998.
morphological root (e.g., taul-). When the target is of 30 and 60 ms). The presence of facilitation
presented (e.g., mesa), its root would not have been effects in these words at a longer SOA (in our case,
previously activated by the prime, receiving ini- 250 ms) could be attributed to the later role that
tially only form-based activation from the corre- meaning-based activation possibly plays in the
sponding word unit. This would explain the lack of recognition process given that noncognate words
facilitation effects in noncognate words (at SOAs do share a common meaning.
246 Comprehension
Figure 11.13 Noncognate translations (taulamesa; table) as represented in the bilingual interactive ac-
tivation model (BIA) (pos, position). These translations do not share a common root. The morphological
level is proposed to mediate between meaning and form. Version adapted from Dijkstra and Van Heuven,
1998.
Finally, the facilitation for false friends (e.g., the letters that it has in common with its false
curtacurva), that only occurred at a very short friend target (e.g., curva). Activation at the word
SOA (i.e., 30 ms), could be accounted in the level (i.e., form-based activation) could then be
following way. Because in the BIA model words responsible for the presence of facilitation effects at
from both languages are initially activated, a word an early stage of processing. Given that false
presented as the prime (e.g., curta) would activate friends, as interlingual homographs, have neither
Representation of Cognates 247
morphological nor semantic overlap, it can be as- 2. In the case of derivational sufxed forms,
sumed that this form-based initial activation would the results are less clear because factors such as
soon be suppressed by the inhibitory connections semantic transparency and productivity appear to
postulated in the model. Therefore, at long SOAs play a role in determining how these words are
(i.e., 60 and 250 ms) no facilitation for false friends represented and accessed (e.g., Drews & Zwitser-
lood, 1995; Feldman & Larabee, 2001; Marslen-
occurs.
Wilson et al., 1994; Stolz & Feldman, 1995).
3. Although in the example used, portapuerta,
the gender and plural morphemes coincide in the
Conclusion two languages (Catalan and Spanish), this is not
always the case (e.g., mitomite, myth in En-
We began this chapter by proposing that morphol- glish) (see Mascaro, 1985, for more details).
ogy might be a critical factor in modeling the rep- 4. In our version of this model, we have only
resentation of cognates in bilinguals. Throughout represented the words root, but the representation
of afxes (e.g., gender and plural sufxes) could
this chapter, we have presented experimental evi-
also be included in the case of morphologically
dence that provides preliminary support for this complex words. Note, however, that it might be
claim, and we have discussed how some of the possible that sufxes of regularly inected forms
current models of bilingual memory could incor- will not be lexically represented because their use is
porate the representation of morphology. However, generally governed by rules.
we appreciate that further work is needed to explain 5. Giraudo and Grainger (2000) referred to
in more detail what is the precise nature of the their proposal as the supralexical hypothesis to
proposed cross-language morphological represen- distinguish it from the sublexical hypothesis de-
tation in the languages tested. Further work is also fended by Taft (1994). Only the latter requires
needed to determine whether the same interpreta- morphological decomposition as a prior stage to
access.
tion applies to other morphologically related words
6. Giraudo and Grainger (2000) also suggested
besides inections (e.g., derived forms) and to nd that the locus of morphological masked priming
out how the morphological representation relates to effects within the same language is the morpho-
both the form representation and the semantic rep- logical level.
resentation of words and what role, if any, it plays in
lexical access processes. The present support for a
morphological level of representation shared in the References
case of cognate words between a bilinguals two Altarriba, J. (1992). The representation of trans-
languages, however, appears to provide the begin- lation equivalents in bilingual memory. In R. J.
ning of a promising new research line. Harris (Ed.), Cognitive processing in bilinguals
(pp. 157174). Amsterdam: Elsevier.
Cabre, M. T. (1994). Alentorn de la paraula (Vols.
Acknowledgments 1 and 2). Valencia, Spain: Universitat de
Vale`ncia. Collecio: Biblioteca Lingustica
The research presented in this chapter was sup- Catalana.
ported by a grant from the Ministry of Science and Caramazza, A., Laudana, A., & Romani, C.
Technology (BSO2000-1252). We thank Annette (1988). Lexical access in inectional
de Groot and Judith Kroll for providing many morphology. Cognition, 28, 297332.
helpful comments on an earlier version of this Chen, H.-C. (1990). Lexical processing in a
chapter. non-native language: Effects of language
prociency and learning strategy. Memory
Notes and Cognition, 18, 279288.
Chen, H.-C., & Ng, M.-L. (1989). Semantic
1. We present versions of the distributed lexical/ facilitation and translation priming in
semantic feature model and the BIA model using Chinese-English bilinguals. Memory and
Catalan and Spanish words as examples because the Cognition, 17, 454462.
majority of experiments reported in the chapter Cristoffanini, P. M., Kirsner, K., & Milech, D.
have tested these languages. However, the same (1986). Bilingual lexical representation: The
interpretations would apply in the case of Spanish- status of Spanish-English cognates. The
English translations. Nevertheless, it would be in- Quarterly Journal of Experimental
teresting to explore whether the morphological Psychology, 38, 367393.
characteristics of the language can play a role in the Davis, C. W., & Kim, J. (2000, July). Masked
issues we have addressed. priming by translation and phonological
248 Comprehension
primes in Korean and English. In F. Y. Dore interlingual homographs: The neglected role of
(Ed.), Abstracts of the XXVII International phonology. Journal of Memory and Language,
Congress of Psychology (p. 405). Hove, U.K.: 41, 496518.
Psychology Press. Dijkstra, A., & Van Heuven, W. J. B. (1998).
Davis, C. W., Sanchez-Casas, R., & Garca-Albea, The BIA model and bilingual word recogni-
J. E. (1991). Bilingual lexical representation as tion. In J. Grainger & A. Jacobs (Eds.),
revealed using the masked priming procedure. Localist connectionist approaches to human
Unpublished manuscript. cognition (pp. 189225). Hillsdale, NJ:
Davis, C. W., Sanchez-Casas, R., & Garca-Albea, Erlbaum.
J. E. (2002). Masked translation priming: Dijkstra, A., Van Jaarsveld, H., & ten Brinke, S.
Varying language experience and word type (1998). Interlingual homograph recognition:
with Spanish-English bilinguals. Manuscript Effects of task demands and language
submitted for publication. intermixing. Bilingualism: Language and
Davis, C. W., & Schoknecht, C. (1996). Lexical Cognition, 1, 5166.
processing in Thai-English bilinguals. In Pan- Domnguez, A., Segu, J., & Cuetos, F. (2002). The
Asiatic Linguistics: Proceedings of the Fourth time-course of inectional morphological
International Symposium on Language and priming. Linguistics, 40, 235259.
Linguistics, Thailand (Vol. 4, pp. 13991428). Drews, E., & Zwitserlood, P. (1995). Morpholog-
De Groot, A. M. B. (1983). The range of auto- ical and orthographic similarity in visual
matic spreading activation in word priming. word recognition. Journal of Experimental
Journal of Verbal Learning and Verbal Psychology: Human Perception and
Behavior, 22, 417436. Performance, 21, 10981116.
De Groot, A. M. B. (1992). Bilingual lexical Durgunoglu, A. Y., & Roediger, H. L. (1987).
representation: A closer look at conceptual Test differences in accessing bilingual
representations. In R. Frost & L. Katz (Eds.), memory. Journal of Memory and Language,
Orthography, phonology, morphology and 26, 377391.
meaning (pp. 389412). Amsterdam: Elsevier. Evett, L. J., & Humphreys, G. W. (1981). The use
De Groot, A. M. B. (1993). Word-type effects in of abstract graphemic information in lexical
bilingual processing tasks: Support for a mixed access. The Quarterly Journal of Experimental
representational system. In R. Schreuder & Psychology, 33A, 325350.
B. Weltens (Eds.), The bilingual lexicon Feldman, L. B. (Ed.). (1995). Morphological
(pp. 191214). Amsterdam: Benjamins. aspects of language processing. Hillsdale, NJ:
De Groot, A. M. B. (2001). Lexical representation Erlbaum.
and lexical processing in the L2 user. In V. Cook Feldman, L. B., & Larabee, J. (2001). Morpho-
(Ed.), Portraits of the language user (pp. 32 logical facilitation following prexed but not
63). Clevedon, U.K.: Multilingual Matters. sufxed primes: Lexical architecture or
De Groot, A. M. B., Delmaar, P., & Lupker, S. modality-specic processes. Journal of
(2000). The processing of interlexical Experimental Psychology: Human Perception
homographs in translation recognition and and Performance, 27, 680691.
lexical decision: Support for non-selective ac- Feldman, L. B., & Moskovlijevic, J. (1987).
cess to bilingual memory. The Quarterly Repetition priming is not purely episodic
Journal of Experimental Psychology, 53A, in origin. Journal of Experimental
397428. Psychology: Learning, Memory and
De Groot, A. M. B., & Nas, G. L. J. (1991). Lex- Cognition, 15, 112.
ical representations of cognate and non- Feustel, T. C., Shiffrin, R. M., & Salasoo, A.
cognates in compound bilinguals. Journal of (1983). Episodic and lexical contribution to
Memory and Language, 30, 90123. the repetition effect in word identication.
De Groot, A. M. B., & Poot, R. (1997). Word Journal of Experimental Psychology: General,
translation at three levels of prociency in a 112, 309346.
second language: The ubiquitous involvement Forster, K. I. (1998). The pros and cons of masked
of conceptual memory. Language Learning, priming. Journal of Psycholinguistic Research,
47, 215264. 27, 203233.
Deutsch, A., Frost, R., Pollatsek, A., & Rayner, K. Forster, K. I., & Azuma, T. (2000). Masked
(2000). Early morphological effects in word priming for prexed words with bound stems:
recognition in Hebrew: Evidence from Does submit prime permit? Language and
parafoveal preview benet. Language and Cognitive Processes, 15, 539561.
Cognitive Processes, 15, 487506. Forster, K. I., & Davis, C. W. (1984). Repetition
Dijkstra, A., Grainger, J., & Van Heuven, W. J. B. priming and frequency attenuation in
(1999). Recognition of cognates and lexical access. Journal of Experimental
Representation of Cognates 249
Kroll, J. F., & Stewart, E. (1994). Category inter- Sandra, D., & Taft, M. (Eds.). (1994). Morpho-
ference in translation and picture naming: logical structure, lexical representation and
Evidence for asymmetric connections between lexical access. Mahwah, NJ: Erlbaum.
bilingual memory representations. Journal of Scarborough, D. L., Gerard, L., & Cortese, C.
Memory and Language, 33, 149174. (1984). Independence of lexical access in bi-
Levelt, W. J. M. (1989). Speaking: From intention lingual word recognition. Journal of Verbal
to articulation. Cambridge, MA: MIT Press. Learning and Verbal Behavior, 23, 8499.
Marslen-Wilson, W., Tyler, L., Waksler, R., & Schwanenflugel, P., & Rey, M. (1986). Interlingual
Older, L. (1994). Morphology and meaning semantic facilitation: Evidence for a common
in the English mental lexicon. Psychological representational system in the bilingual lexi-
Review, 101, 333. con. Journal of Memory and Language, 25,
Mascaro, J. (1985). Morfologia. Barcelona, Spain: 605618.
Enciclopedia Catalana. Smith, M. C. (1997). How do bilinguals access
McClelland, J. L., & Rumelhart, D. E. (1981). An lexical information? In A. M. B. de Groot and
interactive activation model of context effects J. F. Kroll (Eds.), Tutorials in bilingualism:
in letter perception. Part I: An account of basic Psycholinguistic perspectives (pp. 145168).
ndings. Psychological Review, 88, 375405. Mahwah, NJ: Erlbaum.
Mildred, H. V. (1986). Masked priming effects Stolz, J. A., & Feldman, L. B. (1995). The role of
between and within languages. Unpublished orthographic and semantic transparency of the
honors thesis, Monash University, Melbourne, base morpheme in morphological processing.
Australia. In L. B. Feldman (Ed.), Morphological aspects
Oliphant, G. (1983). Repetition and recency effect of language processing (pp. 109129). Hills-
in word recognition. Australian Journal of dale, NJ: Erlbaum.
Psychology, 35, 393403. Taft, M. (1985). The decoding of words in lexical
Potter, M. C., So, K.-F., Von Eckardt, B., & access: A review of the morphographic ap-
Feldman, L. B. (1984). Lexical and conceptual proach. In D. Besner, T. G. Waller, & G. E.
representations in beginning and procient MacKinnon (Eds.), Reading research: Ad-
bilinguals. Journal of Verbal Learning and vances in theory and practice (pp. 271294).
Verbal Behavior, 23, 2338. New York: Academic Press.
Sanchez-Casas, R. (1999). Una aproximacion Taft, M. (1994). Interactive activation as a
psicolingustica al estudio del lexico en el framework for understanding morphological
hablante bilingue. In M. de Vega & F. processing. Language and Cognitive
Cuetos (Eds.), Psicolingustica del espanol Processes, 9, 271294.
(pp. 597651). Madrid, Spain: Trotta. Taft, M., & Forster, K. I. (1976). Lexical storage
Sanchez-Casas, R., & Almagro, Y. (1999, April). and retrieval of polymorphemic and poly-
Efectos de priming entre lenguas utilizando syllabic words. Journal of Verbal Learning
primes enmascarados y no enmascarados y and Verbal Behavior, 15, 607620.
diferente asincrona estimular. Paper presented Talamas, A., Kroll, J. F., & Dufour, R. (1999).
at the meeting of the IV Simposium de Psico- From form to meaning: Stages in the
lingustica, Madrid, Spain. acquisition of second language vocabulary.
Sanchez-Casas, R., Davis, C. W., & Garca-Albea, Bilingualism: Language and Cognition,
J. E. (1992). Bilingual lexical processing: 2, 4558.
Exploring the cognate/noncognate distinction. Tenpenny, P. L. (1995). Abstractionist versus
European Journal of Cognitive Psychology, 4, episodic theories of repetition priming and
293310. word identification. Psychonomic Bulletin
Sanchez-Casas, R., Garca-Albea, J. E., & Igoa, J. and Review, 2, 341368.
M. (2000, July). Can cognate words be char- Varela Ortega, S. (1992). Fundamentos de
acterized as a kind of morphological relations? morfologa. Madrid, Spain: Sntesis.
In F. Y. Dore (Ed.), Abstracts of the XXVII Weinreich, U. (1974). Language in contact:
International Congress of Psychology (pp. Findings and problems. The Hague, The
405406). Hove, U.K.: Psychology Press. Netherlands: Mouton. (Original work
Sanchez-Casas, R., Suarez-Buratti, B., & Igoa, published 1953)
J. M. (1992, September). Are bilingual Williams, J. N. (1994). The relationship between
lexical representations interconnected? Paper word meanings in the first and second
presented at the meeting of the Fifth language: Evidence for a common, but re-
Conference of the European Society for stricted, semantic code. European Journal of
Cognitive Psychology, Paris, France. Cognitive Psychology, 6, 195220.
Wendy S. Francis
12
Bilingual Semantic and Conceptual
Representation
ABSTRACT The question of whether and to what extent semantic or conceptual rep-
resentations are integrated across languages in bilinguals has led cognitive psycholo-
gists to generate over 100 empirical reports. The terms semantic and conceptual are
compared, contrasted, and distinguished from other levels of representation, and terms
used to describe language integration are claried. The existing literature addressing
bilingual episodic and semantic memory at the level of semantic systems and at the level
of the translation-equivalent word pair is summarized. This evidence strongly favors
shared semantic systems and shared semantic/conceptual representation for translation
equivalents. Translation equivalents appear to have a different and closer cognitive
status than within-language synonyms. Important directions in future cognitive re-
search on bilingualism include neuroscientic and developmental approaches.
251
252 Comprehension
play. Any of the concepts a person can know ought tied with a subset or a particular pattern of acti-
to have the potential to be expressed in any human vation or connection weights across the entire
language. Of course, the concepts actually realized system. Within this framework, the degree of in-
in an individuals language input or output will vary tegration can be considered at the systems level
with systematic patterns across languages. Semantic or at the level of individual word meanings. For
representations may be those concepts that are re- both systems and units, this type of distributed or
ferred to by particular words or sentences. Thus, multicomponential representation allows the pos-
semantic representations would be representations sibilities of completely shared representations,
of word or sentence meaning. Word meanings, or completely separate representations, and interme-
semantic representations of words, would be a diate partly shared representations (as explained
particular type of concept. This view would be by De Groot, 1992a, 1992b). However, at the
consistent with the position of many linguists (e.g., present time, bilingual researchers have not com-
Jackendoff, 1994). A second way to think about pleted the types of studies needed or the analyses
this relationship is to consider semantic represen- necessary for a true connectionist analysis, but
tations or word meanings as the mappings of verbal some progress is being made in this area (e.g.,
labels to their concepts. Although many of the Thomas & Van Heuven, chapter 10, this volume).
concepts a person knows can be expressed using We have used primarily the classical information
individual words, there are of course many more processing approach.
concepts that are not associated with any particular The semantic/conceptual level of representation
word. Such concepts have the potential to be ex- must also be distinguished from other levels of
pressed as sentences (or larger units of language), in representation. Concepts can be given verbal la-
which case these concepts would be the semantic bels, sequences of phonemes or graphemes, which
representations of the sentences that express them. are called words; word meanings are the concepts
Across different researchers in this area, some to which these words refer. In bilinguals, transla-
use conceptual or semantic representation ex- tion equivalents are words in different languages
clusively to describe the focus of their research, and that refer to the same concept or have the same
others use both terms interchangeably. If semantic meaning. When referring to mental representations
representations are considered as a subset of the set of words, cognitive psychologists often call this the
of possible conceptual representations, then all lexical level of representation. The term lexical
three practices seem reasonable because the con- literally means having to do with words. Al-
ceptual representation associated with a particular though this term does distinguish the word level
word or sentence is its semantic representation. from, say, the sentence level, it does not specify
Some researchers (e.g., Pavlenko, 1999) have ad- what information about words is designated. In
vocated separation of conceptual and semantic linguistics, the term lexical is used rather generally:
levels of representation. In theory, if the constructs a lexical entry contains several types of knowledge
are different, they ought to be separable. However, about a word, including phonological, morpho-
with a subset relationship, the separation is un- logical, syntactic, and semantic information, and
likely to be viable in experimental practice, at least the lexicon is the collection of lexical entries that
with respect to language. To clarify, if a researcher any person has acquired. In psychology, the lexical
is interested in studying those concepts that also level of representation is often meant to refer just to
happen to be semantic representations of words, the level of the verbal label, and the lexicon is often
there is no obvious way to separate them. On the meant to refer only to the collection of verbal la-
other hand, if a researcher is interested in studying bels. Less formally, the lexicon is the set of words
concepts that are not semantic representations of in a persons vocabulary. This differential usage is
words or sentences, then, ironically, it is not clear often cause for confusion, so it would be advisable
how to study them using language stimuli. to be clear on exactly what the term is intended to
The central question about bilingual semantic/ mean. Because of this ambiguity, it is tempting to
conceptual representations is the degree to which avoid the word lexical altogether, but the term
they are integrated across languages. Here, it is lexical seems appropriate in reference to research
important to talk about systems of possible repre- that focuses on individual words and their mean-
sentations versus representations of specic words. ings rather than larger units of language such as
A semantic or conceptual system can be considered phrases or full sentences. Such research can be
to have an innumerable set of possible semantic framed in terms of lexical representation, lexical
components, of which any word meaning is iden- processing, or lexical access.
Semantic Representation 253
along with tables of quantitative comparisons, see Novell, 1975; Nott & Lambert, 1968; Peynircioglu
Francis, 1999b.) The studies are organized ac- & Durgunoglu, 1993; Saegert, Obermeyer, & Ka-
cording to the type of representation they address. zarian, 1973). However, performance for mixed-
First, they are classied according to whether they language lists was inferior (6886%) to that of
address primarily episodic memory representations single-language lists when items from the same se-
or semantic memory representations. This distinc- mantic category were studied in different languages
tion, proposed by Tulving (1972), is one in which (Lambert et al., 1968; Nott & Lambert, 1968;
the term episodic memory is used to refer to mem- Palmer, 1972; Tulving & Colotla, 1970). A single
ory for events, and the term semantic memory is study of recognition memory for unrelated words
used to refer to information in a persons knowledge showed a small but signicant enhancement of
base. In the context of this chapter, the episodic recognition performance for mixed-language lists
memories addressed are primarily for verbal events. relative to single-language lists (McCormack &
The semantic memories addressed are primarily Colletta, 1975).
knowledge of language. To determine the type of Examination of language clustering in recall of
memory most directly addressed, the main criterion mixed-language lists indicates that, although lan-
was that experiments on episodic memory typically guage can be used as an organizer in episodic
have a delay between an initial exposure and a re- memory, it is subordinate to semantic organization.
trieval task of some sort, whereas experiments on In the original reports, there were apparent dis-
semantic memory typically deal with simultaneous crepancies among the results because random se-
or immediate sequential processing of stimuli. Of quence was used as a baseline. Because output
course, studies of episodic memory may shed light sequences are strongly inuenced by the sequence
on the underlying semantic memory representation of the input (e.g., Dalrymple-Alford & Aamiry,
of the concepts used as stimuli, but these inferences 1969), a more appropriate baseline for output
must be identied as such. clustering is the degree of clustering in the input
In the following summary, the sets of studies sequence. A reanalysis using the input sequence
addressing episodic and semantic memory are fur- clustering as a baseline led to a consistent pattern
ther divided along a second dimension: studies that of results across studies (Francis, 1999b). All ve
address memory systems and studies that address studies showed more clustering in the output se-
pairwise relationships among corresponding units quences than in the input sequences, indicating
within these systems. To determine which type of positive reorganization by language (Dalrymple-
representation was addressed by a particular study, Alford & Aamiry, 1969; Lambert et al., 1968; Nott
the main criterion was that experiments that & Lambert, 1968; Saegert, Obermeyer, et al.,
address pairwise representations of units typically 1973; Tulving & Colotla, 1970). When compared
deal with translation equivalents, whereas stud- to semantic category reorganization, which was
ies that address systemwise representation typically substantial in the studies that measured it, two
deal with associates or inuences of items that are studies showed less reorganization by language
not translation equivalents. than by semantic category (Lambert et al., 1968;
Nott & Lambert, 1968), and one did not (Dal-
rymple-Alford & Aamiry, 1969). Thus, language
Episodic Memory Systems appears not to be as salient an organizing principle
as semantic category. The language reorganization
Studies comparing memory for mixed-language observed may be a function of separation at the
word series to memory for single-language word phonological level.
series have shown that having to remember the Several types of interference studies have shown
language of input in addition to the corresponding that learning items in one language can adversely
concept requires an additional memory load. In affect the learning or retrieval of semantically
these studies, bilingual participants were given a free similar items in the other language. One method
recall test in which answers were counted as correct has been to examine interference in paired associ-
only if given in the appropriate language. Recall ate learning. Negative transfer occurs when a set of
performance for mixed-language word lists was paired associates is re-paired after initial learning,
equivalent (93100%) to that of single-language and performance on the re-paired list is worse than
word lists when words in the different languages for a new set of paired associates. Learning the re-
came from different semantic categories (Lambert, paired set also impairs later recall of the original set
Ignatow, & Krauthamer, 1968; McCormack & of pairs relative to a new set, a phenomenon known
Semantic Representation 255
as retroactive interference. Negative transfer (Lo- (Francis, 1999a) have shown a high degree of ana-
pez, Hicks, & Young, 1974; Young & Webber, logical transfer across languages. Directed trans-
1967) and retroactive interference (Lopez et al., fer rates across languages were 8996% of the
1974; Young & Navar, 1968) occur even if words corresponding within-language rates. Spontaneous
in the intervening re-paired list are in a different transfer rates across languages ranged from 65% to
language from the original, showing that associa- 95% of the within-language rates. The greatest at-
tions made in one language carry over to the other tenuation was observed as decreased reminding or
language. Therefore, the concepts of the two access to the source problem in spontaneous transfer
members of a pair were associated, not just the for pairs of word problems with very high surface
surface forms, indicating shared conceptual sys- similarity, suggesting that reductions in transfer
tems for the two languages. In another paired were caused by surface rather than conceptual
associate learning paradigm, learning to associate characteristics of the source and target problems.
translation equivalents with two different cues was Results from other paradigms that addressed
more difcult than learning to associate two unre- episodic memory at the systems level have been
lated different-language words with two different interpreted as evidence for separate memory sys-
cues (Kintsch & Kintsch, 1969). tems, although not necessarily by the researchers
Other types of conceptually based interference who conducted the studies. However, these tech-
effects also extend across languages. Wholepart niques do not address the semantic/conceptual level
interference is shown when learning a set of words of representation. First, when learning and recal-
interferes with later learning of a subset of those ling lists of words from the same semantic cate-
words relative to a new set; partwhole interfer- gory, performance declines with each successive
ence is shown when the subset is learned rst and list, a phenomenon known as proactive interference
interferes with learning of the whole set rela- (or, alternatively, proactive inhibition), but if
tive to a new set. In a bilingual study of wholepart the category changes on a subsequent list, there is a
interference and partwhole interference, the recovery in performance or release from proactive
whole lists and part lists were presented either in interference (Wickens, Born, & Allen, 1963). Per-
the same language or in different languages (Sae- formance recovers following a language change
gert, Kazarian, & Young, 1973). Between-language as well (Dillon, McCormack, Petrusic, Cook, &
interference was not as strong as within-language Laeur, 1973; Goggin & Wickens, 1971), but this
interference, but it was substantial in all conditions effect can be explained by phonological differences
except that in the partwhole paradigm there was between words in different languages (ONeill &
facilitation instead of interference relative to a Huot, 1984) and therefore is not informative about
neutral list when going from the dominant to the semantic level of representation. Second, the
nondominant language. Misinformation effects in- tendency when speaking a particular language to
duced by presenting misleading information to recall autobiographical events that occurred in the
eyewitnesses between an observed event and ques- same language context (Marian & Neisser, 2000;
tioning (Loftus, 1975) have also been examined in Schrauf & Rubin, 1998, 2000) does not indicate
bilinguals. A study with bilingual witnesses showed separate memory stores for the two languages, but
that the degree of interference was equivalent rather extends the range of known context-related
whether the misleading information was given in effects on memory.
the same language or in a different language from
the nal recall and recognition tests (Shaw, Garcia,
& Robles, 1997). Under an extreme separate con- Episodic Memory Representations
cept model, encoding and retrieving concepts in of Translation Equivalents
one language should not affect encoding and re-
trieval of concepts in the other language, but these Studies of direct cross-language memory tests such
interference results clearly contradict this expecta- as recall and recognition showed that items learned
tion. Therefore, they support a model in which in one language can be intentionally accessed
concepts are at least partly shared. through the other language as long as the encoding
Analogical transfer in problem solving occurs and retrieval conditions encourage conceptual pro-
when a new problem is solved by applying a solu- cessing, as in free recall or recognition (Durgunoglu
tion previously learned for a different problem with & Roediger, 1987; Ervin, 1961; Kintsch, 1970).
similar causal structure. Studies using probability Similarly, positive transfer learning paradigms have
problems (Bernardo, 1998) and insight problems shown that learning of a word list (Lambert,
256 Comprehension
Havelka, & Crosby, 1958; Lopez & Young, 1974; effects. Semantic processing is not considered the
Young & Saegert, 1966) or set of sentences (Opoku, primary basis of repetition priming in word frag-
1992) was enhanced by previous or interpolated ment completion, a paradigm in which successfully
learning of the translations relative to previous or completing a word fragment like S_N_W_C_ is
interpolated learning of an unrelated list or sen- facilitated by prior presentation of its completion
tence. These ndings provide only weak evidence sandwich. However, within this paradigm, a
for shared representation, because the extent to high degree of repetition priming across languages
which strategic covert translation contributes to the was observed when the encoding task required deep
transfer effects observed is unknown. conceptual processing (Smith, 1991). Substantial
To avoid this problem, several studies were de- but smaller cross-language priming effects occurred
signed using less direct measures of memory, in- when the encoding tasks were less conceptual
cluding savings and repetition priming paradigms. (Basden, Bonilla-Meeks, & Basden, 1994; Durgu-
Savings is a phenomenon in which material previ- noglu & Roediger, 1987; Heredia & McLaughlin,
ously learned but forgotten is relearned more 1992; Peynircioglu & Durgunoglu, 1993). These
quickly than new material. When nonrecallable studies showing repetition priming across languages
numberword paired associates were relearned indicated that concepts encoded by means of one
in a different language, the savings effect (relative to language are automatically accessible to the other
unrelated word sets) was substantial, but this effect language even when no effort is made to retrieve
was about half the magnitude of the savings ob- them.
served when the paired associates were relearned Studies of recall for bilingual repetitions show
in the same language (MacLeod, 1976). Repetition a spacing effect for translation-equivalent repeti-
priming is a change in speed, accuracy, or bias based tions, in that recall performance is greater when the
on previous experience with an item and typically repetitions occur after several intervening items
refers to those effects that last for at least several than when the repetitions occur in immediate suc-
minutes or after several intervening items. Under a cession (Glanzer & Duarte, 1971; Heredia &
shared-concept model, priming across languages McLaughlin, 1992; Paivio, Clark, & Lambert,
would be expected to the extent that priming within 1988). This result contradicts the expectations of
a language is based on conceptual processing. Under the separate concept model and supports the shared
a separate concept model, conceptual repetition concept model. Under a separate concept model,
priming across languages would not be expected. translation equivalents have different concepts (as
Three different types of conceptually based would two unrelated words), so their recall prob-
priming paradigms have been examined in bilin- abilities would be independent of each other and
guals and have yielded substantial cross-language therefore not depend on the number of intervening
repetition priming. First, categoryexemplar gener- items. However, under a shared concept model,
ation priming, a bias to generate previously studied occurrences of translation equivalents involve rep-
exemplars to category cues, is substantial even etition of the same concept, so the well-established
when the exemplars are studied and generated in spacing effect for within-language repetitions would
different languages (Francis, 2001; Francis & Bjork, be expected to generalize to between-language rep-
1992). Second, verb generation priming, a response etitions of the concept.
time advantage for repeated items in generating The studies on memory for language of input
appropriate verbs to noun cues (e.g., generating showed that bilinguals often remembered concepts
bark to dog), is substantial even when the without remembering the language in which the
language changes from the rst to second occur- concepts were learned. In a basic study of recog-
rence (Seger, Rabin, Desmond, & Gabrieli, 1999). nition memory for words, bilinguals were twice as
Third, semantic classication of words is faster for likely to misclassify the language as they were to
repeated than for new items even when the language misrecognize an item (Kintsch, 1970). The phe-
changes. This effect has been demonstrated for nomenon was more striking in studies designed to
animate/inanimate decisions (Zeelenberg & Pecher, increase language confusion, such as using cognates
2003), natural/manufactured decisions (Zeelenberg as stimuli (Cristoffanini, Kirsner, & Milech, 1986),
& Pecher, 2003), and concrete/abstract decisions having only script to distinguish language (Brown,
(Francis & Goldmann, 2003). Sharma, & Kirsner, 1984), or studying mixed lan-
The attenuation of priming across languages in guage sets of highly related sentences (ONeill &
some of these tasks suggests that nonconceptual Dion, 1983; Rosenberg & Simon, 1977). Similarly,
processes also contribute to the within-language in learning successive word lists in which some
Semantic Representation 257
words were reused in different languages, sub- These effects included the ndings that a different
stantial proactive language confusion was ex- language repetition helps free recall relative to a
hibited (Liepmann & Saegert, 1974). Translation single presentation, a different language massed
intrusions also occurred when mixed-language sets repetition helps recall more than a same language
of paired associates were re-paired (Lopez et al., massed repetition, and same and different language
1974). Translation intrusion errors were exhibited spaced repetitions elicit equivalent recall perfor-
to a lesser degree in a number of free recall ex- mance (Durgunoglu & Roediger, 1987; Glanzer &
periments not specically designed to induce con- Duarte, 1971; Heredia & McLaughlin, 1992; Ko-
fusion (Kolers, 1966; Lambert et al., 1968; Nott & lers, 1966; Kolers & Gonzalez, 1980; Paivio et al.,
Lambert, 1968; Paivio et al., 1988; Rose & Car- 1988; Winograd et al., 1976).
roll, 1974). These ndings showed that the lan- The advantage for different language repetitions
guage of input is not a necessary feature of the over single presentations is expected under either
episodic memory representation. The nding that model, because the joint probability of remember-
memory for language of input for words and sen- ing one of two things with different names is al-
tences can be high under certain experimental cir- ways expected to be higher than the probability of
cumstances (Cristoffanini et al., 1986; Kintsch, remembering a specic one of those two. The ad-
1970; MacLeod, 1976; ONeill & Dion, 1983; vantage for different language over identical repe-
Rose, Rose, King, & Perez, 1975; Rosenberg & titions in massed conditions is expected under
Simon, 1977; Saegert, Hamayan, & Ahmar, 1975; either model because the massed between-language
Winograd, Cohen, & Barresi, 1976) and can be repetitions are more distinctive than the identical
attributed to memory for the different phonology repetitions because of phonological/orthographical
or orthography of noncognate translation equiva- differences. The equivalent performance under
lents in different languages. spaced conditions for different language and iden-
Other studies dealing with episodic memory of tical repetitions is also plausible under either model
translation equivalents were not informative about because, with increased spacing, identical repeti-
whether the representations are shared at the se- tions approach independence, as would be expected
mantic/conceptual level (for more detail, see Fran- for semantic (e.g., different language) repetitions
cis, 1999b). In two cases, it was because the basis of and even unrelated items. One exception in which
the memory phenomenon under investigation was between-language spaced repetition of a story led to
not semantic in nature. First, the nding that word better recall than a within-language spaced repeti-
fragments were not good recall cues for words tion (Hummel, 1986) could be explained under ei-
studied in a different language (Watkins & Pey- ther model by paying more attention to a translated
nircioglu, 1983) does not bear on the semantic/ story than to an identical repeated story.
conceptual level of representation because word Similarly, ndings of translation-based genera-
fragments constitute cues to orthography and pho- tion effects, an advantage in memory for items gen-
nology, not meaning. Second, across four studies erated from a translation cue rather than merely read
of repetition priming in lexical decision (deciding (Arnedt & Gentile, 1986; Basden et al., 1994; Basi,
whether a letter string such as chair or glarb is a Thomas, & Wang, 1997; ONeill, Roy, & Trem-
word or not), there was no evidence of facilitation blay, 1993; Paivio & Lambert, 1981; Potter, So, Von
from noncognate translation equivalents (Cris- Eckardt, & Feldman, 1984; Vaid, 1988) and the
toffanini et al., 1986; Gerard & Scarborough, 1989; absence of generation effects under some conditions
Kirsner, Brown, Abrol, Chadha, & Sharma, 1980; (Durgonoglu & Roediger, 1987; ONeill et al.,
Kirsner, Smith, Lockhart, King, & Jain, 1984). Be- 1993; Slamecka & Katsaiti, 1987) were consistent
cause even the within-language facilitation for re- with both shared concept and separate concept
peated items in lexical decision is not thought to be models. This is because generation effects are ex-
conceptually based, the absence of priming across pected based on having to produce a word from a cue
languages in those studies was not informative on rather than simply reading it, regardless of whether
the question of conceptual representation. the cue is another word in the same language or a
In several studies dealing with effects of repeti- translation equivalent. The lack of effect in some
tion and generation on recall, there were plausible cases may be consistent with the single-language
explanations for the results under either shared or literature on the generation effect; the effect ap-
separate models. Some of the observed effects of pears to be somewhat inconsistent, with positive,
bilingual repetition on recall could be explained null, and even reversed effects across studies. (See
under either a shared or separate concept model. Steffens & Erdfelder, 1998, for a review of this
258 Comprehension
literature and evaluation of the conditions under the same language or a different language from the
which generation effects occur.) exemplar (Shanon, 1982). Other evidence that se-
mantic category is a more dominant organizer of
semantic memory than language comes from the
Semantic/Conceptual Systems nding that language clustering in generating ex-
emplars of two categories from semantic memory
Lexical decisions are faster when a word is imme- (i.e., with no prior study sequence), although sub-
diately preceded by a semantic associate than when stantial, was subordinate to organization by se-
it is immediately preceded by an unrelated word or mantic category (Dalrymple-Alford, 1984).
presented in isolation, a phenomenon known as se- Studies of between-language interference effects
mantic priming of lexical decision. Several studies have shown that processing in one language can
examining this type of semantic priming across automatically interfere with processing of another.
languages revealed that the advantage for a related The Stroop effect is an interference phenomenon in
prime held even when it appeared in a different which the naming of ink colors is slowed when the
language from the target word (Chen & Ng, 1989; colors are presented in the form of incongruent
De Groot & Nas, 1991; Frenck & Pynte, 1987; color words (e.g., responding red when the word
Grainger & Beauvillain, 1988; Jin, 1990; Keatley & blue is printed in red ink). The most common
De Gelder, 1992; Keatley, Spinks, & De Gelder, variant on this colorword task is a pictureword
1994; Kirsner et al., 1984; Schwanenugel & Rey, task, in which picture naming is slowed by having
1986; Tzelgov & Eben-Ezra, 1992; Williams, 1994). an incongruent word superimposed on or presented
This facilitation was evident whether the control simultaneously with the target picture (e.g., naming
condition was an item with an unrelated prime or an a picture of a goat with the word sheep super-
item with no prime. A few discrepant nonsignicant imposed). Relative to a neutral condition, between-
cross-language facilitation effects in these same language colorword interference was consistently
studies appeared to be Type II errors due to insuf- reliable, ranging from 58% to 95% the magnitude
cient power. (Although the effect disappeared under of the within-language effect across studies (Abu-
response-deadline conditions [Keatley & De Gelder, nuwara, 1992; Chen & Ho, 1986; Dalrymple-
1992], it is not clear how this nding might have an Alford, 1968; Dyer, 1971; Fang, Tzeng, & Alva,
impact on the conclusions drawn.) 1981; Kiyak, 1982; Lee, Wee, Tzeng, & Hung, 1992;
Noncognate different language associates elicited Preston & Lambert, 1969; Smith & Kirsner, 1982).
an effect that was on average 7080% as large as that Similarly, between-language pictureword inter-
of the corresponding within-language associates. ference relative to a neutral control ranged from
Priming effects for cognates were stronger, indis- 75% to 140% of the within-language effect across
tinguishable from within-language priming, but it studies (Costa, Miozzo, & Caramazza, 1999; Ehri
cannot be determined whether the cognates were & Ryan, 1980; Rusted, 1988; Smith & Kirsner,
processed in the intended language, especially given 1982).
ndings suggesting that lexical access in word reading Other Strooplike interference tasks have been
is nonselective with respect to language (e.g., Dijk- examined in bilinguals and have consistently yielded
stra, Grainger, & Van Heuven, 1999; Jared & Kroll, substantial between-language interference for in-
2001; Jared & Szucs, 2002). These experiments congruent relative to neutral conditions. The word
showed that processing of an item in one language can word interference task requires naming words that
be facilitated when immediately preceded by a related have distracter words superimposed on them, which
item in the other language. slows performance in bilinguals even if the distracter
Semantic comparisons between words from dif- word is in a different language (Chen & Tsoi, 1990).
ferent languages took no longer than comparisons The toneword interference task requires classi-
between words in the same language. The semantic cation of tones as high, medium, or low, with si-
comparisons included verication of category multaneous incongruent presentation of the written
exemplar relationships (Caramazza & Brones, 1980; words high, medium, or low, which slows perfor-
Dufour & Kroll, 1995; Potter et al., 1984), choosing mance even if responses are to be given in a different
the more extreme member of a word pair (Popiel, language from the distracter words. A ankerword
1987), and solving analogies (Malakoff, 1988). interference task requires classication of a word
Producing the name of a superordinate category in that is anked above and below by words that
response to an exemplar likewise took the same would require a different classication response,
amount of time whether the response was given in which slows responses even if the anker words are
Semantic Representation 259
in a different language from the word to be classied In a paradigm requiring a decision of whether a
(Fox, 1996). In all cases, between-language inter- phoneme belonged to the name of a picture, re-
ference was attenuated relative to within-language jecting phonemes that were part of the nontarget
interference. In a related translation paradigm, pre- language name was slower than rejecting phonemes
sentation of semantically related distractor words that were not part of the name in either language
(La Heij, De Bruyn, Elens, Hartsuiker, Helaha, & (Colome, 2001). In category exemplar generation
Van Schelven, 1990; La Heij, Hooglander, Kerling, from semantic memory, part-set cueing (auditory
& Van der Velden, 1996; Miller & Kroll, 2002) or presentation of a subset of possible responses) in a
semantically related pictures (La Heij et al., 1996) different language interfered as much as did cueing
slowed translation relative to neutral conditions. in the same language (Peynircioglu & Goksen-
Although the standard control for the Stroop Erelcin, 1988). The results of these studies sug-
interference task is a neutral condition, a number of gested that activation spreads from the nontarget
studies have used a congruent condition, in which language to the target language by means of the
ink colors and color words match, as a control or common concept. Together, these studies indicated
comparison condition. The use of congruent trials that the language systems of a bilingual are inter-
as a control in Stroop and Stroop-like interference dependent and share common elements at the se-
tasks is problematic because it mixes facilitation mantic level.
of the congruent condition relative to the neutral Two types of results that addressed the seman-
condition and interference of the incongruent con- tic systems level could be explained with either a
dition relative to the neutral condition (MacLeod, shared or a separate semantic system. Differences
1991). In fact, in bilingual colorword interference among associates produced to words in different
studies, a congruent item in a different language languages (Dalrymple-Alford & Aamiry, 1970;
has been shown to slow responses relative to neutral Kolers, 1963) were not particularly informative
items (Abunuwara, 1992; Dalrymple-Alford, 1968), about the degree of integration in semantic sys-
which could have the effect of spuriously reducing tems, even when accounting for the likelihood of
the estimate of between-language interference rela- producing the same associate twice in repeated
tive to within-language interference. Nevertheless, trials in the same language. Although separate se-
even relative to the congruent control interference mantic systems would be expected to lead to dif-
across languages in the colorword task is sub- ferent associations, several alternative explanations
stantial (Abunuwara, 1992; Altarriba & Mathis, are possible. For example, the co-occurrence fre-
1997; Dalrymple-Alford, 1968; Tzelgov, Henik, & quencies of particular word pairs, frequencies of
Leiser, 1990). The net effects of using a congruent category exemplars, and word order rules differ
condition as a control are less clear for the other across languages, and any of these could lead to
Strooplike interference paradigms. For the picture differences in the associates generated. Two lexical
word task, translation equivalents facilitated pic- decision ndings were also ambiguous in their
ture naming (Costa et al., 1999), but in another implications for semantic representation. Lexical
study, words phonologically related to the rst decision times were as fast when bilinguals had to
language translation equivalent slowed second lan- verify words from both languages as in single-
guage picture naming (Hermans, Bongaerts, De Bot, language lexical decision (Caramazza & Brones,
& Schreuder, 1998). Interference in incongruent 1979). Rejecting words from the nontarget lan-
relative to congruent conditions was evident in ex- guage was slower than rejecting items that were
periments using the pictureword task (Gerhand, nonwords in both languages as long as the words
Deregowski, & McAllister, 1995), the toneword were orthographically legal and pronounceable in
task (Hamers & Lambert, 1972), and the anker the target language (Grainger & Beauvillain, 1987;
word task (Guttentag, Haith, Goodman, & Hauch, Nas, 1983; Scarborough, Gerard, & Cortese,
1984). 1984; Thomas & Allport, 2000). Both types of
Interference based on automatic processing of lexical decision ndings could be attributed to
the nontarget language has been observed in other shared semantic systems. However, a plausible al-
paradigms as well. When lexical decisions were ternative is that the letter strings presented were
made on a word with a translation equivalent, ra- processed by both language systems simultaneously
ther than an unrelated word, that was ignored on (in parallel), which would not require shared se-
the previous word-naming trial, response times mantic systems. Therefore, these methods did
increased substantially, an effect known as negative not have clear implications for bilingual semantic
priming (Neumann, McCloskey, & Felio, 1999). representation.
260 Comprehension
mutual exclusivity across languages (Au & Glus- cognitive organization. Most relevant to this
man, 1990). There were no reliable performance chapter are studies that focus on bilingual semantic
differences on either task between monolingual and representation, particularly the question of shared
bilingual children. This study showed that, al- versus separate semantic systems. If words in the
though children reject synonymy within a lan- two languages of a bilingual activate a common
guage, they accept translation equivalence. semantic system, the same cortical areas ought to
A number of the paradigms used to examine be sensitive to semantic relative to nonsemantic
the semantic integration of translation equivalents processing in both languages, but if the two lan-
have also been conducted with synonyms. Several guages activate separate systems, the localization of
studies of episodic memory show differences be- activation ought to differ across languages.
tween translation equivalents and synonyms in Studies addressing semantic representation gen-
situations of both positive and negative transfer. erally have found no reliable differences in locali-
For example, recall of a word is enhanced more by zation of semantic processing across languages (e.g.,
a translated repetition than by a synonymous rep- Chee, Hon, Lee, & Soon, 2001; Illes et al., 1999),
etition (Kolers & Gonzalez, 1980; Paivio et al., which is consistent with the cognitive experimental
1988). Savings in recall has been demonstrated for evidence for shared semantic systems discussed in
nonrecallable translation equivalents (MacLeod, this chapter. However, studies that also involve
1976), but not for nonrecallable synonyms (Nel- phonology, orthography, or processing of whole
son, 1971). Intrusion rates in free recall (under language, particularly when participants have lim-
mixed-language testing conditions) are higher for ited prociency in one language, often yield different
translation equivalents than for within-language patterns. For a review of the bilingual neuroimaging
synonyms (Paivio et al., 1988). Studies of semantic research with an emphasis on language compre-
memory and language processing yield similar hension and production processes, see Abutalebi,
patterns of positive and negative transfer. Transfer Cappa, and Perani, chapter 24, this volume. A key in
of reading speed is 100% for (word-for-word) this area of research will be for bilingual cognition
translated sentences (MacKay & Bowman, 1969), experts to be involved in focusing studies on ques-
but only partial for synonym-substituted sentences tions of maximal theoretical interest, choosing tasks
(Levy, Di Persio, & Hollingshead, 1992). Repeti- that are appropriate from a cognitive/bilingual per-
tion blindness, a decrement in recall performance spective, and interpreting results within the context
for second occurrences of words within lists or of the bilingual cognition literature.
sentences in rapid serial visual presentation, has
been demonstrated for translation repetitions in
mixed language word lists and sentences (MacKay Developmental Approaches
& Miller, 1994; Sanchez-Casas et al., 1992; but see
Altarriba & Soltano, 1996). In contrast, synonym Researchers have only begun to address the devel-
repetitions did not exhibit repetition blindness opment of conceptual/semantic structures in bilin-
(Kanwisher & Potter, 1990). guals. Developmental models of bilingual language
Because very few of these studies had the explicit acquisition are likely to be important in the fu-
purpose of comparing translation-equivalents and ture of bilingual research because they allow for
synonyms to each other, the individual cross-study changes in representation with learning. Although
comparisons ought to be considered preliminary until cognitive psychologists have studied extensively the
controlled comparison studies have been conducted. organization of bilingual lexical and semantic rep-
However, the consistency across paradigms gives resentation in procient bilinguals, far less atten-
strong evidence that translation equivalents are more tion has been given to the question of how the
closely related than are within-language synonyms. representation got to that point (one exception is
Krolls revised hierarchical model; Dufour & Kroll,
Neuroimaging Studies of 1995; Kroll & De Groot, 1997). Surprisingly little
Bilingual Conceptual/Semantic is known about what it means cognitively for a
Representation person to go from being monolingual to bilingual
because data on appropriate cognitive tasks across
Advances in neuroimaging over the last decade, different levels of learning are sparse. There are very
in particular positron emission tomography (PET) few cross-sectional cognitive studies that examine
and functional magnetic resonance imaging (MRI), bilinguals across several different levels of language
have enabled new methods for examining bilingual acquisition in the literature (notable exceptions
262 Comprehension
are the work of Chen, 1990; De Groot & Poot, 1997; 1999 Psychological Bulletin article (Francis,
Magiste, 1984, 1985, 1992) and apparently no 1999b).
longitudinal cognitive studies. Therefore, the exist-
ing data are insufcient to provide empirical support
to build a more comprehensive model of bilingual References
language development.
Abunuwara, E. (1992). The structure of the
As De Groot (2000) pointed out, studying se- trilingual lexicon. European Journal of
mantic or lexical representation as it exists in a Cognitive Psychology, 4, 311322.
person who is at a particular stage of second lan- Altarriba, J. (1992). The representation of
guage acquisition does not assume that the repre- translation equivalents in bilingual memory.
sentation is static, but it instead gives us a window In R. J. Harris (Ed.), Cognitive processing in
on the acquisition process. It would be informative bilinguals (pp. 157174). Amsterdam: Elsevier
to capture more windows at different stages of Science.
acquisition. Perhaps combining current experi- Altarriba, J., & Mathis, K. M. (1997). Conceptual
mental techniques with the microdevelopmental and lexical development in second language
acquisition. Journal of Memory and
approach would enable a better understanding of
Language, 36, 550568.
how these representations evolve. Altarriba, J., & Soltano, E. G. (1996). Repetition
Illustrations of the microdevelopmental ap- blindness and bilingual memory: Token
proach across a variety of contexts can be found in individuation for translation equivalents.
the edited volume Microdevelopment: Transition Memory & Cognition, 24, 700711.
Processes in Development and Learning (Granott Arnedt, C. S., & Gentile, J. R. (1986). A test of
& Parziale, 2002). As an example, Gelman, Romo, dual coding theory for bilingual memory.
and Francis (2002) conducted a microdevelop- Canadian Journal of Psychology, 40,
mental study using notebooks kept by English as a 290299.
second language students throughout a course to Au, T. K., & Glusman, M. (1990). The principle of
mutual exclusivity in word learning: To honor
get windows on learning at several different points
or not to honor? Child Development, 61,
during their science learning and English language 14741490.
acquisition. Collecting more cognitive measures of Basden, B. H., Bonilla-Meeks, J. L., & Basden, D.
bilingual processing or conducting a true experi- R. (1994). Cross-language priming in word-
ment addressing semantic or lexical representation fragment completion. Journal of Memory and
on a similar schedule would likely provide new Language, 33, 6982.
insights on the acquisition process. Basi, R. K., Thomas, M. H., & Wang, A. Y.
A large body of developmental or second lan- (1997). Bilingual generation effect: Variations
guage acquisition research does describe the patterns in participant bilingual type and list type.
of language performance exhibited by second Journal of General Psychology, 124, 216222.
Bernardo, A. B. I. (1998). Language format and
language learners at different levels of prociency,
analogical transfer among bilingual problem
at different ages, and in different situations. How- solvers in the Philippines. International
ever, these studies shed virtually no light on the Journal of Psychology, 33, 3344.
cognitive processes underlying the observed effects. Brown, H., Sharma, N. K., & Kirsner, K. (1984).
For example, many studies have been interpreted The role of script and phonology in lexical
as supporting a critical period for second language representation. The Quarterly Journal of
learning, yet these interpretations rarely explain Experimental Psychology, 36A, 491505.
which cognitive processes or mechanisms might be Caramazza, A., & Brones, I. (1979). Lexical access
relatively problematic for older learners. Identica- in bilinguals. Bulletin of the Psychonomic
tion of these cognitive mechanisms would likely be Society, 13, 212214.
Caramazza, A., & Brones, I. (1980). Semantic
useful in developing more rigorous models of second
classication by bilinguals. Canadian Journal
language acquisition and in developing methods to of Psychology, 34, 7781.
improve second language learning and instruction. Chee, M. W. L., Hon, N., Lee, H. L., & Soon, C. S.
(2001). Relative language prociency modu-
lates BOLD signal change when bilinguals
Acknowledgment perform semantic judgments. NeuroImage,
13, 11551163.
It should be noted that much of the conceptual Chen, H. C. (1990). Lexical processing in a
content of this chapter was also reported in my non-native language: Effects of language
Semantic Representation 263
prociency and learning strategy. Memory & second language: The ubiquitous involvement
Cognition, 18, 279288. of conceptual memory. Language Learning,
Chen, H. C., & Ho, C. (1986). Development 47, 215264.
of Stroop interference in Chinese-English Dijkstra, T., Grainger, J., & Van Heuven, W. J. B.
Bilinguals. Journal of Experimental (1999). Recognition of cognates and inter-
Psychology: Learning, Memory, and lingual homographs: The neglected role of
Cognition, 12, 397401. phonology. Journal of Memory and Language,
Chen, H. C., & Ng, N. L. (1989). Semantic 41, 496518.
facilitation and translation priming effects in Dillon, R. F., McCormack, P. D., Petrusic, W. M.,
Chinese-English bilinguals. Memory & Cook, G. M., & Laeur, L. (1973). Release
Cognition, 17, 454462. from proactive interference in compound
Chen, H. C., & Tsoi, K. C. (1990). Symbol-word and coordinate bilinguals. Bulletin of the
interference in Chinese and English. Acta Psychonomic Society, 2, 293294.
Psychologica, 75, 123138. Dufour, R., & Kroll, J. F. (1995). Matching words
Colome, A`. (2001). Lexical activation in bilinguals to concepts in two languages: A test of the
speech production: Language-specic or concept mediation model of bilingual
language-independent? Journal of Memory representation. Memory & Cognition, 23,
and Language, 45, 721736. 166180.
Costa, A., Miozzo, M., & Caramazza, A. (1999). Durgunoglu, A. Y., & Roediger, H. L. (1987). Test
Lexical selection in bilinguals: Do words in the differences in accessing bilingual memory.
bilinguals two languages compete for selec- Journal of Memory and Language, 26,
tion? Journal of Memory and Language, 41, 377391.
365397. Dyer, F. N. (1971). Color-naming interference in
Cristoffanini, P., Kirsner, K., & Milech, D. (1986). monolinguals and bilinguals. Journal of
Bilingual lexical representation: The status of Verbal Learning and Verbal Behavior, 10,
Spanish-English cognates. Quarterly Journal 297302.
of Experimental Psychology: Human Experi- Ehri, L. C., & Ryan, E. B. (1980). Performance
mental Psychology, 38, 367393. of bilinguals in a pictureword interference
Dalrymple-Alford, E. C. (1968). Interlingual task. Journal of Psycholinguistic Research, 9,
interference in a color-naming task. 285302.
Psychonomic Science, 10, 215216. Ervin, S. M. (1961). Learning and recall in bilin-
Dalrymple-Alford, E. C. (1984). Bilingual retrieval guals. The American Journal of Psychology,
from semantic memory. Current Psychological 74, 446451.
Research and Reviews, 3, 313. Ervin, S. M., & Osgood, C. E. (1954). Second
Dalrymple-Alford, E. C., & Aamiry, A. (1969). language learning and bilingualism. Journal of
Language and category clustering in bilingual Abnormal and Social Psychology, 49(Suppl.),
free recall. Journal of Verbal Learning and 139146.
Verbal Behavior, 8, 762768. Fang, S. P., Tzeng, O. J., & Alva, L. (1981).
Dalrymple-Alford, E. C., & Aamiry, A. (1970). Intralanguage versus interlanguage Stroop
Word associations of bilinguals. Psychonomic effects in two types of writing systems.
Science, 21, 319320. Memory & Cognition, 9, 609617.
De Groot, A. M. B. (1992a). Bilingual lexical rep- Fox, E. (1996). Cross-language priming from
resentation: A closer look at conceptual rep- ignored words: Evidence for a common
resentations. In R. Frost & L. Katz (Eds.), representational system in bilinguals. Journal
Orthography, phonology, morphology, and of Memory and Language, 35, 353370.
meaning (pp. 389412). Amsterdam: Elsevier. Francis, W. S. (1999a). Analogical transfer of
De Groot, A. M. B. (1992b). Determinants of word problem solutions within and between lan-
translation. Journal of Experimental Psychol- guages in English-Spanish bilinguals. Journal
ogy: Learning, Memory, and Cognition, 18, of Memory and Language, 40, 301329.
10011018. Francis, W. S. (1999b). Cognitive integration of
De Groot, A. M. B. (2000). On the source and language and memory in bilinguals: Semantic
nature of semantic and conceptual knowledge. representation. Psychological Bulletin, 125,
Bilingualism: Language and Cognition, 3, 79. 193222.
De Groot, A. M. B., & Nas, G. L. J. (1991). Lexical Francis, W. S. (2000). Clarifying the cognitive ex-
representation of cognates and noncognates in perimental approach to bilingual research. Bi-
compound bilinguals. Journal of Memory and lingualism: Language and Cognition, 3, 1315.
Language, 30, 90123. Francis, W. S. (2001). Components of priming in
De Groot, A. M. B., & Poot, R. (1997). Word category exemplar generation. Abstracts of the
translation at three levels of prociency in a Psychonomic Society, 6, 47.
264 Comprehension
Francis, W. S., & Bjork, R. A. (1992, November). Granott, N., & Parziale, J. (2002). Microdevelop-
Cross-language conceptual priming in English- ment: Transition processes in development
Spanish bilinguals. Poster presented at the and learning. Cambridge, U.K.: Cambridge
33rd annual meeting of the Psychonomic University Press.
Society, St. Louis, MO. Grosjean, F. (1992). Another view of bilingualism.
Francis, W. S., & Goldmann, L. (2003). Priming of In R. Harris (Ed.), Cognitive processing in
semantic judgments within and across lan- bilinguals (pp. 5162). Amsterdam: Elsevier.
guages in Spanish-English bilinguals. Unpub- Guttentag, R. E., Haith, M. M., Goodman, G. S.,
lished manuscript. & Hauch, J. (1984). Semantic processing of
Frenck, C., & Pynte, J. (1987). Semantic repre- unattended words by bilinguals: A test of the
sentation and surface forms: A look at across- input switch mechanism. Journal of Verbal
language priming in bilinguals. Journal of Learning and Verbal Behavior, 23, 178188.
Psycholinguistic Research, 16, 383396. Hamers, J. F., & Lambert, W. E. (1972). Bilingual
Frenck-Mestre, C., & Vaid, J. (1992). Language as interdependencies in auditory perception.
a factor in the identication of ordinary words Journal of Verbal Learning and Verbal
and number words. In R. J. Harris (Ed.), Behavior, 11, 303310.
Cognitive processing in bilinguals (pp. 265 Heredia, R., & McLaughlin, B. (1992). Bilingual
281). Amsterdam: Elsevier Science. memory revisited. In R. J. Harris (Ed.), Cog-
Gelman, R., Romo, L. F., & Francis, W. S. (2002). nitive processing in bilinguals (pp. 91103).
Notebooks as windows on learning: The case Amsterdam: Elsevier Science.
of a science-into-ESL program. In N. Granott Hermans, D., Bongaerts, T., De Bot, K., &
& J. Parziale (Eds.), Microdevelopment: Schreuder, R. (1998). Producing words in a
Transition processes in development and foreign language: Can speakers prevent inter-
learning (pp. 269293). Cambridge, U.K.: ference from their rst language? Bilingualism:
Cambridge University Press. Language and Cognition, 1, 213229.
Gerard, L. D., & Scarborough, D. L. (1989). Hummel, K. M. (1986). Memory for bilingual
Language-specic lexical access of homo- prose. In J. Vaid (Ed.), Language processing in
graphs by bilinguals. Journal of Experimental bilinguals: Psycholinguistic and neuropsycho-
Psychology: Learning, Memory, and Cogni- logical perspectives (pp. 4764). Hillsdale, NJ:
tion, 15, 305315. Erlbaum.
Gerhand, S. J., Deregowski, J. B., & McAllister, H. Illes, J., Francis, W. S., Desmond, J. E., Gabrieli, J.
(1995). Stroop phenomenon as a measure of D. E., Glover, G. H., Poldrack, R. A., et al.
cognitive functioning of bilingual (Gaelic/ (1999). Convergent cortical representation of
English) subjects. British Journal of Psychol- semantic processing in bilinguals. Brain and
ogy, 86, 8992. Language, 70, 347363.
Glanzer, M., & Duarte, A. (1971). Repetition Jackendoff, R. (1994). Word meanings and what it
between and within languages in free recall. takes to learn them: Reections on the Piaget-
Journal of Verbal Learning and Verbal Chomsky debate. In W. F. Overton & D. S.
Behavior, 10, 625630. Palermo (Ed.), The nature and ontogenesis
Goggin, J., & Wickens, D. D. (1971). Proactive of meaning (pp. 129144). Hillsdale, NJ:
interference and language change in short- Erlbaum.
term memory. Journal of Verbal Learning and Jared, D., & Kroll, J. F. (2001). Do bilinguals ac-
Verbal Behavior, 10, 453458. tivate phonological representations in one or
Gollan, T. H., Forster, K. I., & Frost, R. (1997). both of their languages when naming words?
Translation priming with different scripts: Journal of Memory and Language, 44, 231.
Masked priming with cognates and non- Jared, D., & Szucs, C. (2002). Phonological acti-
cognates in Hebrew-English bilinguals. vation in bilinguals: Evidence from inter-
Journal of Experimental Psychology: lingual homograph naming. Bilingualism:
Learning, Memory, and Cognition, 23, Language and Cognition, 5, 225239.
11221139. Jiang, N. (1999). Testing processing explanations
Grainger, J., & Beauvillain, C. (1987). Language for the asymmetry in masked cross-language
blocking and lexical access in bilinguals. priming. Bilingualism: Language and Cogni-
Quarterly Journal of Experimental tion, 2, 5975.
Psychology: Human Experimental Jin, Y. S. (1990). Effects of concreteness on cross-
Psychology, 39A, 295319. language priming in lexical decisions. Percep-
Grainger, J., & Beauvillain, C. (1988). Associative tual and Motor Skills, 70, 11391154.
priming in bilinguals: Some limits of inter- Kanwisher, N. G. (1987). Repetition blindness:
lingual facilitation effects. Canadian Journal Type recognition without token individuation.
of Psychology, 42, 261273. Cognition, 27, 117143.
Semantic Representation 265
Kanwisher, N. G., & Potter, M. C. (1990). Repe- Lambert, W. E., Havelka, J., & Crosby, C. (1958).
tition blindness: Levels of processing. Journal The inuence of language-acquisition contexts
of Experimental Psychology: Human Percep- on bilingualism. Journal of Abnormal and
tion and Performance, 16, 3047. Social Psychology, 56, 239244.
Keatley, C., & De Gelder, B. (1992). The bilingual Lambert, W. E., Ignatow, M., & Krauthamer, M.
primed lexical decision task: Cross-language (1968). Bilingual organization in free recall.
priming disappears with speeded responses. Journal of Verbal Learning and Verbal
European Journal of Cognitive Psychology, 4, Behavior, 7, 207214.
273292. Lee, W. L., Wee, G. C., Tzeng, O. J. L., &
Keatley, C. W., Spinks, J. A., & De Gelder, B. Hung, D. L. (1992). A study of interlingual
(1994). Asymmetrical cross-language priming and intralingual Stroop effect in three different
effects. Memory & Cognition, 22, 7084. scripts: logograph, syllabary, and alphabet.
Kintsch, W. (1970). Recognition memory in bilin- In R. J. Harris (Ed.), Cognitive processing in
gual subjects. Journal of Verbal Learning and bilinguals (pp. 427442). Amsterdam: Elsevier
Verbal Behavior, 9, 405409. Science.
Kintsch, W., & Kintsch, E. (1969). Interlingual Levy, B. A., Di Persio, R., & Hollingshead, A.
interference and memory processes. Journal (1992). Fluent rereading: Repetition, automa-
of Verbal Learning and Verbal Behavior, 8, ticity, and discrepancy. Journal of
1619. Experimental Psychology: Learning, Memory,
Kirsner, K., Brown, H. L., Abrol, S., Chadha, N. and Cognition, 18, 957971.
K., & Sharma, N. K. (1980). Bilingualism and Liepmann, D., & Saegert, J. (1974). Language
lexical representation. Quarterly Journal of tagging in bilingual free recall. Journal of
Experimental Psychology, 32, 585594. Experimental Psychology, 103, 11371141.
Kirsner, K., Smith, M. C., Lockhart, R. S., King, Loftus, E. F. (1975). Leading questions and the
M. L., & Jain, M. (1984). The bilingual lexi- eyewitness report. Cognitive Psychology, 7,
con: Language-specic units in an integrated 560572.
network. Journal of Verbal Learning and Lopez, M., Hicks, R. E., & Young, R. K. (1974).
Verbal Behavior, 23, 519539. Retroactive inhibition in a bilingual A-B, A-B
Kiyak, H. A. (1982). Interlingual interference in paradigm. Journal of Experimental
naming color words. Journal of Cross-Cultural Psychology, 103, 8590.
Psychology, 13, 125135. Lopez, M., & Young, R. K. (1974). The linguistic
Kolers, P. A. (1963). Interlingual word associa- interdependence of bilinguals. Journal of
tions. Journal of Verbal Learning and Verbal Experimental Psychology, 102, 981983.
Behavior, 2, 291300. MacKay, D. G., & Bowman, R. W. (1969). On
Kolers, P. A. (1966). Interlingual facilitation of producing the meaning in sentences. American
short-term memory. Journal of Verbal Learn- Journal of Psychology, 82, 2339.
ing and Verbal Behavior, 5, 314319. MacKay, D. G., & Miller, M. D. (1994). Semantic
Kolers, P. A., & Gonzalez, E. (1980). Memory for blindness: Repeated concepts are difcult to
words, synonyms, and translations. Journal of encode and recall under time pressure.
Experimental Psychology: Human Learning Psychological Science, 5, 5255.
and Memory, 6, 5365. MacLeod, C. M. (1976). Bilingual episodic mem-
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical ory: Acquisition and forgetting. Journal of
and conceptual memory in the bilingual: Verbal Learning and Verbal Behavior, 15,
Mapping form to meaning in two languages. 347364.
In A. M. B. de Groot & J. F. Kroll (Eds.), MacLeod, C. M. (1991). Half a century of research
Tutorials in bilingualism: Psycholinguistic on the Stroop effect: An integrative review.
perspectives (pp. 169199). Mahwah, NJ: Psychological Bulletin, 109, 163203.
Erlbaum. Magiste, E. (1984). Stroop tasks and dichotic
La Heij, W., De Bruyn, E., Elens, E., Hartsuiker, translation: The development of interference
R., Helaha, D., & Van Schelven, L. (1990). patterns in bilinguals. Journal of Experimental
Orthographic facilitation and categorical in- Psychology: Learning, Memory, and Cogni-
terference in a word-translation variant of the tion, 10, 304315.
Stroop task. Canadian Journal of Psychol- Magiste, E. (1985). Development of intra- and
ogy, 44, 7683. interlingual interference in bilinguals. Journal
La Heij, W., Hooglander, A., Kerling, R., & Van der of Psycholinguistic Research, 14, 137154.
Velden, E. (1996). Nonverbal context effects Magiste, E. (1992). Second language learning
in forward and backward translation: Evidence in elementary and high school students.
for concept mediation. Journal of Memory European Journal of Cognitive Psychology, 4,
and Language, 35, 648665. 355365.
266 Comprehension
Malakoff, M. E. (1988). The effect of language repetition effects on recall. Journal of Experi-
of instruction on reasoning in bilingual mental Psychology: Learning, Memory, and
children. Applied Psycholinguistics, 9, 1738. Cognition, 14, 163172.
Marian, V., & Neisser, U. (2000). Language- Paivio, A., & Desrochers, A. (1980). A dual-coding
dependent recall of autobiographical memo- approach to bilingual memory. Canadian
ries. Journal of Experimental Psychology: Journal of Psychology, 34, 388399.
General, 129, 361368. Paivio, A., & Lambert, W. (1981). Dual coding
McCormack, P. D., & Colletta, P. (1975). Recog- and bilingual memory. Journal of Verbal
nition memory for items from unilingual and Learning and Verbal Behavior, 20, 532539.
bilingual lists. Bulletin of the Psychonomic Palmer, M. B. (1972). Effects of categorization, de-
Society, 6, 149151. gree of bilingualism, and language upon recall
McCormack, P. D., & Novell, J. A. (1975). Free of select monolinguals and bilinguals. Journal
recall from unilingual and trilingual lists. of Educational Psychology, 63, 160164.
Bulletin of the Psychonomic Society, 6, Pavlenko, A. (1999). New approaches to concepts
173174. in bilingual memory. Bilingualism: Language
Miller, N. A., & Kroll, J. F. (2002). Stroop effects and Cognition, 2, 209230.
in bilingual translation. Memory & Cognition, Peynircioglu, Z. F., & Durgunoglu, A. Y. (1993).
30, 614628. Effects of a bilingual context on memory per-
Nas, G. (1983). Visual word recognition in bilin- formance. In J. Altarriba (Ed.), Cognition and
guals: Evidence for a cooperation between culture: A cross-cultural approach to psychol-
visual and sound based codes during access to ogy (pp. 5775). Amsterdam: Elsevier Science.
a common lexical store. Journal of Verbal Peynircioglu, Z. F., & Goksen-Erelcin, F. (1988).
Learning and Verbal Behavior, 22, 526534. Part-set cuing across languages: Evidence for
Nelson, T. O. (1971). Savings and forgetting from both word- and concept-mediated inhibition
long-term memory. Journal of Verbal Learn- depending on language dominance. Acta Psy-
ing and Verbal Behavior, 10, 568576. chologica, 67, 1932.
Neumann, E., McCloskey, M. S., & Felio, A. C. Popiel, S. J. (1987). Bilingual comparative judg-
(1999). Cross-language positive priming dis- ments: Evidence against the switch hypothesis.
appears, negative priming does not: Evidence Journal of Psycholinguistic Research, 16,
for two sources of selective inhibition. 563576.
Memory & Cognition, 27, 10511063. Potter, M. C., So, K. F., Von Eckardt, B., &
Nott, C. R., & Lambert, W. E. (1968). Free recall Feldman, L. B. (1984). Lexical and conceptual
of bilinguals. Journal of Verbal Learning and representation in beginning and procient
Verbal Behavior, 7, 10651071. bilinguals. Journal of Verbal Learning and
ONeill, W., & Dion, A. (1983). Bilingual recog- Verbal Behavior, 23, 2338.
nition of concrete and abstract sentences. Preston, M. S., & Lambert, W. E. (1969). Inter-
Perceptual and Motor Skills, 57, 839845. lingual interference in a bilingual version of
ONeill, W., & Huot, R. (1984). Release from the Stroop colorword task. Journal of Verbal
proactive inhibition as a function of a lan- Learning and Verbal Behavior, 8, 295301.
guage of pronunciation shift in bilinguals. Rose, R. G., & Carroll, J. F. (1974). Free recall of a
Canadian Journal of Psychology, 38, 5462. mixed language list. Bulletin of the Psycho-
ONeill, W., Roy, L., & Tremblay, R. (1993). A nomic Society, 3, 267268.
translation-based generation effect in bilingual Rose, R. G., Rose, P. R., King, N., & Perez, A.
recall and recognition. Memory & Cognition, (1975). Bilingual memory for related and un-
21, 488495. related sentences. Journal of Experimental
Opoku, J. (1992). The influence of semantic cues in Psychology: Human Learning and Memory, 1,
learning among bilinguals at different levels of 599606.
proficiency in English. In R. J. Harris (Ed.), Rosenberg, S., & Simon, H. A. (1977). Modeling
Cognitive processing in bilinguals (pp. 175 semantic memory: Effects of presenting se-
189). Amsterdam: Elsevier Science. mantic information in different modalities.
Paivio, A. (1986). Mental representations: A dual Cognitive Psychology, 9, 293325.
coding approach (pp. 239257). New York: Rusted, J. (1988). Orthographic effects for Chi-
Oxford University Press. nese-English bilinguals in a pictureword in-
Paivio, A. (1991). Mental representation in bilin- terference task. Current Psychology: Research
guals. In A. G. Reynolds (Ed.), Bilingualism, and Reviews, 7, 207220.
multiculturalism, and second language learn- Saegert, J., Hamayan, E., & Ahmar, H. (1975).
ing (pp. 113126). Hillsdale, NJ: Erlbaum. Memory for language of input in polyglots.
Paivio, A., Clark, J. M., & Lambert, W. E. (1988). Journal of Experimental Psychology: Human
Bilingual dual-coding theory and semantic Learning and Memory, 1, 607613.
Semantic Representation 267
Saegert, J., Kazarian, S., & Young, R. K. (1973). Steffens, M. C., & Erdfelder, E. (1998). Determi-
Part/whole transfer with bilinguals. American nants of positive and negative generation
Journal of Psychology, 86, 537546. effects in free recall. Quarterly Journal of
Saegert, J., Obermeyer, J., & Kazarian, S. (1973). Experimental Psychology, 51A, 705733.
Organizational factors in free recall of Thomas, M. S. C., & Allport, A. (2000). Language
bilingually mixed lists. Journal of Experimen- switching costs in bilingual visual word rec-
tal Psychology, 97, 397399. ognition. Journal of Memory and Language,
Sanchez-Casas, R. M., Davis, C. W., Garca-Albea, 43, 4466.
J. E. (1992). Bilingual lexical processing: Ex- Tulving, E. (1972). Episodic and semantic memory.
ploring the cognate/noncognate distinction. In E. Tulving & W. Donaldson (Eds.), Orga-
European Journal of Cognitive Psychology, 4, nization of memory. New York: Academic
293310. Press.
Scarborough, D. L., Gerard, L., & Cortese, C. Tulving, E., & Colotla, V. A. (1970). Free recall
(1984). Independence of lexical access in of trilingual lists. Cognitive Psychology,
bilingual word recognition. Journal of Verbal 1, 8698.
Learning and Verbal Behavior, 23, 8499. Tzelgov, J., & Eben-Ezra, S. (1992). Components
Schrauf, R. W., & Rubin, D. C. (1998). Bilingual of the between-language semantic priming
autobiographical memory in older adult im- effect. European Journal of Cognitive
migrants: A test of cognitive explanations of Psychology, 4, 253272.
the reminiscence bump and the linguistic en- Tzelgov, J., Henik, A., & Leiser, D. (1990).
coding of memories. Journal of Memory and Controlling Stroop interference: Evidence
Language, 39, 437457. from a between-language task. Journal of
Schrauf, R. W., & Rubin, D. C. (2000). Internal Experimental Psychology: Learning, Memory,
languages of retrieval: The bilingual encoding and Cognition, 16, 760771.
of memories for the personal past. Memory & Vaid, J. (1988). Bilingual memory representation:
Cognition, 28, 616623. A further test of dual coding theory. Canadian
Schwanenflugel, P. J., & Rey, M. (1986). Journal of Psychology, 42, 8490.
Interlingual semantic facilitation: Evidence for Watkins, M. J., & Peynircioglu, Z. F. (1983). On
a common representational system in the the nature of word recall: Evidence for
bilingual lexicon. Journal of Memory and linguistic specicity. Journal of Verbal
Language, 25, 605618. Learning and Verbal Behavior, 22, 385394.
Seger, C. A., Rabin, L. A., Desmond, J. E., & Weinreich, U. (1953). Languages in contact. The
Gabrieli, J. D. E. (1999). Verb generation Hague, The Netherlands: Mouton.
priming involves conceptual implicit memory. Wickens, D. D., Born, D. G., & Allen, C. K.
Brain and Cognition, 41, 150177. (1963). Proactive inhibition and item similar-
Shanon, B. (1982). Bilingual identification and ity in short-term memory. Journal of Verbal
classification of words and drawings in two Learning and Verbal Behavior, 2, 440445.
languages. Quarterly Journal of Experimental Williams, J. N. (1994). The relationship between
Psychology, 34A, 135152. word meanings in the first and second lan-
Shaw, J. S., Garcia, L. A., & Robles, B. E. (1997). guage: Evidence for a common, but restricted,
Cross-language postevent misinformation semantic code. European Journal of Cognitive
effects across languages in Spanish-English Psychology, 6, 195220.
bilinguals. Journal of Applied Psychology, 82, Winograd, E., Cohen, C., & Barresi, J. (1976).
889899. Memory for concrete and abstract words in
Slamecka, N. J., & Katsaiti, L. R. (1987). The bilingual speakers. Memory & Cognition, 4,
generation effect as an artifact of selective 323329.
displaced rehearsal. Journal of Memory and Young, R. K., & Navar, M. I. (1968). Retroactive
Language, 26, 589607. inhibition with bilinguals. Journal of Experi-
Smith, M. C. (1991). On the recruitment of se- mental Psychology, 77, 109115.
mantic information for word fragment com- Young, R. K., & Saegert, J. (1966). Transfer with
pletion: Evidence from bilingual priming. bilinguals. Psychonomic Science, 6, 161162.
Journal of Experimental Psychology: Learn- Young, R. K., & Webber, A. (1967). Positive
ing, Memory, and Cognition, 17, 234244. and negative transfer with bilinguals. Journal
Smith, M. C., & Kirsner, K. (1982). Language of Verbal Learning and Verbal Behavior, 6,
and orthography as irrelevant features in 874877.
colour-word and pictureword Stroop inter- Zeelenberg, R., & Pecher, D. (2003). Evidence for
ference. Quarterly Journal of Experimental long-term cross-language repetition priming in
Psychology: Human Experimental Psychol- conceptual implicit memory tasks. Journal of
ogy, 34A, 153170. Memory and Language, 46, 8094.
Cheryl Frenck-Mestre
13
Ambiguities and Anomalies
What Can Eye Movements and Event-Related
Potentials Reveal About Second Language Sentence
Processing?
268
Sentence Processing 269
play an important role in determining readers ini- in the native language will affect L2 processing,
tial choice of structure, according to some (Mac- provided the reader has sufcient knowledge of
Donald et al., 1994). the L2 (Dussias, 2001; Frenck-Mestre, 1997, 2002;
In the same vein, the role of the thematic con- Frenck-Mestre & Pynte, 1997; Hoover & Dwivedi,
straints of the rst noun phrase (NP) as well as 1998; Juffs & Harrington, 1996; see also Fernan-
contextual information have been investigated. dez, 1998, for ofine L2 studies of syntactic am-
Under experimental circumstances in which readers biguity resolution). Nonetheless, just what choice
are given single thematically independent sen- the reader will make on encountering a syntactic
tences, difculty might indeed be predicted, as ambiguity in the L2 can provide valuable infor-
shown by behavioral measures or electrophysio- mation concerning specic L2 issues. This can be
logical evidence, on the reading of the disambigu- illustrated by Examples 4a and 4b.
ating region of sentences such as those illustrated
in Examples 2 and 3. A quite different result might 4a. Le sous-marin detruit pendant la guerre a
be expected, however, when readers are given coule en quelques secondes.
contextual information, such as the focus particle The submarine destroyed during the war
only present in Example 2b but not in 2a, or sank in a few seconds.
when given more extended referential information 4b. Le sous-marin detruit pendant la guerre un
when they read sentences in context. Just this effect navire de la marine royale.
has been obtained, in various online monolingual The submarine destroys during the war a
studies, showing that Example 2a is in fact harder ship from the royal navy.
to process than 2b (Crain & Steedman, 1985; Ni
et al., 1996; but see Clifton et al., 2000, for an Both of these sentences are syntactically legal in
opposing view). The same result has been obtained the French language. They are both structurally
when the context is provided by a sentence rather ambiguous and in the present case equally plausible.
than by a single word (Altmann et al., 1992). They differ, however, concerning readers initial
Furthermore, numerous online studies have now preference, manifested at the disambiguation point
provided evidence that semantic and/or thematic underlined in the examples. The structure presented
information can reduce a readers likelihood to be in Example 4a, involving a reduced relative clause,
led up the garden path. The examples depicted in in which the element following NP1 is not the main
Sentences 3a and 3b illustrate this. Readers of En- verb but a past participle form that is the verb of the
glish experience less difculty with sentences such reduced relative clause, is known to cause difculty
as 3b than with those such as 3a because of the for readers in the absence of extrasyntactic cues,
semantic information present in the sentence onset as discussed in the preceding section. Thus, French
(McRae, Ferretti, & Amyote, 1997; Taraban & native speakers can be expected to experience
McClelland, 1988). greater difculty at the disambiguation of Structure
These monolingual studies thus strongly ques- 4a than Structure 4b (i.e., on reading the verb of
tion the hypothesis that readers initially perform a the main clause [a coule] than the direct object NP
serial, strictly syntactic parse of structures and [un navire]), as has indeed been shown with struc-
propose, rather, lexically and/or referentially based tures similar to these (cf. Pynte & Kennedy, 1993).
models of parsing (see MacDonald et al., 1994; but This is not, however, as immediately apparent
see Clifton et al., 2000, and Pickering et al., 2000, for native English speakers, when reading in French,
for contrary viewpoints). as outlined next. Although Example 4a may indeed
pose some difculty for native English speakers, Ex-
ample 4b may pose just as great a problem. Whereas
Bilingual Studies of Online Example 4b is both permissible and plausible in
Parsing: Evidence From Eye French, it is generally considered an ungrammatical
Movements structure in English. Indeed, in English it is generally
not permissible to separate the case assigner from the
In the same vein that monolingual models of element receiving the case (Haegeman, 1994), that is,
parsing can be tested via the processing of syntactic to displace the direct object NP from the case-
ambiguities, so can L2 research gain from this ap- assigning VP.1 This is not the case for French, as
proach. Note rst that the same factors that are direct object NPs may, but need not, be adjacent to
prone to inuence immediate syntactic processing the case-assigning VP.2 This is elaborated below.
Sentence Processing 271
Structure 4a is permissible in both English and speaker to consider sentences such as 4b as un-
French and is both structurally equivalent and grammatical in English, the speaker would have to
equally ambiguous in the two languages. Of the revert from a superset grammar to a more re-
two structures (4a and 4b), Example 4a is the stricted subset in which only one of the two
syntactically more complex. For Example 4b, al- grammatical possibilities is available. This is con-
though it is the syntactically simpler structure, it sidered quite difcult.4
generally violates the constraints of the English In light of this line of argumentation, English
language. Which of these considerations will pre- learners of French should, conversely, experience
vail for native English speakers reading in French? difculty with Structure 4b for the mirrored reason.
Should they nd it easier to process Sentence 4a That is, if they apply the strict adjacency principle
than 4b despite the increased syntactic complexity from English (and the subset principle), they should
of the former of these sentences? Might the struc- initially adopt a more restricted grammar of French
tural ambiguity of Sentence 4a be blocked for be- and should experience difculty in interpreting
ginning English-French bilinguals if they adopt the Structure 4b. If this principle is applied blindly,
parameters of their native language when reading then English native speakers should experience
French? Otherwise stated, might they systemati- greater difculty with Structure 4b than 4a, even
cally adopt the reduced relative reading of the though the former is the simpler, syntactically
sentence when they encounter the prepositional speaking. To examine this question, the processing
phrase, thus rendering the declarative interpreta- of Structures 4a and 4b was examined with novice
tion dispreferred? English-French bilinguals via the recording of eye
Some evidence on this matter has been provided movements (Frenck-Mestre, 1998).
by White (1989a, 1989b, 1991). White argued that The results of this experiment clearly revealed
French learners of English as a second language that beginning English-French bilinguals do not
should incorrectly assume that Sentence 4b is a li- treat the ambiguous structures illustrated by Ex-
censed structure in English given the syntactic dif- amples 4a and 4b in the same manner that native
ferences across French and English. In line with French readers do. First, it is notable that American
this, White found that French readers rated struc- readers showed longer reading times compared to
tures similar to that presented in Example 4b as French readers at the prepositional phrase region
grammatically acceptable and found it difcult to following the rst verb (in the examples during the
distinguish between this structure and the correct war) for both sentence structures. This would be
English one, for which the object NP is adjacent to expected if the English-dominant bilingual readers
the verb (such as The submarine destroyed a boat projected a main clause structure but then immedi-
during the war.). Whites results thus clearly ately revised this hypothesis because of the absence
demonstrated the inuence of the native language of a noun phrase following the verb. This interpre-
on L2 parsing. tation of the data was strengthened by the results
How then might English learners of French be obtained at the disambiguation point (underlined in
expected to process Structure 4b? White argued Examples 4a and 4b).
that French readers will have difculty rejecting The group of English-French bilinguals experi-
structures such as Structure 4b, for which there is enced considerably more difculty, as manifested
an intervening element between the case-assigning by longer reading times, when the disambiguating
VP and the object NP because of French having element forced a main clause interpretation of the
[strict adjacency], by which it is permissible to sentence (Example 4b) than a reduced relative
displace the object NP. White (1989b, chapter 6) structure (Example 4a). Moreover, as compared to
outlined her argument in terms of parameter set- French readers, they demonstrated considerably
ting, by which French learners of English must longer reading times for the main clause reading.
reset the values of this parameter to [strict The data for the group of French readers did not in
adjacency] to accommodate for the more restricted fact show a difference in processing time for the two
set of sentences possible in English.3 White ac- structures during the rst reading of the sentence,
counts for her results in the framework of universal but only in the measure of total reading times
grammar and the inability of adult learners to ac- (i.e., the summation of all xations in a specied
cess Universal Grammar and to properly reset pa- region of the sentence, including the rst time the
rameters that were set by the properties of the eye entered the region and all subsequent reread-
speakers native language. Indeed, for a French ings thereof).
272 Comprehension
Which Available Theory Best difculty with the reduced relative sentences be-
Accounts for the Data? cause of their previous processing of this structure
for the particular verbs we chose.5
In view of certain current monolingual models of The results obtained in the bilingual group can
language processing, it would appear that our data be explained along similar lines. That is, given that
pose some difculty. In contrast to models that the English language rarely admits the structure
assume a heuristic parser, which will systematically presented in Example 4b, in which the direct object
adopt the least complex of two alternative struc- complement is separated from its case-assigning
tures, my colleagues and I found that our partici- verb, our American-French bilingual readers should
pants did not show an immediate preference for the have had little experience with this structure in
syntactically less-complex structure of two alter- their native language (English) and thus be less
natives. To the contrary, when our participants likely to project it when an alternative structure
were confronted with structurally ambiguous sen- was available.
tences such as illustrated in 4a and 4b, we found Another statistical model of parsing is that
that our bilingual readers showed evidence of dif- known as the linguistic tuning hypothesis and for-
culty with the syntactically less-complex structure warded by Mitchell and colleagues (Mitchell, 1989;
when reading in their L2. Moreover, the results Mitchell et al., 1995). The model predicts, in similar
from our French monolingual control subjects fashion to MacDonald et al. (1994), that the amount
reading in their native language did not show of difculty a reader will experience when parsing a
nearly as strong or as immediate effects of syntactic structure will be directly related to the amount of
complexity as has been reported in previous prior experience that the reader has had with it. It
monolingual studies. has been forwarded as an explanatory model of
Monolingual theories of parsing that posit that cross-linguistic variation as concerns syntactic am-
the frequency of structures (as opposed to the biguity resolution (cf. Cuetos et al., 1996; Frenck-
syntactic complexity thereof) is a crucial element in Mestre & Pynte, 2000a, 2000b, for recent reviews).
determining the difculty of processing (cf. Mac- The model has direct bearing on the results found in
Donald, 1997; MacWhinney, 2001; Mitchell, our bilingual group of readers.
1989; Mitchell, Cuetos, Corley, & Brysbaert, First, these readers should, in terms of the
1995) may provide a better framework for under- model, experience greater difculty with structures
standing the pattern of results we obtained, both such as 4b given that the native language of these
in the monolingual group of readers and in the bilinguals does not afford them much experience
relatively inexperienced bilingual group. Past ex- with this structure, and that they were relatively
posure to and experience with a language and its inexperienced in the French language. Second, a
properties is a key factor in this type of model, prediction can be made as concerns the perfor-
which provides quite a different theoretical stance mance of these bilingual subjects as they gain ex-
on syntactic processing compared to syntax-rst perience in French. As stated expressly in Cuetos
accounts. et al. (1996): The model predicts that parsing
Consider rst the constraint satisfaction model preferences will change if, during some period prior
forwarded by MacDonald et al. (1994). From the to testing, the reader or listener has been exposed
vantage point of this model, it may be considered to an unusual preponderance of one ambiguity
quite logical that the reduced relative structures resolution rather than another (p. 175). From this,
we studied did not produce strong effects in the we could expect bilingual subjects to show changes
monolingual group of subjects. Quite a number of in immediate parsing decisions when reading in
the verbs that we selected were frequently used their L2 if, indeed, they receive linguistic input that
as adjectives (for example instruit, maudit, differs from that present in their native language
distrait, among others). In line with MacDonald (cf. Frenck-Mestre, 2002, for a discussion).
et al.s prediction that the frequency of structures As concerns the ambiguity studied here, we
will directly affect processing difculty, this would could predict that, with greater experience, English
decrease readers likelihood to treat this word as readers of French would accept the [strict adja-
the main verb of the phrase and facilitate the pro- cency] criterion in French and subsequently have
cessing of a reduced relative structure following a less difculty processing structures for which the
head such a Le pretre instruit. Otherwise stated, object NP is displaced from its case-assigning VP.
our French readers may have experienced little Although we presently only have preliminary data
Sentence Processing 273
on this question, the results from a small group of Hence, through parsing of structures specic to
more advanced English-French bilinguals suggest the L2, the adult learner may learn a new set of
that the above prediction holds. The trend in this parameter values along with the L2, which would
more advanced group was to behave like their become increasingly strong with L2 use. Our data
French counterparts when processing the ambigu- from English-dominant readers of French, who had
ous structures presented in Sentences 4a and 4b. relatively little experience parsing their L2 in com-
It is important to note, nonetheless, that the lin- parison to their L1, can easily be accounted for in
guistic tuning hypothesis has not proven capable such terms. Note, however, that the difference be-
of accounting for recent monolingual results (cf. tween the account proposed here by Fodor (1999)
Mitchell & Brysbaert, 1998). That is, when online and that proposed by frequency-based models (i.e.,
monolingual processing is examined for materials Cuetos et al., 1996; MacDonald et al., 1994;
modeled closely on corpora-based sentences, the on- MacWhinney, 1997) is not readily apparent.
line preferences are quite the opposite from those that Robertson and Sorace (1999) provided an in-
would be predicted by the statistical frequency of teresting discussion of what might be driving the
structures in the corpora. As such, although attrac- results obtained by White (1989a). They recast
tive, the model appears to be in need of further elab- Whites data in terms of optionality theory. Ro-
oration prior to acceptance as a general framework. bertson and Sorace made the parallel between the
Then again, we can ask whether the data we report results they obtained with German adult learners of
for our bilingual readers can be explained in terms of English concerning verb placement and those ob-
parameter setting, such as suggested by White tained by White for French learners of English re-
(1989a, 1989b). First, refer to a discussion by Fodor garding adverb placement. These authors suggested
(1999) concerning the hypothesis of a set-and- that rather than assuming an all-or-none mecha-
ready mechanism that would allow the human nism by which native language parameters are reset
parser to determine the correct parameters to describe to those of the L2, that optionality at the level of
the grammar of his or her language. In sum, Fodor competence persists in interlanguage grammars.
outlined the impossibility of any such automatic This suggestion stems from the observation that
mechanism based on supercially recognizable cues residual constructions from the native language
for natural language grammars. She suggested that grammar (in the case of Robertson & Sorace, V2
the only psychologically valid triggering mecha- constructions from German) are seldom systemat-
nism is part and parcel of parsing. ically observed. Great variations exist both among
Regarding the acquisition of L2 grammars by learners and within a learner concerning the fre-
adult learners specically, Fodor (1999) suggested quency and manner in which native language
a processing account that is not at odds with the constraints are applied. The authors found that the
frequency-based accounts outlined here. She pos- principles and parameters model, although able to
ited that the increased use or, conversely, disuse broadly explain the pattern of interference from the
of parameter values via parsing will have a direct native language on L2 processing, is inadequate to
impact on the activation levels of these values explain this variation. They appealed therefore to
(not at all unlike the model forwarded by Mac- the minimalist program (Chomsky, 1993) in the
Donald et al., 1994; McRae et al., 1997). The terms of the model set out by Eubank (1993/1994)
stronger L1 parameter values would be hard to of interlanguage grammars. This line of argumen-
overcome in an initial state, thus producing the tation might also be applied to the data for English-
type of interference observed by White and as French beginning bilinguals we obtained, although
many others and I have observed for various to truly adopt this type of model, individual vari-
structures (cf. Durgunoglu, 1997; Dussias, 2001; ation as well as group data need to be examined.
Fernandez, 1999; Frenck-Mestre, 2002; Mac- Last, in line with the idea that individual variation
Whinney, 1997, for reviews). When the adult need be accounted for, it might be noted that the
learner parses L2 sentences with the incorrect L1 conclusions I have drawn were based on group re-
grammar, she or he will eventually be led to dis- sults. It goes without saying that there is always con-
favor L1 values and apply those parameter values siderable variation between participants when any
of the L2 that enable a correct parse of the struc- measure of processing is recorded, whether online
ture, thereby increasing the activation levels of the reading times of various natures (self-paced reading,
latter and (perhaps, with a vast amount of expo- eye movements) or ofine preferences (question-
sure) decreasing those of the former. naires, sentence completion, etc.). It is the intention of
274 Comprehension
experimental psycholinguistics to go beyond this level bilingual readers L2 processing and native language
of individual variation and, through inferential sta- processing. There exists, however, another rich,
tistics, draw conclusions from group data whenever multidimensional, online trace of syntactic proces-
licensed to do so. This is not always assumed to be a sing, which is the recording of event-related scalp
natural choice. potentials during the visual (word-by-word) or au-
Although much can be learned from the study ditory presentation of sentences. It is of interest to
of individual differences (cf. Segalowitz, 1997, for see whether ERP studies provide complementary
a review), and although it is a truism to state that information to that provided by eye movement
adult learners of a language are a heterogeneous studies. Therefore, a quick look of various ERP
group, it is my intention to gain an understanding studies is given as they relate to sentence processing
of the larger picture. Theories of sentence pro- and bilinguals.
cessing must indeed take into account changes that
occur in parallel to readers experience with a
language, as highlighted by many (MacDonald Semantic Anomalies: Variations
et al., 1994; MacWhinney, 1997; Mitchell, 1994). in the N400 Component
Group data can provide as important information
in this regard as individual variation, provided one In an early study, Meuter, Donald, and Ardal
either follows the progress of a particular group or, (1987) compared the ERP trace obtained in the rst
as we have attempted, are a cross-sectional look at language (L1; English or French) and L2 (French or
data from learners with more or less parsing ex- English) of two groups of uent bilinguals while
perience in their L2. This has indeed proven useful reading sentences. The authors chose to examine
in many prior studies of L2 sentence processing. variations in the N400 component, as produced by
the sentence-nal word of semantically anomalous
sentences compared to a semantically acceptable
ending. The question of interest was whether the
What Can Other Measures Tell N400 effect would be obtained in the L2, and
Us About Second Language whether it would be similar in latency and ampli-
Sentence Processing? Evidence tude to that obtained in the L1. Semantically
From Event-Related Potentials anomalous sentence-nal words produced an N400
effect in both the native language and L2. The
The preceding discussion was of evidence primarily authors reported a trend for the N400 effect to be
from studies that used eye movements to examine smaller in the L2 than the native language. How-
native and L2 sentence processing. Eye movements ever, this held true at only one electrode site and
indeed provide a rich and multidimensional online for one group of bilinguals only. As such, no rm
record of the process(es) in which a reader is en- conclusions can be drawn from this preliminary
gaged. Initial reading of different parts of the sen- study about differences in semantic integration
tence can be broken into rst xation (i.e., the processes in the native language and L2 (as indeed
amount of time spent from when the eyes initially none were).
land in a region until a new saccade is engaged) and In a subsequent study, Ardal, Donald, Meuter,
gaze duration (i.e., all xations in a region prior to Muldrew, and Luce (1990) reexamined this ques-
the eyes exiting the region). Moreover, the afore- tion, both in another group of late bilinguals and in a
mentioned rst-pass measures can be compared group of early bilinguals (mean age of L2 acqui-
to later rereadings. Whereas the initial reading of an sition was 7.3 years). The authors again found an
element of the sentence is often considered to reveal N400 effect in the L2 as well as the native language,
readers rst choice concerning lexical access and/ that is, a larger N400 to semantically anomalous
or parsing, subsequent rereadings are more often than semantically acceptable sentence-nal words.
equated with reanalysis and/or repair processes. However, as in the previous study, the N400 effect
In addition to these reaction time measures, obtained in the L2 differed from that found in the
the pattern and frequency of regressive eye move- native language, having a slightly later onset in the
ments can be used to understand how the reader L2 than in the L1. Moreover, the bilinguals showed a
untangles difcult or unexpected structures. I have trend for a later onset of N400 in both languages
shown how different theories of parsing can be put compared to monolingual controls. No signicant
to the test by recording readers eye movement pat- differences in the ERP record were found between
terns and how comparisons can be made between the early and late bilingual groups. The tentative
Sentence Processing 275
conclusion from this study was that semantic inte- collaborators (McLaughlin, Osterhout, & Kim,
gration processes are affected by prociency in a 2004; Osterhout, McLaughlin, Kim, Greenwald, &
language and will be reected by the time course of Inoue, 2004) suggested that, in early stages of learn-
electrophysiological measures of processing. One ing an L2 as an adult, the N400 may not be restricted
caveat is nonetheless in order; the native language of to the detection of semantic anomalies. Osterhouts
the bilinguals in this study varied considerably. As group has found that number agreement errors in the
is well known, interactions between the bilinguals L2 (i.e., between the subject of the sentence and the
native language and L2 are numerous. The conclu- subsequent verb) will initially produce a variation
sions from Ardal et al.s (1990) study must thus be in the ERP trace that has all the characteristics of
considered with some caution. an N400 effect for young adults who have just
In several more recent studies, ERPs were again begun to learn an L2. With more L2 experience, this
used to measure semantic integration processes effect diminishes, to be replaced by a more canonical
during sentence processing in the L2 (Hahne, 2001; syntactic marker in the ERP record. This highly in-
Hahne & Friederici, 2001; Sanders & Neville, teresting line of work should be followed by any
2003; Weber-Fox & Neville, 1996). These studies researcher endeavoring to understand the develop-
also looked at syntactic processing, which is dis- ment of L2 sentence processing in adults.
cussed in this section. The general pattern of these
studies seems to be that, depending on prociency Variations in Early Anterior Negativity and P600 Two
in the L2, the N400 effect (as classically produced major ERP laboratories, one in the United States and
by semantically anomalous sentence endings or one in Germany, have published a series of articles
medial words [e.g., The volcano was eaten or on L2 syntactic anomaly processing (Hahne, 2001;
The scientist criticized Maxs event of the theo- Hahne & Friederici, 2001; Sanders & Neville, 2003;
rem] in comparison to semantically acceptable Weber-Fox & Neville, 1996). These articles have all
sentences [e.g., The bread was eaten or The addressed the issue of the critical period hypothesis
scientist criticized Maxs proof of the theorem] in one way or another, comparing either early and
or, more recently, by sentence-medial nonwords late bilinguals or the performance of late bilinguals
[bokkers] compared to real words [bottles]) on semantic and syntactic anomaly detection to that
will be equivalent in amplitude and latency in the of native speakers.
L2 of bilinguals to that found in the L1 for native In one of these studies (Weber-Fox & Neville,
speakers. Note that for less-procient L2 speakers, 1996), direct comparisons were made between
the amplitude of the N400 effect is often smaller, ve groups of Chinese-English bilinguals, ranging
and its peak is delayed compared to that obtained from early (as early as from infancy) to late ac-
in the native language (Hahne, 2001; Weber-Fox quirement of the L2, in relation to the processing of
& Neville, 1996), thus rejoining the results re- illegal structures in English. Illegalities were of
ported in earlier studies. various sorts, including phrasestructure violations
In sum, those studies that have recorded ERPs and subjacency errors (as in The scientist criti-
(auditory and visual) to examine L1 and L2 sentence cized Maxs of proof the theorem). The authors
processing at the semantic level showed basically found a high level of performance on behavioral
indistinguishable patterns for the two languages tasks for all groups of bilinguals (at least 85%
for procient bilinguals and relatively minor dif- correct). Several different time windows were
ferences between the L1 and L2 for less-procient considered in the electrophysiological trace.
bilinguals. A rather complex pattern of results emerged
(Weber-Fox & Neville, 1996). At the earliest
window (50250 ms, or N125), the three groups of
Syntactic Anomalies early bilinguals did not show a signicantly larger
response to illegal structures, whereas the two
Variations in the N400 Component The above- groups of late bilinguals and the monolingual con-
mentioned bilingual studies used the N400 com- trol group did. However, whereas the monolingual
ponent to index the immediate semantic integration participants showed a hemispheric asymmetry, with
of words in visually or auditorily presented sen- a larger response at left anterior sites, the late bilin-
tences. Since the initial nding of Kutas and Hillyard guals showed a bilateral response that was none-
(1980) in monolinguals, this is indeed the most of- theless larger over the right hemisphere. The latency
ten reported interpretation of variations of the of the N125 was also delayed in the two groups of
N400. Note, however, that work by Osterhout and late bilinguals compared to monolinguals. As such,
276 Comprehension
the authors concluded that the early negativity Another pair of ERP studies (Hahne, 2001;
found in the late bilingual groups was not the Hahne & Friederici, 2001) suggested a similar yet
early anterior negativity associated with aspects perhaps more nuanced argument concerning L2
of syntactic processing. syntactic processing. In these studies, phrase struc-
At a later window, often associated with N400 ture violations were employed in German (e.g., Das
(300450 or 300500 ms after word onset), phrase Eis wurde im gegessen, literal translation The ice
structure violations produced an increased negativity cream was in the eaten) to determine whether
in all bilingual groups as well as in the monolingual Russian-German and Japanese-German late bilin-
controls. Again, however, in the two groups of guals would be sensitive to this type of anomaly.
bilinguals who acquired their L2 after age 11 years, Akin to the study reported by Weber-Fox and Ne-
the typical signature of a greater left hemisphere ville (1996), the late bilinguals in these studies did
effect was absent. Finally, in the time window as- show a difference in the ERP trace to legal and illegal
sociated with the P600 (i.e., 500700 ms and 700 structures, as evidenced by P600, but unlike Weber-
900 ms; cf. Osterhout & Holcomb, 1992; Osterh- Fox and Neville (1996), these studies did not nd a
out, McKinnon, Bersick, & Corey, 1996), whereas delayed onset of the P600 in bilinguals.6
the three groups of early bilinguals showed a re- Two restrictions were nonetheless present. First,
sponse comparable to that found in monolinguals the difference in P600 as a function of sentence
for this type of violation, the two groups of late type was found only in advanced late bilinguals
bilinguals showed greater positivity to illegal (mean formal learning 6 years; mean residency 5
structures only in the later time window (700900 years), not in less-experienced L2 users (mean for-
ms), and the amplitude of the P600 was smaller mal learning and residency 2.5 years). Second, and
than that found for monolingual controls. most important for the authors argument, al-
The authors (Weber-Fox & Neville, 1996) though differences in the P600 were found in the
suggested that, at least for the types of syntactic advanced bilinguals as a function of sentence type,
anomalies they studied, only very early acquisition no differences were found in either of the late bi-
of an L2 enables bilinguals to acquire the skills lingual groups for an earlier left anterior negativity.
necessary to detect and process them in nativelike Monolingual controls showed both effects. This
fashion. Note, however, that none of the bilingual pattern of results led the authors to suggest, in line
groups showed a typical early anterior negativity. with their previous monolingual work (Friederici,
Moreover, the two groups of late bilinguals did Hahne, & Mecklinger, 1996), that differences in
show increased N400 as well as P600 responses to automatic and more effortful syntactic processing
illegal structures, even if the latter effect was de- can be indexed by these two components. The ab-
layed in comparison to the early bilinguals and sence of an early effect in late bilinguals when en-
monolinguals. As such, the differences across the countering illegal structures in their L2 would
bilingual groups were in amplitude and latency suggest that they lack automatic processes present
rather than in nature. in native speakers.
In a subsequent study (Sanders & Neville, 2003),
the ERP trace to auditorily presented materials was Variations in P600 The ERP studies cited all ex-
compared for monolinguals and Japanese-English amined L2 processing for anomalous structures,
late bilinguals. For the bilinguals, no differences in that is, sentences that contained either a semantic
the ERP trace were found between syntactic strings or syntactic anomaly in comparison to semanti-
(i.e., basically jabberwocky sentences in which cally/syntactically acceptable sentences or strings
the syntactic class of elements in the sentence was that contained only syntactic information com-
maintained and that respected English syntax, but pared to complete nonsense strings. Whether the
that were otherwise meaningless) and acoustic ERP trace will reveal differences for ambiguity res-
strings (which carried neither syntactic nor semantic olution is open to debate. In the one study of which
information). Monolingual controls, however, pro- I am aware that has used ERPs to examine the
duced differences across these conditions, both at processing of ambiguous structures in the L2 (Kotz,
specic positions in the sentences and across the 1991), it would appear that perhaps there are in-
entire sentence. As such, the authors again con- deed differences between anomaly and ambiguity
cluded that automatic grammatical processing is not processing.
acquired by those who learn their L2 later in life In the study reported by Kotz (1991), the ma-
(whereas semantic processing is unaffected by age of terials were the same as those used by Osterhout
acquisition). (1990) and Osterhout and Holcomb (1992). The
Sentence Processing 277
materials played on verb subcategorization infor- language, data on syntactic anomaly detection
mation, as illustrated by Examples 1 and 2: showed discrepancies between native and L2 pro-
cessing (Hahne, 2001; Hahne & Friederici, 2001;
1. The doctor agreed to see the patient had left Sanders & Neville, 2003; Weber-Fox & Neville,
the hospital. 1996). Why this should be so when eye movement
2. The doctor implored to see the patient had data have quite often shown that, provided suf-
left the hospital. cient prociency, L2 syntactic processing obeys
the same principles as native language process-
Sentence 1 carries an intransitive verb (agree), ing (Dussias, 2001; Frenck-Mestre, 1997, 2002;
whereas Sentence 2 carries a transitive verb (im- Frenck-Mestre & Pynte, 1997; Hoover & Dwivedi,
plore) that requires either a direct object or sen- 1998; Juffs & Harrington, 1996) is a matter worth
tential complement. Given that both native and contemplating.
procient nonnative readers readily use this type of As a possible explanation for the differences
subcategorization information (cf. Frenck-Mestre in results across eye movement and ERP studies,
& Pynte, 1997), it is to be expected that at the I suggest that there are major differences in the
preposition to the processing of Sentence 1 (in scope of ERP studies on L2 syntactic processing
which agree is the main verb) will incur less and that of the eye movement studies presented in
difculty than that of Sentence 2 (in which implore the rst half of this chapter. First, bilingual ERP
is the subordinate verb of a reduced relative clause studies have by and large examined the processing
followed by a main clause). This was born out in of syntactic anomalies (with the exception of Kotz,
the ERP data for both monolingual controls and 1991). Eye movement studies on syntactic proces-
highly procient Spanish-English bilinguals (mean sing, both mono- and bilingual, are dominated by
age of acquisition of English 5.3 years). Both the study of syntactic ambiguities. Second, the
groups showed a signicantly larger P600 (time main thrust of these ERP studies has been to test
window 600800 ms7) at the preposition to the critical period hypothesis; either direct cross-
when reading sentences containing a transitive verb longitudinal comparisons have been made between
(as illustrated in Sentence 2) than when reading early and late bilinguals concerning syntactic
sentences containing an intransitive verb (as illus- anomaly detection or, within the late bilinguals,
trated in Sentence 1). No interaction with group comparisons have been made with monolingual
(monolingual vs. bilingual) was observed. More- data concerning semantic and syntactic anomaly
over, at the subordinate verb had, the inverse detection. In the eye movement studies discussed,
effect was found: P600 was larger for sentences in only the performance of late bilinguals was under
which the rst verb was intransitive and thus did not scrutiny. Both experienced and beginning bilin-
entail a subsequent main clause (Example 1) than guals data were examined, but nonetheless for late
for those with a transitive verb, in which had bilinguals (i.e., those who learned their L2 after age
provided the main verb (Example 2). This obtained 12 years and in almost all cases in a scholastic
for both monolingual and bilingual readers. setting). The debate still rages regarding whether
The results of this ERP study showed, in line these late bilinguals can ever obtain the same level
with the eye movement studies reported in this of automatic processing as native speakers (cf.
chapter, that procient L2 readers produce similar Birdsong, 1999, and chapter 6, this volume, for a
results to those obtained for native readers, and discussion of the question), but such is beyond the
that highly specic information, such as the type scope of the present chapter.
of construction most commonly associated with a Unlike the ERP studies of anomaly processing,
particular verb class, is used by procient bilinguals the study of syntactic ambiguities has revealed
in their L2. quite coherent patterns across ERP and eye move-
ment studies. Both measures produce highly simi-
lar patterns for native and procient nonnative
What to Conclude Concerning speakers. Moreover, there are parallels between the
Second Language Syntactic ERP data for anomaly processing and eye move-
Processing in Late Bilinguals? ment data on ambiguity resolution. In both, it has
been reported that, with increasing L2 experience,
In contrast to the ERP data reported for semantic late bilinguals performance resembles that of na-
anomaly detection in the L2, which has been found tive readers to a greater or lesser degree. Note,
to be basically equivalent in the second and native however, that the ERP literature on the processing
278 Comprehension
of syntactic anomalies unanimously found late bi- 5. It should be noted that the frequency of
linguals lacking when it comes to early processing structures on its own is most likely not strong enough
decisions in these subjects.8 This is not the case in to reduce all ambiguity, as was clearly outlined by
the eye movement literature, which clearly reports MacDonald and colleagues (MacDonald et al.,
very detailed and immediate use of grammatical 1994; MacDonald, 1994, 1997). Stronger con-
straints would be provided by the culmination of
information during parsing in the L2 for highly
several factors (cf. McRae et al., 1997).
procient late bilinguals. 6. In the Hahne (2001) and Hahne and Frie-
It is my contention that, to understand just where derici (2001) studies, the time window for the P600
and why the data differ between eye movements and ranged between 500 and 1,200 ms. Otherwise
ERPs, direct comparisons must be made between the stated, it was not broken down into several win-
same subjects and, most important, for the same dows, as was the case for the Weber-Fox and
type of processing. Resolving a syntactic ambiguity Neville (1996) study. Note, nonetheless, that visual
may well entail reappraisal of a structure and repair inspection of the ERP data does not suggest an
processes; however, repair is indeed a possibility. earlier onset of P600 in the monolingual control
This is not at all immediately apparent for illegal group.
7. An earlier time window, between 500 and
structures, such as have been used in the majority of
650 ms, revealed no differences as a function of
ERP studies to date. What does the reader do structure for either the monolingual controls or
when unable to resolve a phrasestructure violation? bilingual readers.
Whereas the eye movement studies reported showed 8. It is well beyond the scope of this chapter to
considerable change in L2 syntactic ambiguity res- entertain the current disputes in the literature re-
olution with experience and near-native perfor- garding the exact nature of the different ERP
mance for the most highly skilled bilinguals, it is less components observed during syntactic anomaly
evident that improvement would be found for the detection. Note, however, that there is presently no
processing of illegal structures, such as those em- clear consensus on this issue (cf. Friederici et al.,
ployed in the bilingual ERP studies. Future research 1996; Osterhout & Hagoort, 1999; Osterhout
et al., 1996).
on L2 syntactic parsing is thus faced with an inter-
esting new avenue. Comparisons across techniques
(eye movement, ERPs, and functional magnetic References
resonance) with comparable subject pools and lin- Altmann, G. T. M., Garnham, A., & Dennis, Y.
guistic materials should provide clear advances in (1992). Avoiding the garden path: Eye move-
the understanding of this most intriguing topic. ments in context. Journal of Memory and
Language, 31, 685712.
Notes Ardal, S., Donald, M. W., Meuter, R., Muldrew, S.,
& Luce, M. (1990). Brain semantic
1. Note that this principle can be transgressed incongruity in bilinguals. Brain and Language,
in two specic cases: when the direct object is a 39, 187205.
heavy NP, as illustrated by The jury will reveal Bever, T. G. (1970). The cognitive basis for
after lunch [the verdict over which they have been linguistic structures. In J. R. Hayes (Ed.),
debating for almost three weeks.] and when the Cognition and language development
direct object is a sentential complement, as illus- (pp. 277360). New York: Wiley.
trated by The judge said on Monday [that he Bialystok, E. (1997). Why we need grammar:
would refuse to reconsider the case.]. Confessions of a cognitive generalist. In L.
2. In French, sentences such as those illustrated Eubank, L. Selinker, & M. Sharwood Smith
in parentheses are both equally licensed by the (Eds.), The current state of interlanguage (pp.
grammar (Jean boit son cafe lentement and 5561). Amsterdam: Benjamins.
Jean boit lentement son cafe). Birdsong, D. (1999). Second language acquisition
3. White (1991) also entertained the hypothesis and the critical period hypothesis. Mahwah,
that the effect observed in French readers of En- NJ: Erlbaum.
glish may be caused by differences across English Bley-Vroman, R. (1991). Processing, constraints on
and French concerning verb raising, as suggested by acquisition, and the processing of ungram-
Pollock (1989). matical sentences. In L. Eubank (Ed.), Point
4. Fodor (1999), as well as Bley-Vroman (1991) counterpoint: Universal grammar in the
and MacWhinney (1997), argued rather strongly second language (pp. 191197). Amsterdam:
against the arguments put forward by White Benjamins.
(1989a, 1989b), who situated L2 processing within Boland, J. E., & Boehm-Jernigan, H. (1998).
the greater theory of Universal Grammar. Lexical constraints and prepositional phrase
Sentence Processing 279
Friederici, A. D., Hahne, A., & Mecklinger, A. McLaughlin, J., Osterhout, L., & Kim, A. (2004).
(1996). Temporal structure of syntactic Neural correlates of second-language word
parsing: Early and late event-related brain learning: Minimal instruction produces rapid
potential effects. Journal of Experimental change. Nature Neuroscience, 7, 703704.
Psychology: Learning, Memory and McRae, K., Ferretti, T. R., & Amyote, L. (1997).
Cognition, 5, 12191248. Thematic roles as verb-specic concepts.
Haegeman, L. (1994). Introduction to government Language and Cognitive Processes, 12,
and binding theory (2nd ed.). Cambridge, MA: 137176.
Blackwell. Meuter, R., Donald, M. W., & Ardal, S. (1987). A
Hahne, A. (2001). Whats different in second- comparison of rst and second-language ERPs
language processing? Evidence from event- in bilinguals. Current Trends in Event-Related
related brain potentials. Journal of Potential Research (EEG Suppl. 40), 412416.
Psycholinguistic Research, 30, 251266. Mitchell, D. C. (1989). Verb-guidance and other
Hahne, A., & Friederici, A. D. (2001). Processing a lexical effects in parsing. Language and
second language: Late learners comprehen- Cognitive Processes, 4, 123154.
sion mechanisms as revealed by event-related Mitchell, D. C. (1994). Sentence parsing. In M. A.
brain potentials. Bilingualism: Language and Gernsbacher (Ed.), Handbook of psycholin-
Cognition, 4, 123142. guistics (pp. 375409). New York: Academic
Holmes, V. M., Stowe, L., & Cupples, L. (1989). Press.
Lexical expectations in parsing Mitchell, D. C., & Brysbaert, M. (1998)
complement-verb sentences. Journal of Challenges to recent theories of language
Memory and Language, 28, 668689. differences in parsing: Evidence from Dutch.
Hoover, M. L., & Dwivedi, V. D. (1998). Syntactic In D. Hillert (Ed.), Sentence processing: A
processing by skilled bilinguals. Language cross-linguistic perspective (pp. 313335).
Learning, 48, 129. New York: Academic Press.
Juffs, A., & Harrington, M. (1996). Garden path Mitchell, D. C., Cuetos, F., Corley, M. M. B., &
sentences and error data in second language Brysbaert, M. (1995). Exposure-based models
sentence processing. Language Learning, 46, of human parsing: Evidence for the use of
283326. coarse-grained (non-lexical) statistical records.
Kimball, J. (1973). Seven principles of surface Journal of Psycholinguistic Research, 24,
structure parsing in natural language. 469488.
Cognition, 2, 1547. Mitchell, D. C., & Holmes, V. M. (1985). The
Kotz, S. A. (1991). Event-related brain potentials: role of specic information about the verb
A sensitive measurement of bilingual sentence in parsing sentences with local structural
comprehension? Unpublished masters thesis, ambiguity. Journal of Memory and Language,
Tufts University, Boston, MA. 24, 542-559.
Kutas, M., & Hillyard, S. A. (1980). Reading Ni, W., Crain, S., & Shankweiler, D. (1996).
senseless sentences: Brain potentials reect Sidestepping garden paths: Assessing the
semantic incongruity. Science, 207, 203205. contributions of syntax, semantics and
MacDonald, M. (1994). Probabilistic constraints plausibility in resolving ambiguities. Language
and syntactic ambiguity resolution. Language and Cognitive Processes, 11, 283334.
and Cognitive Processes, 9, 157201. Osterhout, L. (1990). Event-related brain
MacDonald, M. (1997). Lexical representations potentials elicited during sentence compre-
and sentence processing. Language and hension. Unpublished doctoral dissertation,
Cognitive Processes, 12, 121136. Tufts University, Boston, MA.
MacDonald, M. C., Pearlmutter, N. J., & Osterhout, L., & Hagoort, P. (1999). A supercial
Seidenberg, M. S. (1994). The lexical nature of resemblance does not necessarily mean you are
syntactic ambiguity resolution. Psychological part of the family: Counterarguments to
Review, 101, 676703. Coulson, King, and Kutas (1998) in the P600/
MacWhinney, B. (1997). Second language SPS-P300 debate. Language and Cognitive
acquisition and the competition model. In A. Processes, 14, 114.
M. B. de Groot & J. F. Kroll (Eds.), Tutorials Osterhout, L., & Holcomb, P. J. (1992).
in bilingualism: Psycholinguistic perspectives Event-related brain potentials elicited by
(pp. 113142). Mahwah, NJ: Erlbaum. syntactic anomaly. Journal of Memory and
MacWhinney, B. (2001). The competition model: Language, 31, 785806.
The input, the context and the brain. In Osterhout, L., Holcomb, P. J., & Swinney, D. A.
P. Robinson (Ed.), Cognition and second (1994). Brain potentials elicited by garden-
language instruction. Cambridge: Cambridge path sentences: Evidence of the application of
University Press. verb information during parsing. Journal of
Sentence Processing 281
285
286 Production and Control
production system, like lexical access, grammatical the precise characteristics of the preverbal input re-
encoding, and the like, but I do not imply by the use presentation to language production processes and
of this term that these processing components are the more general question about the relation of
necessarily operating in a strict modular fashion. language and thought. These approaches in mono-
With respect to the rst topic, it appears that in lingual production research have provided important
large parts of monolingual language production re- insights into the processing details of the language
search, a clear specication of the format and the role production process, and they have proven a useful
of the conceptual input has been avoided more or less heuristic for experimental research on monolingual
systematically. This holds primarily for monolingual language production.
production research aiming at uncovering the precise However, the present chapters on bilingual
(temporal) details of the working of the respective production show that the topic of an adequate
modules or processing components under investiga- characterization of the preverbal input is unavoid-
tion. Of course, there are exceptions to this (e.g., able when talking about bilingual language pro-
Slobin, 1996). But, at least most of the experimental duction. For example, La Heij discusses the question
research that tries to uncover the details of the lin- whether all necessary information for accessing the
guistic encoding processes tends to avoid the com- appropriate lexical entry should be assumed to be
plexities of the conceptual input. present at the conceptual preverbal message level.
To understand the potential reasons for this If so, we would end up with what La Heij calls
state of affairs, it is useful to realize that language complex access/simple selection. The general
production research has always been in a tenuous approach by La Heij is a very attractive one, but it
position between leaving the input and output sides requires precise specication of the contents and
relatively unconstrained on the one hand and on properties of the conceptual input. Costa assumes
the other hand trying to have as much (experi- a somewhat more complex and intelligent selection
mental) control over input and output as possible. mechanism that only considers the actual target
The following quotations exemplify two extreme language during the selection process. Also, here we
positions on this topic. Butterworth (1980) stated: have to assume that some information from the
It would be extraordinarily optimistic to set up conceptual level plays a role.
manipulations of the input and to expect to nd Is the need for a sufciently explicit specication
systematic outputs, unless the subject is so limited of the conceptual input restricted to bilingual lan-
in what he is allowed to say that generalisations guage production, or does it also gure in mono-
to natural speech become almost impossible (p. 2). lingual production? Despite the success of an
In clear contrast to this position, Rosenberg (1977) experimental approach that keeps the conceptual
stressed the need for experimental approaches of the input constant and thus tries to bypass the problem
type used in experimental cognitive psychology: of a more detailed specication of the conceptual
input, it appears that monolingual production re-
It should be clear by now that determinants and search also will be forced to face this problem. Just to
organisation of the speech production process name two problems (beyond those provided in the
are not likely to be resolved unless we turn our chapter by La Heij): How do speakers choose be-
attention toward the development of manip- tween levels of specicity for referring to objects
ulative research paradigms which (a) insure depending on the (presumed) expertise of their
adequate control over the input, (b) limit in- interlocutors? And, how does context precisely affect
formation processing demands, and (c) constrain the choice between a more or less-specic word? Just
the speakers responding. (p. 196) to give an example for the latter question, assume
you want to refer to a specic sh in a shpond (e.g.,
As Bock (1995, 1996) noted, a current practice in a carp). At what level of processing does a speaker
monolingual research has been to bypass the dif- lter out the basic level name sh and decides to
cult problem of an explicit characterization of the use the more specic name carp? From a broader
conceptual input to language production by keep- perspective, this is not very different from ltering
ing the eliciting stimulus the same across experi- out the German word for carp, Karpfen when you
mental manipulations. Typical examples of this want to refer to a carp in English and not in German.
approach are implicit priming paradigms, picture The obvious question is whether this problem is
word interference experiments, and sentence com- going to be solved in the same way when you are
pletion tasks. These approaches try to avoid the dealing with one language, your rst language, or
difcult and largely unresolved issue of specifying with two languages, your rst and second languages.
Introduction to Part III 287
Put differently, are the mechanisms for ltering out to opt for a different solution, namely, keeping the
the inappropriate name the same within a language processing modules as local, automatic, and simple
and between two languages? And, more important as possible while locating control processes and re-
in the present context, are these mechanisms located lated processes outside the processing modules
at the conceptual level or at a level of the linguistic proper. The present chapters show that this question
formulation processes? Thus, in my view, the present plays a much more central role in bilingual pro-
chapters strongly suggest that we will not be able to duction than in monolingual production. I hope that
avoid a more detailed characterization of the con- the present chapters will bring this topic back to the
ceptual input as soon as we start to look at situations attention of those primarily interested in mono-
that are a bit more complex and a bit more con- lingual production.
textualized than those currently used in most of To summarize, the present chapters on bilingual
research on monolingual production. production provide a lot of food for thought, not
Let us now turn to the second topic: control, only for those interested in bilingual language
switching, and automaticity. Although these each production, but also for those primarily interested
can be seen as rather different issues, they have a in monolingual production. Put differently, bilin-
common underlying basis that plays a role in all gual language production is not only an interesting
chapters of the present section, but is most explicitly topic by itself, but also (re)introduces topics highly
present in the last three chapters. When reading relevant for monolingual production research.
the chapters, one is confronted with the question
whether control processes and related processes are
References
to be located outside the respective processing
module, or whether they are part of the module it- Bock, J. K. (1995). Sentence production: From
self. This issue surfaces much more clearly in bilin- mind to mouth. In J. L. Miller & P. D. Eimas
gual production research, but it is also of central (Eds.), Handbook of perception and cognition.
importance for models of monolingual production. Volume 11: Speech, language, and commu-
As already stated above, one central question in this nication (2nd ed., pp. 181216). San Diego,
CA: Academic Press.
context is the following: How much of intelligent
Bock, J. K. (1996). Language production: Methods
or complex processing is going on in the linguistic and methodologies. Psychonomic Bulletin and
formulation modules (like lexical access and selec- Review, 3, 395421.
tion, grammatical encoding, and phonological en- Butterworth, B. (1980). Introduction: A brief
coding) proper, and how much should be located review of methods of studying language
outside these modules at other levels like conceptual production. In B. Butterworth (Ed.), Language
preparation, task schemas, and the like? The production (Vol. 1, pp. 117). London:
proposals range from very simple, highly auto- Academic Press.
matized formulation modules functioning in a bal- Fodor, J. A., Bever, T. G., & Garrett, M. F. (1974).
listic fashion once triggered, with very simple local The psychology of language. New York:
Crowell.
processing principles, to sophisticated procedures
Levelt, W. J. M. (1989). Speaking. From intention
within each module, like verication procedures, to articulation. Cambridge, MA: MIT Press.
selective inhibition, tags and ags, and so on. Levelt, W. J. M., Roelofs, A., & Meyer, A. S.
It is interesting to look at this issue from a his- (1999). A theory of lexical access in speech
torical perspective. In Levelts (1989) seminal production. Behavioral and Brain Sciences,
book on language production, the different proces- 22, 175.
sing components were highly locally operating de- Rosenberg, S. (1977). Semantic constraints on
vices working in an automatic and ballistic fashion. sentence production: An experimental ap-
In the course of time, these modules have become proach. In S. Rosenberg (Ed.), Sentence pro-
more and more intelligent, partly as a reaction to duction: Developments in research and theory
(pp. 195228). Hillsdale, NJ: Erlbaum.
new empirical evidence (compare, e.g., Levelt,
Slobin, D. I. (1996). From thought and language
1989, with Levelt, Roelofs, & Meyer, 1999). As a to thinking for speaking. In J. Gumperz &
consequence, we are now talking about much more S. Levinson (Eds.), Rethinking linguistic
complex and intelligent within-module processes relativity (pp. 7096). Cambridge, England:
than we did 10 years ago. However, one might want Cambridge University Press.
This page intentionally left blank
Wido La Heij
14
Selection Processes in Monolingual
and Bilingual Lexical Access
ABSTRACT How do bilinguals selectively retrieve words from either the rst or second
language when both words express the same conceptual content? Formulated in this
way, this problem is very similar to the one faced by monolinguals when a preverbal
message does not uniquely specify a single lexical item (the convergence problem). In
this chapter, I argue that this convergence problem is nonexistent if one assumes that the
preverbal message contains all necessary information, including affective and pragmatic
features, to uniquely specify a single word. In monolinguals, these features may indicate
the intention to use slang, formal language, or a euphemism. In bilinguals, one of these
features may indicate that a rst or second language word is required. If the preverbal
message indeed contains all necessary information, lexical selection can be a simple
process based on information that is locally available in the activation levels of words.
Like the model proposed by Poulisse and Bongaerts (1994), the resulting model of lexical
access can be characterized as complex access, simple selection. Arguments are pro-
vided against recent models of monolingual and bilingual lexical access in which lexical
selection is a complex, nonlocal process that needs (a) situational knowledge (about, for
instance, the language to be spoken) and (b) knowledge about the activated words (for
instance, whether they belong to the rst or second language).
289
290 Production and Control
language (L1). First, I discuss the nature of the semantic interference effect: the observation that
conceptual information that the speaker wants to pictures are named slower when accompanied by a
convey (the preverbal message) and the problem semantically related context word than when ac-
that this preverbal message may not uniquely companied by an unrelated context word. For ex-
specify a single word (the convergence problem). ample, the naming of the picture of a dog takes
Next, two selection processes are discussed that are more time when the context word cat is super-
commonly assumed in language production mod- imposed than when the word pen is superimposed.
els: the selection of the conceptual information to In the following sections, a small number of ex-
be lexicalized (concept selection) and the selection perimental results are discussed that were obtained
of the response word from a set of activated words with this pictureword interference paradigm.
(lexical selection). This discussion leads to a pro- In recent years, a discussion has started about
posal of how words are retrieved in the L1. what is retrieved on the basis of a preverbal mes-
The nal section discusses the implications of sage, an abstract word representation (lemma) or a
this proposal for bilingual lexical access. In that phonological word form (e.g., Caramazza, 1997;
section, I propose a simple modular account of Caramazza & Miozzo, 1998; Harley, 1999; Roe-
bilingual lexical access that has two main charac- lofs, Meyer, & Levelt, 1998). In this chapter, I use
teristics. First, the language in which the bilingual the theoretically neutral term lexical representation
intends to speak isin the form of a language or word. Finally, I use the following convention:
cuepart of a complex preverbal message Nonverbal representations (concepts) are presented
that contains all conceptual, pragmatic, and affec- in uppercase letters (e.g., the picture activates the
tive characteristics of the word to be retrieved. concept DOG). When a word denotes a stimulus,
Second, the actual selection of a word is a relatively verbal response, or lexical representation, italics
simple process mainly based on the activation lev- are used (e.g., the concept activates the word
els of lexical representations. In this view, lexical dog, the word dog was presented, and the
selection does not evaluate the appropriateness of a speaker produced the word dog).
word that is about to be selected or involve the
selective activation or inhibition of words that be-
long to a particular language. This view, which
could be characterized as complex access, simple Models: The (Preverbal) Message
selection, is contrasted with recent proposals that
could be characterized as simple access, complex In this chapter, only those aspects of models of
selection. language production are discussed that are relevant
for the retrieval of a single word. Basically, all of
these models distinguish two systems, a conceptual
system and a lexicon. The conceptual system con-
A Word on Experimental tains world knowledge in the form of nonverbal
Paradigms, Findings, and representations. In some models, a separate lexical-
Terminology semantic system is distinguished that contains the
meaning of words. However, this difference is not
Since the early 1990s, online research methods well articulated, and the labels conceptual system
have become increasingly popular in language and semantic system are often used interchangeably
production research. Of these methods, the picture (see Francis, chapter 12, this volume). The second
word interference taska variant of the color system is often referred to as the mental lexicon
word Stroop taskis most frequently used. In this or just lexicon. The lexicon contains represen-
task, a target picture and a context word are pre- tations of words with their syntactic, phonological,
sented, and the participant is required to name the and (in some models) semantic characteristics.
picture while ignoring the context word. Manipu- At a very general level, the naming of an object
lating the syntactic (Schriefers, 1993), semantic (La with a bare noun is thought to comprise the fol-
Heij, 1988; Underwood, 1976), and phonological lowing steps (see Fig. 14.1 for a simple pictorial
(Posnansky & Rayner, 1977; Schriefers, Meyer, & representation). First, the visual processing of the
Levelt, 1990) similarity between the context word object ultimately leads to the activation of a rep-
and the name of the picture has been used to shed resentation in the conceptual system. When that
light on many aspects of lexical access. Most rele- happens, the object is recognized. That is, in-
vant for the discussion in this chapter is the formation about the object, like function, smell,
Selection Processes in Lexical Access 291
Figure 14.1 A simple model of lexical access. A picture of a dog is presented for naming. At the conceptual
level (below the dotted line), the corresponding conceptual representation is selected (concept selection).
At the lexical level (above the dotted line), a number of semantically related word candidates become
activated. Lexical selection determines which of these words will be produced.
taste, and so on, becomes available. Because other position to give an order, they want to see the
conceptual representations may be activated (e.g., appropriate action taken. In other words, by pro-
by other stimuli in the speakers environment), the ducing a verbal message speakers intend to
conceptual information that the speaker wants to achieve a communicative goal (Levelt, 1989).
express has to be selected. This selection process is Given this goal of a speaker, we may agree
referred to as concept selection. All current models on the following, very pragmatic denition of
assume that lexical access does not result in the differences in meaning between verbal messages: If
activation of a single lexical representation. In- an addressee in a particular situation reacts differ-
stead, it is assumed that many words become ac- ently to verbal messages A and B, thenfor that
tivated, either because their meaning overlaps with addressee in that situationthe messages A and B
the content of the preverbal message or because not differ in meaning. There are theories that seem to
only the selected concept, but also all activated dene meaning in such a way that utterances A and
concepts activate their lexical representations. A B may have the same meaning but nevertheless
second selection process (lexical selection) is re- induce different reactions on the part of the ad-
sponsible for the selection of one word from this set dressee. As discussed here, these theories of mean-
of candidates. Lexical selection ultimately results in ing do not seem a very useful starting point for
the availability of a phonological word form that understanding the behavior of speakers.
can be articulated. Because speakers want to achieve a communi-
In addition, it is useful to note that Levelt (1989) cative goal, it is evident that in preparing an ut-
proposed a monitoring process that uses the word terance, they have to anticipate as much as possible
comprehension system to determine whether the how a particular addressee in a particular situation
meaning of the word about to be articulated mat- will react. To that end, the speaker should take
ches the content of the preverbal message. If no many aspects into account: the addressees age,
match is obtained, this may lead to a covert re- likes and dislikes, intelligence, language skills,
pair. When the monitor detects a mismatch during sense of humor, social position, occupation, and so
or after articulation, an overt repair may result. on. In addition, the speaker should adjust the ut-
terance to the specic social context. Within so-
The Verbal Message ciolinguistics, some of these factors are discussed
under the headings style and register, which refer
Usually, speakers want to achieve something with respectively to the language required in a specic
their utterances. If they ask a question, they like to situation and to the language used within specic
hear it answered; if they provide information, they socioeconomic groups (e.g., occupational groups
want it to be understood, and if they are in the and teenagers; see also Levelt, 1989).
292 Production and Control
The following contextual factors that determine I discuss two situations in which problems of this
our choice of words can be thought of. First, the sort are argued to arise: the hyperonym problem
social context may require a speaker to use formal and the word-to-phrase synonymy problem.
language (e.g., in a meeting you may address a At the basis of Levelts (1989) and Roelofss
good friend and colleague as Mr. Brown instead (1992, 1996) discussion of the convergence prob-
of the usual Jim) or to use slang (Im going lem is the feature theory of word meaning. This
bonkers between those jerks). Second, known theory, as presented by Levelt and Roelofs, pro-
sensitivities of the addressee may require the use poses that the meaning of a word consists of a list
of euphemisms (varying from rest room to of conceptual features, comparable to what is
collateral damage) or the avoidance of taboo found in a dictionary. For example, the concept
words. Third, limited language skills of the ad- MOTHER may be represented by two conceptual
dressee may make you refrain from using low-fre- features, PARENT and FEMALE. If the speaker
quency words (e.g., avoid eloquent when talking wants to express a message containing the concept
to a child or to a nonnative speaker). Finally, MOTHER, the conceptual features PARENT and
knowledge about an addressees familiarity with FEMALE will be activated. The hyperonym prob-
the conversation topic may determine your choice lem refers to the fact that the activated feature
of category level (to a layperson, you may decide to PARENT is a sufcient condition for the retrieval
use the phrase statistical analysis instead of the of the word parent. So, why not retrieve the hy-
more precise t test). peronym parent instead of the hyponym mother?
Even in a laboratory situation, speakers could The word-to-phrase synonymy problem refers to
react to stimuli in very different ways. One and the the fact that, in this situation, the system seems
same stimulus may be called a line drawing, a unable to decide between the word mother and the
picture of an animal, animal, hound, dog, young phrase female parent.
dog, puppy, whelp, Labrador, or pooch (see also Levelt (1989) considered the solution of the
Levelt, Roelofs, & Meyer, 1999). It is mainly be- hyperonym problem a touchstone for theories of
cause of a contextual factor (the instruction given) lexical access. Indeed, many researchers (e.g.,
that in most experiments participants produce Bierwisch & Schreuder, 1992; Caramazza, 1997;
formal, basic category, bare noun utterances (dog). De Bot & Schreuder, 1993; Roelofs, 1992, 1996)
Nevertheless, the models discussed in this chapter made serious efforts to provide a solution. Levelt
often simplify matters by assuming that, in naming formulated three principles, including a principle
tasks, the concept activated by a target stimulus of specicity that should prevent the system from
(e.g., the concept DOG activated by the picture of a producing hyperonyms instead of the intended
dog) directly activates the corresponding response hyponyms. On the basis of a similar analysis of the
word in the mental lexicon (e.g., the word dog). convergence problem, Roelofs decided to abandon
However, it is probably more realistic to look at the feature theory of word meaning altogether.
the activation and selection of a conceptual repre- Instead, he proposed a nondecompositional view of
sentation as a step in the construction of a pre- word meaning in which lexical concepts like FE-
verbal message (Levelt, 1989), a process in which MALE, PARENT, and MOTHER are represented
contextual factors (including task instructions) also by separate nodes. In his view, lemma retrieval
are taken into consideration. starts with the activation of MOTHER and not
with activation of the features FEMALE and
PARENT.
The Convergence Problem The proposal to reject a feature theory of word
meaning seems in line with developments in the
The issue of how word meaning is represented is area of concept representation, in which the idea
notoriously difcult, and it is quite understandable that a concept can be dened by a list of (necessary
that many students of language production have and sufcient) features is severely criticized. In
not made it the main focus of their research efforts. addition to Wittgensteins (1953) theoretical point
One of the aspects of meaning representation that that some concepts (e.g., GAME) do not seem to
did receive attention, mainly because of the work have a unique dening feature, there are several
by Levelt (1989) and Roelofs (1992, 1996), is the empirical arguments (see, e.g., Harley, 2001). For
so-called convergence problem: the problem that example, a simple feature theory has difculty in
the conceptual information in a preverbal message accounting for the typicality effect. Oranges and
may not uniquely specify one lexical item. Here, olives are both fruit and should both possess the
Selection Processes in Lexical Access 293
feature FRUIT. Nevertheless, it takes much longer phrase. This does not have to surprise us because
to verify the sentence an olive is a fruit than to there is nothing in the feature theory that forces us
verify the sentence an orange is a fruit. to assume that all conceptual features of a word
Another criticism is that a simple feature model have linguistic counterparts (words that express
seems to assume that the meaning of a word them). For example, it seems very difcult to con-
is xed. Often, however, word meanings seem vey the mildness that characterizes a euphemism
fuzzy. First, speakers may disagree about the in any other way than by using the euphemism it-
category membership of words. McCloskey and self. Clearly, if there is no word-to-phrase synon-
Glucksberg (1978) mentioned, for instance, that ymy in natural languages, there is no word-to-
half of their participants thought that stroke was a phrase synonymy problem to be solved.
disease, and half thought that it was not. Second, Levelts (1989) hyperonym problem seems to
word meaning seems to depend on the nonverbal be based on a similar logical aw. Roelofs (1996)
or verbal context (see, e.g., Barsalou, 1982). The presented this problem in the following way: If
word mother can have a purely technical meaning the conceptual conditions for the application of a
in one situation (the one who has given birth to word such as father are met, then those of its hy-
or female parent), but in many other situations peronyms such as parent are automatically satised
mother is better characterized as the one who loved as well (p. 309). Again, the question is, How can
and protected you during childhood, irrespective of we be so sure about this premise? Surely, fathers
a biological relation. belong to the category of parents, but it is not
Despite these concerns about a feature theory of clear whether this implies that the set of features
word meaning, it is questionable whether Levelts relevant to a word includes those relevant to its
(1989) and Roelofss (1992) analysis of this theory hyperonyms (p. 314). That is, apart from the def-
is adequate and consequently whether Roelofss inition, no arguments are provided that the full
rejection of all feature theories on the basis of that meaning of a category name (the word) has to be a
analysis is warranted. Let me rst examine the subset of the meaning of the names of the catego-
word-to-phrase synonymy problem: Why not say rys exemplars. In fact, it is pretty obvious that this
female parent instead of mother despite the fact is not the case. Category names have a generality
that both utterances express the same conceptual or unspecicity that the names of exemplars of a
content (Roelofs, 1996, p. 309; see also Fodor, category lack. For example, it could be argued that
1980)? Somewhat surprisingly, many psycholin- one important feature of PARENT is GENDER
guists seem to take Levelts and Roelofss premise NOT SPECIFIED or GENDER IRRELEVANT.
that mother and female parent express the same Clearly, that feature cannot be part of the meaning
conceptual content for granted. But what makes of mother. So, when the conceptual conditions for
them so sure of that? The answer to this question is mother are met, those for parent are not auto-
as simple as it is disconcerting: They are sure of matically satised. Relevant in this context is that,
that because they rst dene the conceptual content in later developments of the feature theory (e.g.,
of MOTHER as FEMALE PARENT. If that de- Smith & Medin, 1981), the identication of a word
nition is incorrect or incomplete, the notorious as a member of a category was not based on a
word-to-phrase synonymy problem is like dening complete overlap between the two sets of features
your cat as a dog and then trying to understand but on the amount of overlap. These probabilis-
why it does not bark. tic feature models can also account for the ob-
Dening CAT as DOG does not make it a dog servation that word meanings seem fuzzy.
and dening MOTHER as FEMALE PARENT In conclusion, there is little we can say with
does not make her (just) a female parent. Above, certainty about what comprises the meaning of a
I suggested a simple way of testing whether two word. Clearly, dictionary denitions and simple
utterances differ in meaning. Let us apply this test feature lists do not do any justice to its complexity,
to the present example: Next time you visit your fuzziness, and context dependency. Therefore, the
mother, greet her with a cheerful Hi, female only information we can rely on is the speakers
parent! and see what happens. Most probably, and the addressees behavior. If an addressee
this test will show that the premise that these two reacts differently to mother than to female parent,
utterances express the same meaning is incorrect. this denes a difference in meaning. In formulating
Similar tests of other word-phrase pairs will most the hyperonym problem, Levelt (1989) may have
probably show that there are very few words that approached the issue of concept representation
express the same meaning as a corresponding from the wrong direction. Instead of looking at the
294 Production and Control
behavior of language users, heand many others wants to refer to the animal that tries to rip off his
started with a rather poor and unrealistic feature pants. In the rst stage of lexical access that is only
list representation of word meaning and either based on core meaning, the words creature, beast,
wondered how our language production apparatus animal, dog, Doberman, and somewhat more un-
repairs the resulting problems or decided to likely, pooch, all become active. Next, a selection
abandon feature theories altogether. process has to be assumed that selects the most
For the present purposes, the most important adequate word from this set. Evidently, this process
conclusion is that if it is assumed that the meaning has to be quite complex. First, it should know
of a word includes subtle pragmatic and affective about the specic characteristics of the situation
aspects, the underlying conceptual representation is and has to consider questions like: Is formal lan-
so rich and complex that it can only be expressed guage required? Will a slang word be understood?
by using that particular word. That is, natural Does one of these words convincingly express the
languages probably do not contain word-to-phrase ferocity of the attacker? Will the addressee under-
synonymy pairs or hyperonyms as dened by Levelt stand that I am referring to a dog when I use the
(1989). Therefore, there is no convergence problem word Dobermann? Second, to allow for the lexical
to be solved. selection process to make its selection, the activated
words have to carry these meaning components
with them.
The Preverbal Message In the Lexical Selection section, the probability
of such an intelligent selection mechanism is
Possibly, psycholinguists who tried to nd solu- discussed in combination with the idea that
tions for the convergence problem implicitly as- word representations contain tags that, for in-
sumed that the meaning of a word can be stance, represent their pragmatic and affective
decomposed in a set of core elements and some characteristics. To anticipate the conclusion, this
paraphernalia that include, for instance, the prag- solution is highly unparsimonious given that all
matic and affective factors discussed in the previ- information needed to retrieve the appropriate
ous sections. Moreover, the assumption may have word is available in the conceptual system. The
been that these two sets of meaning elements play most likely and most parsimonious answer to
a different role in lexical access. The rst step, in the issues raised in this section is therefore that
which a set of lexical items is activated, may be the preverbal message contains all information
based on the core meaning, whereas ner selection necessary to select the appropriate word. This in-
from this set, on the basis of pragmatic and affec- formation includes cues as SLANG WORD AP-
tive factors, is postponed until later processing PROPRIATE, EUPHEMISM PREFERABLE, and
levels. With respect to such a view, two questions FIRST NAME ALLOWED.
can be raised. First, is it realistic to assume a hard
distinction between the core meaning of a word
and other aspects of its meaning? Second, how does Is Lexicalization Limited to
the more rened selection of words at a later pro- the Preverbal Message?
cessing level take place? I briey discuss these two
issues in turn. I proposed that lexical access is based on a complex
After what has been said about the complexity nonverbal representation that contains all neces-
and fuzziness of word meaning, it seems very un- sary information to arrive at the correct word. It
likely that nature has endowed us with a system seems reasonable to assume that this complex
that neatly distinguishes between two sets of representation is all that we use during the process
meaning components and uses them in different of lexical access. Indeed, in Levelts (1989) original
ways. I can refer to Wittgenstein (1953) again: blueprint of the speaker, the preverbal message
What is the core in the meaning of words like game was the only input to the formulator. Therefore, it
and art? Are the strong emotional connotations of may come as a surprise that current models of lexical
the word mother part of the core or just para- access (e.g., Levelt et al., 1999; Starreveld & La Heij,
phernalia, as female parent suggests? 1996) assume that during lexical access all activated
Despite these doubts, assume for a moment that concepts activate lexical representations. These
word meanings can be subdivided, and that only activated concepts may be part of the preverbal
the core meaning is used in the rst step of lexical message, but could also be activated by other, ir-
access. Examine the following example: A speaker relevant, objects in the speakers environment or by
Selection Processes in Lexical Access 295
spreading activation within the conceptual system. our laboratory (Bloem & La Heij, 2003) provided
For example, if a speaker is asked to name the strong evidence against this assumption. We used
picture of a dog, spreading activation from the a language production task in which the to-
concept DOG to related concepts like CAT and be-named targets were accompanied by either
HORSE will result in the activation of the words context words or context pictures. As expected,
cat and dog at the lexical level (see Fig. 14.1 for an context words induced semantic interference.
illustration). However, the corresponding context pictures in-
The main reason for this somewhat counterin- duced semantic facilitation, contrary to the pre-
tuitive assumption seems to be that it realizes in a dictions derived from computer implementations
simple way the activation of a cohort of semanti- of the Levelt et al. (1999) and the Starreveld and
cally related words during lexical access. Evidence La Heij (1996) models. In these models, context
for the activation of a semantic word cohort comes pictures automatically activate their names at the
from a number of observations: the occurrence of lexical level, which renders the situation very similar
semantically related speech errors (sister instead of to the wordcontext condition. To account for se-
mother), the occurrence of blends of semantically mantic facilitation with picture context, we pro-
related words (e.g., stummy, a blend of stomach posed two modications of current models: (a)
and tummy; Fromkin, 1973), and the semantic in- Lexicalization is conned to the preverbal message
terference effect in word production, discussed at (as proposed by Levelt, 1989), and (b) this preverbal
the beginning of this chapter (see Glaser & Glaser, message activates, in addition to the sought-for
1989; Roelofs, 1992; and Starreveld & La Heij, word, a cohort of semantically related words (as
1996, for details). Whereas such a cohort activa- originally proposed in Mortons 1969 and in Le-
tion may result naturally from a feature-based velts 1989 models of speech production). These
conceptual representation, models in which holistic assumptions are illustrated in Fig. 14.2.
conceptual representations are assumed may need
the spread of activation in combination with par- Conclusions
allel lexical access to arrive at the same result.
Despite these considerations, the idea that The main point that I have tried to make in
during lexical access all activated concepts acti- this section concerns the content of the preverbal
vate their lexical representations seems rather message. Arguments were presented in favor of
counterintuitive. Moreover, recent observations in the view that the preverbal message contains all
Figure 14.2 The model in Fig. 14.1 modied so that lexicalization is conned to one preverbal message,
and the preverbal message contains cues that further specify the sought-for word.
296 Production and Control
information necessary to retrieve the word that and one in green (e.g., Humphreys, Lloyd-Jones, &
realizes the speakers communicative goal. To that Fias, 1995), this process has to make sure that the
end, the speaker has to take into account a large conceptual representation of the red picture con-
amount of information about the context of trols the response. I use the term concept selection
the conversation and about the characteristics of for this process.
the addressee. Given the complexity of the prever- As discussed, during the process of lexicaliza-
bal message, it seems unlikely that situations will tion not only the intended word, but also seman-
arise in which this message species more than one tically related words will become activated to some
lexical item. As a consequence, there is no con- extent. Lexical selection is the process that selects
vergence problem to be solved. In addition, I dis- one word from this set of candidates for further
cussed evidence that only the preverbal message is processing (phonological encoding or articulation).
capable of activating words at the lexical level. To Concept selection and lexical selection are dis-
account for incidental speech errors, blends, and cussed in turn.
semantic interference in the pictureword task, it
has to be assumed that lexical access is not perfect.
That is, in addition to the correct word, other Concept Selection
words with meanings that partly overlap the con-
tents of the preverbal message become activated. Concept selection is discussed with the help of the
connectionist Selective Attention Model (SLAM) of
Phaf, Van der Heijden, and Hudson (1990). SLAM
was developed to simulate tasks in which an aspect
Models: Selection and Control of one visual stimulus (its form, color, or position)
Processes had to be named and other stimuli in the visual
eld had to be ignored. So, although the model was
All computer implementations of the models in not developed to provide an account of lexical ac-
the previous sections of this chapter consist of cess, it does simulate tasks that are very similar to
processing levels containing nodes that represent the usual Stroop-like interference tasks used in
conceptual/semantic and lexical representations. language production research.
These nodes are connectedunidirectionally or SLAM assumes three processing levels: (a) an
bidirectionallyby links. Although the nature of early mapping level; (b) a feature level, which
the links differs, they all share the capability to pass consists of three specialized modules: a form
activation from one node to another, resulting in (or identity) module, a position module, and a
an increase (excitatory connections) or decrease color module; and (c) a response (word) level.
(inhibitory connections) in activation level of the Stimulus presentation is simulated by activating
receiving node. In the implemented models, the corresponding representations at the mapping
the presentation of a stimulus picture is simulated level. These representations combine two stimulus
by raising the activation level of the nodes either at features, for example, red left and blue square.
an early visual level or at the conceptual level. Activation spreads from the mapping level to the
From that moment, activation spreads through the identity, position, and color modules at the feature
model. level. Next, activated feature representations in
This spread of activation in itself, however, these modules (e.g., RED, BLUE, LEFT, and CIR-
typically does not result in a response. To produce CLE) send activation to the corresponding names
responses, the task instructions (e.g., name the red at the response level (e.g., red, blue, left, and
picture and ignore the green picture) have to be circle).
implemented in the model. This is realized by se- When a stimulus is presented without instruc-
lection or control processes. In most models, two tion, the implemented SLAM model gives no re-
types of selection or control processes can be dis- sponse or a random response. A theoretical analysis
tinguished. Under the assumption that all stimuli of a typical instruction in a ltering task (e.g. name
presented are identied in parallel, the rst type of the color of the gure on the left ) revealedin
selection process determines what activated con- addition to the name instruction that is not fur-
ceptual information has to be lexicalized (compare ther discussedtwo essential components. First,
Allports, 1987, and Van der Heijdens, 1996, no- the part of the instruction that says that the color
tions of selection for action). In a task in which, aspect of the gure (not its position or form) has to
for instance, two pictures are presented, one in red be named is supposed to result in a preactivation of
Selection Processes in Lexical Access 297
is thought to be a rather complex process that uses nonverbal stimuli do. As a consequence, shortly
information about, for instance, the task instruc- after the presentation of, for example, the picture
tions and the source of activation of words. Roe- of a dog with the word cat superimposed, the in-
lofss (1992) original model of lexical access correct word cat will be highly activated. How does
belongs to this class of models. As in Starreveld and the system prevent this word from selection and
La Heijs (1996) model, the conceptual represen- articulation? Within research on selective atten-
tation of the target concept is given additional tion, this question proved to be one of the most
activation (signalling activation). However, difcult to answer (see, e.g., Keele, 1973; Morton,
Roelofs also assumed that when activation spreads 1977; Van der Heijden, 1981). Two types of so-
along the links of the network, it leaves tags at lutions can be distinguished: solutions that main-
each node reached, specifying the source of the tain the assumption that lexical selection is only
activation. So, in a pictureword interference task, based on activation levels and solutions that in-
there are picture tags and word tags attached volve checking processes of some sort.
to nodes at the conceptual and lexical level. In The probably simplest solution was im-
addition, permitted response words in an experi- plemented in the connectionist model of lexical
ment (or all words when no response set is pre- access proposed by Starreveld and La Heij (1996;
dened) receive a ag. In the pictureword see also Cohen et al., 1990). These authors ad-
task in which the picture has to be named, the justed the parameters in the model in such a way
lexical selection process determines which word thatwithin the stimulus-onset asynchrony range
has both a picture tag and a ag. Although this examinedthe mere presentation of a word does
combination uniquely species the correct response not sufce to make its lexical representation reach
word, it is assumed that selection cannot take place the difference threshold. That is, even when a single
before a difference threshold in activation has been word is presented, task activation is necessary to
reached. make the word reach the difference threshold. This
In simple object-naming tasks, Roelofss (1992) solution is elegant because of its parsimony, but not
and Mortons (1969) lexical selection mechanisms everyone will be convinced that task activation is
probably perform equally well. That should be necessary to select the lexical representation of a
reason enough to prefer the simplest solution. But, visually presented word.
do these selection processes perform equally well In SLAM (Phaf et al., 1990), a somewhat dif-
under all conditions? That is the topic of the next ferent solution is chosen. At the lexical level, a node
section. is assumed that is highly active at the start of a trial
and that inhibits the activation of all other lexical
representations for a short period after stimulus
Problems in Lexical Selection onset. This inhibition prevents the early selection of
the distractor word. By the time the inhibition is
In a task in which a number of objects are pre- released, the name of the target color has received
sented (e.g., two pictures), one of which has to be enough activation to compete with the distractor
named aloud, the models discussed in the previous word for selection. This solution produces the de-
sections probably perform equally well. Concept sired effect, but also has a strong ad hoc avor and
selection results in an increase in activation of the begs a number of questions. For instance, is the
target concept relative to other activated concepts. inhibitory node always active, also when it is only
Spreading activation to the lexical level will acti- detrimental to performance, as in a simple word-
vate a number of words, but ultimately leads to the reading task?
selection of the correct name, even when selection All other solutions for correct naming perfor-
is only based on an absolute activation threshold, mance in Stroop-like tasks assume some sort of
as in Mortons (1969) logogen model. However, intelligent checking mechanism. Morton (cited
there are at least two ndings that pose problems in Van der Heijden, 1981) accounted for the small
for simple threshold models. number of errors in the Stroop task in the following
First, these models run into problems when they way: A response is produced, from one or another
have to account for naming performance in Stroop- source, and we then have to check the response
like tasks. In these tasks, a nonverbal target is against the stimulus to conrm that it is the one we
accompanied by an incorrect context word. All want (p. 126). This, however, is easier said than
current models assume that words activate their done. As emphasized by Van der Heijden (1981),
lexical representations faster and stronger than words like check and verify have to be regarded
Selection Processes in Lexical Access 299
with great suspicion. One problem is that word three levels: before lexical selection, before articula-
representations and color representations are in a tion, and after articulation. Indeed, as noted by
completely different format and will never lead to Santiago and MacKay (1999), we need some prin-
a match when directly compared. This comparison ciple in the theory that limits how many checking
can only be performed by a mechanism that mechanisms check checking mechanisms (p. 55).
knows which word corresponds to which color. Third, in Roelofss (1992) and Levelt et al.s
The mechanism that possesses that knowledge is (1999) models, lexical selection solves important
our language comprehension system. So, Mortons problems, but its workings are not spelled out in
solution may come down to a simple lexical se- much detail. For example, in Roelofss original
lection mechanism based on activation thresholds, proposal, the lexical selection mechanism can be
followed by a checking mechanism that seems informed about the task instruction (name the
identical to Levelts (1989) monitor. That is, be- picture, ignore the word) and can use that infor-
fore articulation, the system determines whether the mation in its search for the word with the correct
word that has become available is correct. If not, a tag. In Levelt et al.s proposal, production rules at
new attempt is made. the lexical level test for the presence of a tag at the
Roelofss (1992) original model goes one step conceptual level. Moreover, to account for occa-
further by assuming that the lexical selection pro- sional selection errors, it is argued that these se-
cess itself performs a check. In a pictureword lection mechanisms may suffer from lapses. It is
interference task, for instance, this process is in unclear in these accounts what the nature is of the
some way informed about the instruction name tags, how tagging is achieved, how tags are read
the picture and ignore the word and searches for (see also Li, 1998), and how the lapses should be
the word that was activated by the picture (the interpreted. Given this lack of detail and the dif-
word that has a picture tag). Note that this system culty in deriving testable predictions from these
has to be highly exible: To account for lexical accounts, it is not too surprising that concerns have
selection in the picturepicture task reported by been raised about their practically unconstrained
Humphreys et al. (1995), for instance, the as- explanatory power (Santiago & MacKay, 1999).
sumption has to be that the lexical representations A nal argument is that the lexical selection
receive red picture tags and green picture tags, and mechanisms proposed by Roelofs (1992) and Le-
thaton the basis of the instructionthe lexical velt et al. (1999) seem to violate Levelts (1989)
selection mechanism searches for a word with a red original andin my viewvery attractive con-
picture tag. Levelt et al. (1999) proposed a some- ceptualization of lexical access. Levelt stressed its
what different solution (called binding by check- speed and proposed that the retrieval of words is in
ing) in which, before lexical selection takes place, a certain sense automatic: Formulating and ar-
a process determines whether the conceptual rep- ticulating are underground processes . . . that are
resentation that corresponds to a highly activated probably largely impenetrable to executive control
word possesses a tag indicating that it is the even when one wishes otherwise (p. 22). And:
message concept. the Grammatical Encoder needs only one kind of
Four arguments can be raised against these and input: preverbal messages. . . . In order to do its
similar proposals in which lexical selection is a work, it need not consult with other processing
relatively complex and exible process that uses components [italics added]. The characteristic in-
information from outside the lexicon to arrive at put is necessary and sufcient for the procedures to
the correct word. A rst argument is that the pro- apply (p. 15).1
posed solutions are complex but probably hardly This idea of modularity stands in marked con-
ever needed. In most situations, concept selection trast to Levelt et al.s (1999) proposal in which a
sufces to ensure that the lexical representation production rule at the lexical level does consult
that reaches the highest activation level is the other processing components: Before lexical selec-
sought-for word. As argued here, there may be tion takes place, it checks for the presence of a tag
more parsimonious solutions to the problem of at the conceptual level. In Roelofss (1992) original
preventing incorrect responses under rather un- model, lexical selection seemsat least to some
usual conditions, such as those in the Stroop task. degreeunder executive control. First, as dis-
Second, lexical selection in Levelt et al.s (1999) cussed, the lexical selection mechanism is informed
model seems to serve as yet another monitor that about the tag it should be looking for (e.g., a pic-
checks whether a word is really the intended one. As ture tag, word tag, or red picture tag). In addition,
a consequence, the model performs such a check at the erroneous selection of a word that does not
300 Production and Control
possess the correct tag is attributed to a lapse of process and one set that is ignored, has been se-
attention. This raises the interesting question of verely challenged. One of the predictions that can
whose lapse this is. If it is a lapse of the speaker, be derived from this response set hypothesis is that
lexical selection is evidently under executive con- words that are not part of the response set should
trol, contrary to Levelts (1989) proposal. If it is not induce semantic interference. Starreveld and La
a lapse of attention of the lexical selection mecha- Heij (1999) drew attention to experimental results
nism itself, Santiago and MacKays (1999) ho- that contradicted this prediction, and Caramazza
munculus concern seems justied. and Costa (2000, 2001; but see Roelofs, 2001) pro-
I conclude that the fact that speakers are able to vided compelling experimental evidence that con-
produce the correct response in Stroop-like tasks tradicted Roelofss proposal.
poses a problem for simple, activation-based con- In addition, at a theoretical level it could be
ceptualizations of lexical selection. However, given argued that Roelofss (1992) response set mecha-
the strong drawbacks of more complex, intelli- nism is too articial and too discrete. The all-
gent selection mechanisms that I discussed, it or-none principle (ag or no ag) may work with
seems worthwhile to further investigate alterna- a response set of four, but it seems highly unlikely
tives, including (a) the idea that a selection threshold that when 1020 target pictures are used, all of the
can only be reached when there is the intention to picture names are neatly agged. Data obtained by
produce a naming response (as in Starreveld & La La Heij and Vermeij (1987) suggested a more re-
Heijs [1996] model) and (b) the possible role of alistic account of response set effects. In a picture
Levelts (1989) monitor in preventing the produc- word interference task, they used the response set
tion of erroneous responses (as in Mortons [1977] sizes two, four, and eight. Interference effects, de-
account). ned as the difference between context words that
The second nding that poses a problem for were part of the response set and context words
simple activation-based lexical selection models that were not, gradually decreased with increasing
was reported by Glaser and Dungelhoff (1984). set size (17 ms, 9 ms, and 3 ms with set sizes of
These authors asked their participants to name two, four, and eight, respectively). The authors
pictures (e.g., the picture of a dog) at a superordi- took this nding as support for an activation level
nate category level (e.g., animal). As distractor account of the response set effect: Words often
words, they used correct basic-level names (dog), repeated in an experiment get a higher baseline
semantically related basic-level names (e.g., cat) activation level than words not used as responses.
and unrelated words (e.g., pen). All current models So, instead of an all-or-none agging solution, a
of language production assume that the activated gradual difference in activation level was argued to
concept DOG will send activation to the lexical account for the response set effect (see Cohen et al.,
representation of the word dog, which should make 1990, for a similar suggestion).
it a stronger competitor in the selection of the However, Roelofss (1992) account of Glaser
correct response word animal than the unrelated and Dungelhoffs (1984) facilitation effect com-
distractor word pen. However, the results showed pletely hinged on a strict dichotomy between ac-
otherwise. In comparison to the unrelated context ceptable words and unacceptable words. That is,
word PEN, the context word DOG facilitated the a gradual difference in activation level of lexical
naming of the picture of a dog as animal. representations cannot account for Glaser and
Starreveld and La Heijs (1996) model is unable Dungelhoffs (1984) observation of the identical-
to account for this nding. Roelofs (1992) provided facilitation effect in a category-naming task. So, if
an account with the help of the response set mech- we accept that the agging of words is not a real-
anism discussed in the Lexical Selection section. The istic option, the conclusion has to be that current
idea is that, in a hyperonym-naming task, only the models of language production cannot account for
lexical representations of the permitted category this result. Clearly, this is an important target for
names receive a ag. Because the lexical represen- future research in the area of language production
tation of the context word DOG has no ag, it is (see also Vitkovitch & Tyrrell, 1999). I return to
simply ignored in the process of lexical selection. Glaser and Dungelhoffs nding in the next section.
However, it does send activation to the concept In this section, I discussed the processes of
DOG, and therefore its effect will be facilitatory. concept selection and lexical selection and showed
Roelofss (1992) suggestion that lexical repre- that models of language production strongly dif-
sentations can be neatly divided into two sets, fer in their assumptions about lexical selection.
one set that is considered by the lexical selection I made a case for a simple, activation-based lexical
Selection Processes in Lexical Access 301
selection mechanism and provided arguments guage is indeed one of the cues that is used during
against complex lexical selection mechanisms that lexical access. One question that arises then is whe-
use information from outside the mental lexicon. If ther the cue ensures that only words in the intended
we stick to Levelts (1989) original view that only language become activated.
one inputthe preverbal messageis necessary
and sufcient for retrieving a word in the mental
lexicon, there is only one way to ensure that the Coactivation of Words in
correct word is retrieved: provide it with the cor- the Nonresponse Language
rect input. This is exactly the approach taken in the
As illustrated in Figs. 14.1 and 14.2, it is generally
SLAM model and, in a very simple way, in Star-
assumed that during lexical access not only the
reveld and La Heijs (1996) model. As discussed, in
correct response word, but also semantically related
SLAM the task instructions do not affect the
words become somewhat activated. As discussed,
workings of the lexical selection mechanism, but
this coactivation of semantic neighbors accounts for
only control the input to the lexicalization process.
the observation of semantic errors, blends of se-
mantically related words, and the semantic inter-
ference effect in the pictureword task. Given this
Models: Extension Toward assumption and the fact that translation equivalents
Bilingualism often have an almost perfect match with respect to
their semantic content, it seems reasonable to as-
sume that during lexical access also words in the
Response Language as Cue in nonresponse language are activated to some extent.
the Preverbal Message There is also empirical evidence in favor of this
view (see also Costa, chapter 15, this volume).
One of the central issues in research on bilingual
First, bilinguals speaking in their L2 may acciden-
lexical access can be phrased in the following way:
tally insert words from their L1. In line with the
How is it possible that an English-French bilingual
above argument, Poulisse and Bongaerts (1994)
systematically uses the word dog in one situation
related these performance switches to semantic
and chien in another situation, despite the fact that
substitutions in L1. Even blends of words from two
both words express the same conceptual content?
languages have been observed (e.g., springling
Formulated this way, the issue is simply another
from English spring and German Fruhling). Sec-
example of the convergence problem discussed in
ond, Colome (2001) showed that making a pho-
the rst section of the chapter. The solution that
nological decision about the name of a target
I proposed there can then be readily applied: Dog
picture in L2 is affected by the phonological con-
and chien do not express the same meaning. The
tent of the name of the target picture in L1. Third,
meaning of dog has to contain a feature that it is an
a number of researchers (Costa, Miozzo, & Car-
English word and the meaning of chien that it is
amazza, 1999; Hermans, Bongaerts, De Bot, &
a French word. That is, in a bilingual speaker the
Schreuder, 1998) reported a semantic interference
intention to speak in L1 or a second language (L2)
effect in pictureword interference tasks in which
is part of the preverbal message, just as the inten-
the pictures had to be named in one language and
tion to use formal language, a slang word, a cate-
the context words were presented in the other
gory name, or an euphemism.
language of the bilingual. The fact that naming the
This conclusion is completely in line with con-
picture of a dog as dog takes longer when the
clusions reached by a number of researchers in the
Dutch word paard (horse) is superimposed than
area of bilingualism. De Bot (1992) concluded that
when the Dutch word stoel (chair) is superimposed
Levelts (1989) denition of registers (e.g., tele-
can be taken as evidence that during the retrieval of
graphic speech and motherese) included an L2, and
the L1 word dog also the semantically related L2
that information about which register to use is
word paard received some activation.
present in the preverbal message. De Bot and
Schreuder (1993) and Paradis (1987) also assumed
that there is no theoretical difference between the Lexical Selection in Bilinguals:
different registers used by a monolingual and the Language Specic or Nonspecic
languages spoken by a bilingual. These and other
authors (e.g., Green, 1993, 1998; Poulisse, 1997; I summarized my view on lexical access in mono-
Poulisse & Bongaerts, 1994) all assumed that lan- linguals and bilinguals as complex access, simple
302 Production and Control
selection. Access is complex in the sense that the 4.1), that picture naming in L2 is facilitated by the
preverbal message contains all the relevant infor- presence of the pictures name in L1 (in comparison
mation, including the intended language. During with an unrelated word in L1). For example, for a
lexical access, not only the sought-for word, but English-Dutch bilingual, the naming of a picture of
also many semantically related words become ac- a dog in L2 (hond) is facilitated by the presence of
tivated, including words in the nonintended lan- the context word dog in comparison to the unre-
guage. Lexical selection is a simple, local process lated word pen. Costa et al. (1999) concluded from
that is only based on the activation levels of words. this translation-facilitation effect that the word
There is only one model of bilingual access that dog, which should have been a very strong com-
takes thisor at least a very similarposition. petitor for the correct word hond, is simply not
Poulisse and Bongaerts (1994; see also Poulisse, taken into consideration by the lexical selection
1997) presented a model in which the presence of process (see also Costa, chapter 15, this volume).
a language cue (or language component) in the To allow for language-specic selection, Costa et al.
preverbal message sufces to produce words in the assumed tags that indicate whether a word belongs
intended language: Conceptual information and to L1 or L2.
the language cue work together in activating lem- Interestingly, the translation-facilitation effect
mas of the appropriate meaning and language. In bears a clear similarity to Glaser and Dungelhoffs
other words, language is one of the features used (1984) identical-facilitation effect in category
for selection purposes (Poulisse, 1997, p. 216). naming discussed in the Problems in Lexical Se-
Interestingly, as argued by the authors, this model lection section. Apparently, the distractor word
also provides a satisfactory account of three phe- dog facilitates both the naming of a picture of a dog
nomena pertinent to any model of bilingual access: as animal (Glaser & Dungelhoff) and as perro
the ability to separate languages, code switching (the Spanish word for DOG; Costa et al., 1999).
(rapid switching between the two languages), and Also, the accounts are somewhat similar. Roelofs
accidental intrusions from the nonintended lan- (1992) assumed that in naming the picture of a dog
guage. The ability to separate languages follows as animal, the distractor word dog can induce fa-
from the use of a language cue at the conceptual cilitation because it is completely ignored by the
level. For example, when the speaker intends to lexical selection mechanism. Likewise, Costa et al.
use L2, L2 words receive more activation than the assumed that in naming the picture of a dog as
corresponding L1 words. Code switching can be perro, the distractor word dog can induce facilita-
fast because there is no active inhibition of words tion for the same reason. In Roelofss account, the
in one of the two languages, and intrusions may distractor word dog can be ignored because it is not
either result from the failure to use the correct agged. In Costa et al.s account, the distractor
language cue or from incidental cases in which the word dog can be ignored because it does not have a
word in the unintended language reaches a higher Spanish word tag.
activation level than the intended word (e.g., be- Caramazza and Costa (2000) rejected Roelofss
cause of priming effects). account on the basis of their nding that words that
All other models of bilingual lexical access as- do not belong to the set of permitted responses in
sume that lexical selection cannot be that simple and an experiment (words that do not possess a ag in
propose a selection or control process that (a) com- Roelofss model) did induce semantic interference.
pletely restricts selection to words in one language Perhaps somewhat ironically, the same argument
(language-selective models; see, e.g., Costa et al., can be used against Costa et al.s language-selective
1999) or (b) selectively activates or inhibits words in model: Distractor words in the nonintended lan-
one of the languages (most language-nonselective guage do induce semantic interference, which sug-
models, e.g., Green, 1993; see Poulisse, 1997, for gests that they are taken into consideration during
an overview). Within the latter type of models, lexical selection. Costa et al.s attempt to account
inhibition may occur proactively (before lexical for this semantic interference effect by assuming
access) or reactively (after the activation of words that words in the nonintended language induce this
in both languages; see Green, 1998). I discuss lan- effect via their translation equivalents in the in-
guage-selective and language-nonselective models tended language is not entirely convincing.
in turn. A second argument against language-selective
An important argument in favor of language- models is theoretical. As noted, there is a striking
selective models is the nding, reported by Costa similarity between Costa et al.s (1999) and Her-
et al. (1999; see also Hermans, 2000, Experiment manss (2000) translation-facilitation effect and
Selection Processes in Lexical Access 303
Glaser and Dungelhoffs (1984) identical-facilitation in the mental lexicon. In addition, a similar facili-
effect in picture categorization. Also, the time tation effect obtained in an experiment in which
courses of both effects are remarkably similar: participants respond to pictures by the associated
Hermans (2000, Experiment 4.1) used stimulus- action (e.g., chairsit, radiolisten) would force the
onset asynchrony values of 300 ms, 150 ms, and assumption of part-of-speech tags.
0 ms and obtained facilitation effects of 51 ms, 40 I leave it to the reader to think of other sets of
ms, and 3 ms, respectively. The corresponding words that would show the same effect, but the
facilitation effects in Glaser and Dungelhoffs (1984) message will be clear: This approach leads to a
categorization task were 53 ms, 39 ms, and 16 ms, proliferation of tags at the lexical level. Clearly, this
respectively. If Costa et al.s (1999) idea of solution is highly unparsimonious. Many of the se-
language-specic selection (with the help of lan- mantic features of words that are represented in the
guage tags) is applied to Glaser and Dungelhoffs conceptual system have to be duplicated in the lex-
(1984) nding, it must be concluded that the process icon in the form of the isolated pieces of information
of lexical selection can ignore all words that belong called tags. In combination with the nding dis-
to a certain categorization level (e.g., the basic level) cussed above that words from the ignored non-
with the help of categorization-level tags. response language induce semantic interference, the
Do two types of tag sufce? Probably not. We conclusion has to be that language-selective models
can think of quite a number of word sets that might are difcult to maintain. However, it should
produce between-set facilitation. Imagine, for in- be stressed that all current models have difculty in
stance, a pictureword interference experiment in accounting for the facilitation effects reported by
which pictures of famous people have to be named Glaser and Dungelhoff (1984) and Costa et al.
by their rst name (e.g., George). It does not seem (1999). Understanding these effects is a challenge
unlikely that naming will be faster when the picture for future research in language production.
is accompanied by the correct last name of the With the exception of the model of Poulisse and
person (e.g., Bush) than when accompanied by an Bongaerts (1994), all language-nonselective models
incorrect last name (e.g., Dylan), despite the fact assume that words in one of the two languages of
that Bush could be viewed as a strong competitor the bilingual can be selectively inhibited or acti-
in retrieving the correct response George. If this vated to some degree. I have argued that such a
prediction would be borne out, the logic that fol- process is superuous when it is assumed that the
lowed above forces the assumption of rst-name preverbal message contains a language cue. In the
tags and last-name tags attached to peoples names model depicted in Fig. 14.3, this assumption is
Figure 14.3 The model in Fig. 14.2 is extended for bilingual language production. The language cue is part
of the preverbal message, and words in both languages become activated. Because of the language cue, the
intended name will reach the highest activation level and will be selected.
304 Production and Control
incorporated: The language cue that is added to the cess that was advocated by Poulisse and Bongaerts
preverbal message ensures that words in the in- (1994). To summarize that view, the preverbal
tended language reach a higher activation level message contains a language cue that ensures that
than words in the nonintended language. the word in the intended language reaches the
One argument that proponents of selective in- highest activation level; no additional activation or
hibition use is Meuter and Allports (1999) par- inhibition processes at the lexical level are needed.
adoxical asymmetry effect in language switching. The only difference between this view and the one
Meuter and Allport reported that it takes bilinguals depicted in Fig. 14.3 is that Poulisse and Bongaerts
more time to switch from L2 to L1 than to switch assumed that words possess language tags. How-
from L1 to L2, a nding that is often interpreted as ever, these tags were viewed as part of the words
reecting the strong inhibition of L1 while speak- meaning representation at the lemma level (in
ing in L2. Costa (chapter 15, this volume) critically Levelts 1989 original model that was taken as a
discussed this interpretation of Meuter and All- starting point by Poulisse and Bongaerts, lemmas
ports results and concluded that the available contained semantic information). If lexical repre-
evidence does not support the presence of an sentations do not contain semantic information, as
inhibitory mechanism in procient bilinguals. assumed by Levelt et al. (1999), then a language
Here, I would like to point out that throughout cue at the conceptual level sufces.
their article Meuter and Allport seemed reluctant
to choose between two possible interpretations
of their ndings: the inhibition of (words within) a Summary
language or the inhibition of a task set. In fact, in
their discussion the latter interpretation was given In this chapter, I examined one aspect of language
more emphasis: The negative priming arises production: the retrieval of a single word on the
from the active inhibition of one of two mutually basis of conceptual information. First, arguments
competing tasks (or languages) (p. 35). In terms of were presented in favor of the view that to achieve
the model that I propose, the asymmetry effect in a communicative goal, the preverbal message has
language switching may then reect processes in- to contain cues that dene the affective and prag-
volved in the incorporation of the language cue in matic characteristics of the sought-for word. Given
the preverbal message. the resulting complexity of the preverbal message,
It should be noted that my arguments do not it was argued to be unlikely that convergence
imply that words in one of the bilinguals languages problems as discussed by Levelt (1989) and Roelofs
cannot reach a higher activation level than words (1992) arise. In addition, I discussed evidence that
in the other languageeven on a more permanent not all activated concepts, but only the preverbal
basis. We have seen that all models of lexical access message is lexicalized. To account for incidental
assume that each attempt to retrieve a word results speech errors, blends, and semantic interference in
in the activation of a set of semantically related the pictureword task, it must be assumed that
words. If we make the assumptions that (a) because lexical access leads to the activation of a set of
of the language cue, semantically related words in semantically related words in the mental lexicon.
the intended language become more strongly acti- How a preverbal message is constructed and how it
vated than semantically related words in the non- gives rise to the activation of a semantic cohort of
intended language; (b) words spread activation to words at the lexical level is an important issue for
associatively related words within the lexicon (see further research. Perhaps it is necessary to return to
Alario, Segui, & Ferrand, 2000; La Heij, Dirkx, & a decompositional (feature-based) view of concep-
Kramer, 1990); and (c) activated words return to a tual representations (see Caramazza, 1997).
somewhat higher baseline level of activation (as in In most models of language production, two
Mortons 1969 logogen model), then the repeated selection processes are assumed: a rst process in
retrieval of words in one language will ultimately which the relevant conceptual information is se-
result in relatively high activation levels of the lected (concept selection) and a second process that
words within that language. Similarly, the nonuse selects one word from the set of activated lexical
of a language may lower the baseline levels of ac- representations (lexical selection). I argued for a
tivation of words in that language, which will make model of lexical access that could be characterized
them more difcult to retrieve. as complex access, simple selection. That is, lexical
In conclusion, the considerations presented in selection is based on a complex preverbal message
this section favor the view of bilingual lexical ac- that contains all relevant information to arrive at
Selection Processes in Lexical Access 305
the correct word. Lexical selection can then be a incorrect word (the ugly sister) that is immediately
simple process that selects one word from the set of rejected by the speaker. However, each new at-
activated words on the basis of the activation levels tempt to retrieve the correct word only leads to the
only. An intelligent lexical selection process that reactivation of the ugly sister, which for that reason
selects on the basis of (semantic) information in- is also referred to as a blocking word. This phe-
nomenon is exactly what may be expected if lexical
stead of activation (e.g., with the help of various
selection is an automatic process only based on
tags attached to words) was rejected on the basis of activation levels: Given a certain input (the pre-
parsimony and of the assumed modularity of the verbal message) and the current levels of activation
process of lexical access. of the lexical representations, the same output will
The model that results bears striking similarities be produced time and time again.
to Levelts (1989) original proposal: Lexical access
is automatic in the sense that it delivers a winner
References
on the basis of the information in the preverbal
message (and only on that information). Speakers Alario, F. X., Segui, J., & Ferrand, L. (2000).
cannot inuence lexical access in any other way Semantic and associative priming in picture
than selecting (or perhaps constructing) an adequate naming. Quarterly Journal of Experimental
preverbal message. This model can be extended to Psychology, 53A, 741764.
the bilingual situation in a very straightforward Allport, D. A. (1987). Selection for action: Some
behavioral and neuropsychological consider-
way. Intended language is part of the preverbal
ations of attention and action. In H. Heuer &
message. The presence of this language cue ensures A. F. Sanders (Eds.), Perspectives on percep-
that the word in the intended language reaches a tion and action (pp. 395419). Hillsdale, NJ:
higher activation level than the translation equiva- Erlbaum.
lent in the nonintended language (Poulisse & Bon- Barsalou, L. W. (1982). Context-independent and
gaerts, 1994). So, there is no need for a selective context-dependent information in concepts.
activation or selective inhibition of words of a par- Memory & Cognition, 10, 8293.
ticular language, in the same way as there is no need Bierwisch, M., & Schreuder, R. (1992). From
for selective activation or inhibition to account for concepts to lexical items. Cognition,
the production of formal language, slang, taboo 42, 2360.
Bloem, I., & La Heij, W. (2003). Semantic
words, euphemisms, very high frequency words,
facilitation and semantic interference in
and category names by monolinguals. word translation: Implications for models of
Finally, I would like to repeat the concerns that lexical access in language production.
were raised by Van der Heijden (1981) and more Journal of Memory and Language, 48,
recently by Santiago and MacKay (1999): We 468488.
should be extremely reluctant in assuming under- Caramazza, A. (1997). How many levels of
specied control mechanisms that, in combination processing are there in lexical access? Cogni-
with convenient tags or ags at convenient places, tive Neuropsychology, 14, 177208.
solve major problems. If there is one thing that this Caramazza, A., & Costa, A. (2000). The semantic
chapter has shown, it is that these mechanisms interference effect in the pictureword
interference paradigm: Does the response set
often induce more problems than they seem to
matter? Cognition, 75, 5164.
solve. Caramazza, A., & Costa, A. (2001). Set size and
repetition in the pictureword interference
Acknowledgments paradigm: implications for models of naming.
Cognition, 80, 291298.
I would like to thank Lex van der Heijden, Patrick Caramazza, A., & Miozzo, M. (1998). More is not
Hudson, Gerard Kempen, and Kees de Bot for always better: A response to Roelofs, Meyer,
helpful discussions and comments on an earlier and Levelt. Cognition, 69, 231241.
version of this chapter. Cohen, J. D., Dunbar, K., & McClelland, J. L.
(1990). On the control of automatic processes:
Note A parallel distributed processing account of
the Stroop effect. Psychological Review, 97,
1. A nice illustration of this assumed automa- 332361.
ticity is the ugly sister phenomenon that speakers Colome, A` . (2001). Lexical activation in bilinguals
in a tip-of-the-tongue state may experience (Rea- speech production: Language-specic or
son & Lucas, 1984). The attempt to nd the cor- language-independent? Journal of Memory
rect word sometimes leads to the activation of an and Language, 45, 721736.
306 Production and Control
Costa, A., Miozzo, M., & Caramazza, A. (1999). Psychology: Learning, Memory, and Cogni-
Lexical selection in bilinguals: Do words in the tion, 21, 961980.
bilinguals two lexicons compete for selection? Keele, S. W. (1973). Attention and human perfor-
Journal of Memory and Language, 41, mance. Pacic Palisades, CA: Goodyear.
365397. La Heij, W. (1988). Components of Stroop-like
De Bot, K. (1992). A bilingual production model: interference in picture naming. Memory &
Levelts speaking model adapted. Applied Cognition, 16, 400-410.
Linguistics, 13, 124. La Heij, W., Dirkx, J., & Kramer, P. (1990). Cat-
De Bot, K., & Schreuder, R. (1993). Word egorical interference and associative priming
production and the bilingual lexicon. In R. in picture naming. British Journal of Psychol-
Schreuder & B. Weltens (Eds.), The bilingual ogy, 81, 511525.
lexicon (pp. 191214). Amsterdam: Benjamin. La Heij, W., & Vermeij, M. (1987). Reading
De Kamps, M., & Van der Velde, F. (2001). Using versus naming: The effect of target set
a recurrent network to bind form, color and size on contextual interference and facilita-
position into a unied percept. Neuro- tion. Perception & Psychophysics, 41,
computing, 3840, 523528. 355366.
Dell, G. S. (1986). A spreading activation theory of Levelt, W. J. M. (1989). Speaking: From intention
retrieval in sentence production. Psychological to articulation. Cambridge, MA: MIT Press.
Review, 93, 283321. Levelt, W. J. M., Roelofs, A., & Meyer, A. S.
Fodor, J. D. (1980). Semantics. Cambridge, MA: (1999). A theory of lexical access in speech
Harvard University Press. production. Behavioral and Brain Sciences,
Fromkin, V. A. (1973). Speech errors as linguistic 22, 175.
evidence. The Hague, The Netherlands: Li, P. (1998). Mental control, language tags, and
Mouton. language nodes in bilingual lexical processing.
Glaser, W. R., & Dungelhoff, F.-J. (1984). The Bilingualism: Language and Cognition,
time course of picture-word interference. 1, 9293.
Journal of Experimental Psychology: McCloskey, M., & Glucksberg, S. (1978). Natural
Human Perception and Performance, 10, categories: Well-dened or fuzzy sets?
640654. Memory & Cognition, 6, 462472.
Glaser, W. R., & Glaser, M. O. (1989). Context Meuter, R. F. I., & Allport, A. (1999). Bilingual
effects on Stroop-like word and picture language switching in naming: Asymmetrical
processing. Journal of Experimental Psychol- costs of language selection. Journal of Memory
ogy: General, 118, 13-42. and Language, 40, 2540.
Green, D. W. (1993). Toward a model of L2 Morton, J. (1969). The interaction of information
comprehension and production. In R. in word recognition. Psychological Review,
Schreuder & B. Weltens (Eds.), The bilingual 76, 165178.
lexicon (pp. 249277). Amsterdam: Benjamin. Paradis, M. (1987). The assessment of bilingual
Green, D. W. (1998). Mental control of the aphasia. Hillsdale, NJ: Erlbaum.
bilingual lexico-semantic system. Bilingualism: Phaf, R. H., Van der Heijden, A. H. C., & Hudson,
Language and Cognition, 1, 6781. P. T. W. (1990). SLAM: A connectionist
Harley, T. A. (1999). Will one stage and no feed- model for attention in visual selection tasks.
back suffice in lexicalization? Behavioral and Cognitive Psychology, 22, 273341.
Brain Sciences, 22, 45. Posnansky, C. J., & Rayner, K. (1977). Visual-
Harley, T. A. (2001). The psychology of language. feature and response components in a picture
New York: Taylor & Francis. word interference task with beginning and
Hermans, D. (2000). Word production in a foreign skilled readers. Journal of Experimental Child
language. Unpublished doctoral dissertation, Psychology, 24, 440460.
University of Nijmegen, The Netherlands. Poulisse, N. (1997). Language production in bi-
Hermans, D., Bongaerts, T., De Bot, K., & linguals. In A. M. B. de Groot & J. F. Kroll
Schreuder, R. (1998). Producing words in a (Eds.), Tutorials in bilingualism: Psycholin-
foreign language: Can speakers prevent guistic perspectives (pp. 201224). Mahwah,
interference from their rst language? NJ: Erlbaum.
Bilingualism: Language and Cognition, 1, Poulisse, N., & Bongaerts, T. (1994). First lan-
213229. guage use in second language production.
Humphreys, G. W., Lloyd-Jones, T. J., & Fias, W. Applied Linguistics, 15, 3657.
(1995). Semantic interference effects on nam- Reason, J., & Lucas, D. (1984). Using cognitive
ing using a postcue procedure: Tapping the diaries to investigate naturally occurring
links between semantics and phonology with memory blocks. In J. E. Harris & P. E.
pictures and words. Journal of Experimental Morris (Eds.), Everyday memory, actions
Selection Processes in Lexical Access 307
15
Lexical Access in Bilingual Production
ABSTRACT What is the impact of the lexical and sublexical representations of the
language not in use on the bilinguals speech production? Do words from the nonre-
sponse language interfere in language production? In this chapter, I address these issues
by comparing two different views of speech production in bilingual speakers: the
language-specic and the language-nonspecic views. I focus on how these two views
make different claims regarding the extent to which activation ow and selection
processes are restricted to one of the two languages of a bilingual. I also discuss the
available empirical evidence supporting each hypothesis and propose a tentative ex-
planation that reconciles the seemingly contrasting results. I argue that the avail-
able evidence suggests that activation ow is language nonspecic. More controversial
are the results regarding the extent to which lexical selection is language specic or
nonspecic.
308
Bilingual Production 309
issues is reviewed. Finally, I discuss the need of pos- is a central question in speech production (see
tulating inhibitory mechanisms to explain the per- Fig. 15.1).
formance of both highly procient and low-procient Although the types of representations at each
bilingual speakers. level are quite different (e.g., concepts, words, and
Before addressing the issue of the language phonemes), there are two principles that seem to
specicity of the activation ow and selection play a role in all of them: activation and selection
processes, it is important to dene these two terms mechanisms. Activation refers to the availability
in a broader context of speech production. Speech of representations at different levels of processing.
production entails at least three different levels of When a given representation is more available for
representation (e.g., Caramazza, 1997; Dell, 1986; production, we say that its level of activation is
Levelt, 1989; Levelt, Roelofs, & Meyer, 1999). First, high; when it is less available, we say that its level
at the conceptual (or semantic) level the speaker of activation is low. Speech production starts with
decides which conceptual information to commu- the activation of conceptual representations. It is
nicate (see Francis, chapter 12, this volume, for a generally assumed that, during conceptual proces-
discussion regarding the differences between the sing, not only the semantic representation of the
semantic and conceptual levels). Second, a lexical intended concept but also those representations of
level represents lexical items (or words) along with semantically related concepts are activated to
their grammatical properties.1 Third, the phono- some degree. That is, when naming the picture of a
logical code of the words is represented. How do dog, the target concept (e.g., dog) along with
the speakers go through all these levels, choos- other related concepts (cat, bark, etc.) become ac-
ing the concepts they want to express, the words tivated. The activation of the semantic representa-
corresponding to those concepts, and nally the tions spreads to the lexical system, activating
phonemes corresponding to those words? This proportionally the corresponding lexical nodes or
Figure 15.1 Schematic representation of the monolingual system. The arrows represent the ow of acti-
vation, and the thickness of the circles indicates the level of activation of the representations.
310 Production and Control
words. Thus, activation ows from an activated that the representations of the two languages of a
semantic representation to its corresponding lexi- bilingual are activated, the second issue is whether
cal node. the selection processes are affected by the activation
Assuming that the conceptual representation levels of representations that do not belong to the
of several elements spread activation to their cor- response language. Note that the rst issue is inde-
responding lexical representations, then the system pendent of the second one. In principle, it is possible
encounters several word candidates for production that activation ow spreads to the two languages of
(dog, cat, bark, etc.). At this point, a decision has a bilingual (language-nonspecic ow of activa-
to be made regarding which lexical node to tion), and that the selection mechanism is not sen-
choose among the activated ones for further pro- sitive to the level of activation of representations
cessing. This decision mechanism is called lexical that do not belong to the intended language
selection. The selection of the intended lexical node (language-specic selection mechanism). This is an
will make available its grammatical properties, important distinction to be kept in mind when dis-
which in turn will be used to construct the syntactic cussing the different possible models of bilingual
frame. speech production.
There are different views regarding how the
lexical selection mechanism works (see La Heij,
chapter 14, this volume, for an extended discussion Two Words for One Concept
of the different models), but all of them agree that
this mechanism is at least sensitive to the level of A fundamental question in the study of bilingual
activation of the intended lexical node. In fact, the speech production concerns the consequences of
dominant view in speech production assumes having a conceptual representation linked to two
that the lexical selection mechanism is actually different lexical items belonging to two different
sensitive to the activation of the intended lexical languages (e.g., De Groot, 1992; Gollan & Kroll,
node and to the other lexical nodes that may act as 2001; Kroll & Stewart, 1994; La Heij, Hoog-
competitors. lander, Kerling, & Van der Velden, 1996;
Activation from the lexical level also spreads to Van Hell & De Groot, 1998; see also Franciss
the sublexical or phonological level. That is, when chapter 12 in this volume for a discussion of the
the lexical node is selected, the next step in the organization of the bilinguals memory represen-
production of speech is the retrieval of its phono- tations). In other words, is there any effect of
logical makeup. The issues here regarding activa- having two lexical nodes for almost every word
tion ow and selection processes are similar to that the speaker is producing? Unlike synonyms,
those preceding it. First, is activation ow from the translations are not interchangeable (e.g., com-
lexical level to the phonological level restricted to munication will not be very much affected if the
the selected lexical node, or is it the case that any word sofa is produced instead of the word couch,
activated lexical node spreads some proportional but will be disrupted if the speaker says sillon [the
activation to its phonological elements? Second, Spanish name for couch] instead of couch). This
if activation ow is not restricted to the selected is because in many circumstances a bilingual
lexical node, the question then is whether the se- speaker needs to speak only in one language be-
lection of the phonological properties of the target cause the interlocutor may not know his or her
lexical node is affected by the activation of those of other language (e.g., Grosjean, 1997, 1998, 2001),
other nontarget words. and therefore the production of a targets trans-
Given the architecture sketched above, the lation may have disastrous consequences for
question in the context of bilingual production is the communication.
extent to which activation ow (i.e., how informa- Thus, the central issue that needs to be addressed
tion is passed from level to level) and selection is not so much how a lexical node in a given lan-
processes (i.e., how the system decides which rep- guage is selected (see La Heijs chapter 14 in this
resentation is prioritized for further processing) are volume for a discussion of this problem; models of
restricted to only the language spoken by the bilin- monolingual speech production address this issue as
gual (see Fig. 15.2). More precisely, the rst issue well) but rather how the existence of lexical repre-
refers to whether the lexical and sublexical repre- sentations have an impact in the other language of a
sentations of the nonresponse language are activated bilingual. Any approach to this issue requires some
concurrent with the corresponding representations assumptions regarding activation ow and selection
of the language intended for production. Assuming processes during speech production.
Bilingual Production 311
Figure 15.2 Schematic representation of the bilingual system. The squares represent the lexical nodes of the
language not-in-use (Spanish) and the circles the lexical nodes of the language in use (English). The arrows
represent the ow of activation, and the thickness of the circles indicates the level of activation of the
representations. The question marks represent the language-specic and nonspecic activation ow hy-
potheses. If there is activation ow to the two lexicons of a bilingual individual, then the connections
between the semantic representations and the lexical nodes of the language not in use (the squares) will be
functional (the language-nonspecic hypothesis); otherwise, they will not (the language-specic hypothe-
sis). The same applies to the connections between the lexical nodes and the phonological nodes.
Figure 15.3 Schematic representation of the bilingual system. The squares represent the lexical nodes of
the language not in use (Spanish) and the circles the lexical nodes of the language in use (English). The
arrows represent the ow of activation, and the thickness of the circles indicates the level of activation of
the representations. The rectangle represents a lexical selection mechanism. In part a, the selection
mechanism is language specic; that is, it only considers the activation levels of the lexical nodes belonging
to the response language during speech production, rendering any activation of the lexical nodes of the
nonresponse language (represented by the squares) irrelevant during the selection process. In part b, the
selection mechanism is language nonspecic and therefore considers the level of activation of all lexical
nodes irrespective of the language to which they belong.
Schreuder, 1998). That is, words from the nonre- But, before doing so, it should be noted that
sponse language also act as competitors. these two proposals are in many respects under-
These two views make different predictions specied. Language-specic selection models beg a
about the role of the nonresponse language during crucial and yet-unanswered question: How does the
lexical selection. While the language-specic selec- selection mechanism restrict its search to the lexical
tion hypothesis posits that the existence of another nodes of only one language? Likewise, language-
language is irrelevant during lexical selection, the nonspecic models must explain the mechanism
language-nonspecic selection hypothesis assumes that prevents words in the nonresponse language
that lexical nodes in the nonresponse language may from eventual selection (see La Heij, chapter 14, this
interfere during lexical access. In the section Cross- volume, for a related discussion).
Language Effects During Selection Processes, I de- One way to implement the language-specic
scribe some studies that have attempted to provide selection mechanism can be found in the binding-by-
evidence in favor of one model or the other. checking mechanism proposed by Levelt, Roelofs,
Bilingual Production 315
and Meyer (1999). This mechanism ensures that proper lexical node (see La Heij, chapter 14, this
the selected word matches the intended meaning of volume, for a critical discussion of this position).
the speaker. Roelofs (1998) extended this mecha- Regarding the language-nonspecic hypothesis,
nism to cases of bilingualism. According to Roe- there have been two proposals of how to ensure
lofs, the checking mechanism is sensitive to both selection in the intended language. The rst as-
the language the speaker wants to use and the sumes that the semantic system activates the lexical
language of the selected lexical node. If the lan- representations of the response language with more
guage of the selected word does not match that of intensity than those of the nonresponse language
the intended language, the checking mechanism (e.g., Poulisse, 1999). This differential amount of
notes a mismatch, and the selected lexical node is activation received by the two lexical systems
discarded before further processing. In this way, guarantees that the lexical nodes with the higher
the system ensures that only lexical nodes that level of activation correspond to those of the lan-
belong to the intended language will eventually be guage in use. The second solution appeals to the
produced. For an involuntary intrusion of a word existence of an inhibitory process acting on the
from the nonresponse language to occur, two er- lexical representations that belong to the nonre-
rors should be present: the selection of a word sponse language. In other words, the activation
belonging to the wrong language and a failure in levels of the lexical representations of the language
the checking mechanism in charge of binding the not in use are suppressed. In this way, the lexical
intended conceptual representation and the lan- nodes of the response language would always
guage in which it has to be produced with the be more activated than those of the nonresponse
316 Production and Control
language, ensuring that the word to be selected performance of 45 late Dutch-English bilinguals in
belongs to the intended language. different speech production tasks. Of special inter-
est here are those slips of the tongue that ostensibly
showed an effect of the rst language (L1) system
Selecting Sublexical Representations (the so-called L1-based slips). For the 15 relatively
Regarding the stage at which phonological encod- procient bilinguals (speakers who had studied
ing takes place, it is also commonly assumed that English for at least 7 years) included in the experi-
the selection of the proper segmental information is ment, the number of times that an L1 word was
determined by the activation levels of the segments produced involuntarily was very low. These speak-
that form the target word (Costa & Caramazza, ers produced an average of 3,361 words in their
2002; Meyer, 1996; Meyer & Schriefers, 1991; L2 with only 16 L1 intrusions. However, the
Roelofs, 2000; Starreveld, 2000). Assuming that 15 low-procient bilinguals, who were exposed
the lexical nodes of the nonresponse language to English for only 2 years, produced an average of
activate their corresponding phonological seg- 2,795 words, with a total of 246 L1 lexical intru-
ments, the question is whether such activation in- sions. These observations suggested that (a) there is
terferes with the targets phonological encoding. concurrent activation of the two languages of a bi-
An answer to this question depends, in prin- lingual, and (b) language prociency correlates
ciple, on some assumptions about how the pho- negatively with the probability of committing a
nological repertoires of the two languages of a faulty lexical selection that involves lexical items
bilingual are represented. For example, a language- from the nonresponse language.
specic selection mechanism can only be im- Hermans et al. (1998) conducted several picture
plemented if the bilingual speaker possesses two word interference experiments in which Dutch-
separate phonological repertoires. In this scenario, English bilinguals were asked to name pictures in
it would be possible to postulate a retrieval mech- their L2 while ignoring the presentation of dis-
anism sensitive only to the activation levels of one tractor words in L1 or L2. In the crucial condition,
specic phonological repertoire (along the same the distractor word was phonologically related to
lines as proposed for the selection of lexical nodes). the targets translation. For example, if the speaker
By contrast, if there is a certain overlap between the had to name a picture of a mountain in English
phonological systems of the two languages of a (mountain), the distractor word phonologically re-
bilingual, then the activation of the phonological lated to the targets translation (berg) was berm. The
information corresponding to the translation word authors argued that if activation from the semantic
would probably have an impact on the ease system ows to the targets translation in the non-
with which the phonological composition of the response language (berg), lexical selection should be
target word is retrieved. harder when the translation word receives extra
activation from the distractor word (that is, when
the distractor word is phonologically related, e.g.,
berm) than when it does not (when the distractor
Producing Words in One
word is phonologically unrelated, e.g., kaars). The
Language: Experimental results supported this prediction in that naming la-
Evidence tencies were slower in the former condition. This
result was interpreted as supporting the ideas that
Activation Flow From the Semantic (a) the activation ow is not language specic, and
System to the Lexical and (b) the lexical selection mechanism considers the
Sublexical Systems activation levels of words belonging to the response
and nonresponse languages (see more later about
The studies reviewed next have explored whether this claim). That is, the semantic system activates
activation ows freely from the semantic system to both languages of a bilingual, and the selection
the two lexical systems of a bilingual irrespective of mechanism is sensitive to the activation level of
the language spoken. any lexical item regardless of the language it belongs
The rst set of evidence suggesting that the exis- to (see also Costa, Colome, Gomez, & Sebastian-
tence of language-nonspecic activation ow comes Galles, 2003, for a different interpretation of these
from the study of spontaneous slips of the tongue in results).
bilingual speakers. Poulisse and Bongaerts (1994; Other studies that have addressed this issue have
see also Poulisse, 1999) analyzed the L2 production explored whether there is phonological activation
Bilingual Production 317
of the targets translation word. For example, Costa The cognate effects in naming latencies and in
et al. (2000) asked whether the cognate status of TOT rates, along with other convergent experi-
translation words has an impact on the speed with mental evidence (Colome, 2001), suggest the exis-
which they are produced. Cognates are transla- tence of phonological activation of the targets
tions with similar phonological/orthographic form translation. In other words, the available evidence
(e.g., guitarra/guitar). Noncognates are translations suggests that activation ow in lexical accessfrom
with dissimilar phonological/orthographic form the semantic system to the lexical level and from the
(e.g., pandereta/tambourine). Costa et al. (2000) lexical level to the sublexical levelis language
hypothesized that if during picture naming the nonspecic.
phonological representation of the targets transla-
tion is activated, then the retrieval of the phono-
logical properties of the target word would be easier Cross-Language Effects
for cognates than for noncognate words. This is During Selection Processes
because, for the former set of words, the phono-
logical segments (or features) of the target word The parallel activation of both lexicons of a bilin-
(e.g., guitarra) would receive activation from two gual during speech production begs the question of
sources, the target lexical representation (e.g., gui- how the activation of the representations in the
tarra) and its translation in the nonresponse language not in use affects the production process
language (e.g., guitar). This situation is different for in the response language. As discussed next, such
the noncognate words, for which the target lexical an issue is far from resolved.
node and its translation would activate different Poulisse and Bongaerts (1994) and Poulisse
phonological representations. The results conrmed (1999) showed that the retrieval of lexical items in
this prediction: Naming latencies were faster for L2 is affected by the existence of L1 words. These
pictures with cognate than with noncognate names. authors demonstrated the occurrences of L1 in-
This result was interpreted as supporting the no- trusions in L2 speech production, particularly in
tions that (a) activation ow from the semantic low-procient bilinguals. Poulisse and Bongaerts
system to the lexical system is language nonspecic, (1994) interpreted these errors as demonstrating
and (b) lexical nodes from the nonresponse lan- that the lexical selection mechanism considers the
guage spread activation to their phonological level of activation of all lexical nodes, irrespective
segments (see Kroll, Dijkstra, Janssen, & Schriefers, of the language spoken, and that in some cases
2000, for a replication of the cognate effect in the mechanism derails and selects the targets
picture naming). translation rather than the intended lexical node.
Another result indicating that lexical nodes from However, in a more recent study, Poulisse (1999)
the nonresponse language activate their phonologi- preferred a different interpretation of the phenom-
cal properties was reported by Gollan and Acenas enon. She argued that these errors stem from a de-
(2000), who explored the tip-of-the-tongue (TOT) viant behavior in the speech production system
phenomenon in bilingual speakers (see Gollan & called multiple selection, in which two lexical nodes
Silverberg, 2001, for a related study). In this study, are selected for further processing. The uninten-
bilingual speakers tended to fall in TOT states less tional L2 switches arise from two independent
often with cognate words than with noncognate errors. First, the lexical selection mechanism erro-
words. The authors argued that the cognate effect neously selects two lexical nodes simultaneously
arises because the targets translation (in the case of (the target and its translation in the nonresponse
the cognate words but not in the case of the non- language). As a consequence of this multiple selec-
cognate words) is sending activation to the phono- tion, two word form representations become acti-
logical elements of the target word. Thus, the vated. At that point, another error occurs: Instead of
phonological elements of a cognate word would be retrieving the phonological representation of the
more available than those of a noncognate word. target lexical node in the response language, the
Under the assumption that TOT states arise as a corresponding targets translation is retrieved and
consequence of a failure in retrieving the phono- produced.2
logical elements of the target lexical node, then the Importantly, Levelt et al. (1999) suggested that
probability of falling in a TOT state would be higher the selection of two lexical nodes is the source of the
for noncognates than for cognates. This is because slips of the tongue in which the phonological infor-
the availability of the phonemes would be higher in mation of two lexical nodes is combined in one single
the latter than in the former case. production. In other words, these so-called blend
318 Production and Control
errors are considered to result from having available either in L1 or on L2. Trials in which the rst picture
the phonological information of two previously se- had to be named in French were considered switch
lected lexical nodes. If multiple selection indeed trials; those in which the rst picture had to be
underlies the L1 intrusions observed by Poulisse named in English were considered nonswitch trials.
(1999), then a considerable number of blend errors The results showed a semantic interference effect.
across languages would be expected in her corpus, a When the second denition was semantically re-
prediction that was not supported by the data (four lated to the target picture name, naming latencies
errors out of more than 100,000 words produced). were slower than when it was unrelated. However,
Regardless of the exact locus at which the L1 this semantic interference depended on whether the
intrusions take place, it is important to determine target picture was preceded by a switch trial. If the
whether the results reported by Poulisse (1999) are rst picture introduced a language switch (it had to
compelling enough to assume that lexical selection be named in French), then no semantic interference
is language nonspecic. And, in fact, the very small was present. However, if the rst picture had to
number of L1 intrusions produced by the relatively be named in English (no switch condition), then
late uent bilinguals tested by Poulisse may support semantic interference effects appeared for the two
the notion that this type of bilingual speaker languages. In other words, if there was a language
achieves lexical selection by means of a language- switch just after the denitions, then no semantic
specic selection mechanism. interference was present.
A second result that supports the notion that the Lee and Williams (2001) interpreted this result
activation of lexical nodes belonging to the non- as revealing the existence of cross-language com-
response language affects lexical selection in the petition and inhibitory processes in bilingual
response language is that reported by Hermans language production. They argued that, when
et al. (1998). As discussed, in this study an increase producing the name of a target picture in French
in the level of activation of the targets translation (chateau [castle]), the English lexical node pro-
produced by the presentation of a phonologically duced in response to the denition of a palace in
similar distractor word led to interference in the English (palace) interferes with the target selection
production of the pictures name. These results in French (chateau). Crucially, however, such
suggested that when speaking in L2, the lexical competition disappears if there is a language switch
selection mechanism considers the level of activa- immediately before the target picture. The authors
tion of every lexical node, treating the nodes as claimed that the language switch leads to the in-
possible candidates for production, therefore al- hibition of lexical nodes in the language in which
lowing cross-language interference to arise. the denitions have been answered (English), ren-
Lee and Williams (2001) investigated, by means dering any subsequent interference from English
of a semantic interference paradigm, whether words lexical nodes ineffective (e.g., palace would not
from the nonresponse language interfere during interfere with chateau because the English lexical
lexical selection. English-French bilinguals were nodes would have been inhibited).3
asked to perform a naming-to-denition task mixed There is, however, another set of results that
with a picture-naming task. In each trial, the bilin- favors the notion that lexical selection is language
guals were presented with three denitions, and they specic. Costa et al. (1999) reported a series of
were asked to produce the word corresponding to pictureword interference experiments in which
each in their L1 (English). After the denitions, two balanced Catalan-Spanish bilinguals were asked to
pictures appeared, one after the other, and the par- name pictures in their L1 while ignoring distractor
ticipants were asked to name them either in English words. Of critical interest is the condition in which
or in French. Naming latencies were recorded for the distractor word corresponded to the targets
the second picture (the target picture). translation (e.g., picturetaula [table in Catalan],
Three critical manipulations were included in the with the distractor mesa [table in Spanish]). In this
experiment. First, the second denition of a triad condition, the targets translation (mesa) is sup-
was semantically related (e.g., The queen lives at posed to be highly activated (even more than in the
Buckingham . . . [correct answer palace]) or un- case in which the distractor word is phonologically
related (e.g., animal that travelers ride on in the related to the targets translation, as in Hermans
desert . . . [correct answer camel]) to the target et al., 1998) because it receives activation from two
picture (e.g., castle). Second, the target picture sources: the semantic representation of the target
(the second picture) was named in L1 (English) or picture and the presentation of the distractor word.
in L2 (French). Third, the rst picture was named In such a scenario, if the targets translation is
Bilingual Production 319
considered a candidate for lexical selection, naming processes are language specic or language non-
latencies should be slower when the distractor specic. The experimental results regarding the rst
word is the targets translation (mesa) than when it issue point in the same direction, namely, that
is an unrelated word (perro [dog in Spanish]). during lexical access in speech production the lex-
The results did not support this prediction. In fact, ical nodes of the two languages of a bilingual are
naming latencies were faster in the former than in the simultaneously activated (Hermans et al., 1998;
latter case. In other words, raising the activation Poulisse, 1999). Furthermore, there is convergent
levels of the targets translation did not slow the evidence suggesting that the activation of lexical
targets selection, but rather speeded it, suggesting nodes belonging to the nonresponse language also
that the lexical selection mechanism does not take spreads to their phonological properties (Colome,
into account the activation level of the lexical nodes 2001; Costa et al., 2000; Kroll, Dijkstra, Janssen,
that belong to the nonresponse language. Impor- & Schriefers, 2000). Together, these results suggest
tantly, this result has been replicated with two other that activation ow from the semantic system to
populations of bilinguals with different degrees of L2 the lexical and sublexical systems of a bilingual is
prociency and with L2 as the language of the re- language nonspecic.
sponse (Costa & Caramazza, 1999; Hermans, 2000). The remaining issue to solve is whether the se-
Thus, there are two seemingly contradictory lection of the lexical and sublexical representations
results. Although a distractor word that partially in the response language is affected by the activa-
matches the targets translation (berm) interferes tion of corresponding linguistic representations in
with the selection of the target word (mountain), a the other language. Much less agreement exists
distractor word that matches fully the targets regarding this issue. The results of several reaction
translation (berg) facilitates the target production time studies seemed to favor the notion that lexical
(mountain). At this point, there is no satisfactory selection is language nonspecic (Hermans et al.,
explanation for these contrasting results (but see 1998; Lee & Williams, 2001). However, another
Hermans, 2000, and Costa et al., 2003). set of studies has been interpreted as supporting the
There is also some evidence coming from the notion that lexical nodes belonging to the nonre-
TOT study conducted by Gollan and Acenas sponse language do not enter the competition pro-
(2000), suggesting that lexical nodes from the cess during lexical selection (Costa et al., 1999;
nonresponse language do not interfere during lex- Costa & Caramazza, 1999; Hermans, 2000). At this
ical selection. In this study, the probability that a point, it is difcult to adjudicate between these two
speaker falls in a TOT state depended on whether possibilities. In fact, it is possible that both of them
the speaker knew the targets translation. At rst capture the performance of different population of
glance, it may be expected that if a TOT state arises bilingual speakers.
in part because of the competition created by the Regarding whether the activation of the sub-
existence of a translation word, target words for lexical representations of the language not in use
which the participant knew the translation would affects the selection of the phonological represen-
be more likely to produce TOT states. This is be- tations in the language in use, the results seem more
cause the translation may act as a blocking word, homogeneous. Evidence from different paradigms,
complicating the retrieval of the correct word in the such as phoneme detection (Colome, 2001), picture
proper language. However, surprisingly bilingual naming (Costa et al., 2000; Kroll et al., 2000), and
speakers fell in TOT states less often with words TOT rates (Gollan & Acenas, 2000), suggests that
for which they knew the translations in the non- the activation of the phonological properties of the
response language than for words for which they targets translation affects the ease and speed with
did not know the translation. The authors argued which the phonological properties of the target in
that this result is difcult to explain in terms of the response language are retrieved. In this sense,
models that assume the existence of competing phonological encoding seems to follow language-
lexical representations across languages. nonspecic processing.
It can be concluded that activation ow
and phonological retrieval are two language-
Integrating the Experimental nonspecic processes. The extent to which lexical
Results selection is language specic or not remains as an
unresolved issue.
I have reviewed several studies aimed at nding However, regardless of which of the two hy-
whether the ow of activation and the selection potheses turns out to be correct, it may be recalled
320 Production and Control
that neither explains how exactly bilinguals nally comes from switching experiments. Meuter and
select the words in the response language. In the Allport (1999) investigated the extent to which a
framework of language-nonspecic models, it has language switch cost is dependent on the direction
been argued that lexical selection in the intended of the switch (see Meuter, chapter 17, this volume).
language is achieved by means of the active inhi- Bilingual speakers of different languages were asked
bition of the words in the nonresponse language. In to name nine digits presented repeatedly in lists, and
the following section, I discuss the arguments that they were instructed to name a given digit in L1 or L2
have been put forward in support of this view. depending on the color of the screen of a given trial.
The authors measured digit-naming latencies for
trials preceded by a same-language response (no-
switch trial) or by a different language response
Does Lexical Selection in (switch trial). Naming latencies for no-switch trials
Bilinguals Entail Inhibitory were faster than for switch trialsthe switching
Processes? cost. Interestingly, the magnitude of the switching
cost was larger when participants were asked to
Some researchers have put forward the notion that switch from the less-dominant to the more dominant
lexical selection in the desired language is achieved language than vice versa. That is, to switch from L2
by suppressing the activation of the lexical nodes to L1 was more costly than to switch from L1 to L2.
that belong to the nontarget language (e.g., Green, The authors (Meuter & Allport, 1999) inter-
1986, 1998; Meuter & Allport, 1999). Postulating preted this result as supporting the notion that
inhibitory mechanisms certainly increases the ex- when speaking one language, the nonresponse
planatory power of any given model of lexical language is inhibited. They argued that the mag-
access. Interestingly, however, none of the most nitude of the inhibition exerted in the lexical nodes
relevant monolingual speech production models is different for L1 than for L2; it is larger for L1.
postulates inhibitory processes. Therefore, after naming in L2 when in a subse-
Green (1998) has put forward the most specic quent trial a word in the dominant language (L1)
implementation of inhibitory control in the lan- has to be produced, the system needs more time to
guage production system. In this (Inhibitory Con- raise the activation level of its lexical nodes because
trol) model, there are multiple levels of control. they have just been strongly inhibited. When a
The level at which inhibition is postulated is the switch in the opposite direction is needed (from L1
lexical level (or lemma level). Lexical nodes are to L2), the switching cost is not so large because
marked with language tags that specify the lan- when speaking in the dominant language, there is
guage to which they belong. During lexical access, no need to inhibit the less dominant language
those words carrying the tag corresponding to the strongly. Therefore, it should be relatively easy
nontarget language are reactively inhibited, pre- to switch to L2. Further support of this differential
venting their selection. In other words, the con- strength of the suppression mechanism comes from
ceptual system activates the lexical nodes of the the observation that the magnitude of the asym-
two languages of a bilingual, but those belonging metrical switching cost was negatively correlated
to the nonresponse language are suppressed later. with the participants L2 level of prociency.
There are three important features of the model These results are consistent with the notion that
worth discussing here. First, the inhibition applied the reactive inhibition is proportional to the level of
to the lexical nodes of the nonresponse language is activation of the to-be-suppressed lexical nodes:
reactive in the sense that it is only functional after The greater the activation of the lexical nodes of
the lexical nodes have been activated. Also, this the nonresponse language, the greater the degree of
reactive mechanism assumes that more active lex- inhibition required. The notion of reactive inhibi-
ical nodes will be more inhibited. Second, despite tion leads also to the following prediction: If the
this suppression mechanism, the lexical nodes of bilingual uses the nonresponse language often and
the nonresponse language interfere during lexical is relatively uent in that language, the bilingual
selection in the response language. Third, there is would have to inhibit it greatly. As Green (1998)
discrete processing between lexical and sublexical put it:
levels, which implies that phonological activation is
restricted to the selected lexical node. Competition between alternative responses
The most compelling evidence supporting in- should increase with uency in context where
hibitory control of the lexical systems of a bilingual both languages are active. Increased competition
Bilingual Production 321
should induce greater inhibition of unwanted of inhibition of the lexical nodes of the nonre-
competitors. . . . The competitor is more acti- sponse language. This is because suppressing the
vated for procient bilinguals and so requires a activation of lexical nodes would prevent the acti-
greater degree of inhibition. (p. 103) vation of their phonological properties. Thus, it
is not immediately obvious how a model that as-
A prediction that follows from this assumption sumes both discrete processing and reactive inhi-
is that when speaking in L1, the degree with which bition can account for the phonological activation
L2 needs to be inhibited correlates positively with of the nonselected and presumably inhibited tar-
its level of activation. Therefore, the amount of gets translation.4
inhibition applied to L2 should be larger for pro- However, if inhibitory control is not part of the
cient bilinguals than for nonprocient bilinguals. bilingual speech production system, then how is
In this scenario, the switching cost when switching lexical selection achieved in the intended language?
from L1 to L2 should be larger for procient than There are several ways in which this can be effected
for nonprocient bilinguals given that the for- (see, e.g., La Heij, chapter 14, this volume). For
mer group has to inhibit L2 more strongly when example, in Greens (1998) model, lexical selection
speaking in L1. Note that this prediction is inde- could be achieved without inhibitory processes. In
pendent of the magnitude of the asymmetrical this model, lexical concepts (those concepts for
switching costs. Instead, it is related to how much which a word exists in a given language) are lan-
inhibition is needed to suppress the activation guage specic. From this, it would follow that, for
levels of the words belonging to a nonresponse instance, a Spanish-English bilingual has two lexi-
language. A closer look at the Meuter and Allport cal concepts for a given object. Lexicalization starts
(1999) results fails to support such a prediction. In with the activation of a lexical concept, which
fact, an increase in prociency level correlated with in turn activates its corresponding lexical node.
a reduction in the switching costs regardless of the The selection of a given lexical node is determined,
direction of the switch. among other things, by a checking procedure
Summarizing, the study of Meuter and Allport that inspects whether a lexical node corresponds
(1999) revealed three main results: (a) asymmetri- to the intended lexical concept. Thus, in this
cal switching cost for less-procient bilinguals, (b) model lexical concepts are language specic, lexical
the magnitude of the asymmetrical switching cost nodes carry language tags, and there is a checking
was reduced for more procient bilinguals, and (c) mechanism that ensures that lexical concepts are
no increase in switching cost for more procient linked to appropriate lexical nodes. According to
bilinguals. One way to reconcile these three results is this account, the successful selection of a lexical
to assume that reactive inhibitory processes in lexi- node in the intended language might be achieved
cal selection are only functional when the L2 pro- without lexical inhibition. This is because the
ciency is low. In this view, the switching cost mechanism that actually guarantees successful se-
observed for more uent bilinguals would not reect lection in the proper language is the checking
any inhibitory process but just the time to change mechanism that ensures that a given lexical node
the task instruction (see Costa & Santesteban, corresponds to the intended language-specic lexi-
2004b; Costa, Santesteban, & Felhosi, 2003). cal concept.
Another important feature of Greens model Let me illustrate this argument with an example.
is the discrete processing between the lexical and Assume that the speaker selects the L2 lexical
sublexical levels of representation. Accordingly, the concept DOG (and not the L1 lexical concept
only phonological information that would be active PERRO). The target conceptual representation
corresponds to the lexical node that has been se- (DOG) spreads activation to its lexical representa-
lected in the response language. This assumption is tion (dog). At the moment of selection, the check-
at odds with those results that show that the pho- ing mechanism makes sure that the selected lexical
nological representations of lexical nodes in the representation (dog) is linked to the intended con-
nonresponse language are activated during naming cept (DOG). If such is the case, the retrieval of the
in the response language (Colome, 2001; Costa phonological properties of the selected lexical node
et al., 2000; Kroll et al., 2000). In fact, the exis- starts. If not, a new selection procedure starts. In
tence of phonological activation of the targets other words, lexical nodes from the response lan-
translation is problematic also for the discrete as- guage (within-language competitors, cat in the ex-
sumption in monolingual models. More important, ample) and nonresponse language (cross-language
these effects seem also inconsistent with the notion competitors, perro in the example) may be
322 Production and Control
discarded in the same way by the checking mecha- the lexical nodes of the nonresponse language may
nism. That is, the same mechanism that prevents the affect production performance, but that bilinguals
selection of semantically close competitors would shift from language-nonspecic processing toward
be in charge of preventing the selection of the tar- language-specic processing when they become
gets translation. Thus, although Greens model more procient bilinguals. Although this empirical
assumes the existence of language inhibition, in generalization captures some of the current results,
principle it could account for bilingual lexical se- it is admittedly rather tentative and requires further
lection without such mechanism (see La Heij, research.
chapter 14, this volume, for further discussion).
I reviewed some of the basic claims about the Acknowledgments
role of inhibitory mechanisms in bilingual speech
production. I argued that some results support the The work reported here was supported in part by
notion that lexical access in speech production the National Institutes of Health (DC 04542),
by the Ramon y Cajal program, and the Spanish
entails the suppression of the activated lexical
government (BSO2001-3492-C04-01), and the
nodes belonging to the nonresponse language. McDonnell grant Bridging Mind Brain and Be-
However, there are other results that do not seem havior. I thank Antoni Rodrguez-Fornells, Salva-
consistent with such an idea, especially when the dor Soto-Faraco, Mikel Santesteban, and Alfonso
production performance of highly procient bilin- Caramazza for their helpful comments to earlier
guals is addressed. Also, I argued that one of the versions of this chapter.
current models of speech production, of which in-
hibition is a cornerstone, does not really require Notes
such a mechanism to explain how normal speech
production proceeds in bilingual speakers. This 1. There is a debate regarding the functional
does not mean to say that inhibitory processes play architecture of the lexical system. According to
no role in the production of speech in bilingual some models (e.g., Levelt et al., 1999), lexical ac-
cess would entail the retrieval of two different
speakers. It may very well be that with an increase
representations (the lemma and the lexeme). Other
of L2 prociency, there is a shift from reliance on authors postulated the existence of only one level
inhibitory processes toward language-specic se- of lexical representation (e.g., Caramazza, 1997;
lection mechanisms. Further research needs to be Dell, 1986). Here, I adopt this second view, and
done to determine how inhibitory control is im- I refer to the lexical representations with the term
plemented in the speech production system, al- lexical nodes. Nevertheless, the arguments devel-
lowing at the same time the ow of activation to be oped in this chapter are relatively independent of
language nonspecic. this debate.
2. The reason that led Poulisse to put forward
such a new interpretation came from the adoption
Summary of the speech production architecture proposed
by Levelt et al. (1999), in which frequency effects
In this chapter, I focused on two aspects of bilin- are located at the level of word form retrieval
(Jescheniak & Levelt, 1994). Poulisse (1999) ar-
gual speech production: activation ow and selec-
gued that the involuntary selection of L1 elements
tion processes. I contrasted two different views of is to some extent caused by the different frequency
bilingual lexical access, the language-specic and values of the L1 and L2 lexical nodes. Thus, if such
language-nonspecic views. I argued that there is errors are sensitive to word frequency and this
empirical evidence consistent with the notion that variable affects only the retrieval of word forms, it
activation ows from the semantic system to the follows that the level at which such errors are oc-
two languages of a bilingual up to the phonological curring is the phonological level. However, recent
level in a language-nonspecic fashion. results revealed that frequency also exerts its in-
The picture is more complex when the selection uence at the level at which lexical selection is
mechanisms are concerned. The experimental evi- achieved (Caramazza, Bi, Costa, & Miozzo, 2004;
Dell, 1990; Caramazza, Costa, Miozzo, & Bi,
dence is mixed in the sense that some results favor
2001; Jescheniak, Meyer, & Levelt, 2003). Thus,
the language specicity of the lexical selection the available experimental evidence is not com-
mechanism, and others favor the notion of non- pelling enough to assume that the involuntary L1
specic lexical selection. A possible way to recon- intrusions stem from the phonological level.
cile the seemingly contradictory data is to assume 3. Although these results are interesting, they
that, in nonprocient bilinguals, the activation of also show some inconsistencies that may prevent
Bilingual Production 323
drawing strong conclusions. Although it is true that Costa, A., & Caramazza, A. (1999). Is lexical
naming latencies only revealed semantic interfer- selection in bilingual speech production
ence effects when the target picture was preceded language-specic? Further evidence from
by an English response (nonswitch trials), the sce- Spanish-English and English-Spanish
nario is a bit different when paying attention to the bilinguals. Bilingualism: Language, and
error rates. Error rates were larger (in some cases Cognition, 2, 231244.
even by a factor of more than 2) whenever the Costa, A., & Caramazza, A. (2002). The produc-
target word had been preceded by a semantically tion of noun phrases in English and Spanish:
related denition regardless of whether there was a Implications for the scope of phonological
language switch or not. Thus, it is unclear whether encoding in speech production. Journal of
lexical competition also occurred for the conditions Memory and Language, 46, 178198.
in which the target picture was preceded by a lan- Costa, A., Caramazza, A., & Sebastian-Galles, N.
guage switch. (2000). The cognate facilitation effect: Impli-
4. Greens model could in part accommodate cations for models of lexical access. Journal of
these results, maintaining the notion of inhibition, Experimental Psychology: Learning, Memory,
but giving up some other assumptions and postu- and Cognition, 26, 12831296.
lating new ones. First, it could be assumed that Costa, A., Colome, A., Gomez, O., & Sebastian-
there is cascaded processing between the lexical Galles, N. (2003). Another look at cross-
and phonological levels of representation, in language competition in bilingual speech
such a way that any lexical representation that re- production: Lexical and phonological factors.
ceives activation spreads a proportional amount of Bilingualism: Language and Cognition, 6,
it to the corresponding phonological representa- 167179.
tion. A further assumption that has to be made is Costa, A., Miozzo, M., & Caramazza, A. (1999).
that by the time inhibition reaches the nonresponse Lexical selection in bilinguals: Do words in the
lexical nodes, the targets translation has already bilinguals two lexicons compete for selection?
spread some activation to the nodes representing its Journal of Memory and Language, 41,
phonological properties. 365397.
Costa, A., & Santesteban, M. (2004a). Bilingual
References word perception and production: Two sides of
the same coin? Trends in Cognitive Sciences,
Caramazza, A. (1997). How many levels of 8, 253.
processing are there in lexical access? Cogni- Costa, A., & Santesteban, M. (2004b). Lexical
tive Neuropsychology, 14, 177208. access in bilingual speech production:
Caramazza, A., Bi, Y., Costa, A., & Miozzo, M. Evidence from language switching in highly
(2004).What determines the speed of lexical procient bilinguals and L2 learners. Journal
access: homophone or specic-word of Memory and Language, 50, 491511.
frequency? A reply to Jescheniak et al. (2003). Costa, A., Santesteban, M., & Felhosi, G. (2003,
Journal of Experimental Psychology: September). Do language-switching costs
Learning, Memory, and Cognition, 30, reveal different degrees of language activa-
278282. tion? Paper presented at the 13th Conference
Caramazza, A., & Costa, A. (2000). The semantic of the European Society for Cognitive
interference effect in the picture-word Psychology, Granada, Spain.
interference paradigm: Does the response set Cutting, J. C., & Ferreira, V. S. (1999). Semantic
matter? Cognition, 75, 5164. and phonological information ow in the
Caramazza, A., & Costa, A. (2001). Set size and production lexicon. Journal of Experimental
repetitions are not at the base of the differen- Psychology: Learning, Memory, and Cogni-
tial effects of semantically related distractors: tion, 25, 318344.
Implications for models of lexical access. De Bot, K. (1992). A bilingual production model:
Cognition, 80, 291298. Levelts speaking model adapted. Applied
Caramazza, A., Costa, A., Miozzo, M., & Bi, Y. Linguistics, 13, 124.
(2001). The specic-word frequency effect: De Groot, A. M. B. (1992). Determinants of word
Implications for the representation of homo- translation. Journal of Experimental Psychol-
phones. Journal of Experimental Psychology: ogy: Learning, Memory, and Cognition, 18,
Learning, Memory, and Cognition, 27, 10011018.
14301450. Dell, G. S. (1986). A spreading activation theory of
Colome, A`. (2001). Lexical activation in bilinguals retrieval in sentence production. Psychological
speech production: Language-specic or Review, 93, 283321.
language-independent? Journal of Memory Dell, G. S. (1990). Effects of frequency and vo-
and Language, 45, 721736. cabulary type on phonological speech errors.
324 Production and Control
Language and Cognitive Processes, 5, Hermans, D., Bongaerts, T., De Bot, K., &
313349. Schreuder, R. (1998). Producing words in a
Dewaele, J. M. (2001). Activation or inhibition? foreign language: Can speakers prevent inter-
The interaction of L1, L2 and L3 on the ference from their rst language? Bilingualism:
language mode continuum. In J. Cenoz, B. Language and Cognition, 1, 213230.
Hufeisen, & U. Jessner (Eds.), Cross-linguistic Humphreys, G. W., & Riddoch, M. J. (1988).
inuence in third language acquisition: Cascade processes in picture identication.
Psycholinguistic perspectives (pp. 6989). Cognitive Neuropsychology, 5, 67104.
Oxford, U.K.: Oxford University Press. Jescheniak, J. D., & Levelt, W. J. M. (1994). Word
French, R. M., & Jacquet, M. (2004). Under- frequency effects in speech production:
standing bilingual memory: Models and data. Retrieval of syntactic information and of
Trends in Cognitive Sciences, 8, 8793. phonological form. Journal of Experimental
Goldrick, M., & Rapp, B. (2002). A restricted Psychology: Learning, Memory, and Cogni-
interaction account (RIA) of spoken word tion, 20, 824843.
production: The best of both worlds. Jescheniak, J. D., Meyer, A. S., & Levelt, W. J. M.
Aphasiology, 16, 2055. (2003). Specic-word frequency is not all that
Gollan, T. H., & Acenas, L. A. (2000, April). counts in speech production: Comments on
Tip-of-the-tongue incidence in Spanish- Caramazza et al. (2001) and new experimental
English and Tagalog-English bilinguals. Paper data. Journal of Experimental Psychology:
presented at the Third International Sympo- Learning, Memory, and Cognition, 29,
sium on Bilingualism, Bristol, U.K. 432438.
Gollan, T. H., & Kroll, J. F. (2001). Bilingual Kroll, J. F., Dijkstra, A., Janssen, N., & Schriefers,
lexical access. In B. Rapp (Ed.), The handbook H. (2000, November). Selecting the language
of cognitive neuropsychology: What decits in which to speak: Experiments on lexical ac-
reveal about the human mind (pp. 321345). cess in bilingual production. Paper presented
Philadelphia: Psychology Press. at the 41st Annual Meeting of the Psycho-
Gollan, T. H., & Silverberg, N. B. (2001). nomic Society, New Orleans, LA.
Tip-of-the-tongue states in Hebrew-English Kroll, J. F., & Stewart, E. (1994). Category inter-
bilinguals. Bilingualism: Language and Cog- ference in translation and picture naming:
nition, 4, 6383. Evidence for asymmetric connections between
Green, D. W. (1986). Control, activation and bilingual memory representations. Journal of
resource. Brain and Language, 27, 210223. Memory and Language, 33, 149174.
Green, D. W. (1998). Mental control of the bilin- La Heij, W., Hooglander, A., Kerling, R., &
gual lexico-semantic system. Bilingualism: Van der Velden, E. (1996). Nonverbal
Language and Cognition, 1, 6781. context effects in forward and backward
Griffin, Z. M., & Bock, J. K. (1998). Constraint, translation: Evidence for concept mediation.
word frequency, and the relationship between Journal of Memory and Language, 35,
lexical processing levels in spoken word pro- 648665.
duction. Journal of Memory and Language, Lee, M. W., & Williams, J. N. (2001). Lexical
38, 313338. access in spoken word production by bilin-
Grosjean, F. (1997). Processing mixed language: guals: Evidence from the semantic competitor
Issues, ndings and models. In A. M. B. de priming paradigm. Bilingualism: Language
Groot & J. F. Kroll (Eds.), Tutorials in bilin- and Cognition, 4, 233248.
gualism: Psycholinguistic perspectives (pp. Levelt, W. J. M. (1989). Speaking: From
225254). Mahwah, NJ: Erlbaum. intention to articulation. Cambridge, MA:
Grosjean, F. (1998). Transfer and language mode. MIT Press.
Bilingualism: Language and Cognition, 1, Levelt, W. J. M. (2001). Spoken word production:
175176. A theory of lexical access. Proceedings of the
Grosjean, F. (2001). The bilinguals language National Academy of Sciences, 98,
modes. In J. Nicol (Ed.), One mind, two 13, 46413,471.
languages: Bilingual language processing Levelt, W. J. M., Roelofs, A., & Meyer, A. (1999).
(pp. 122). Oxford, U.K.: Blackwell. A theory of lexical access in speech
Harley, T. A. (1993). Phonological activation of production. Behavioral and Brain Sciences,
semantic competitors during lexical access in 22, 175.
speech production. Language and Cognitive Levelt, W. J. M., Schriefers, H., Vorberg, D.,
Processes, 8, 291309. Meyer, A. S., Pechmann, T., & Havinga, J.
Hermans, D. (2000). Word production in a foreign (1991). The time course of lexical access in
language. Unpublished doctoral thesis, speech production: A study of picture naming.
University of Nijmegen, The Netherlands. Psychological Review, 98, 122142.
Bilingual Production 325
McNamara, J. (1967). The bilinguals linguistic Rapp, B., & Goldrick, M. (2000). Discreteness and
performance: A psychological overview. interactivity in spoken word production.
Journal of Social Issues, 23, 5977. Psychological Review, 107, 460499.
McNamara, J., & Kushnir, S. L. (1972). Linguistic Roelofs, A. (1992). A spreading-activation theory
independence of bilinguals: The input switch. of lemma retrieval in speaking. Cognition, 42,
Journal of Verbal Learning and Verbal 107142.
Behavior, 10, 480487. Roelofs, A. (1998). Lemma selection without
Meuter, R. F. I., & Allport, A. (1999). Bilingual inhibition of languages in bilingual speakers.
language switching in naming: Asymmetrical Bilingualism: Language and Cognition, 1,
costs of language selection. Journal of Memory 9495.
and Language, 40, 2540. Roelofs, A. (2000). WEAVER and other com-
Meyer, A. S. (1996). Lexical access in phrase and putational models of lemma retrieval and
sentence production: Results from picture word-form encoding. In L. Wheeldon (Ed.),
word interference experiments. Journal of Aspects of language production (pp. 71114).
Memory and Language, 35, 477496. Sussex, U.K.: Psychology Press.
Meyer, A. S., & Schriefers, H. (1991). Phonologi- Roelofs, A. (2001). Set size and repetition matter:
cal facilitation in picture-word interference Comment on Caramazza and Costa (2000).
experiments: Effects of stimulus onset asyn- Cognition, 80, 283290.
chrony and types of interfering stimuli. Jour- Schriefers, H., Meyer, A. S., & Levelt, W. J. M.
nal of Experimental Psychology: Learning, (1990). Exploring the time-course of lexical
Memory, and Cognition, 17, 11461160. access in production: Picture-word interfer-
Peterson, R. R., & Savoy, P. (1998). Lexical ence studies. Journal of Memory and
selection and phonological encoding during Language, 29, 86102.
language production: Evidence for cascaded Starreveld, P. A. (2000). On the interpretation of
processing. Journal of Experimental onsets of auditory context effects in word
Psychology: Learning, Memory, and production. Journal of Memory and
Cognition, 24, 539557. Language, 42, 497525.
Poulisse, N. (1999). Slips of the tongue: Speech Van Hell, J. G., & De Groot, A. M. B. (1998).
errors in rst and second language production. Conceptual representation in bilingual
Amsterdam, The Netherlands: Benjamins. memory: Effects of concreteness and
Poulisse, N., & Bongaerts, T. (1994). First lan- cognate status in word association.
guage use in second language production. Bilingualism: Language and Cognition, 1,
Applied Linguistics, 15, 3657. 193211.
Carol Myers-Scotton
16
Supporting a Differential
Access Hypothesis
Code Switching and Other Contact Data
ABSTRACT This chapter endorses the position of Clahsen (e.g., 1999), Jackendoff
(2002), Pinker (1999), inter alia about differences in how words may be accessed in
production (i.e., that some lexical words including regular morphology are constructed
online while semiproductive or irregular elements are stored as units in the mental
lexicon). However, it goes a step in another direction, to argue that not all elements
underlying surface-level morphemes are accessed in the same way or at the same point
in language production and that this difference is reected in the distribution patterns
of surface-level morphemes in naturally occurring data. Specically, those elements
underlying content morphemes and what are called early system morphemes are salient
at the level of the mental lexicon. In contrast, those system morphemes that are
structurally assigned (called late system morphemes) are not available to participate in
lexical combinations until the level of the formulator. The formulator receives direc-
tions from lemmas in the mental lexicon about how to assemble surface-level mor-
phemes, including those late system morphemes that are essential in building larger
constituent structures. A Differential Access Hypothesis captures the distinction be-
tween morpheme activation, supported by evidence that links variation in data dis-
tributions to morpheme type. The evidence this chapter considers comes from language
contact phenomena, especially code switching. This hypothesis makes claims related to
the two-step retrieval hypothesis of Garrett (1975, 1993, inter alia). Two models rel-
evant to contact data, the Matrix Language Frame (MLF) model and the 4-M model,
frame the discussion.
326
Code Switching and Other Contact Data 327
distinct implications, not only for theories of lan- argues that speakers selectagain unconsciously
guage production, but also for generative theories what I call a Matrix Language to provide mor-
of language in general. phosyntactic structure for their bilingual speech.
The claim is that these data provide a window At the same time, speakers consider which of the
on how language is organized at the level of the participating languages is better suited to express
mental lexicon and how it is differentially accessed specic intentions. According to the MLF model,
at the level of the formulator (under a model the way that this decision can be carried out de-
modied from Levelt, 1989). The model I assume is pends on both universal constraints imposed by the
not fully spelled out here, and it is not compared in grammatical structure of code switching within a
detail with other production models simply because clause and on typological features of the specic
discussing such models, as a whole, is not a pri- languages involved. That is, although bilinguals
mary goal for this chapter. generally can express intentions in any of their
In the model underlying discussion here, pro- languages, the structure of how they do this is
duction is set in motion well before the projection structurally constrained (cf. Jake & Myers-Scotton,
of surface structures. At the prelinguistic concep- 1997; Myers-Scotton, 2002a; Myers-Scotton &
tual level, speakers begin to map onto language Jake, 2001).
their intentions about communicating. Bear in With these many selections made, speaker in-
mind that speakers are communicating not only tentions access abstract semantic/pragmatic feature
referential information, but also information about bundles that are language specic, but they are not
how they view their own public faces and their linguistic units. At this point, production becomes
relationship with their listeners. This means that relatively simple, and so in some ways, it sup-
they make a number of decisions, largely uncon- ports the view of production of La Heij (chapter 14,
scious, that consider the sociolinguistic and psy- this volume) as complex access, simple selection.
cholinguistic consequences of potential choices. For However, he does not take into account the lemma
monolingual speakers especially, this means con- matching (across participating languages) that
sidering dialectal and stylistic choices. In fact, all Myers-Scotton and Jake (1995) saw as an essential
speakers do this as part of weighing the pragmatic part of selection in code switching when mixed
implications of how they chose to speak (i.e., how constituents (containing morphemes from both
ways of speaking may be interpreted by others). languages) are produced. This is discussed in a later
Bilingual speakers have even more to consider section. The model sketched here has obvious
because the decision to produce bilingual speech, similarities to other models, especially those of
especially code switching, entails much more than Green (1986, 1998) and Poulisse (1997). Also, the
just selecting a monolingual style/register. (Bilin- approach taken here is compatible with a number
gual speech is dened most generally as utterances of points made by Meuter (chapter 17, this vol-
that include surface-level morphemes from two or ume), such as that the process of inhibition can
more language varieties. Code switching is one type operate both locally and globally.
of bilingual speech and comes in several forms, but The semantic-pragmatic feature bundles inter-
the only type of interest here includes morphemes face with language-specic lemmas in the mental
from two varieties in the same clause.) Before em- lexicon. Lemmas support surface-level morphemes.
barking on bilingual speech, bilinguals take into Most specically and in line with the work of Bock
account their own prociency and that of listeners. and Levelt (1994), features directly elect lemmas
They also must answer for themselves a number of supporting what I call content morphemes (e.g.,
questions germane to the linguistic choices they nouns and verbs). The information in other lemmas
face. becomes salient in other ways, as will become clear.
The big question relevant to code switching is, In this model, lemmas contain all the necessary
Will engaging in code switching result in sufcient information that will result in surface-level mor-
pragmatic and social rewards to make it worth any phosyntactic structures, not just semantic infor-
costs? An example of a cost is that, in some com- mation. The Abstract Level model developed by
munities, the public view is that code switching Myers-Scotton and Jake assumes that lemmas
is bad language. (See Myers-Scotton, 1993, and contain three levels of abstract grammatical infor-
Myers-Scotton & Bolonyai, 2001, for views on mation. First, the level of lexical-conceptual struc-
why speakers engage in code switching.) ture contains specications for semantic and
If speakers do engage in code switching, the Ma- pragmatic features. Second, the level of predicate-
trix Language Frame (MLF) model discussed here argument structure refers to mappings of thematic
328 Production and Control
roles (e.g., agent and patient) onto syntactic within the bilingual clause, the crucial site at which
structures and to specications for subcategoriza- juxtaposing of languages is a grammatical issue.
tions of syntactic predicates (e.g., Can a verb be both Many of these were studies of code switching
transitive and intransitive? Compare devour vs. (e.g., Meechan & Poplack, 1995; Muysken, 2000)
eat). The third level, morphological realization pat- or mixed languages (e.g., Bakker & Mous, 1994;
terns, species the permissible surface-level cong- Matras, 2000). Some researchers have attempted to
urations of morphemes (e.g., Is case expressed explain some code-switching data within the
overtly?) as well as word order. (For more details on framework of generative syntactic theories in-
these levels and the Abstract Level model, see tended to explain monolingual data (e.g., Mac-
Myers-Scotton, 2002a; Myers-Scotton & Jake, Swan, 1999, 2000; Ritchie & Bhatia, 1996). For a
1995, 2001.) Directions sent to an articulator con- critique of such models, largely because they do not
cerning phonetic surface-level forms are also nec- consider differing activations of the participating
essary, of course, but are not discussed here. languages (related to the role of a Matrix Lan-
guage), see Jake, Myers-Scotton, and Gross (2002)
and Myers-Scotton (2002a, pp. 157163).
Language Contact Studies Little theoretical attention has been directed to
other contact phenomena, with the exception of
Within linguistics, studies of language contact have some studies of attrition (e.g., Bolonyai, 2002) and
multiplied in the last 20 years, beginning with creole formation (Bickerton, 1981, and the collec-
heavy interest in code switching (e.g., Pfaff, 1979; tion in DeGraff, 1999, as well as Myers-Scotton,
Poplack, 1980). The advent of the European Union 2001a). A few researchers who are primarily spe-
has stimulated European interest in all forms of cialists in second language (L2) acquisition have
contact phenomena; there are many more confer- extrapolated from their data to propose speech
ences and workshops on contact languages in production models (e.g., Poulisse, 1997; Wei, 2000a,
Europe than in North America, for example. Also, 2000b).
more articles and books on bilingualism have ap-
peared, thanks as well to the global economy,
with its accompanying rise of bilingualism, espe- The Matrix Language: Embedded
cially in languages of wider communication (e.g., Language Opposition
not only English, but also languages such as Chinese
in Southeast Asia). In addition, this burgeoning Starting in 1993 (Myers-Scotton, 1997) and pro-
interest has meant more studies of long-standing ceeding through publications with Jake (1995,
bilingual communities, especially in the third 2000a, 2000b, 2001) and by myself (Myers-Scot-
world, as well as of the new bilingual communities ton, 2002a), Jake and I have considered code
created by the huge inux of immigrants to nations switching within two grammatically oriented
from Australia to Norway to Canada. frameworks that translate into a model of pro-
Yet, many of these studies have been purely duction, although they are not themselves such a
descriptive. Some are best considered under the model. These are the MLF model and the newer 4-
rubric of the sociology of language because their M (four types of morpheme) model. These models
goal was to detail patterns of language use in a derive from looking at code switching as it actually
bilingual community (e.g., Kropp Dakuba, 1997; is present in naturally occurring data. Of course,
Zentella, 1997). Others who considered the social code switching is of interest to psycholinguists
side of language were engaged in theory building; simply because it consists of morphemes from two
but these theories refer to the psychological and languages in typically very uent speech. But, there
social motivations for producing bilingual speech, is a second, perhaps more important reason: There
not its grammatical nature (e.g., Auer, 1998; is nothing random about how these morphemes
Myers-Scotton, 1993). are organized in a clause once it is recognized that
Many did study the types of grammatical the participating languages have different roles and
structures found in a bilingual corpus, but their that different types of morpheme have different
ndings may not be of special interest to psycho- distribution patterns. That is, these orderly asym-
linguists for two reasons. First, as indicated above, metries have implications for any model of lan-
many studies were classications that were not guage and its organization in the mind.
directed toward supporting any specic hypothe- The unit of analysis in these models is the bilin-
ses. Second, they did not necessarily focus on data gual CP (projection of complementizer), commonly
Code Switching and Other Contact Data 329
used in syntactic theories of various persuasions. frame the bilingual clause. They are often very
The CP is the highest unit projected by lexical uent as well in the Embedded Language; however,
elements. It can be dened unambiguously as a depending on the type of embeddings they make in
complementizer followed by a clause consisting of the clause, their uency can vary. Mixed constitu-
a subject and predicate. The predicate can be re- ents consist of morphemes from both languages.
alized as a verb phrase or a predicate adjective Sometimes, such a constituent is the entire bilingual
phrase. Examples of complementizers are that in I clause (see Example 1), but sometimes the clause
think that I will leave and if in If it rains, I will includes as well monolingual constituents from ei-
leave. (Note that each of these sentences contains ther language (islands). To embed singly occurring
two CPs. In the rst sentence, the second CP [in- content morphemes in mixed constituents requires
troduced by that] is embedded in the rst one.) The less uency than embedding whole phrasal con-
complementizer is sometimes replaced by a speci- stituents (Embedded Language Islands, as in
er (e.g., a topicalizer) or a null element (e.g., in so- Example 2).
called independent clauses in many languages,
there is no overt complementizer). Also, CPs can
contain other null elements, but they are still Levels of Activation
clauses (i.e., an utterance such as What? is
The structural asymmetry between the participat-
a clause with many null elements). In this chapter, I
ing languages implies that the Matrix Language has
will refer to a bilingual CP simply as a bilingual
a higher level of activation than the Embedded
clause. I will also refer to code switching as if it
Language. However, because the Embedded Lan-
always occurs only between two languages, but
guage does supply its own elements, it is also al-
code switching with more than two languages is
ways on, but at a lower level. But, because of its
entirely possible and is frequent.
dominant role in structuring the clause, it follows
Of course, there can be larger bilingual units
that the Matrix Language must always be on, even
than the bilingual clause when bilinguals speak
when an Embedded Language element is intro-
(i.e., the sentence, the conversational turn, etc.).
duced, even if its level of activation is lowered, as it
However, it is only in the bilingual clause that the
almost must be when Embedded Language Islands
grammars of both languages are in contact and in
are produced. Of course, bilingual word recogni-
which the basic hierarchical opposition of the MLF
tion tasks (e.g., see this volumes chapters 9, 17,
model between the Matrix Language versus Em-
and 22 by Dijkstra, Meuter, and Christoffels & De
bedded Language makes any sense. Sentences, of
Groot, respectively, for related discussion) also
course, may contain more than one clause, and in
provide implications about differing levels of acti-
bilingual speech not all clauses in a sentence have
vation in the bilinguals two languages.
grammatical frames from the same language. This
Note that when the Embedded Language sup-
is a reason not to use the sentence as a unit of
plies full constituents, they are often adjuncts, such
analysis. (There is good empirical evidence to show
as prepositional phrases, and therefore are periph-
that within any bilingual clause the source of the
eral to the core thematic grid in the clause; how-
grammatical frame remains the same.)
ever, the hypothesis is that, even though they
A critical feature of the MLF model is to rec-
are entirely in the Embedded Language, such is-
ognize that the more important structural role in
lands must meet the frame requirements of the
the bilingual clause goes to only one language; that
Matrix Language. Still, the activation of the Em-
is, a single language supplies the morphosyntactic
bedded Language may need to be much higher
frame. The frame itself is called the Matrix Lan-
when islands are produced compared with singly
guage, but so is the language supplying the frame.
occurring words; after all, islands are full constit-
This differentiates it from the other participating
uents with inections and other functional elements
language, which is called the Embedded Language.
(see Myers-Scotton, 2002a, pp. 139153 on Em-
The Matrix Language is structurally identied
bedded Language Islands).
by the role it plays within code switching. It may
also be the dominant language in the speakers
community, but that does not gure in identifying Examples of Code Switching
the Matrix Language. Often, the Matrix Language
is the rst language (L1) of the speakers, but this is Example 1, audio-recorded in Oslo, Norway,
not always so. Obviously, speakers must be very shows a typical example of the type of bilingual
uent in the Matrix Language because they use it to patterning that is found in code switching involving
330 Production and Control
code switching. Further, evidence showed that they Swahili/English example in Example 3. A reason
easily accept the morphological realization patterns for the difculty in switching verbs is their major
of the Matrix Language frame, even though these role in phrase structure: They assign thematic roles
may be quite different from frame-building require- and set subcategorization frames for syntactic
ments in the language they come from. For example, complements. Why verbs can be switched in some
as in Example 1, an Embedded Language noun, corpora and not others has largely eluded re-
which would not be case marked in the Embedded searchers to date (but, e.g., see Jake & Myers-
Language, receives an overt case sufx from the Scotton, 1997; Myers-Scotton & Jake, 2001, for
Matrix Language. For this to happen, it follows that an argument why English verbs cannot receive
not all of the information in an Embedded Language Arabic inections in Palestinian Arabic/English
lemma supporting such a noun is activated. code switching).
Further, this state of affairs implies that the
Matrix Language lemmas remain activated to
send frame-building directions to the formulator The Uniform Structure Principle
throughout the bilingual clause. Note that if there
is not a Matrix Language lemma to provide a close Once the asymmetry between what the Matrix
match for the Embedded Language lemma called, Language and the Embedded Language can supply
a solution is at hand. In addition to lemmas, the to the bilingual clause is clear, it is obvious
mental lexicon also includes language-specic that code switching generally proceeds without
generalized lexical knowledge that can make the obstacles across data sets. When there is an ob-
match. This provision solves a number of potential stacle, compromise strategies involving little dis-
problems. For example, concepts can be expressed ruption in the bilingual clause sufce. I say little
in the Embedded Language without matches in disruption because the clause that results obeys
the Matrix Language lexicon; or, brand new the very same basic constraints as other bilingual
words for new concepts or objects can occur in clauses. Further, this maintenance of uniformity is
either monolingual or bilingual speech. For such the same as that found in monolingual data. This
words to appear in code switching, they just must generalization is captured in a simple, but far-
be incorporated in phrase structure in ways meet- reaching, Uniform Structure Principle:
ing the levels of predicate-argument structure and
morphological realization patterns present in the A given constituent type in any language has a
Matrix Languages generalized lexical knowledge uniform abstract structure and the requirements
(cf. Myers-Scotton, 2002a, pp. 69, 130131; of well-formedness for this constituent type
Myers-Scotton & Jake, 1995). As a simple exam- must be observed whenever the constituent ap-
ple, if nouns occur in an Embedded Language noun pears. In bilingual speech, the structures of the
phrase with articles but nouns do not occur with Matrix Language are always preferred, but
articles in the Matrix Language, Embedded Lan- some Embedded Language structures are al-
guage nouns canand dooccur in code switch- lowed if certain conditions are met. (Myers-
ing without any article. Scotton, 2002a, p. 8)
from any language with lemmas in the speakers phology not in terms of rules that add afxes, but
mental lexicon. In this sense, the formulator is not rather as free combinations of lexically stored
language specic, although the operations it per- parts; 2002, p. 180.)
forms at any one time necessarily are. But, the history of other inectional elements is
different for Jackendoff. Other inection elements
are items in a semi-regular pattern [that is] simply
Abstract Constraints stored (2002, p. 187). That is, these forms (e.g.,
the irregular past-tense verb) in some sense are
on Morphology
stored as completed forms, not constructed. They
contrast with the products of productive mor-
The main argument of this chapter is that such
phology that are not stored, but rather involve
uniformity, specically in contact phenomena, re-
procedures that build things of word size or smaller
ects more abstract constraints on how different
and are constructed online in working memory.
types of morpheme are accessed in language pro-
Jackendoffs general conclusion was that we
duction. That is, I argue that what occurs in nat-
should take very seriously the question of what is
urally occurring code-switching data and other
stored and what is computed online because it
contact data (performance) indicates that recog-
justies a major reorganization of the theory of
nizing a link between competence and production
grammar (2002, p. 193).
better explains the nature of language than focus-
In this chapter, my goal is to provide evidence
ing only on formal models of competence.
for a related argument: Not all of those forms that
Jackendoff called grammatical words are built in
Jackendoff: What Is Constructed the lexicon in the same way; some are only built
Online when information on grammatical relations that
take account of hierarchical information outside
Now, this particular argument is not part of Jack- their immediate phrase structure is available.
endoffs (2002) views in his recent, far-reaching
claims about a theory of language. However, it is Clahsen: Combinatorial
part of Jackendoffs overall claim that what hap- Operations?
pens in production is absolutely central in work-
ing out the instantiation of language in the mind Of course, Jackendoffs position is compatible with
(p. 152). In this regard, here is the question he po- that expressed by others, such as Clahsen (1999)
ses: What aspects of an utterance must be stored in and Clahsen et al. (2001). Clahsen and his associ-
long-term memory, and what aspects can be con- ates also argued for the dual structure of the
structed online in working memory? (p. 152). My language faculty, but based their claims on ex-
argument goes a step further in posing another perimental ndings. The question Clahsen (1999)
question: Within the set of aspects that can be asked is whether empirical ndings are to be ac-
constructed online in working memory, is it possi- counted for by combinatorial operations (such as
ble that not all the lexical elements that make up rules of language) or by (access to) lexical entries
these words are salient at the same point in the (p. 991). In my terms, what he meant is this: Do
production process? That is, does the abstract na- irregular forms that contain an inection have their
ture of morphological elements and their particular own lemma, or are they part of the same lemma as
role in phrasal structures affect how and when they that for the basic stem?
are accessed? Clahsen (1999) focused on German noun plu-
Jackendoffs comments about morphological rals and participles and amassed evidence com-
elements largely have to do with how they are paring various types of responses to regular versus
stored. After acknowledging that some elements irregular inected entries. He concluded that,
(e.g., content words such as dog) must be stored in Lexically restricted (irregular) inection is not rule-
long-term memory, he argued that morphology is based (p. 994). Instead, the inection is part of
treated in two different ways. First, those inec- the lexical item itself. Thus, the argument is that
tional elements involved in productive morphol- irregular forms are not assembled via rules that join
ogy are stored in a way similar to content words. together a base form and an inection.
(Jackendoff referred to the relevant morphological Clahsen et al. (2001) reached similar conclu-
process affecting these afxes as regular mor- sions. They investigated German adjectives and
Code Switching and Other Contact Data 333
German strong verbs. These are verbs that change the type of learning that the procedural memory
their base form (e.g., bring-en [to bring]) to show system subserves. However, he did say that [T]his
tense/aspect (e.g., ge-bracht [brought] past partici- system may be particularly important in the learn-
ple), although they may also take regular tense/ ing and computation of sequential and hierarchical
aspect inections as well, as do German weak structures (i.e., in grammatical structure building)
verbs. The authors had an unsurprising result in (p. 107).
lexical decision experiments: Subjects showed As will become clear, I indicate that it is im-
shorter response times for high-frequency verb portant to differentiate types of grammatical mor-
stems than low-frequency ones. But, what was in- pheme, reecting the fact that they have different
teresting was that response time for the participle distributions across various types of data. How-
forms of strong verbs was related to the frequency ever, especially those morphemes that are most
of the participles themselves, not to the frequency critical in grammatical structure building are the
of the verb stem of these participles. This nding ones for which I posit a different route to produc-
indicated that participles of these verbs were con- tion. They are also the ones that seem to be hardest
sidered as units on their own, not as part of the to acquire accurately in late L2 learning (cf. Myers-
basic stem of the same verb. Scotton & Jake, 2000a; Wei, 2000a, 2000b).
The views of Clahsen, Jackendoff, and Ullman,
Ullmans Declarative/Procedural as well as the results of other researchers, such
Model as Marcus, Brinkman, Clahsen, Weise, and Pinker
(1995) and Pinker (1999), lead to a particular view
The declarative/procedural model of Ullman of the nature of lemmas in the mental lexicon as it
(2001) is also relevant to the argument about dif- relates to morpheme decomposition. This view is
ferent morpheme types developed in this chapter. compatible with part of the argument I make here
Under this model, there are two memory systems, about the nature of different types of grammatical
the declarative memory system, which contains morphemes, how they are organized in the mental
memorized words, and the procedural memory lexicon, and how they become salient in produc-
system, which is implicated in the learning of new tion. Data from contact phenomena add strong
motor and cognitive skills, including grammati- evidence substantiating the claim that there are
cal information. lemmas supporting several types of elements. Both
Like some other models, Ullmans model posits content morphemes and regular inections are
that the lexicon and grammar are two separate supported individually in the mental lexicon (by
computational systems, but he argued that there different lemmas), but there also are holistic lem-
are not different components dedicated to each of mas for elements that Jackendoff referred to as
these capacities. This is important because Ullman semiproductive.
argued that differences in performance in late bi-
linguals in their L2, as compared with their L1, Code Switching Data as a Window
indicate that they process language differently and on Combinations
that the difference is in their use of these two
memory systems. He argued that the processing of In code switching, regular inectional elements
linguistic forms that are computed grammatically from one language can join with content mor-
by procedural memory in L1 is expected to be phemes from another language; this is strong evi-
dependent to a greater extent upon declarative dence that regular inections are supported as
memory in L2 (p. 109). Thus, his argument offers individual elements in the mental lexicon, as just
an explanation for why speakers in an L2 do not suggested. At the same time, code switching also
perform with the same grammatical accuracy that provides good evidence that Jackendoffs semi-
they can in their L1 and may explain why L2 productive elements are based on single units in the
learners have problems with acquiring certain types mental lexicon; they are not constructed on line.
of grammatical knowledge. The evidence is that Embedded Language non-
Overall, he cited a wide array of evidence from nite verb forms, especially for the participles, from
L2 learning to aphasia to functional neuroimaging different languages always appear as holistic
about the neural bases of L2 learning. Unfortu- units in code switching. This is discussed in a later
nately, for the argument I develop here, Ullman section. However, rst the basics of the MLF model
was not very specic about what he included under are presented more fully.
334 Production and Control
online with input from both languages, I cite two Example 5 (Croatian/English, Hlavac, 2000, p. 392,
other examples here. Example 4 comes from Ewe, cited in Myers-Scotton, 2002a, p. 90)
a language in the Akan cluster of language varie-
ties in Ghana. Ewe can be identied as the Matrix . . . i tako one . . . [kontejner]-e i tako dalje . . .
Language based on its frame-building features in [pak]-ujem
this clause. The English verb weed receives the and so those container- and so on
Ewe inection for habitual aspect, and the English pack-1S/PRES M/PL/ACC
noun garden receives the denite sufx from Ewe. And so [I] pack those . . . containers and so on . . .
The Ewe sufx for habitual aspect (-na) is the type (Croatian/English, Hlavac, 2000, p. 392, cited in Myers-
of inection that identies Ewe as the Matrix Scotton, 2002a, p. 90)
Language under the System Morpheme Principle.
This principle does not specify that inections,
such as the denite sufx on garden, must come Code-Switching Evidence for
from the language identied as the Matrix Lan- the Holistic Nature of Nonnite
guage. However, recall that the Uniform Structure Verb Forms
Principle gives preference to maintaining the
structure of the Matrix Language. This explains Depending on the language pair, nonnite Embedded
why Ewe also supplies this sufx. Note that gar- Language verbs serve a variety of functions in code
den appears in a postpositional phrase headed by switching. These verb forms are multimorphemic on
Ewe. This phrase follows Ewe order, not that their own; that is, they consist of a content mor-
of English, in support of the Morpheme Order pheme and a so-called inectional morpheme that ts
Principle. the type of early system morpheme under the 4-M
model. Such system morphemes are dened and
exemplied in a following section. However, even
Example 4 (Ewe/English, Amuzu, 1998, p. 56, cited in though nonnite verbs are multimorphemic, they
Myers-Scotton, 2002a, p. 89) always occur as holistic Embedded Language units,
even if they receive Matrix Language inections as
well. (For example, in tuko confused [we are con-
wo ts-na wo fe asi-wo ts-na fused], a clause from Swahili/English code switching,
3PL take-HAB 3PL POSS hand-PL take-HAB the English past participle is intact, as it would be in
weed-na garden-a me-e a monolingual English clause.)
weed-HAB garden-DEF in-FOC
In contrast, in those language pairs in which the
They take [use] their hands to weed in the garden
Embedded Language nite verb can receive Matrix
(Ewe/English, Amuzu, 1998, p. 56, cited in Myers-Scotton,
2002a, p. 89) Language inections, the nite verb never occurs in
mixed constituents with any Embedded Language
inections as it would have in monolingual data.
Example 5 comes from Croatian, in contact with (For example, recall u-na-change [you will change]
English in the speech of Croatian second-genera- in Example 3 from Swahili/English code switching;
tion immigrants in Australia. Croatian is a mor- change does not appear with any English inec-
phologically rich language with inections that tions.) This difference across nite and nonnite
contain more than one morpheme in one phono- verb forms is evidence that the nonnite verbs are
logical unit. In this case, an English verb (pack, supported by single lemmas in the mental lexicon.
transcribed as /pak/ by Hlavac) is inected with Recall the statement that, in many language
such a Croatian multimorpheme unit. It contains pairs, Embedded Language verbs with agreement
the subject-verb agreement sufx for both rst or tense/aspect afxes from the Matrix Language
person singular and present tense. The English do not occur. Instead, in these language pairs,
noun container (transcribed as /kontejner/ by Hla- speakers produce the Matrix Language verb for do
vac) receives a sufx for masculine, singular, and and inect it with the relevant Matrix Language
accusative. These inections (subject-verb agree- afxes. This do verb is then followed by a nonnite
ment and accusative case) are the type of frame- (not marked for tense) form of the Embedded
building morphemes that, according to the System Language verb that carries the speakers intended
Morpheme Principle, must come from only one of meaning. In all cases in the literature, no matter
the languages. what the specic language pairs, the nonnite verb
336 Production and Control
form appears as the single unit it is in monolin- Another 29% (14/48) occur in Acholi innitive verb
gual data. The nonnite form is typically the in- positions (i.e., they are equivalent to innitives),
nitive (stem innitival afx), but it also can be either with or without the Acholi innitival prex
the present participle in some language pairs. In (as in ka terrorizing [to terrorize]). Finally, 21%
Example 6, this do construction is illustrated (10/48) are gerunds or other types of nominals (as in
from Turkish/Dutch code switching, with the the prepositional phrase labongo considering life
Dutch innitive kijk-en (watch). There are no ex- [without considering life]) or in associative con-
amples of Dutch nite verbs receiving Turkish structions (e.g., chances me surviving [chances of
inections. surviving] with Acholi me as of).
Some nonnite verb forms (usually innitives)
also appear as holistic units under the phenome-
non that I call double morphology. In such a case,
ja, maar toch, millet kijk-en yap-yor an Embedded Language content morpheme (most
yeah, but still, everybody watch-INF do-PROG/3S often a noun) appears with the relevant Matrix
Yeah, but still, everybody is watching you. Language afx (for plural on nouns, for inniti-
(Turkish/Dutch, Backus, 1996, p. 238) val marker on innitive forms). What makes the
form noteworthy is that the relevant Embedded
Language afx also appears with its Embedded
Example 6 (Turkish/Dutch, Backus, 1996, p. 238) Language head. In examples in which French is
In other language pairs, when inecting an Embed- the Embedded Language, French innitives some-
ded Language verb stem with Matrix Language times appear in their holistic (i.e., French) form,
tense/aspect afxes seems blocked, a different com- but with an innitival inection from the Ma-
promise strategy is employed (cf. discussion in Jake trix Language. For example, in Congo Swahili/
& Myers-Scotton, 1997; Myers-Scotton, 2002a; French code switching, the innitival form ku-re-
Myers-Scotton & Jake, 1995, on congruence). For nvyoy-er (to return) appears (Kamwangamalu,
example, in Acholi/English code switching, a non- 1987, p. 172). Nouns showing double morphol-
nite Embedded Language verb form occurs very ogy are discussed in the section on early system
freely and serves several different functions, repla- morphemes.
cing a nite verb in some constructions. The English In addition to the three types of constructions
present participle is this ubiquitous form. (Acholi, with nonnite Embedded Language forms that
the Matrix Language, is a Nilotic language spoken in appear holistically in code switching as verb forms,
Uganda.) Altogether, there are 48 examples in the in many language pairs Embedded Language past
relatively small corpus studied that show the English participles function as predicate adjectives. Again,
present participle functioning in three ways. In only they always appear as a holistic form. In the
one case does the participle function as a fully in- Nairobi corpus studied for my work in 1993
ected verb (with Acholi tense/aspect inections). (Myers-Scotton, 1997), there are eight examples.
Half of the participles (24/48) do receive a subject- One is illustrated in Example 8.
verb agreement prex and then function as part of a
reduced relative clause or otherwise subordinate
clause (e.g., gi-doing [they doing/they who do] as in Example 8 (Swahili/English code switching, Nairobi
Example 7). corpus, Myers-Scotton, 1988)
inections, they produce inected words online, morphemes (Bock & Levelt, 1994, referred to them
even cross-linguistically. But, lemmas that support as directly elected).
irregular forms or most nonnite forms are not Further, similar asymmetries between content
productive in this sense. The evidence from code morphemes and grammatical elements in other
switching is that Embedded Language nonnite contact phenomena become obvious. For example,
verb forms appear as holistic forms is additional content morphemes either are modied rst (and
evidence that some words (which contain a so- what is modied most is their lexical-conceptual
called inectional morpheme) are supported as full structure) or are replaced rst when speakers show
forms (by lemmas) in the mental lexicon; they are attrition in their L1 and more use of an L2 domi-
not constructed on line. nant in the community (cf. Myers-Scotton, 2002a;
Schmitt, 2001). In addition, any examination of
creoles shows that content morphemes from the
Basic Asymmetries in Contact superstrate language have quite a different role
Data than superstrate grammatical elements (system
morphemes) in shaping the creole (Myers-Scotton
Now, having supported the argument that not all 2001b, 2002a). (Superstrate refers to the language
inected words are accessed in the same way, I variety spoken by the overseers/owners present at
return to a related argument and the main goal of the time of creole formation.)
this chapter, to argue for differential routes of ac- The second asymmetry is this: Different types
cess from the mental lexicon. Systematic study of of system morpheme have different patterns of dis-
naturally occurring data since at least the early tribution across many contact phenomena. The
1990s from diverse pairs of languages in contact term system morpheme was employed under the
support two basic asymmetries. Much of this re- MLF model because it captures generalizations not
search has been on code switching, but the asym- available under other designations. The model dis-
metries are evident in other language contact tinguishes content morphemes from system mor-
phenomena as well. These asymmetries refer to the phemes by this criterion: Content morphemes assign
difference in the roles of content morphemes and or receive thematic roles; system morphemes do not.
what I refer to as system morphemes. The rst Note that this criterion does not identify as system
asymmetry is this: There is a basic split between morphemes the same morphemes as does a func-
content morphemes (the lexicon) and the grammar tional element (cf. Myers-Scotton 2002a; Myers-
in how they participate in structuring language. Scotton & Jake, 2000b). For example, pronouns in
This split is graphically reected in most contact English are content morphemes.
phenomena. In this chapter, I illustrate that split
with code-switching data discussed in terms of the
MLF model. How this split is played out differs Four Types of Morpheme
somewhat across contact phenomena, with varia- in the 4-M Model
tions on the split partly based not only on differ-
ences in the prociency of speakers in the languages To motivate a hypothesis that explains differences
involved, but also on the effects of sociopolitical in distribution, I need to present the basics of the
factors. However, I stress that the fact there is a 4-M model. This model differentiates four types of
split at all has more to do with the abstract aspects morpheme: content morphemes and three types of
of the types of morpheme than with these external system morpheme. (The term morpheme is used in
factors, including such psycholinguistic factors as the model in two ways: It refers to not only the
frequency. abstractions underlying surface-level morphemes,
As a quick example, in the most prevalent of but also as the surface-level forms themselves.)
all contact phenomenalexical borrowinghow In code switching, although Embedded Language
does it happen that nearly 100% of borrowed content morphemes can occur relatively freely in a
forms are content words? Content morphemes are mixed constituent, Embedded Language system
the prime candidates for borrowing for two rea- morphemes cannot. The MLF model captures this
sons. First, content words signal speakers inten- notion, but the 4-M model makes further rene-
tions (intentions to convey meanings). Second, they ments in the category of system morpheme, making
are directly accessed at the level of the mental possible ner-grained explanations of distributions.
lexicon; this means they are more accessible im- The 4-M classication receives independent
mediately, and they are more salient than system motivation from the phrase structure properties of
338 Production and Control
the four morpheme types. Content morphemes are and because they have two or more variants. It is
dened as those that assign/receive thematic roles not until a larger constituent is assembled that it is
and head their immediate maximal projections (e.g., clear which variant of their form is to be used. For
noun phrase). Prototypical content morphemes are example, in English, the past tense morpheme is an
nouns and verbs. Thematic roles refer basically to outsider. Whether past tense is encoded by -ed or
the semantic roles in any clause (e.g., agent or by the auxiliary verb do in its past tense form is not
patient). System morphemes are dened as those clear until the clause is assembled. With regular
inections and functional elements that do not as- verbs in declarative statements, the form is -ed. But
sign or receive thematic roles. In English, for ex- in interrogative sentences, the verb do takes the
ample, some prepositions are content morphemes past tense inection (as in Did you go there yes-
(e.g., for in I did it for Stella; for assigns the the- terday?) Subject-verb agreement is also an outsider
matic role of beneciary to Stella). But, some are late system morpheme. Note how subject-verb
system morphemes like at in look at that dog. agreement has two variants in the present tense
Early system morphemes are different from late English; it is marked with -s for third-person sin-
system morphemes in several ways. First, they gular, but with a null element for the other persons
pattern with content morphemes in sharing the and numbers. In languages with overt case, such as
feature of conceptual activation. They can be German, case also is an outsider late system mor-
thought of as eshing out the meaning of content pheme. When elements signaling morphology are
morphemes. Second, they depend on their content multimorphemic (e.g., German determiners, which
morpheme heads in their immediate maximal pro- include morphemes for number, gender, and case),
jections for their form. (They are what Bock and the late system morpheme (case) seems to be most
Levelt, 1994, seemed to have had in mind by re- salient (cf. Myers-Scotton & Jake, 2000a, 2001, on
ferring to some words as indirectly elected.) Italian/Swiss German code switching).
However, under the 4-M model, the type of early A way to summarize the four types of mor-
system morpheme includes not only words (e.g., pheme is in terms of the abstract oppositions that
determiners), but also inections (e.g., derivational can separate them. Both content morphemes and
afxes and afxes marking plural). early system morphemes have the feature [ con-
In contrast with early system morphemes, the conceptually activated], but late system morphemes
two types of late system morpheme do not depend do not. Content morphemes are further differenti-
on their heads in syntactic structures. This differ- ated from all system morphemes because they have
ence is related to the Differential Hypothesis, the feature [ thematic role assigners/receivers].
which is developed in the next section. In fact, the Finally, outsider late system morphemes are dis-
reason they are called late is that they are hy- tinguished from bridge late system morphemes
pothesized to be projected later than either content based on phrase-building operations. Late outsid-
morphemes or early system morphemes. ers have the feature [ requires outside operations],
Within the category late system morpheme, but bridges do not. Across languages, the deni-
bridge late system morphemes are projected when tions under the 4-M model do not necessarily put
a grammatical conguration (the immediate maxi- the same lexical categories or types of afx into the
mal projection in which they occur) requires them. same morpheme type. However, the same deni-
Thus, of in collar of Bora is a bridge. Bridges are tions of morpheme type apply across all languages;
invariant, at least in all languages examined within therefore, any morphemes across languages that t
the terms of the 4-M model to date. They are called the same denition are the same type. (For exam-
bridges because their role is to join together ele- ple, not all the afxes that Hungarian grammarians
ments to produce a constituent that is well formed refer to as case markers are the same morpheme
in the relevant language. Thus, in French il in il type as what are called case markers in German.)
pleut chaque jour (It rains each day) is such
a bridge (and is different from its homonym, the
third person pronoun il, which refers to an object). The Differential Access
Outsider late system morphemes are called Hypothesis
outsiders because they depend for their form on
information from outside their immediate maximal The preceding discussion makes it clear that mor-
projection. That is, their form is coindexed with phemes can be classied under the 4-M model in
elements outside the maximal projection in which terms of the different roles that the morphemes
they occur. They differ from bridges in this way have in phrase structure. What is of more interest
Code Switching and Other Contact Data 339
here is that these four types have different distri- differentiate types of system morpheme and their
bution patterns in contact phenomena. (Their dif- level of access in relation to content morphemes
ferential distribution in monolingual data is also of (and, of course, in my view, not all closed class
interest, but not a subject here.) That is, when two items are system morphemes).
languages are present in a clause, the morpheme Note that my hypothesis does not preclude the
types do not observe the same restrictions on oc- notion of simultaneous processing; structure at one
currence. The point has already been made that not level does not have to be completed before work at
all types of morpheme can come from both lan- another level begins. Further, the terms early and
guages in mixed constituents in code switching. late are used more or less metaphorically in the
Data presented in the following sections make de- 4-M model. That is, when I say that certain types of
tails of this asymmetry clearer. Limited discussion morpheme are not activated until later than other
of data from other contact phenomena also points types, this simply means their activation depends
to asymmetries in how morpheme types can occur. on salience of the earlier elements, making certain
That these asymmetries exist implies differences in directions and combinations available for proce-
the morpheme types, not just in terms of their roles dures at the level of the formulator.
in surface phrase structure, but also at some ab- However, note that by separating early system
stract level. The Differential Access Hypothesis morphemes from late system morphemes, this hy-
offers an explanation for these differences, refer- pothesis calls into question the implied notion that
ring to how the morphemes are accessed in pro- all constructions with regular morphological ele-
duction. The Differential Access Hypothesis is the ments undergo language production in the same
following: way. True, all system morphemes can be seen as
combining with other lexical items to satisfy the
The different types of morpheme under the 4-M variable pattern involved. But, when and where
model are differentially accessed in the abstract they are assembled is not the same. Thus, my view
levels of the production process. Specically, differs in an important way from the views of
content morphemes and early system mor- Pinker, Clahsen, and others.
phemes are accessed at the level of the mental Further, in the terms of the 4-M model, my
lexicon, but late system morphemes do not be- colleagues and I view the English participial forms
come salient until the level of the formulator. as necessarily consisting of a content morpheme
(Myers-Scotton, 2002a, p. 78) and an early system morpheme, not a late system
morpheme. This difference is relevant to when
The hypothesis implies the following scenario these forms are accessed.
for accessing late system morphemes: Lemmas un- Even though the participial sufxes often appear
derlying content and early system morphemes send on lists of English inectional sufxes in textbooks,
language-specic directions to the formulator to they are more like derivational afxes than other
build larger linguistic units. These instructions inectional sufxes in that they change the meaning
contain information about assigning late system of the content morpheme that is their head (e.g.,
morphemes to these larger structures. That is, the present participles can function as gerunds; past
information in the lemmas supporting late system participles can function as predicate adjectives,
morphemes does not become salient until the con- etc.). True, past participles share the same phonetic
tent morphemes that have directions about the form with past tense forms for many verbs (e.g.,
syntactic roles (and morphological realizations) of stop, stopped, stopped), but this does not mean that
late system morphemes call them. the two verbs are isomorphic in more than form.
The Differential Access Hypothesis is similar
to Garretts views (e.g., 1975, 1993, inter alia). He
noted that major and minor grammatical category Exemplifying Asymmetries in
words behave quite differently (1993, p. 81). He Code Switching With System
referred to open class elements as recruited by Morphemes
direct retrieval processes. For him, the closed class
elements are minor category elements that rarely As indicated, asymmetries in the distribution of
appear in exchanges and are recruited as parts system morphemes characterize all contact phe-
of structural frames, most particularly planning nomena, but most dramatically code switching.
frames associated with phonological phrasing (p. The MLF model attempts to capture this asymmetry
81). One difference seems to be that Garrett did not in the System Morpheme Principle. This principle
340 Production and Control
distinguishes among system morphemes by speci- with the accusative case sufx that such a direct
fying that one type must come from the Matrix object would receive in Hungarian. Again, the case
Language in mixed constituents, those with that the noun caterpillar will receive in this clause
grammatical relations external to their head con- is not known until the noun phrase containing it is
stituent (Myers-Scotton, 1997, p. 83). (Unfortu- combined with the verb, which assigns the thematic
nately, many researchers have interpreted the role of patient to the noun and the case of accu-
principle as applying to all system morphemes; cf. sative.
Myers-Scotton, 2001b.) Because the 4-M model
explicitly divides system morphemes into three
types, this division should make clearer the limited Example 10 (Hungarian/English code switching,
scope of the System Morpheme Principle. The Bolonyai, 1998, p. 33)
principle refers only to those morphemes called
outsider late system morphemes under the 4-M el- -enged-t-em a caterpillar-t
model. Frame building in the bilingual clause de- PREVERB let- -PST-1S the caterpillar-ACC
pends on these morphemes because they indicate I let the caterpillar go.
hierarchical relations beyond those in immediate (Hungarian/English codeswitching, Bolonyai, 1998,
maximal projections. That is, the role of outsiders p. 33)
is to knit together the clause. For this reason, it is
no surprise that they must come from the language
from which the morphosyntactic frame is derived. Exemplifying Special Distributions
for Early System Morphemes
Outsider Late System Morphemes Various researchers have observed that occasion-
and the Matrix Language ally some system morphemes are doubled in code
Examples 9 and 10, as well as the examples cited switching; that is, both the Matrix Language and
previously, show that the System Morpheme Prin- the Embedded Language versions of the same sys-
ciple makes the right predictions for outsider late tem morpheme occur with an Embedded Language
system morphemes in code switching. In Example head. The plural afx is doubled most frequently
9, even though the verb for telephone comes from (e.g., ma-ghost-s) in Swahili/English CS, with ma-,
French, the subject-verb agreement marker (the the prex marking Swahili noun Class 6 (a plural
prex na-) comes from Lingala (the Matrix Lan- class). Such instances of double morphology were
guage). The reason that this marker is an outsider explained as the result of mistiming (Myers-
morpheme is that its form (i.e., which person and Scotton, 1997). Speakers access an Embedded Lan-
number will it refer to?) is not clear until the verb is guage noun that they intend as a plural, and the
put in the larger clause with the NP that contains Embedded Language afx for plural is accessed
Ngai (I). along with the noun, even though what the mor-
phosytactic frame calls for is the Matrix Language
plural afx alone. Doubling of such morphemes is
Example 9 (Lingala/French, Meeuwis & Blommaert,
called double morphology.
1998, p. 86; I added the glosses)
The 4-M model and the Differential Access
Hypothesis provide an explanation of why this
Ngai moto na-telephoner. na-telephon-aki na tongo
mistiming happens only with early system mor-
1s person 1S-telephone 1s-telephone-PAST at morning
I am the one who called. I called this morning.
phemes and motivates the following Early System
(Lingala/French, Meeuwis & Blommaert, 1998, p. 86; glosses Morpheme Hypothesis: Only early system mor-
added by CM-S) phemes may be doubled in classic code switching.
The motivation is as follows: Early system mor-
Example 10 came from a Hungarian child who is phemes have a very different relation with the heads
being raised in the United States and who, at the in their immediate maximal projection than other
time of this recording, showed a good deal of code system morphemes. Like their heads, early system
switching between Hungarian and English. (Later, morphemes are conceptually activated. Under the
she showed increasing attrition of Hungarian as Differential Access Hypothesis, they are salient
English became her dominant language.) In this at the same time as their content morpheme heads
example, she integrated the English noun caterpil- (in the mental lexicon). Thus, they are available
lar into the Hungarian frame by inecting the noun for any mistiming to occur. (Evidence that this
Code Switching and Other Contact Data 341
doubling is a type of error is that the doubl- cases when only outsider late system morphemes
ing only occurs occasionally in code-switching must come from the Matrix Language? For exam-
corpora.) ple, why is there a Shona plural marker on the En-
In contrast to early system morphemes, also glish noun lesson in Example 12? Although there is
according to the Differential Access Hypothesis, nothing in the MLF model to require that early
the structurally assigned system morphemes (late system morphemes come from the Matrix Lan-
system morphemes) are not available until the level guage, the Uniform Structure Principle (cited above)
of the formulator. Thus, that double morphology gives preference to maintaining the same source of
does not affect late system morphemes offers sup- structural elements in any constituent. Under the
port for this hypothesis. System Morpheme Principle, outsider morphemes
Examples 11 and 12 show plural afxes from in any mixed constituent in code switching must
both the Matrix Language and the Embedded come from the Matrix Language; thus, to maintain
Language. Example 11 includes two sufxes, one uniformity, the bias is for other system morphemes
(-lar) from Turkish (the Matrix Language) and the to come from the Matrix Language as well. And,
other (-en) from Dutch (the Embedded Language) the code switching literature largely gives evidence
on the Dutch noun for Pole. Note as well that the of this bias. There are only a few reported examples
Dutch noun receives the Turkish sufx for dative of early system morphemes and only one bridge
case, in line with the System Morpheme Principle. system morpheme reported (Arabic djal [of] when
French is the Matrix Language).
Example 11 (Turkish/Dutch, Backus, 1992, p. 90)
seems to be a case of mistiming. (See Myers-Scot- speakers show different patterns of substitution and
ton, 2002a, pp. 129131, for an explanation of retention for late system morphemes than they do
how Chichewa Class 9 agreement is spread from for the early ones (Bolonyai, 1999, 2002, for Hun-
the English noun to the Chichewa modier.) garian children living in the United States who were
taking on English as their dominant language).
Example 13 (Chichewa/English CS, Simango, 2000, Other attrition studies showed that, contrary to
p. 494) some beliefs, outsider morphemes (specically case
markers) are very resistant to loss (Gross, 2000, on
Ngoni, ta-mu-send-er-a mw-ana apple-s i-modzi long-term German residents in the United States
IMPER-OBJ/3S-peel-APPL-FV CL1-child apple-PL CL9-one and Hlavac, 2000, on second-generation Croatian
Ngoni, peel one apple for the child. speakers in Australia).
(ChicheIa/English CS, Simango, 2000, p. 494) More evidence supporting the notion of differ-
ential access is available from diverse sources, in-
cluding monolingual data. For example, the link the
Evidence of Asymmetries 4-M model makes between early system morphemes
in Other Contact Data and content morphemes is borne out in lexical
borrowing and in speech errors. Recall that the two
Naturally occurring data in various types of contact morpheme types share the feature of conceptual
phenomena show how the asymmetry between activation. Speakers of one language sometimes
different types of system morpheme plays out. The borrow from another language an early system
most dramatic example may come from classic code morpheme along with its noun. There are a num-
switching, in which all outsider late system mor- ber of such borrowings from Arabic in various
phemes (those indicating grammatical relations European languages (e.g., alcohol from Arabic
across phrase structure in the bilingual clause) in [al kuhl], al [the] and kuhl [used to make ab-
mixed constituents must come only from one lan- sinthe]). Or, speakers forming a creole sometimes
guage, the Matrix Language (the source of the assume an early morpheme preceding a noun is part
morphosyntactic frame for the bilingual clause). All of the noun (e.g., lavyan in Mauritian Creole from
indications are that this distribution holds across French la viande [meat]). Also, in speech errors
diverse code-switching data sets, although few have when afxes are stranded, English plural afxes
been studied quantitatively, and singly occur- move with their content morpheme heads more of-
ring exceptions may occur. One quantitative study ten than is predicted by chance (e.g., I presume you
(Myers-Scotton, 2002b) found that the Matrix could get light in poorer picture-s; Stemberger,
Language is the source of all late system morphemes 1985, p. 162, cited in Myers-Scotton, 2002a, p. 83).
in mixed constituents in all bilingual clauses in In addition, the asymmetry between the distribution
the corpus (N 229). Another study (Finlayson, of conceptually activated morphemes (content and
Myers-Scotton, & Calteaux, 1998) reported the early system morphemes) and structurally assigned
same nding for all bilingual clauses (N 124). (late) system morphemes is also evident in the data
Further evidence about asymmetries concerning on speech by Brocas aphasics reanalyzed by Myers-
morpheme type comes from interlanguage (speech Scotton and Jake (2000a).
produced by L2 learners). Beginners accuracy is Further, I (Myers-Scotton, 2002a) also looked
much lower on late system morphemes than other at asymmetries in morpheme distribution in other
morphemes. For example, English third-person contact phenomena in relation to the 4-M model
singular -s is less accurately produced by low-level and the Differential Access Hypothesis. In the next
L2 learners than is noun plural -s (an early system section, one contact phenomenon, creole develop-
morpheme) (Wei, 2000a, 2000b, on Japanese and ment, is discussed in some detail in the terms of the
Chinese learners of English). This nding even ap- argument of this chapter.
plies to advanced L2 learners (Blazquez-Domingo,
2001, on English speakers studying Spanish). They
are much less accurate in producing a Spanish prep- Creole Development
osition that is an outsider morpheme than they are and Asymmetries Among
in producing prepositions that are either content Morpheme Types
morphemes or early system morphemes.
Similar differences apply to other types of contact In creole structure, the contributing languages play
phenomena. For example, in cases of L1 attrition, different roles as well. A composite of the substrate
Code Switching and Other Contact Data 343
languages (i.e., the L1s of the slaves/workers de- content morphemes to the language learner with
veloping the creole) is the likely source of most of limited access to hearing the target language used.
the abstract morphosyntactic base in any creole. Also, superstrate late system morphemes do not
The superstrate or lexier language (the variety necessarily meet the requirements of the morpho-
spoken by overseers/owners in the creole scenario) syntactic frame if we accept the view that this
also plays its part. It supplies most of the content frame comes largely from the substrate languages.
morphemes to ll this frame in two ways. First, Thus, we predicted that English language system
superstrate content morphemes express most of morphemes would not occur in a creole with En-
the intended referential messages. Second, of glish as its superstrate language.
more interest, these morphemes also appear in re- We (Myers-Scotton & Jake, 2002) tested this
congured forms as late system morphemes to meet Creole System Morpheme Hypothesis by analyzing
the requirements of the morphosyntactic frame four texts (108 lines) from Gullah (Turner, 1949/
(Myers-Scotton 2001a, 2002a). 2002). Gullah is a creole spoken on the coast and
offshore islands of South Carolina; it has English as
its superstrate language. The study concentrated on
Reconguring Content Morphemes instances of English regular verb inections that are
as System Morphemes outsider late system morphemes (third-person sin-
gular present tense -s or past tense -ed). Presumably,
Creoles develop when workers speaking different verb forms with these morphemes would be as-
L1s are thrown together and need to communicate sembled online (at the level of the formulator, ac-
with each other and their overseers. Attempting to cording to the Differential Access Hypothesis).
learn the language of the overseers (the super- Results showed that, of all verbal tokens with op-
strate) is often a favored option, but because the portunities for such regular inections (N 26),
workers have limited interactions with superstrate English inection was missing in 100% of the cases.
speakers, they have limited possibilities for learn- An example of a context for a third-person singular
ing the superstrate language. That is, creole for- present tense sufx (-s), but with no sufx, is [i ca fu
mation is related to L2 acquisition, but it does not men (he care[s] for men)]. There are a few examples
have the same structural outcome because the of irregular past tense verbs (e.g., I took) and six
conditions of learning are different. Instead of ac- examples of irregular verbs showing either subject-
quiring an L2 version of what might be called the verb agreement (ve examples of is) or subject-verb
target language, speakers acquire a new language, agreement and past tense (one example of was), but
the creole. it is likely these are present in the mental lexicon as
As just noted, workers develop a language (the units and not assembled on line.
creole) that largely has superstrate words. The re- Also, creoles give other evidence, in addition to
sult is that a creole, such as Gullah, the data source missing verb inections, that superstrate late sys-
exemplied here, appears to be a version of En- tem morphemes are not available in creole forma-
glish. In fact, it is quite different. One reason for tion. In both Jamaican Creole (in which one might
the difference is that the morphosyntactic frame for expect existential it from English in weather
a creole seems to be largely drawn from a com- clauses and similar clauses) and in Haitian Creole
posite of the languages of the workers who devel- (in which one would expect the French clitic pro-
oped the creole, not from the superstrate language. noun il to serve this existential role), these super-
A second reason, relevant to the 4-M model, is that strate forms are missing. Examples (from Holm,
not all types of English morpheme appear in the 1988, p. 88) all translate in English as its rain-
creole, and some appear in a recongured form. ing. In these cases, the superstrate existential
Jake and I (Myers-Scotton & Jake, 2002) pronoun is a bridge late system morpheme. (In the
showed that differences in the distribution of En- Gullah texts analyzed by Myers-Scotton & Jake,
glish morphemes are predictable based on the 4-M 2002, the existential it was also absent in 27/27
model and the Differential Access Hypothesis. Our opportunities.)
reasoning is this: Because content morphemes are
conceptually activated (i.e., they convey semantics Example 14a
and pragmatics), they are more accessible to the
creole speakers than late system morphemes. Late ren a faal
system morphemes signal grammatical relations, (Jamaican Creole)
but their uses are not so transparent as are those of English: Rain is falling/Its raining
344 Production and Control
to the question, What guides the extraction? is data, the divisions are more obvious in bilingual
UG, but his own version of UG. He said, Suppose data. Contact data offer an especially transparent
that Universal Grammar consists of a collection of window on production and, necessarily, compe-
skeletal fragments of l-rules [lexical rules] built into tence.
lexical memory (p. 191).
What about my argument that a basic distinc- Note
tion between types of morpheme is universally ev-
1. In examples from Bantu languages (e.g.,
ident and in a wide variety of linguistic data? Is this Swahili, Shona), the morph gloss FV in verbs
a reason to assume that this distinction is part of stands for nal vowel. It carries no meaning on
UG? Yes and no. I take a different perspective and its own, but it is part of a meaningful pattern of
look instead to production, not UG, as the rst line inections. Also, words in Bantu languages have a
of explanation for these data. This enables me to consonant-vowel-consonant-vowel (CVCV) pat-
sidestep the issue of exactly which elements of tern (i.e., they must end in vowels).
linguistic structure in humans have a universal
basis. References
That is, I choose to pay more attention to the
production problem than the innateness problem. I Amuzu, E. (1998). Aspects of grammatical
try to support two related points that are different structure in Ewe-English codeswitching.
Unpublished masters thesis, University of
sides of the same issue. First, the following im-
Oslo, Norway.
portant production problem needs to be recog- Auer, P. (1998). Introduction: Bilingual conversa-
nized. To arrive at the surface level, production has tion revisited. In P. Auer (Ed.), Code-switching
to deal with linear input, but we know that lan- in conversation (pp. 124). London:
guage is organized in a hierarchical, not linear, Routledge.
fashion. We need to know that noun A can be the Backus, A. (1992). Patterns of language mixing:
object of verb A without necessarily occurring next A study of Turkish-Dutch bilingualism.
to it, or that noun B can be the beneciary of the Wiesbaden, Germany: Harrassowitz.
action encoded in the verb, but not be next to the Backus, A. (1996). Two in one: Bilingual speech
verb. How is this hierarchical knowledge intro- of Turkish immigrants in the Netherlands.
Tilburg, The Netherlands: Tilburg University
duced into the linear stream?
Press.
My second point suggests a solution to this Bakker, P., & Mous, M. (Eds.). (1994). Mixed
problem, namely, that this vital information about languages: 15 case studies in language
hierarchies and relations among the content mor- intertwining. Amsterdam: Institute for
phemes is conveyed from these elements once they Functional Research into Language and
arrive at the level of the formulator. Structurally Language Use.
assigned system morphemes (the late system mor- Bernsten, J., & Myers-Scotton, C. (1988). [Shona/
phemes) that only become available at this point English corpus]. Unpublished raw data.
perform these tasks. Therefore, the timing of their Bickerton, D. (1981). Roots of language. Ann Ar-
salience and the basic conceptually activated versus bor, MI: Karoma.
Blazquez-Domingo, R. (2001). Not all prepositions
structurally assigned distinction of the 4-M model
are equal: Differential accuracy by advanced
are features of the production system. Further, learners of Spanish. Journal of Spanish Ap-
making this distinction is more of an operation plied Linguistics, 5, 163192.
than it is a principle, so the issue becomes whether Bock, J. K., & Levelt, W. (1994). Language
UG includes operations. Jackendoff himself hinted production: Grammatical encoding. In M. A.
at this idea: A system of grammatical relations Gernsbacher (Ed.), Handbook of psycholin-
and a system of morphological agreement makes guistics (pp. 945984). New York: Academic
a lot of sense as renements of a syntax-semantic Press.
mapping (2002, p. 264). He also indicated such Bolonyai, A. (1998). In-between languages: Lan-
a system is part of the interface system between guage shift/maintenance in childhood bilin-
gualism. International Journal of Bilingualism
phrasal syntax and meaning. Given that the
2, 2143.
asymmetries in how morphemes are distributed Bolonyai, A. (1999). The hidden dimensions of
show every indication of universality, how this in- language contact: The case of Hungarian-En-
terface operates may well be part of UG, admit- glish bilingual children. Unpublished doctoral
tedly an enlarged sense of UG. Although evidence dissertation, University of South Carolina,
of these asymmetries is available in monolingual Columbia.
346 Production and Control
17
Language Selection in Bilinguals
Mechanisms and Processes
ABSTRACT One fruitful approach to the study of the processes underlying language
selection in bilinguals is the analysis of the costs associated with the act of switching
from one to the other language. This chapter reviews the experimental ndings con-
cerning the cognitive processes that enable the conguration of a switch of language, as
well as those that allow a language, once selected, to be maintained. The effects of
relative prociency and language context are evaluated, as is the role of monitoring,
and are discussed in the context of current models of bilingual language processing.
349
350 Production and Control
(I have ook, I have uh, a brother too [too]; Four related questions are addressed. First, what
Poulisse & Bongaerts, 1994, p. 13) or when the are the processes by which interference between
speaker is under some stress (Dornic, 1979, 1980; multiple languages is prevented? That is, how is a
Grosjean, 1982). Other instances of erroneous language de-selected? An important issue here is
language selectionas they occur unintentionally the extent to which, if at all, the de-selection of
and spontaneously in multilingual speakerswere a language is generalized to that entire language
described by Shanon (1991; see also Clyne, 1997, system. Is language selection global or local; that is,
for a discussion of triggered code switches). The do the de-selection or inhibition processes operate
sometimes unintentional nature of the phenome- on the language as a whole or only on those con-
non is suggestive of the cognitive processes and stituents of the concepts that are relevant?
mechanisms that drive language selection. Why Second, how might the selection processes be
does this type of interference occur when at other affected by relative prociency in the respective
times the bilingual appears perfectly capable of languages? A reasonable supposition might be that
keeping the two languages separate? greater prociency in a language facilitates perfor-
This chapter focuses on the research that has mance in that language generally.
looked at language selection and control experi- Third, what are the factors that trigger a language
mentally, primarily by studying basic speech pro- switch, and how effective are they? For example, do
duction skills such as naming. Typically, bilinguals language-specic information cues activate the en-
at varying levels of relative prociency in their two tire language system? Of particular interest here is
languages are put in an experimental situation that the type of information that may serve as a cue to
requires they switch from one to the other lan- determine language choice. External cues, such as
guage. The focus is on the controlled and willed occur when the other language is heard, may cause
selection of single responses in a bilingual setting the bilingual to becomeif only momentarily
and not on language switching as it occurs spon- linguistically disoriented and result in a switch of
taneously and (un)intentionally in code switching. language. Alternatively, such cues may motivate an
Concentrating on individual responses enables intentional switch.
the comparative analysis of response latencies un- The fourth question then becomes, once a lan-
der controlled conditions, allowing the evaluation guage switch has been accomplished successfully,
of the role of relative prociency, language context, how is the selected language maintained? Here, the
and the interplay between the two languages. role of monitoring is discussed, as is its importance
Although these tasks (such as the simple naming in preventing the bilingual from falling back into
of pictures in alternate languages) may appear re- the language previously spoken and maintaining
strictive, there is no a priori reason to assume that the language of choice, particularly if this is the
the basic processes allowing the occurrence of a weaker language.
presumably intentionalswitch of language in
conversation (as motivated by a change of inter-
locutor or topic) might not be similar, or even The Selection and De-selection
identical, to a switch of language occurring in re- of Languages
sponse to task instructions in a naming task. Issues
of language selection in conversation and the rules It has long been known that switching between
underlying the manner in which different language languages takes a measurable amount of time
systems interact in uent speech are discussed (e.g., Dalrymple-Alford, 1967; Kolers, 1966, 1968;
elsewhere in this book (see also Poulisse, 1997, for Macnamara, Krauthammer, & Bolgar, 1968;
a recent review). Reference is made to research Macnamara & Kushnir, 1971). By and large the
ndings from the monolingual domain as well, in early experiments aimed at studying intentional
particular those relating to the cognitive processes language switching in production measured the
underlying an individuals ability to switch be- time taken to read (and comprehend) written
tween different tasks. This is not meant to imply monolingual and mixed passages (Kolers, 1966,
that speaking a Language A (LA) in favor of a 1968; Macnamara & Kushnir, 1971). For example,
Language B (LB) is akin to switching, for example, comparisons were made between passages consist-
between reading a word (Task A) or naming the ing of mixed sentences such as His horse, fol-
color in which it is presented (Task B), but rather to lowed de deux bassets, faisait la terre resonner
recognize that the nature of the selection processes under its even tread (His horse, followed by two
may be similar. hounds, made the earth resound under its even
Language Selection in Bilinguals 351
tread; Kolers, 1966, p. 359) and monolingual pas- Motivated by the results of an early bilingual
sages. The cost associated with a language switch Stroop study demonstrating the bilinguals inability
was placed somewhere between 0.2 and 0.5 s. These to ignore irrelevant and potentially interfering lan-
tangible costs were beautifully consistent with guage information at input (Preston, 1965, as
Clynes (1980) observations that code switches in cited in Macnamara, 1967a), the idea of a language
spontaneous conversation were often preceded by switch was extended to a model encompassing two
some hesitation, suggesting a time cost as a result of switch mechanisms. To account for bilinguals
(preparation for) a switch. ability to comprehend their two languages, an au-
The early research was marred by a number tomatic switch was postulated, operating at input.1
of methodological issues, not least of which was Regulating the selection of language in production
the underlying assumption of a xed time cost, was a controlled switch at output (Macnamara,
unchanged by the direction of the switch from 1967a; Macnamara & Kushnir, 1971; see also
L1 to L2 or vice versa. (See also Grosjean, 1997, Caramazza, Yeni-Komshian, & Zurif, 1974; Dornic
and Paradis, 1980, for further criticisms, including & Laaksonen, 1990).
problems with the grammaticality of the code- The observed switch costs occurred, it was ar-
switched material used, as in the example given.) gued, because the ease in making a correct re-
Typically, a measure of the switch cost was ob- sponse is exactly balanced by difculty in inhibiting
tained by subtracting the overall response latencies a wrong one [italics original] (Macnamara,
associated with naming/reading monolingual pas- 1967b, p. 734). In other words, one of the pro-
sages, or lists of words, from those associated with cesses underlying the language switch mechanism
the mixed-language presentation and dividing the was that of inhibition, thought to operate such that
difference over the number of language switches in responses in the dominant language (L1) were
the mixed presentation. It was therefore impossible harder to suppress, and consequently responses in
to determine whether (as would be intuitively pre- the weaker language (L2) were more difcult to
supposed) it was easiest to switch to the dominant produce on a switch of language. This difculty
language. By using averaging procedures, the role was balanced out by the comparative ease of sup-
of the bilinguals relative prociency in the two pressing responses in the weaker L2 to speak in the
languages was ignored also, as was the possible stronger L1.
effect of current language use on the relative ease of The logical inference is that it should be easier
switching. to switch to the dominant L1 than to the weaker
In spite (or perhaps, irrespective) of the limita- L2. Macnamaras (1967b) explanation derived
tions of the data, the switch cost thus calculated from a study requiring participants to generate
was striking enough to give rise to a number of rather than namewords either in one language
theories attempting to explain its origin. One of the only or in alternating fashion between their two
most persistent theories held that, at a physiologi- languages (Irish and English) and was based only
cal level, there must be a localizable on differences in the number of words produced in
the different conditions. No response latencies were
automatic switch that allows each individual to obtained, thus preventing not only the analysis of
turn from one language to another. . . . When a the latencies associated with a language switch,
child or adult turns to an individual who speaks but also, more importantly, any comparisons of
only English, he speaks English, and turning to a those latencies with respect to the direction of the
man who speaks French and hearing a word of switch (from L1 to L2 or vice versa). Macnamara
French, the conditioning signal turns the switch therefore lacked the data (a fact he acknowledged) to
over [italics added] and only French words come verify his theory. Although its constituent assump-
to mind. (Peneld & Roberts, 1959, p. 253) tions are not incorrect, there is now evidence to
suggest that this is not how a switch of language is
This rst mention of a language switch mecha- effected.
nism focused on the production of speech, assumed a
clear separation between the languages, and assumed
that language selection was an externally (exoge- Selection and Control: Global
nously) driven process. I return to the notion of or Local Inhibitory Processes?
external cueing (e.g., as signaled by a change in in-
terlocutor) implied in this description in the section If the ease with which LA is suppressed or de-
on language selection and cueing of language choice. selected is not directly commensurate with the ease
352 Production and Control
with which the LB is produced, then what are the language actions (e.g., name a numeral or a picture
processes and mechanisms that allow the bilingual in LA rather than LB). As part of their specied
to switch languages, and at what level of selection goal, language information is coded also. Compe-
do these operate? There are various points at which tition for output from the lexico-semantic system
selection can occur and conceivably also degrees to occurs at the level of the language task schemas,
which selection can occur. either inhibiting or activating lemmas according to
On the one hand, it may be that only the possi- the task relevance of their associated language tags.
ble alternative (same and other-language) responses When the competition is between automatic or
are suppressed when selecting one language over routine behaviors (e.g., reading a word aloud, as
the other. There is evidence from bilingual word prompted by its language membership), the win-
recognition at least that, even on a monolingual ning schema is determined through contention
task, alternative lexical candidates in the other scheduling.
language are accessed also (e.g., Dijkstra & Van When the competition cannot be resolved easily
Heuven, 1998; Van Heuven et al., 1998; see also and selection is willful and deliberate (e.g., when
Kroll & Dijkstra, 2002, for a review). The data do a novel task is carried out, such as naming a picture
not allow the evaluation of whether the entire other in a language as cued by a color), contention
language system is activated, but do indicate that scheduling is controlled and monitored by the
all related representations in both lexica are acti- Supervisory Attentional System (SAS; Norman &
vated and available until fairly late in the selection Shallice, 1986; Shallice, 1988; Shallice & Burgess,
process (Dijkstra, Grainger, & Van Heuven, 1999; 1996). The word, in the correct language, is pro-
see also Dijkstra, Timmermans, & Schriefers, 2000). duced eventually through (a) inhibitory control
For spoken word recognition, similar ndings have modulated by the SAS, inhibiting all lemmas with
been reported, suggesting that even in a monolingual inappropriate language tags; and (b) activation by
setting the irrelevant language is accessed (Spivey & the SAS of the relevant language task schema. Im-
Marian, 1999). portantly, lemma selection in LBthrough the
On the other hand, it may be that an entire inhibition of items in LAdoes not exclude those
language system is suppressed to enable error-free same items from competing in the selection pro-
use of the other language system appropriate under cess. The idea of a supervisory system that moni-
the circumstances. This latter possibility is one that tors language behavior is not a novel one. Obler
has been incorporated in a number of recent the- and Albert (1978) proposed, although not perhaps
ories. For example, in bilingual word recognition it in as much detail, a bilingual monitor system
has been suggested that language nodes control the thought to process, inter alia, linguistic and other
degree to which any given language is (more or cues from the environment and operating contin-
less) activated through excitatory connections with uously to activate the relevant language system.
all word nodes in that language (the Bilingual In- Implicit in the IC model is the notion that the
teractive Activation model; Dijkstra & Van Heu- language as a whole will be affected, because the
ven, 1998; Grainger, 1993; see also Dijkstra, language task schemas selectively activate or in-
chapter 9, and Thomas & Van Heuven, chapter 10, hibit lemmas according to the task requirements
this volume; see also Grosjean, 1997, and Li & (i.e., to produce L1 or L2) (see also the Bilingual
Farkas, 2002, for arguments against the need to Production model; De Bot, 1992; De Bot &
postulate language nodes). A similar proposal was Schreuder, 1993). However, Green (1998b) did
put forward by Green (1986, 1993, 1997, 1998a) suggest that the inhibitory effects are not neces-
to explain the processes underlying the bilinguals sarily global only, but potentially can be selective.
ability to control her two languages. To illustrate For example, translation equivalents and related
the way in which such control may be implemented concepts could be inhibited more strongly than
in the system, this model is briey discussed here. other concepts (as has been shown in bilingual
Greens Inhibitory Control (IC) model (Green, word recognition).
1993, 1997, 1998a) holds that lemmas are tagged How does the interplay of inhibitory processes
for language-specic information, and these tags affect bilingual language selection in the produc-
are either inhibited or activated by language task tion of speech? To answer this question, a detailed
schemas. The language task schemas (reminiscent analysis of the latencies associated with switches
of the schema as proposed originally by Norman & of language is required. From this, it emerges that,
Shallice, 1986, for the control of actions; see also contrary to early speculations (Macnamara, 1967b),
Cooper & Shallice, 2001) are thought to control a switch of languages when producing a verbal
Language Selection in Bilinguals 353
response is paradoxically slower when switching disparity: Selecting the stronger L1 was more ef-
back to the dominant L1 (Meuter, 1994; Meuter & fortful when the weaker L2 had been used immedi-
Allport, 1999). Bilingual participants named nu- ately before it. This is consistent with observations of
merals rapidly and unpredictably in either L1 or L2, task selection in other domains (Allport & Styles,
as signaled by a color cue. 1990; Allport, Styles, & Hsieh, 1994; Rogers &
As can be seen in Fig. 17.1, the switch costs, Monsell, 1995; see Monsell, 1996, for a compre-
although not immediately obvious to a listener, are hensive review). For example, when participants
nevertheless measurable and signicant (mean cost alternated between two versions of the classic Stroop
for a switch from L2 to the dominant L1 143 ms; color word task (Stroop, 1935; name the word vs.
mean cost for a switch from L1 to the weaker name the color the word is printed in), Allport
L2 85 ms). The switch cost is determined by et al. found larger switch costs when switching from
subtracting from the mean response latency asso- the weaker color-naming task to the dominant
ciated with (for example) the rst switch to L1, the word-naming task. Similar observations have been
mean response latency on nonswitch trials in L1 made also in other bilingual tasks, such as cued
(immediately preceding the rst occurrence of L2 picture naming (Kroll & Peck, 1998, as cited in Kroll
switches). The increased response latencies occur & Dijkstra, 2002) and even early bilingual speech
on a switch only and are markedly slower in L1 recognition (Bosch & Sebastian-Galles, 1997).
than in L2. When no switches are made, efciency A critical look at the early research on bilingual
is once again greater in L1 (i.e., faster response adaptations of the Stroop task (Stroop, 1935) also
latencies). showed the same pattern. Meuter (1994) recal-
Across a series of studies, including numeral culated the data from a number of earlier studies
naming (Meuter & Allport, 1999), superordinate (Albert & Obler, 1978; Dyer, 1971; Kiyak, 1982;
naming (Meuter, 1994), switching languages in Lee, Wee, Tzeng, & Hung, 1992; Preston & Lam-
conversation (Meuter, 2001), and pictureword in- bert, 1969) using the neutral condition as a baseline
terference (Meuter, 1994), this reverse dominancy measure (cf. Jensen & Rohwer, 1966). In four of the
pattern on a switch of language emerged repeat- ve studies (Albert & Obler, 1978, excepted), while
edly when there was a marked L1/L2 prociency nonbalanced bilinguals did experience relatively
Figure 17.1 Mean response latency (in milliseconds) for both nonswitch and switch trials, indicated in
canonical sequence. Nonswitch response trials in L1 and L2 are given as a function of run length (i.e., the
number of successive responses in the same language), with run length classied as follows: no more than
1 only, 23, or 4 or more successive responses in the same language. The possible sequence of responses is
indicated up to and including the second switch in a sequence. A maximum of four switches per sequence
was possible (range [0, 4]). (Adapted from Meuter, 1994; see also Meuter & Allport, 1999.) L1, rst
language; L2, second language.
354 Production and Control
greater interference from distractors in the dominant How may these paradoxical patterns be ex-
language, greater interference occurred also when plained? Meuter and Allport (1999) argued that the
responding in the dominant language (see Fig. 17.2, critical components in switching between lan-
top panel). This occurs irrespective of any corre- guages are, rst, the establishment of a language set
spondence between the language of presentation and (e.g., to enable a response in L1, competing re-
response. Thus, although the language of the dis- sponses from L2 have to be inhibited) and, second,
tractorin relation to the response language the inertia this generates in the system. The out-
inuences the speed of response, of crucial importance come of the inertia is the tendency to continue re-
also is the language in which the response is given. sponding in the same language. To produce a
Figure 17.2 Interference per item (in milliseconds) on bilingual colorword Stroop tasks, calculated for
ve separate studies using a neutral baseline and plotted according to language of response. Language of
response is either the same as (S) or different from (D) the language of presentation of the target item.
Recalculated data are given for nonbalanced (top panel) as well as balanced participants (lower panel). (A)
Preston and Lamberts (1969) Experiment 3 with English-French bilinguals (top panel) and Experiment 1
(lower panel) with English-Spanish bilinguals. (B) Kiyaks (1982) study with Turkish-English bilinguals.
(C) Albert and Oblers (1987) study with Hebrew-English bilinguals (two nonbalanced groups and one
balanced group). (D) Dyers (1971) study with Spanish-English bilinguals (combined data). (E) Lee et al.s
(1992) study with Tamil-English bilinguals (lower panel gives combined data). L1, rst language; L2,
second language.
Language Selection in Bilinguals 355
response in the other language on the next trial, the sets of pictures were easier to name in one language
language set inertia (labeled task set inertia when than the other. Switch costs were minimal on those
switching between tasks in other domains; Allport switch trials following responses in the practiced
et al., 1994) has to be overcome. It is the language language, for which little or no inhibition of a
set inertia resulting from responding in the weaker competing response was required. In contrast,
L2 (which required strong suppression of the switch costs on switch trials following responses
competing stronger L1) that is the most difcult to in the unpracticed language, requiring strong inhi-
conquer. This interpretation is consistent with the bition of a competing response, were more than
idea that the language switch costs represent a four times as large (40 ms and 180 ms, respec-
conict effect arising from the persistence of the tively). This pattern of results would obtain only if
language response set instituted on the preceding the suppression of a response competitor (in the
trial. The paradoxical pattern in the bilingual other language) resulted in the suppression of the
Stroop tasks similarly might be explained by as- associated language system as a whole. Importantly,
suming stronger (or earlier) recoding of the dis- the results provided further evidence to support
tractor item when the intended response language the idea that the critical factor in determining the
is the dominant L1. It follows that comparatively switch cost is the language set established for the
greater difculty will be experienced when having preceding response, that is, the language set from
to suppress a resulting inappropriate response in L1 which a switch is made.
(Meuter, 1994). It may be seen as parsimonious to impose global
Within the IC model (Green, 1993, 1997, inhibition on elements belonging to one language
1998a), the asymmetrical language switch cost is LA to facilitate the use of the other language LB. It
explained as within-system inhibition. On a switch is commonly assumed that the two language sys-
of language from L2 to L1, the L1 language task tems, more so perhaps when in a bilingual set-
schema is selected in favor of the L2 language task ting, are active to varying degrees (Grainger &
schema that dominated language selection on the Dijkstra, 1992; Green, 1986, 1993, 1998a; Gros-
preceding response. Successful performance in jean, 1998). Evidence from code-switching in-
the weaker L2 on the previous trial was due to the stances also strongly points to the simultaneous
inhibition of all lemmas with L1 language tags. activation of the two languages (see Poulisse,
Because the inhibition of L1 is especially powerful 1997). However, if it is assumed that spreading
in nonbalanced bilinguals, the cost that arises activation is the process by which words in either
from its removal is commensurately large, thus language are accessed, the question arises regarding
accounting for the larger cost observed. the extent to which competing candidates are ac-
For a number of reasons, Meuter and Allport tivated, both within and across languages (see also
(1999) suggested that the suppression of a response De Bot, 1992). Consequently, it may well be the
in a particular language results in the suppression case that the suppression effects are more localized,
of that entire language system. First, responses did something the paradigms as they were implemented
not become faster with an increase in consecutive thus far might not have been able to reveal (see also
responses. Response latencies on switch trials also Green, 1998b).
remained unaffected by this manipulation (see To tease out the possible existence of localized
Meuter & Allport, 1999, Fig. 3, p. 32). In other inhibitory effects, Bajo and Green (1999) manipu-
words, longer periods of speaking LA do not cause lated the numerical distance in a numeral naming
LB to be subjected to increasing inhibition, which task. German-English bilinguals named numerals
would have been reected in similarly increasing (ranging from 1 to 9) in either language, as cued by
switch costs. Second, a switch cost was observed color. The numerical distance between each suc-
only on the rst response in the other language; the cessive pair of numerals was either large (e.g., 17,
next response was as fast as any others in the same a numerical distance of 5) or small (e.g., 13, a
language (see the nonswitch responses in Fig. 17.1). numerical distance of 1). Numerals were assumed
Third, experimentally induced L1-L2 prociency to map onto analogue representations of their
differences resulted in identical effects in a cued magnitude and, once activated, a spread of acti-
picture-naming task (Loasby, 1998). Fluent bilin- vation to closely related numeral concepts (i.e., a
guals received selective practice in the naming of small numerical distance removed) would result.
line drawings, subsets of which were practiced in Moyer and Landauer (1967) rst described this
L1, others in L2, and the remainder left un- effect when they discovered that the larger of a
practiced in either language, with the result that pair of simultaneously presented numerals took
356 Production and Control
measurably longer to name the greater the numer- more or less procient in one language relative
ical distance between it and the smaller numeral. to the other. A logical inference from the ndings
By systematically varying the numerical distance discussed in the previous sections is that, when a
between numeral pairs, Bajo and Green (1999) bilingual is equally procient or practiced at the
were able to evaluate any inhibitory effect at the two languages, the resulting switch costsin either
conceptual level, as well as determine the level at directionlikewise should be equal.
which control is effected in selection (i.e., internal A number of ndings support this idea. In one
to the bilingual lexicon, at the lemma level, or study, a small group of carefully selected balanced
external to it). If global inhibition only operated, English-French bilinguals named the superordi-
then no differences in response latencies would nates of common nouns in one or the language,
be expected as a function of numerical distance. as cued by color (e.g., sparrowBIRD, carnation
By contrast, if the inhibitory effects also operated FLEUR [FLOWER]); Meuter, 1994; Meuter &
locally, increases in numerical distance should Allport, 1999). They performed under two condi-
result in increases in response latencies on non- tions. In the language-compatible condition, changes
switch trials but concomitant decreases in response in color cues mapped directly onto the actual shifts
latencies on switch trials. The ndings were con- in language of presentation and cued for a response in
sistent with the operation of local inhibitory ef- that language (e.g., oeillet [carnation]FLEUR/
fects, indicativeat the conceptual level at least sparrowBIRD). In the language-noncompatible
of an internal locus of control at the lemma level, condition, changes in color cues instead signaled
not necessarily generalized to the entire language. for a response in the other language (e.g., oeillet
External control was revealed clearly in the asym- FLOWER/sparrowOISEAU [BIRD]). In both
metrical switch pattern obtained, larger when conditions, large costs of language switching were
switching to the dominant L1. observed when shifting language of response. More
To recapitulate, a switch of language is enabled importantly, however, there was no asymmetry in
through processes of inhibition, affected by the le- the switch costs. This is in sharp contrast to data
vel of prociency in the to-be-suppressed language. obtained from nonbalanced bilinguals in the same
The inhibitory processes can operate both locally task: They showed the expected asymmetry, with
and globally, a point discussed in more detail in the larger costs when switching to the dominant L1 in
next section. The level of prociency in a given response (Meuter, 1994).
language may differ depending on the particular The review of the bilingual Stroop data (Meuter,
task (e.g., naming versus comprehension) to be 1994) also revealed that the asymmetrical interfer-
carried out in that language. It is important, ence effect on the response language was reduced or
therefore, that any calculations aimed at deter- even absent for balanced bilinguals (Fig. 17.2,
mining relative prociency and switch costs are lower panel). Furthermore, increased practice
carried out with reference to task-specic baselines. across 2,000 or so response trials (decreasing the
It is not only conceivable but highly probable that L1-L2 prociency disparity on the task) signi-
within-individual, task-specic differences exist cantly reduces the asymmetry in the language
that are reected in differences in ease of proces- switch costs (Meuter & Allport, 1999). Recall also
sing. Such differences emerge also as a direct con- the study by Loasby (1998), in which asymmetries
sequence of experimental (contextual) demands were experimentally induced through additional
(see, for example, the practice effects on switch training on small subsets of pictures. By extension,
costs described by Meuter & Allport, 1999). one could train subsets to equal prociency in both
languages and, it would follow, a reduced or
eliminated asymmetry would result.
Much research focusing on the representation of
The Role of Relative Prociency concepts and words in the bilingual lexicon has
in Language Selection highlighted the extent to which its architecture and
the nature of the selection processes are molded
The pattern of asymmetrical language switch costs and driven by differences in L1-L2 prociency (as
that is seen in nonbalanced bilinguals contradicts formulated, for example, in the Revised Hierar-
the intuitive belief that greater prociency or u- chical Model; Kroll & Stewart, 1994; for reviews,
ency in a language should be synonymous with see Kroll & De Groot, 1997; Kroll & Tokowicz,
greater efciency. What is critical is not prociency 2001). With increased prociency, a concomitant
per se, but rather the extent to which a bilingual is increase in reliance on conceptual information, as
Language Selection in Bilinguals 357
opposed to reliance on lexical information, has bilinguals were equally procient in both languages
been observed, and often an asymmetrical pattern cannot be excluded.
in translation obtains, with backward translation The research reviewed thus far has covered par-
(from L2 to L1) effected faster than forward adigms that, by some, may be considered somewhat
translation (from L1 to L2). (See De Groot & Poot, articial and far removed from the language be-
1997, and La Heij, Hooglander, Kerling, & Van havior that bilinguals engage in when communicat-
der Velden, 1996, for contrasting ndings.) ing in natural settings. The question invariably arises
Additional support for the importance of rela- whether the same observations hold true there.
tive prociency comes from functional imaging Observations from code-switching data suggest, in-
studies (see also Abutalebi, Cappa, & Perani, directly, that they do (e.g., Clyne, 1980). Supporting
chapter 24, and Hull & Vaid, chapter 23, this evidence comes from a study in which nonbalanced
volume). Different cortical areas of activation have Spanish-English bilinguals related different personal
been found with low prociency (i.e., nonbalanced) experiences, unpredictably in one or the other lan-
bilinguals (Dehaene et al., 1997; Perani et al., guage. Detailed analyses of the monologues (en-
1996). For example, using functional magnetic compassing both response latencies and word
resonance imagining, Dehaene et al. looked at counts) revealed that, even though L1 was the more
auditory story comprehension in late-acquisition procient language overall (as evidenced by higher
(and low-prociency) French-English bilinguals word counts), it took signicantly longer to start
and found that, although L1 activation was con- speaking L1. Not surprisingly, word production in
ned to the left temporal lobe, some participants the rst 5 s of speech also was markedly reduced.
showed additional activation in the left inferior This reects the asymmetry also found in the naming
frontal gyrus and the anterior cingulate for L2. This tasks. The onset asymmetry disappeared about 10 s
possibly reected the greater attentional demands into the story, re-establishing the normal dominance
made by the processing of a weaker L2. Similar pattern (Meuter, 2001). The bilinguals were classi-
ndings were obtained by Perani et al. with low- ed further according to self-reported recent use
prociency Italian-English bilinguals. (percentage use on day of testing) of the weaker L2.
In procient (i.e., balanced) bilinguals, greater For the high-usage group (L2 use 70%), an unex-
overlap between cortical regions subserving pro- pected pattern emerged: The recent experience of
cessing in both languages has been observed (e.g., predominant L2 usage resulted in a dominance re-
Chee et al., 1999; Perani et al., 1998). One positron versal, such that L2 was now the stronger language.
emission tomographic study of auditory story It appears then, that relative prociency is a
comprehension (Perani et al., 1998) compared two powerful determining factor in the ease with
groups of highly procient bilinguals, early L2 ac- which bilinguals control and regulate their two
quirers and late (after the age of 10 years) L2 ac- languages. Moreover, the degree of prociency can
quirers. Bilateral activation in the temporal poles be affected by recent experience, and its relativity
was observed for both L1 and L2, as well as in the applies to a language as a whole. Even training on
hippocampal structures and the lingual gyrus. In limited subsets produces generalized effects on re-
addition, again for both languages, there was left sponses (cf. Loasby, 1998). Although more local-
hemisphere activation (e.g., in the superior tem- ized effects of suppression and activation are
poral sulcus and the inferior parietal lobule). These possible (see, for example, Bajo & Green, 1999),
patterns were independent of age of L2 acquisition, the paradigms discussed here did not enable an
suggesting thatin language comprehension at evaluation of them. The conversation analysis did
leastprociency rather than age of acquisition show increasing engagement, on a switch, of the
determines the brain regions involved in the pro- dominant L1, but this may simply reect an in-
cessing of the two languages. creasingly, globally, active language system.
This nding seemingly contrasts with Kim, To uncover any localized effects, if they exist,
Relkin, Kyoung-Min, and Hirshs (1997) observa- production paradigms need to incorporate more
tions, in a silent production task, of activity in ne-tuned response latency measures. For example,
non-overlapping regions in the left hemisphere for a language-switching task based on superordinate
late bilinguals but marked overlap in Brocas area naming could build into it a measure of seman-
for early bilinguals. However, different language tic relatedness to evaluate the extent to which
skills were not evaluated, and no independent the lexico-semantic system is affected. In non-
measures of language prociency were reported. balanced bilinguals, an overall switch cost asym-
Therefore, the possibility that the early and late metry would be predicted, supporting the idea
358 Production and Control
of global suppression. The comparison of switch cess what is seen or heard. Anecdotal evidence
trials involving a semantic relationship, (e.g., suggests that bilinguals do use, involuntarily,
moineau [sparrow]OISEAU/robinBIRD), versus an inappropriate language when cued by something
those that do not (e.g., moineauOISEAU/church in the environment. They do so also when they
BUILDING) might reveal greater switch costs on appear to cue themselves inadvertently, by pro-
semantically related switch trials. If so, this would ducing a trigger word, as in the following utter-
demonstrate the operation also of local suppres- ance: Ich habe viele LETTER geschre/geschrieben
sion, affecting conceptually related items more. We (literal translation: I have many letters written
remain caught in a dichotomy for now, when in [Dutch/German]; Clyne, 1997, p. 108). In this ex-
fact the global effect might be masking more subtle ample, the word LETTER is both German and
local processes, such as those observed in bilingual Dutch. Its Dutch meaning, while also referring to
word recognition (e.g., Dijkstra & Van Heuven, something that can be written, is conned to a letter
1998; Van Heuven et al., 1998). What has yet to be of the alphabet and yet triggersalthough swiftly
established is the extent to which these operate also correcteda switch into Dutch. Evidence from bi-
in bilingual speech production. lingual Stroop and picture word interference tasks
There are two caveats to the foregoing discus- also demonstrated that bilinguals cannot ignore the
sion. First, Monsell et al. (1997; Monsell, Yeung, language that is irrelevant (and even a hindrance) to
& Azuma, 2000) suggested that the switch cost the task (see MacLeod, 1991, for a review; see also
may not always be asymmetrical, but perhaps only Fig. 17.1), studies of negative priming have found
when one of the tasks is by far the dominant one. effects of unattended items (e.g., Fox, 1996), and it
When the difference in relative dominancy between has been shown that the irrelevant language is ac-
two tasks is not great (i.e., large enough to result in cessed even when in an exclusively monolingual task
asymmetrical interference effects but too small to setting (e.g., Spivey & Marian, 1999). The obser-
result in asymmetrical switch costs) the asymmetry vation that bilingualswhen using LAare affected
disappears. Although the asymmetry is observed in by the other language LB, even when it is ignored
nonbalanced bilinguals (as a direct consequence of intentionally or not in active use, suggests that both
L1-L2 prociency differences), it remains to be language systems are active to varying degrees
seen how differences in relative prociency across (Grainger & Dijkstra, 1992; Green, 1986; Paradis,
three languages will affect the switch costs in trilin- 1980), a factor also recognized in models of bilin-
guals. (See Costa & Santesteban, 2004, for thought- gual speech production (e.g., the Bilingual Produc-
provoking data on highly procient early bilinguals tion Model; De Bot, 1992; De Bot & Schreuder,
switching to L3.) 1993). This raises the question of how external
Second, the reduction (or even disappearance) language-related information is used to drive lan-
of the asymmetry in language switch cost associ- guage selection.
ated with comparable (balanced) levels of pro- What constitutes a valid language selection cue
ciency in the two languages does not imply the in a bilingual setting? In many of the studies dis-
disappearance of the language switch cost alto- cussed in the previous sections, the tasks were cued
gether. Typically, a measurable cost remains. When in some way by a geometrical gure (Macnamara
a negligible cost is observed, it may be that task et al., 1968), a color (e.g., Meuter & Allport,
demands are such that even on nonswitches a cost 1999), a position on a computer screen (e.g.,
is experienced. An example of such task demands is Rogers & Monsell, 1995), or a variable tone (Kroll
given in the next section (cf. Meuter & Shallice, & Peck, 1998, as cited in Kroll & Dijkstra, 2002).
2001). Expectations of which language to speak, Such cues are arbitrary and only attain meaning in
whether anticipated or in response to some external the experimental context through task instruction.
cue (e.g., being spoken to in a particular language), For example, within the IC model (Green, 1998a),
also might affect the ease with which a language is language task schemas are formed on the basis of
selected. such cues, linking a particular gure, color, posi-
tion, or tone cue to the production of a specied
language (e.g., name a picture in L1, not L2, in
Language Selection and Cueing response to a high tone). If presented some variable
of Language Choice time prior to the presentation of a stimulus (such
as a picture), cues can also afford the opportunity
A bilingual individual confronted with information to prepare for a response, to some extent at least.
in one or the other language cannot help but pro- However, such arbitrary cues are quite distinct
Language Selection in Bilinguals 359
from the cueing validity that a word (whether language is selected for response, even when this
spoken, heard, or read) directly affords for the is inappropriate. Recall that English-French bilin-
language to which it pertains (with the exception of guals were instructed to label each target word
ambiguous lexical items, such as the word LET- either in the language in which it was presented
TER, earlier seen to trigger a switch from German (the language-compatible condition, e.g., sparrow
to Dutch). An unambiguous cue, as provided by a BIRD) or in the other language (the language-
unique, language-specic word in LA, unambigu- noncompatible condition, e.g., sparrowOISEAU).
ously and directly cues LA. In the latter condition, each trial encompasses a
Two early studies demonstrated this effect in the within-trial switch of language (e.g., sparrow
monolingual domain. Jersild (1927) and Spector OISEAU/oeilletFLOWER), and of particular in-
and Biederman (1976) found that the time cost terest is the pattern of response obtained here. For
involved in changing between tasks was reduced nonbalanced bilinguals, response latencies on non-
when cued unambiguously (e.g., when switching switch and switch trials in mixed lists were equally
between giving the opposite to a written word and slow, resulting in a nonexistent switch cost. Fur-
subtracting 3 from a digit). In contrast, large costs thermore, both types of trials were markedly
were observed when switching between adding and slower (by an average of 300 and 349 ms, respec-
subtracting 3 from a digit, costs that were reduced tively) than nonswitch trials in monolingual lists.
when 3 or 3 next to each digit signaled the This pattern of results (i.e., no measurable switch
operation (Spector & Biederman, 1976). Exactly costs on mixed lists) contrasted with that obtained
how these recongurations are executed and what in the language-compatible condition: A switch
the processes underlying the ability to switch be- cost (measuring 262 ms) was obtained when re-
tween tasks might be has been the subject of in- sponding in L1 only.
creasing research (Allport & Styles, 1990; Allport Consistent with Meiran and colleagues (Meir-
et al., 1994; Los, 1996; Meiran, 1996, 2000; an, 2000; Meiran, Chorev, & Sapir, 2000) inter-
Meuter, 1994; Meuter & Allport, 1999; Meuter, pretation of the switch cost, in which one
Humphreys, & Rumiati, 2002; Meuter & Shallice, component identied is that of reconguration of
2001; Rogers & Monsell, 1995; Rogers, Sahakian, task set, it appears that attending to the language
Hodges, Polkey, Kennard, & Robbins, 1998; of presentation LAeven when the task species
Rubenstein, Meyer, & Evans, 2001). The observed that the response should not be given in LA
reduction in switch cost with the presence of a cue nonetheless results in its selection. LA, thus activated,
suggests that some prior preparation can take place has to be suppressed to respond in the other
(Meiran, 1996, 2000). The exact nature of the cue language, LB.
is important also. Greater reductions in switch Similar ndings were obtained in a study com-
costs were found on an arithmetic switching task paring switch costs in bilingual picture naming and
when cued by arithmetic symbols as opposed to translation (Kroll, Dietz, & Green, in preparation,
color. Color cues in turn were more effective than as cited in Kroll & Dijkstra, 2002). Bilinguals were
no cues at all (Emerson & Miyake, 2003; see also asked to switch between languages in a picture-
Baddeley, Chincotta, & Adlam, 2001). naming and a translation task. In contrast to a
In the bilingual domain, Macnamara et al. picture, a to-be-translated word provides a cue to
(1968) found a reduction in language switch costs the response language (albeit in the noncompatible
when switches occurred in regular alternation, sense described here) and thus may enable earlier
suggesting some preparation as a consequence of language selection. If this occurs, a reduction in
predictability. This level of preparation likely is at a (and perhaps even elimination of) switch costs
global level, preselecting the language required on would be expected. As predicted, although asym-
the next trial, with a cost remaining because the metrical switch costs were obtained on the picture-
actual response can be selected only on the ap- naming task, on the translation task no signicant
pearance of the critical stimulus. An earlier study by switch costs were obtained. Although this is sug-
Dalrymple-Alford (1967) found that direct cueing gestive of the effectiveness of the language cue
of a language either was ineffective or else a switch provided by the stimulus, it is equally likely that in
of language could not be triggered by information translationas in the naming of superordinates
presented prior to the execution of a switch. in the other languagethe word rst activates its
However, Meuter (1994) showed that, when associated lexicon before the appropriate selection
bilinguals have to attend explicitly to the language- can be made from the lexical candidates in the
specic aspects of the stimulus, that particular other language.
360 Production and Control
Each translation, then, represents an inherent (either English or Spanish) into the basic numeral-
switch between the language of presentation and naming task because these words would activate
the language of response. It may be this aspect of directly the associated lexicon. Monolingual and
the task that makes nonswitch and switch trials mixed blocks were used, the latter containing both
appear equally fast. If this is true, it follows that, on color-cued switches and nonswitches of response
an Italian-to-English translation task, for example, language. The numerals were interleaved with
increased response latencies would occur on trials neutral letter strings, as well as words for which the
in which the critical word is a homographic non- language identity either did or did not match the
cognate (e.g., estate in Italian, meaning summer) response language required on the next trial (con-
compared to trials in which a language-specic gruent and incongruent cues to the language of
word is to be translated (e.g., inverno in Italian, response, respectively). Now, incongruent cues did
meaning winter). increase responses on nonswitch trials, suggesting
There is, of course, something curious about some preparation had occurred by way of general
both these tasks: that is, superordinate naming in activation of the other language. No reduction in
the other language and translation. In each task, switch costs was observed when a switch of lan-
the language of the stimulus, although a valid task guage was cued congruently. The cues only ap-
cue, is not the optimal language cue, and the bi- peared for 100 ms, perhaps not sufcient to achieve
lingual has to attend explicitly to the language of either complete disengagement from the preceding
input (thus generating some internal conict) to language set or efcient use of the cue in prepara-
select the appropriate response language. It follows tion for a response.
that it should be possible to cue a bilingual more Meuter and Shallice (2001, Experiment 2) ob-
efciently, in a language-switching setting, by using served that, with longer, predictive (color) cue
language-specic cues that correctly and unambig- presentations (up to 1000 ms), switch costs did
uously cue the associated response language. Also, decrease signicantly and more so for the weaker
consistent with the results just described, an in- L2. However, although a valid cue may facilitate
congruent cue would signal (and perhaps initiate the preparation of a response, the response proper
preparation for) a response in the inappropriate cannot be congured until the stimulus is presented
language, thus resulting in a cost. (see also Allport et al., 1994; Los, 1996; Meiran,
Two studies explored the effect of external cues 2000). Signicant reductions in switch costs
and their validity on the bilinguals response were found when naming Arabic versus Chinese
efciency. The rst study assumed that the identity numerals (Meuter & Tan, 2003). Here, the stimuli
of the interlocutor would be a powerful cue to unambiguously specied the response language,
language selection, and merely attending to this thus facilitating a switch of language.
feature might affect language selection (Meuter & The differing effects of cues on nonswitch and
Powell, 1997). German-English bilinguals made switch responses suggest that there may be some
response language decisions (i.e., in which lan- fundamental differences in the reconguration of
guage would you address this person?) about well- language sets depending on whether a switch of
known (German- or English-speaking) individuals response language is required. Support for this
whose pictures were interleaved with a bilingual notion was found in a study that manipulated both
numeral-naming task. Through its associated lan- color cue length and switch ratio. Superimposed on
guage, each picture cued the response language the primary numeral-naming task was a secondary
required on the next trial unpredictably, either vigilance task requiring the suppression of certain
congruently or incongruently. For example, a pic- (prespecied) responses (cf. Robertson, Manly,
ture of Freud (German speaker) congruently cued Andrade, Baddeley, & Yiend, 1997). In one ex-
a subsequent trial color cued for response in Ger- periment, Italian-English bilinguals rapidly named
man. Contrary to predictions, a congruent cue to a numerals in either language, as cued by color, with
switch of language did not reduce the switch cost, one important exception. All responses to the nu-
and an incongruent cue in a nonswitch context meral 3 (three) were to be suppressed. (See Fig.
did not result in a signicantly increased cost. 17.3 for examples of trial and response sequences.)
However, the relevance of the picture to the task Not only did the additional task increase the at-
(also in terms of its predictive validity) likely was not tentional load, but the suppression of a specic
sufcient. responsein a given languagealso allowed the
Accordingly, Meuter and Leisser (2002) incor- precise evaluation of the effect this had on a sub-
porated words with language-specic orthography sequent response. Suppression trials occurred both
Language Selection in Bilinguals 361
Figure 17.3 Examples of two possible response sequences for the simultaneous appearance of color cue
and numeral. (Adapted from Meuter & Shallice, 2001.) (a) A sequence of trials in which the to-be-
suppressed numeral 3 (cued for response in English) appears on a nonswitch trial. The subsequent trial
is cued for response in the same language (English) and is therefore classied as a nonswitch trial following
a suppressed response on a nonswitch. (b) A sequence of trials in which the to-be-suppressed numeral
appears on a switch of language (from Italian to English). Accordingly, this type of trial is referred to as a
suppressed switch trial. The subsequent trial is cued for response in the same language (English) and is
therefore classied as a nonswitch trial following a suppressed switch.
on nonswitches (e.g., an L1 response trial was response language from the preceding trial (from
followed by a suppression trial also cued for re- Italian to English). However, the numeral itself
sponse in L1; see Fig. 17.3[a]) and switches of re- signals that its response is to be withheld, thus re-
sponse language (e.g., an L1 response was followed sulting in a suppressed switch trial (nota bene, only
by a suppression trial cued for response in L2; see on appearance of the target stimulus could this
Fig. 17.3[b]). Of primary interest were those trials decision be made). The subsequent trial is cued for
immediately following a suppression trial. On both response in the same language as just suppressed,
nonswitch and switch trials following a suppressed effectively a nonswitch trial. The cost incurred on
response, increases in responses latencies were found this trial signicantly exceeded any other observed
amounting to costs virtually identical to those ob- cost (see Fig. 17.4; cf. Meuter & Shallice, 2001).
served on regular switch trials (see Fig. 17.4). Most Two dramatic ndings emerged from the incor-
striking was the cost observed on nonswitches fol- poration of a vigilance component. First, the striking
lowing a suppressed response on a switch of observation of greater cost incurred after a sup-
language. pressed switch than measured on an actual switch of
An example of such an event is seen in Fig. language suggests that there is something unique
17.3(b). Here, the numeral 3 is cued for a switch in about the conguration of a response set associated
362 Production and Control
Figure 17.4 Mean switch cost (in milliseconds) for both response languages, for the following trials: (1)
regular switches (SW); (2) nonswitches following a suppressed nonswitch (NSW after supp NSW); (3)
switches following a suppressed nonswitch (SW after supp NSW); and (4) nonswitches following a sup-
pressed switch (NSW after supp SW). Adapted from Meuter and Shallice (2001).
with a language switch. It appears that the (color) conversational) setting. When in an almost exclu-
cue is effective in partially reconguring the lan- sively monolingual setting (strikingly, even when
guage set, to the extent that the other language is this situation is conned to a block of trials em-
now activated, and some preparatory groundwork bedded in a bilingual experimental setting), the
has been laid for response (see also Meuter & other language does not appear to affect any selec-
Shallice, 2001, Experiment 2). This is consistent tion process. The default value is set to generalized
with Meirans (2000) postulation that one part of activation of the predominantly required language
the observed cost in task switching is reected in for that block only. Effective strategic behavior on
the preparation component, effectively consisting the task requires, on suppression of a response, only
of the opportunity to make use of a cue. minimal suppression of the associated language.
Second, the observation that even a nonswitch When both languages need to be managed, the ex-
trial can incur a switchlike cost suggests that the pectation that a switch of language is likely may give
suppression of a response in the same language rise to global inhibitory processes.
effectively suppresses that language system. How- The reconguration of a language switch merits
ever, these costs do not obtain in a predominantly further exploration. Why is the suppression of a
monolingual context, suggesting that it is the per- response on a switch of language more effortful than
ceived need to use both languages that gives rise to that of any other response? What are the component
inhibitory processes (Meuter & Shallice, 2001). processes executed before the critical stimulus
Now, two notions need to be reconciled. On the (including the language signal) appears? It would
one hand, in a switch context, cueing appears to seem that the switch is recongured, but preventing
aid preparation through the global activation of the its executionapart from suppressing a response
relevant language system, and response suppression results in the inhibition of not only the de-selected
results in the global inhibition of the associated language but also the response language required
language system. On the other hand, when there is on the switch.
no perceived need to use both languages, the sup- Another suggestion put forward is that switch-
pression of a response in LA does not entail the ing between tasks may involve the operation of
suppression of the entire language system. If global inner speech directing the individual by activating
inhibition did occur, thenirrespective of the bi- the required task instructions. Although inner
linguals expectations regarding language usethe speech is not an executive control process but one
suppression of a response in LA should result in the instead associated with the rehearsal and short-
global suppression of LA. term maintenance of phonological information (cf.
The critical point here is the type of requirements Baddeley, 1986), some ndings in the monolingual
imposed by the experimental (and by extension the task-switching domain suggest that articulatory
Language Selection in Bilinguals 363
suppression (e.g., repeating a word over and over aware of the control exerted on a continual basis
while carrying out a language-based task) increases and the need to monitor the appropriateness of
the cost associated with task switching (Baddeley speech output. Unfortunately, the errors are typi-
et al., 2001; Emerson & Miyake, 2003). The overt cally too few to be informative, and yet they often
articulation of the task instruction, on the other provide revealing insights into the functioning of
hand, reduces the switch cost, suggesting that a self- the intact cognitive system.
generated reminder serves as an effective cue to the There are two approaches that have proven
upcoming task (Goschke, 2000). useful in accessing error data. One approach in-
It is conceivable that, at least for the tasks de- volves the study of individuals whose ability to
scribed thus far, for which arbitrary cues signal the control their everyday behavior has been compro-
language of response, bilinguals make use of inner mised through neurological damage. This results in
speech. If so, concurrent articulatory suppression a number of errors, of which errors of task selec-
should increase the cost of switching between lan- tion are most informative. With respect to language
guages. However, if the stimuli themselves endog- switching, those patients who have incurred dam-
enously cue the response language, inner speech age in the frontal lobes are of particular interest.
may have no role in facilitating a language switch, Another approach involves the manipulation of the
and articulatory suppression should not have any experimental situation such that higher error rates
effect. are elicited (e.g., increasing task difculty by in-
creasing the attentional load; Meuter & Shallice,
2001). I discuss these two approaches in turn.
Maintaining the Language Although it is highly unlikely that there exists
of Choice an isolable on-off language switch mechanism
of the type proposed by Peneld and Roberts
The selection of a language, whether intentional or (1959), the frontal lobes and associated areas could
inadvertent (as cued by context), is only one aspect form the possible underlying physiological basis for
of the speech act. What is critical also is the role the bilinguals ability to switch between languages.
that monitoring plays, not only in the bilinguals This suggestion is supported by evidence from
ability to switch effectively between languages but various lines of research indicating that the ability
also in the ability to maintain a language for pro- to control ones behavior, such as switching tasks
duction once it is selected. It has been observed, for successfully, depends on the integrity of the frontal
example, that on occasion bilinguals may not im- lobes (e.g., Perret, 1974; Sandston & Albert, 1987;
mediately comprehend what is said to them if they Shallice, 1988). Typically, patients with frontal
were not expecting to hear the language spoken to lobe damage exhibit great difculty in switching
them (Taylor, 1976). This could be because there between different categories or tasks (e.g., Milner,
may be a level of monitoring of input required (cf. 1963) and often show a high number of persever-
Macnamaras, 1967a, input switch; Obler and Al- ative responses. On other tasks, such as the Stroop
berts notion of monitoring, 1978) that does not task, their performance shows a greater-than-
always function efciently. Alternatively, a gener- usual inability to suppress unwanted information
alized higher level of activation may be associated (Perret, 1974; see also Burgess & Shallice, 1996). If
with one language as compared to the other in switching between languages is considered another
the bilinguals lexicon (e.g., Grainger & Dijkstra, instance of switching between task sets, then it is
1992; Green, 1986, 1993), preventing immediate likely thatgiven the role of the frontal lobes
recognition. Monitoring might be driven by the language switching also may be subserved by this
Supervisory Attentional System (Green, 1986, area. Rogers et al. (1998) suggested that it is spe-
1998a; Norman & Shallice, 1986), but could be cically the left frontal lobe that is involved in the
operationalized also as the outcome of continual dynamic reconguration of established task sets.
and updated computations following changes in Accordingly, a bilingual individual with a frontal
relative activation within an interactive activation lobe injury might experience an inordinate amount
network. of difculty in switching between the two lan-
From the occurrence of inadvertent code switches guages just as would occur with any other form of
(e.g., Clyne, 1997; Shanon, 1991), levels of control task switching.
can be inferred that normally operate to prevent Such a patient was tested recently (Meuter et al.,
other language intrusions. It is when errors of 2002). An Urdu-English bilingual patient with
language choice occur that a person becomes most frontal lobe damage, F. K., showed similar response
364 Production and Control
latency patterns to neurologically intact bilinguals may be simply a consequence of task demands be-
(controls) when naming numerals in Urdu and cause switches were clearly signaled by a change in
English, with larger costs when switching into the language of input.
dominant L1. However, unlike the controls, F. K. Other evidence suggests that, although the
made numerous errors on this task. In particular, F. frontal lobes may be involved in holding the known
K. had great difculty, once a switch to the weaker task set rules in mind, the remapping process may
L2 was made, to maintain L2 as his response lan- well take place elsewhere, in the parietal regions
guage. He often fell back into an L1 response (Meuter, Jackson, Roberts, & Jackson, 1999).
mode, even after a number of successful, consecu- Event-related potential responses were measured
tive responses in L2. Also, successful switches into bilaterally, and although a frontal component was
the weaker L2, while reliably faster than switches found, it did not distinguish between nonswitch
into the dominant L1, were highly error-prone and and switch trials. In contrast, such discrimination
frequently resulted in erroneous L1 responses. In was observed across the parietal midline. Consis-
other words, F. K. exhibited a strong tendency to tent with the monitoring difculty found in F. K.,
perseverate with responses in the dominant L1, this pattern provides further support for frontal
even when cued for responses in the weaker L2. lobe involvement in task maintenance, while sug-
The frontal lobe damage and impairment of control gesting that other aspects of language switching
processes of F. K. resulted in an inability to mod- (such as remapping color cues to switch require-
ulate inhibitory resources and thus regulate and ments) might be driven by the parietal lobes.
monitor his language behavior when both lan- The error pattern shown by F. K. suggests that
guages were required. In particular, F. K.s moni- the monitoring of language behavior is critical.
toring deciencies targeted his ability both to Might the role of monitoring be studied also in
inhibit his stronger L1 sufciently and to maintain bilingual individuals without brain damage? Meu-
this inhibition over time. ter and Shallices (2001) study attempted to do so
Another bilingual patient with frontal lobe by adding a vigilance component to the normal
damage was described whose language mixing bilingual numeral-naming task. In a series of ex-
behavior, often inappropriate and uncontrollable periments, the ratio of switch to nonswitch trials in
even when confronted with clear external cues, ap- a sequence as well as the ratio of regular to sup-
peared primarily caused by left frontal lobe dam- pressed trials were manipulated (see Fig. 17.3).
age (Fabbro, Skrap, & Aglioti, 2000). No aphasic Increased response latencies occurred with an in-
symptoms were observed in either language. In crease in both the incidence of switch and of sup-
each language setting (either Friuli or Italian), the pression trials. Not surprisingly, the added vigilance
patient would produce at least 40% of his spon- component increased the error rate substantially.
taneous utterances in the inappropriate language More importantly, more errors were made when L1
and lapsed marginally more often into L1 when was the required response language, both on non-
L2 was required than vice versa, a pattern consis- switch and switch responses.
tent with the type of monitoring difculties ex- At rst glance, this error pattern appears to
hibited by F. K. contradict that observed in patient F. K., who made
Imaging techniques have provided further sup- many more errors when L2, not L1, was the re-
port for frontal lobe involvement. For example, quired response language. However, the juxtapo-
using functional magnetic resonance imaging, in- sition of these two patterns, increased in
creased activation in the dorsolateral prefrontal neurologically intact bilinguals when required to
cortex was found in Spanish-English bilinguals speak L1 versus increased errors in a frontal lobe
when switching between languages in a picture- patient when required to speak L2, quickly resolves
naming task (Hernandez, Dapretto, Mazziotta, & the contradiction. Taken together, the error pat-
Bookheimer, 2001; Hernandez, Martinez, & Koh- terns suggest strongly that in normal bilinguals the
nert, 2000). However, Price, Green, and von Stud- default setting, especially when in a bilingual con-
nitz (1999), in a positron emission tomographic text, is to inhibit the stronger language (L1) more
study comparing translation and language switching to allow greater efciency in L2. That this ability to
(operationalized as silent reading of alternately monitor L1 versus L2 demands in a bilingual con-
presented words in either L1 or L2), found increased text is essential was demonstrated by its evident
activation in Brocas area as well as bilaterally in the impairment in the bilingual patient F. K. with
supramarginal gyri but no such increase in the frontal lobe damage as well as by Fabbro et al.s
frontal lobes. The lack of frontal lobe involvement (2000) bilingual Friuli-Italian patient.
Language Selection in Bilinguals 365
To recapitulate, ndings from both neurologi- language does not benet in the same way from
cally damaged and intact bilinguals support the advance warning (Meuter & Shallice, 2001). The
idea that the ability to switch between languages notion of global activation is consistent with the
appropriately and then to maintain this selection evident need for response monitoring and mainte-
for as long as the situation demands requires intact nance of the response language. When speaking L2,
monitoring skills. The error patterns also indicate perhaps more so in an L2 setting, there is a con-
that relative language prociency plays an impor- tinual need to suppress the dominant language.
tant role in the monitoring process: The stronger, Anecdotal evidence suggests that, even in highly
dominant language is more difcult both to inhibit procient bilinguals, this level of control operates.
and to monitor. Inadvertent (and inappropriate) slips of language
occur under conditions of stress or tiredness.
Last, to facilitate the language selection process,
Summary the bilingual has recourse to some useful strategies.
One default strategy identied is to inhibit the
This chapter provides an overview of the current stronger L1 more, particularly when in a situation
understanding of the processes that underlie the that demands the use of both languages. In prac-
bilinguals ability to switch (whether consciously or tice, this strategy has the advantage of increas-
inadvertently) language in production and high- ing the availability of the weaker L2 and thus
lighted some of the factors that affect switching facilitating its use. At the same time, the bilingual
behavior. Language selection is determined by a perceives prociency in the dominant L1 as only
number of factors, including relative prociency, marginally compromised, if at all.
contextual cues, and monitoring ability. Although The picture is by no means complete, and a
greater prociency in a language generally is asso- number of issues demand further exploration. For
ciated with better performance, the opposite is true example, the relative contribution of local versus
when a switch of language is made: Switching from global processes needs to be analyzed further.
the weaker L2 to the dominant L1 is more de- Carefully planned experiments incorporating or-
manding than switching in the opposite direction. thogonal comparisons of practice/relative pro-
The comparatively greater cost experienced when ciency, numerical/semantic distance, and language
selecting the dominant language after having just context and/or task demands will allow the evalu-
spoken the weaker L2 is one that is carried by the ation of the contribution of each of these factors as
rst response only. In conversation, the cost is well as their interaction. Also, many bilinguals
measurable in terms of time taken to initiate dis- often have at least a working knowledge of one or
course, but superior uency in L1 quickly re- more additional languages. Incorporating a third
establishes itself (Meuter, 2001). The asymmetry (or even fourth) language into the equation would
observed in the switch between languages, larger provide one further means of testing and expanding
on a switch to the dominant language, can be ac- the validity of the conclusions drawn in this chap-
counted for by language set inertia (Meuter & ter and raises new questions.
Allport, 1999). Alternatively, inhibition within the First, does the need to monitor and use three or
system, arising from the need both to select a dif- more languages require the same de-selection pro-
ferent language task schema and to inhibit active cesses? Some discoveries in the monolingual do-
nontarget lemmas, might account for the asym- main are of relevance here (Arbuthnott & Frank,
metry (Green, 1998a). 2000; Mayr & Keele, 2000; but see also Emerson
It is clear from both patient and normal data & Miyake, 2003). Specically, Task A was easier
that language can be cued exogenously. However, to select when it was suppressed a few trials before
the extent to which cueing is effective remains to be (when switching between three tasks, e.g., C B A)
determined. In an experimental setting at least, as opposed to selecting Task A when it was sup-
language cues that are subtle and not explicitly pressed more recently (when alternating between
related to task demands do not reliably support two tasks: e.g., A B A). By extension, is it easier
language selection, but explicit and unambiguous also to speak LA when both LB and LC are the
cues do. When subtle cues do afford some prepa- languages used most recently? The answer to this
ration, they appear to do so via a process of global question will depend, in part, on the relative pro-
activation. The usefulness of cues is most keenly ciency in the three (or more) languages. (See
felt when speaking L2: With increased preparation Meuter & Binder, 2004, for preliminary data;
time, responses are more efcient. The stronger Costa & Santestaban, 2004.)
366 Production and Control
Second, how are the selection processes affected Paper presented at the Second International
by the need to control and manipulate more than Symposium of Bilingualism,
two languages at different levels of prociency? Newcastle-upon-Tyne, U.K.
Third, do cues generalize, or are they language Bosch, L., & Sebastian-Galles, N. (1997). Native-
specic? Anecdotal evidence suggests that, on oc- language recognition abilities in 4-month old
infants from monolingual and bilingual envi-
casion, a cue to switch languages (from LA to LB)
ronments. Cognition, 65, 3369.
may indeedcorrectlyresult in a switch of lan- Bowers, J. S., Mimouni, Z., & Arguin, M. (2000).
guage, but one thatincorrectlyis made to yet Orthography plays a critical role in cognate
another language, LC. Effective language selection, priming: Evidence from French/English and
when selecting from multiple languages, is likely to Arabic/French cognates. Memory and Cogni-
impose greater processing demands and require tion, 28, 12891296.
more sophisticated monitoring ability, as well as Burgess, P. W., & Shallice, T. (1996). Response
more efcient strategies, to ensure smooth dis- suppression, initiation and strategy use fol-
course when switching between languages. lowing frontal lobe lesions. Neuropsychologia,
34, 263273.
Caramazza, A., Yeni-Komshian, G., & Zurif, E. B.
Note (1974). Bilingual switching: The phonological
1. Although the input switch was described as level. Canadian Journal of Psychology, 28,
automatic, this was not synonymous with instan- 310318.
taneous operation: A cost was thought to be asso- Chee, M. W. L., Caplan, D., Soon, C. S., Sriram,
ciated with a switch both at input and output. The N., Tan, E. W. L., Thiel, T., et al. (1999).
crucial distinction between the selection processes Processing of visually presented sentences in
(switches) was based on whether willed purposeful Mandarin and English studied with fMRI.
action was required independent of any associated Neuron, 23, 127137.
cost (Macnamara & Kushnir, 1971). Clyne, M. G. (1980). Triggering and language
processing. Canadian Journal of Psychology,
References 34, 400406.
Clyne, M. (1997). Some of the things trilinguals do.
Albert, M. L., & Obler, L. K. (1978). The International Journal of Bilingualism, 1,
bilingual brain: Neuropsychological and 95116.
neurolinguistic aspects of bilingualism. New Cooper, R., & Shallice, T. (2001). Contention
York: Academic Press. scheduling and the control of routine
Allport, A., & Styles, E. A. (1990). Multiple activities. Cognitive Neuropsychology, 17,
executive functions, multiple resources? 297-338.
Experiments in shifting attentional control Costa, A., & Santesteban, M. (2004). Lexical
of tasks. Unpublished manuscript, Oxford access in bilingual speech production:
University, Oxford, U.K. Evidence from language switching in highly
Allport, A., Styles, E. A., & Hsieh, S. (1994). procient bilinguals and L2 learners. Journal
Shifting intentional set: Exploring the of Memory and Language, 50, 491511.
dynamic control of tasks. In C. Umilta & Dalrymple-Alford, E. C. (1967). Prestimulus
M. Moscovitch (Eds.), Attention and language cueing and speed of identifying
performance 15: Conscious and nonconscious Arab and English words. Psychological
information processing (pp. 421452). Reports, 21, 2728.
Hillsdale, NJ: Erlbaum. De Bot, K. (1992). A bilingual production model:
Arbuthnott, K., & Frank, J. (2000). Executive Levelts speaking model adapted. Applied
control in set switching: Residual switch Linguistics, 13, 124.
costs and task set inhibition. Canadian De Bot, K., & Schreuder, R. (1993). Word
Journal of Experimental Psychology, production and the bilingual lexicon.
54, 3341. In R. Schreuder & B. Weltens, (Eds.). The
Baddeley, A. D. (1986). Working memory. Oxford, bilingual lexicon (pp. 191214). Amsterdam:
U.K.: Oxford University Press. Benjamins.
Baddeley, A. D., Chincotta, D., & Adlam, A. De Groot, A. M. B. (1993). Word-type effects in
(2001). Working memory and the control of bilingual processing tasks: Support for a
action: Evidence from task switching. Journal mixed-representational system. In R. Schreu-
of Experimental Psychology: General, 130, der & B. Weltens (Eds.), The bilingual lexicon
641657. (pp. 2751). Amsterdam: Benjamins.
Bajo, A., & Green, D. (1999, April). Language De Groot, A. M. B., & Nas, G. L. J. (1991). Lexical
switching and symbolic distance effects. representation of cognates and noncognates in
Language Selection in Bilinguals 367
Jensen, A. R., & Rohwer, W. D. (1966). The sentence processing (pp. 5985). Amsterdam:
Stroop color-word test: A review. Acta North-Holland/Elsevier Science.
Psychologia, 25, 3693. Loasby, H. A. (1998). A study of the effects of
Jersild, A. T. (1927). Mental set and shift. Archives language switching and priming in a picture
of Psychology, 89. naming task. Unpublished manuscript,
Kim, K. H. S., Relkin, N. R., Kyoung-Min, L., University of Oxford, Oxford, U.K.
& Hirsch, J. (1997). Distinct cortical areas Los, S. (1996). On the origin of mixing costs:
associated with native and second languages. Exploring information processing in pure and
Nature, 388, 171174. mixed blocks of trials. Acta Psychologica, 94,
Kirsner, K., Smith, M. C., Lockhart, R. S., 145188.
King, M. L., & Jain, M. (1984). The bilingual Macnamara, J. (1967a). The bilinguals linguistic
lexicon: Language specic units in an performance: A psychological overview.
integrated network. Journal of Verbal Journal of Social Issues, 23, 5977.
Learning and Verbal Behavior, 23, 519539. Macnamara, J. (1967b). The linguistic
Kiyak, H. A. (1982). Interlingual interference in independence of bilinguals. Journal of
naming color words. Journal of Cross Cultural Verbal Learning and Verbal Behavior,
Psychology, 13, 125135. 6, 729736.
Kolers, P. A. (1966). Reading and talking Macnamara, J., Krauthammer, M., & Bolgar, M.
bilingually. American Journal of Psychology, (1968). Language switching in bilinguals as a
79, 357376. function of stimulus and response uncertainty.
Kolers, P. A. (1968). Bilingualism and information Journal of Experimental Psychology, 78,
processing. Scientic American, 218, 7889. 208215.
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical Macnamara, J., & Kushnir, S. L. (1971). Linguistic
and conceptual memory in the bilingual: independence of bilinguals: The input switch.
Mapping form to meaning in two languages. Journal of Verbal Learning and Verbal
In A. M. B. de Groot & J. F. Kroll (Eds.), Behavior, 10, 480487.
Tutorials in bilingualism: Psycholinguistic MacLeod, C. M. (1991). Half a century of research
perspectives (pp. 169199). Mahwah, NJ: on the Stroop effect: An integrative review.
Erlbaum. Psychological Bulletin, 109, 163-203.
Kroll, J. F., & Dijkstra, T. (2002). The bilingual Mayr, U., & Keele, S. W. (2000). Changing inter-
lexicon. In R. B. Kaplan (Ed.), Handbook of nal constraints on action: The role of back-
applied linguistics (pp. 301324). Oxford, ward inhibition. Journal of Experimental
U.K.: Oxford University Press. Psychology: General, 129, 426.
Kroll, J. F., & Stewart, E. (1994). Category Meiran, N. (1996). Reconfiguration of processing
interference in translation and picture naming: mode prior to task performance. Journal of
Evidence for asymmetric connections between Experimental Psychology: Learning, Memory,
bilingual memory representations. Journal of and Cognition, 22, 120.
Memory and Language, 33, 149174. Meiran, N. (2000). Modelling cognitive control in
Kroll, J. F., & Tokowicz, N. (2001). The devel- task-switching. Psychological Research, 63,
opment of conceptual representation for 234249.
words in a second language. In J. L. Nicol Meiran, N., Chorev, Z., & Sapir, A. (2000). Com-
(Ed.), One mind, two languages: Bilingual ponent processes in task switching. Cognitive
language processing (pp. 4971). Cambridge, Psychology, 41, 211253.
MA: Blackwell. Meuter, R. F. I. (1994). Language switching in
La Heij, W., Hooglander, A., Kerling, R., & Van naming tasks. Unpublished doctoral disserta-
der Velden, E. (1996). Nonverbal context ef- tion, University of Oxford, Oxford, U.K.
fects in forward and backward word transla- Meuter, R. F. I. (2001, April). Switch costs in bi-
tion: Evidence for concept mediation. Journal lingual discourse: An exploration of relativity
of Memory and Language, 35, 648665. in language prociency. Poster session
Lee, W. L., Wee, G. C., Tzeng, O. J., & Hung, presented at the Third International Sympo-
D. L. (1992). A study of interlingual and sium on Bilingualism, University of the West
intralingual Stroop effect in three different of England, Bristol, U.K.
scripts: Logographic, syllabary, and alphabet. Meuter, R. F. I., & Allport, A. (1999). Bilingual
In R. J. Harris (Ed.), Cognitive processing in language switching in naming: Asymmetrical
bilinguals (pp. 427442). Amsterdam: North- costs in language selection. Journal of Memory
Holland/Elsevier Science. and Language, 40, 2540.
Li, P., & Farkas, I. (2002). A self-organizing Meuter, R. F. I., & Binder, P. (2004, May).
connectionist model of bilingual processing. Language selection in trilingual speakers:
In R. Heredia & J. Altarriba (Eds.), Bilingual Lembarras du choix. Paper presented
Language Selection in Bilinguals 369
Rogers, R. D., & Monsell, S. (1995). The cost of Shanon, B. (1991). Faulty language selection in
a predictable switch between simple cogni- polyglots. Language and Cognitive Processes,
tive tasks. Journal of Experimental 6, 339350.
Psychology: General, 124, 207231. Smith, M. C. (1997). How do bilinguals access
Rogers, R. D., Sahakian, B. J., Hodges, J. R., lexical information? In A. M. B. de Groot & J.
Polkey, C. E., Kennard, C., & Robbins, T. W. F. Kroll (Eds.), Tutorials in bilingualism: Psy-
(1998). Dissociating executive mechanisms cholinguistic perspectives (pp. 145168).
of task control following frontal lobe Mahwah, NJ: Erlbaum.
damage and Parkinsons disease. Brain, 121, Spector, A., & Biederman, I. (1976). Mental set
815842. and mental shift revisited. American Journal of
Rubenstein, J. S., Meyer, D. E., & Evans, J. E. Psychology, 89, 669679.
(2001). Executive control of cognitive pro- Spivey, M. J., & Marian, V. (1999). Cross talk
cesses in task switching. Journal of Experi- between native and second languages: Partial
mental Psychology: Human Perception and activation of an irrelevant lexicon. Psycho-
Performance, 27, 763797. logical Science, 10, 281284.
Sandston, J., & Albert, M. L. (1987). Perseveration Stroop, J. (1935). Studies of interference in serial
in behavioral neurology. Neurology, 37, verbal reactions. Journal of Experimental
17361741. Psychology, 18, 643662.
Shallice, T. (1988). From neuropsychology Taylor, I. (1976). Introduction to psycholinguis-
to mental structure. Cambridge, U.K.: tics. New York: Holt, Rinehart &
Cambridge University Press. Winston.
Shallice, T., & Burgess, P. (1996). The domain Van Heuven, W. J. B., Dijkstra, A., &
of supervisory processes and temporal Grainger, J. (1998). Orthographic neighbor-
organization of behavior. Philosophical hood effects in bilingual word recognition.
Transactions of the Royal Society London B, Journal of Memory and Language, 39,
351, 14051412. 458483.
Norman Segalowitz
Jan Hulstijn
18
Automaticity in Bilingualism
and Second Language Learning
ABSTRACT In this chapter, we examine automaticity in light of the role it might play in
second language acquisition and in bilingual functioning. We review various theoretical
and operational denitions of automaticity, considering their respective strengths,
limitations, and challenges they present to researchers studying automaticity in the
context of bilingualism. Studies are reviewed regarding automaticity in grammar ac-
quisition and in lexical access and the connection between automaticity and attention
in second language acquisition. The implications of automaticity for second language
instruction are also discussed. It is argued that automaticity needs to be carefully dened
operationally and always viewed in the larger context of how the control system operates
in the acquisition and performance of complex skills.
371
372 Production and Control
it. It is in this sense that letter recognition is said to Thought) model of skill acquisition (Anderson,
be automatic. In contrast, the recognition of a 1983; Anderson & Lebiere, 1998). This approach
letter in the Hebrew alphabet by an L1 speaker of holds that, in the early phases of skill acquisition,
English who is only a novice reader of Hebrew performance largely relies on mechanisms that are
might require considerable consciously directed under conscious control, often involving declara-
effort, applied slowly over an interval much longer tive knowledge (Anderson, 1983). As the learner
than it takes that same person to recognize a letter gains practice, sequenced components of the new
of the English alphabet. Thus, the relatively rapid, skill that are repeated become routinized or
effortless, and ballistic (unstoppable) activities un- chunked, rendering them very fast and efcient
derlying uent letter recognition are said to be and unavailable to conscious awareness. The de-
automatic, standing in contrast to slower, effortful clarative knowledge is said to become procedur-
activities that can be interrupted or inuenced by alized, and the change is sometimes compared by
other ongoing internal processes (e.g., distractions, analogy to the compilation of a computer subrou-
competing thoughts). tine that involves converting instructions encoded
The characteristics of automaticity mentioned in a high-level interpreted language into lower-level
its rapidity, effortlessness, unconscious nature, and machine language.
ballistic naturehave each been separately oper- An alternative approach is Logans (1988) in-
ationalized in various ways in experimental re- stance theory of automatic processing. Logan pro-
search; some examples are reviewed next. In posed that initially performance of a to-be-mastered
thinking about these examples, it is important to skill is based on a set of algorithms for executing the
keep in mind that, in principle, these characteristics desired action. Each time the rule is carried out, there
do not necessarily always have to bundle together is a new memory trace formed corresponding to the
(Bargh, 1992; Neumann, 1984; N. S. Segalowitz, action executed. On subsequent occasions, there is a
2003; Tzelgov, 1999). For example, Paap and Og- race between an algorithmic process that constructs
den (1981) presented evidence showing that uent the appropriate response and a retrieval process that
letter recognition may be automatic in the sense of searches memory for the information needed to
being obligatory but nevertheless can consume re- perform the action. With increasing practice, more
sources. This illustrates one way in which automa- and more representations of the response are stored
ticity does not refer to a unitary construct. It would in memory, so eventually retrieval is accomplished
be an error, therefore, to assume without rst doing faster than is execution of the algorithm. Logans
the requisite empirical research that extensive prac- theory thus holds that automatization in skill ac-
tice leading to expertise will unfailingly result in quisition involves a shift from rule-based to memo-
performance that has all the characteristics typically ry-based performance. Logans theory is able to
associated with automaticity. account very well for the power law (Newell &
This distinction between automatic and attention- Rosenbloom, 1981) property of skilled perfor-
based processing pervades the cognitive psycho- mance, which refers to the frequent observation that
logical literature on skill acquisition (Ackerman, response latency decreases as a function of the
1988, 1989; Anderson, 1983; Anderson & Lebiere, number of practice instances raised to some power
1998; LaBerge & Samuels, 1974; Levelt, 1989; (Logan, 1992).
Logan; 1988; Proctor & Dutta, 1995) and is central
to many treatments of L2 acquisition (DeKeyser,
2001; N. C. Ellis, 2002; Hulstijn & Hulstijn, 1984;
Johnson, 1996; McLaughlin & Heredia, 1996; Theoretical Perspectives
McLaughlin, Rossman, & McLeod, 1983; and N. S. and Empirical Studies on
Segalowitz, 1997, 2003). As will become evident Automaticity in Bilingualism
from discussion in this chapter, the idea of auto-
maticity is itself evolving, especially as researchers Empirical studies addressing questions about auto-
devise different ways to operationalize what they maticity and bilingualism can be viewed from var-
mean by it. ious perspectives. We review studies that examined
Broadly speaking, two general theoretical (a) automaticity as a characteristic of prociency,
approaches have been followed in attempts to (b) automaticity as a factor in grammar rule ac-
understand the place of automatization during quisition, (c) the relation between automaticity and
skill development. One approach is typied by attention, and (d) bilingualism as a testing ground
Andersons ACT* (ACT-Star: Adaptive Control of for learning more about automaticity.
Automaticity 373
the upcoming target. The target was either a word automatic processing is to have explanatory value,
naming an exemplar from the prime category thento avoid circularitythe term automatic
(e.g., APPLE), an exemplar from another category should be more than a synonym for fast.
(e.g., TABLE), or a nonword. The subject had to As a consequence, there is a need to distinguish
judge the word/nonword status (lexical decision) operationally between the following two situations,
of the target. In some conditions, the participants each involving a contrast between fast and slow
were trained to expect a prime word like FRUIT to performance. The rst is Situation A, in which the
be followed by a semantically unrelated target faster performance is simply caused by a difference
word such as TABLE. in the run-time speeds of the processes underlying
Like Neely, Favreau, and Segalowitz (1983) performance and not some difference in the selec-
found that, once participants were suitably trained, tion of which processes are involved or in the way
they showed appropriate facilitation and inhibition processes interact with each other. In this case,
effects in L1. For example, with a long interval there is no need to invoke the idea of automatic
(1,150 ms) between prime and target, a prime like processing, dened now to mean more than fast
FRUIT facilitated lexical decision to an expected processing to avoid circularity, to explain the dif-
but semantically unrelated target like TABLE or ference in performance.
CHAIR, relative to the neutral prime condition. In In contrast, there is Situation B, in which faster
contrast, a prime like FRUIT inhibited responses to performance is caused by more than just a differ-
an unexpected yet semantically related target like ence in the speed of underlying processes. Here, the
APPLE that was occasionally presented on surprise difference may lie in the way underlying processes
trials. On the other hand, when the primetarget are organized, such as when L2 visual word rec-
interval was short (200 ms), lexical decision on ognition proceeds directly from the printed stimu-
these surprise trials was facilitated, indicating that lus to meaning activation without rst passing
the subject could not suppress the activation of through a stage of phonological recoding or
semantically related concepts (APPLE, ORANGE, translation into L1. Or, instead, the difference
BANANA, etc., by FRUIT) even though instruc- might lie in the internal organization of a given
tions and training indicated that such targets were process without necessarily involving the elimina-
not predicted by the prime. In this way, the ex- tion of one or more stages of processing. Such
periment demonstrated the ballistic nature of word differences could lead, for example, to more bal-
meaning activation. listic processing, more parallel processing, and so
Favreau and Segalowitz (1983) found that in L2 on, resulting in signicantly faster and more ef-
only the stronger group showed this form of au- cient performance.
tomaticity. Interestingly, the evidence also indi- N. S. Segalowitz and Segalowitz (1993) de-
cated that the weaker bilinguals did not process scribed the fastslow contrast of Situation A as a
stimuli more slowly but only less automatically. case reecting simple speed-up and the fastslow
This research illustrates the important point that contrast in Situation B as a case for which the
subtle cognitive processing differences can exist difference can more appropriately be attributed to
between groups of relatively highly skilled L2 users automaticity. They proposed that when attempting
(all bilinguals in this study were able to read ma- to determine if automaticity underlies a given case
ture texts to full comprehension), in which for of fast responding, an attempt should be made to
some people (e.g., the stronger bilinguals) certain reject the null hypothesis that the performance
underlying processes operated in a ballistic fashion; could be caused by merely generalized speed-up.
for others, they did not. N. S. Segalowitz and S. J. Segalowitz (1993; S. J.
Segalowitz and Segalowitz later proposed a Segalowitz et al., 1998) proposed a way to test
somewhat different approach to the study of au- and therefore potentially rejectthe speed-up null
tomaticity (N. S. Segalowitz & Segalowitz, 1993; hypothesis. They argued that, when faster proces-
S. J. Segalowitz, Segalowitz, & Wood, 1998). sing is caused only by generalized speed-up of the
Practice and experience with a language typically processes underlying performance, the standard
lead to faster processing, which is commonly re- deviation of the reaction time should drop propor-
ected in various ways, including faster lexical tional to the reduction in the reaction time. This idea
decision times, faster rates of speaking and reading, can be understood at an intuitive level by consider-
and better ability to process rapid speech. N. S. ing the following metaphor. Suppose a videotaped
Segalowitz (2000; N. S. Segalowitz & Segalowitz, recording of a person making a cup of tea on 50
1993) pointed out, however, that if the construct of different occasions is viewed. Each component of
Automaticity 375
the actionputting the water on to boil, pouring the reaction time. Put another way, if the co-
the hot water into a cup, inserting the tea bag, and efcients of variabilitythe ratio of the stan-
so onwill take a particular length of time. A dard deviation to the mean reaction time for each
mean execution time and a standard deviation for individualremain the same while reaction times
this mean can be calculated across the 50 repeti- become faster (that is, both standard deviation and
tions both for the global action of making tea reaction time change by the same proportion), then
and for each component of this event. Suppose now there will be no grounds for rejecting the speed-up
a new videotape is created by rerecording the null hypothesis, and there will not be a signicant
original at twice the normal speed. On the new correlation between reaction time and coefcient
tape, the entire event will appear to be executed in of variability across subjects. If, on the other hand,
half the time with half the original standard devi- the coefcient of variability is signicantly reduced
ation overall; moreover, the mean duration of each as reaction time becomes faster, then the null hy-
component and the standard deviation associated pothesis can be rejected, and a claim can be made
with each component will also be reduced by ex- that there has been a changewhich N. S. Sega-
actly half. This situation corresponds to what N. S. lowitz and Segalowitz (1993) called automatiza-
Segalowitz and Segalowitz (1993) argued to be the tionthat must reect a different recruitment or
null case of generalized speed-up; performance organization of underlying mechanisms. In this
becomes faster because the underlying component case, as the reaction time reduces, so does the co-
processes are executed more quickly and for no efcient of variability, and there will be a signi-
other reason. (Of course, this account makes a cant correlation between the two. (See Wingeld,
number of simplifying assumptions about the brain, Goodglass, & Lindeld, 1997, for a different ap-
including that the component processes are orga- proach to dissociating speed of processing from
nized serially only. They probably are not. How- automaticity.)
ever, the scenario described would apply to both the N. S. Segalowitz and Segalowitz (1993) col-
nonoverlapping aspects of the underlying compo- lected lexical decision data from adults who varied
nents and to those that are organized serially, which in ability in L2 English or L2 French (S. J. Segalo-
together determine the total time of execution.) witz et al., 1998). The results were consistent with
Suppose now we are shown still another video- their approach for distinguishing automaticity
tape in which the mean time for the global action of from speed-up. They found that coefcient of var-
making tea is again half the original mean time, but iability varied with reaction time in those condi-
the standard deviation for the 50 repetitions is far tions for which faster responding was logically
less than half the original standard deviation. This expected to reect a change involving more than
tape cannot have been produced simply by re- just speed-up. They also found that coefcient of
recording the original at twice the normal speed. variability did not vary when faster responding was
Instead, there must have been some change in the expected to reect only speed-up (see also N. S.
way the activity of making tea had been carried Segalowitz, Poulsen, & Segalowitz, 1999). These
out, such that some of the slower and more vari- results are interesting for two reasons. Methodo-
able components of the action sequence had been logically, they demonstrate how to move beyond
dropped or replaced by faster, less-variable com- merely speculating that an observed case of in-
ponents. In other words, there must have been a creased performance speed reects a higher level of
change that involved more than simple speed-up, automaticity; it is now possible to assess the degree
namely, some form of restructuring of the under- to which this performance is not solely attributable
lying processes. to generalized speed-up. On a theoretical level, this
According to this approach, if it is believed that research demonstrated that higher levels of L2
practice and experience have produced some cog- prociency, unlike lower levels of L2 prociency,
nitive change other than generalized speed-up are associated with more than just differences in
restructuring, more ballistic processing, reduced processing speed.
reliance on decision processes, and so onthen one Caution must be taken, of course, when using the
should try to reject the null hypothesis represented coefcient of variability analysis just described.
by generalized speed-up. N. S. Segalowitz and Failure to reject the generalized speed-up hypothesis
Segalowitz (1993) proposed that if faster perfor- carries with it the usual caveats concerning failure to
mance reects more change than is accounted for reject the null hypothesis; it is always wise, there-
by speed-up, then the standard deviation should fore, to have convergent evidence to support a
change by a greater proportion than that seen in generalized speed-up account to conclude from
376 Production and Control
failure to reject that speed-up is what actually oc- Robinson and Ha (1993) and Robinson (1997)
curred. Also, the method of analysis proposed by N. investigated the learning of the so-called dative al-
S. Segalowitz and Segalowitz (1993) does not ad- ternation rule of English by adult speakers of Jap-
dress the many interesting questions that could be anese, Korean, and French (Robinson & Ha, 1993)
asked regarding the kind of change that has taken and Japanese (Robinson, 1997) in a single learning
place when analysis supports a claim for automati- session lasting not longer than 30 min. (Dative al-
zation; it only allows concluding that something ternation refers to the fact that, for some monosyl-
other than generalized speed-up occurred. labic verbs in English, the indirect object form can
Further research is always required to pinpoint alternate with the direct object form, as in She gave
the exact nature of the change; however, analysis the book to the boy and She gave the boy the
of the coefcient of variability may again be useful book, whereas some bisyllabic verbs only allow
in that follow-up research. For example, suppose the indirect form, as in She donated the painting to
the results of a study indicated that L2 word rec- the museum.) In these studies, automaticity was
ognition became faster after some particular form dened as reaction time patterns conforming to the
of training, and that a generalized speed-up ex- power law. Participants in the 1993 study were
planation can be rejected by the coefcient of var- presented with the dative alternation rule. Subse-
iability analysis. Follow-up research using a design quently, in the training phase, they were shown 36
permitting a coefcient of variability analysis could sentences, one at a time. They had to indicate
be useful for looking into whether performance whether the sentence did or did not conform to the
improved because, say, perception of orthographic rule just presented. Feedback was given on the
redundancies (knowledge of spelling pattern fre- correctness of each response. There were 8 sentences
quencies) or phonological recoding had become in the training set. One sentence was presented eight
more automatic. times, one sentence seven times, one six times, and
so on, and the 36 sentences were presented in ran-
Automaticity and Grammar Rule dom order. In a subsequent transfer test, partici-
Acquisition pants performed the same task, this time with 32
sentences, 8 of which were identical to the ones used
Perhaps one of the most hotly debated issues in the in the training set. Reaction times of responses to
eld of foreign or L2 learning concerns the learning old sentences, which had been presented in the
and subsequent use of explicit grammar rules. previous training phase, were faster than those to
Currently, there are three main theoretical posi- new sentences. However, no evidence was found for
tions on this issue, commonly referred to as the the hypothesis, based on Logans instance theory
strong interface, weak interface, and no interface (1988), that reaction times would be faster for
positions (R. Ellis, 1993; Larsen, Freeman, & sentences presented more often in the training phase
Long, 1991, p. 324). Adherents of the strong in- than for sentences presented less often.
terface position claim that explicit, declarative In interpreting the complex ndings of this
knowledge can be transformed or converted into study, it is important to bear in mind that it was
implicit knowledge through practice, as proposed concerned with the application of a rule, explained
in Andersons skill acquisition theory (Anderson, in advance, in a metalinguistic task (grammaticality
1983; Anderson & Lebiere, 1998). According to judgment) rather than in a functional listening,
the weak interface position, explicit, declarative reading, speaking, or writing task, and that the
knowledge may somehow, in a way not yet prop- training phase comprised only 36 trials. We concur
erly understood, facilitate the acquisition of im- with DeKeysers (2001) interpretation that neither
plicit, procedural knowledge. The no interface rule application nor instance retrieval was at work,
position denies a causal role of explicit knowledge but a similarity-based item retrieval process (pp.
in the acquisition of implicit knowledge. In the area 142143). The pattern of results in the 1997 study,
of language pedagogy, Krashen (1981) is perhaps which adopted a more complex design and ad-
the best-known proponent of the no interface po- dressed other issues in addition to Logans instance
sition. For a discussion of the theoretical issues theory, was similar to that of the 1993 study re-
involved in the three positions, see the work of R. garding instance learning. Again, no gradual im-
Ellis (2000), Hulstijn (2002), and Paradis (1994). provement as a function of number of previous
Little empirical research has been conducted item presentations was found. In summary, the two
to test claims made on the basis of these three Robinson studies did not provide evidence for au-
positions in relation to issues of automatization. tomatization as operationally dened.
Automaticity 377
Healy and her coworkers (Bourne, Healy, Par- In all three studies, the participants had to learn
ker, & Rickard, 1999; Healy, Barshi, Crutcher, patterns in an articial language that were analo-
et al., 1998) investigated the acquisition of easy gous to grammatical rules. Participants in the rst
and difcult rules by adult native speakers of En- experiment reported in the study of N. C. Ellis and
glish. The easy rule required pronunciation of the Schmidt (1997) had to learn plural forms in an
article the as thuh or thee when preceding nouns articial language (e.g., bupoon for the plural of
beginning with a consonant or a vowel, respec- the articial word poon, meaning plane), some of
tively. The difcult rule required judging the order which conformed to frequency criteria that made
of letters in meaningless three-letter sequences, them regular plurals, whereas others did not and
such as the invalid LMV and the valid PRQ se- hence were irregular. Participants studied the
quences. PRQ is valid because it can be rearranged articial language names given to 20 picture stimuli
to correspond to a sequential string in the alphabet in 15 sessions of 1 hour sessions and spanning up to
(PQR), whereas LMV cannot. Participants in both 15 days. Participants in the second experiment
experiments were presented with well- and ill- were shown meaningless articial language sen-
formed stimuli. They judged the stimulis well- tences for a period of 75 min. Participants in
formedness and received feedback on the correctness DeKeysers study (1997) were shown articial
of their responses. Participants also reported whe- language sentences with pictures illustrating their
ther their responses were based on a guess, on a rule, meaning in 22 sessions of an hour or less and
on memory of the instance, or on other strategies. spread over an 11-week period. The exposure-
In both experiments, response accuracy rose to learning regimes in these studies differed some-
around 95%, and latencies dropped over the course what, but they had in common that both accuracy
of 30 learning blocks. Healy et al. (1998) reported and reaction times of participants responses were
that: measured during the learning. All three studies
showed an increase in response accuracy for stimuli
Although all subjects [in the difcult-rule ex- conforming to the appropriate grammatical pat-
periment] guessed initially, many subjects soon terns and a concomitant decrease in latency over
discovered and started using the rule. However, the course of trials and sessions following a power
by block 6, rule use began to give way to an law of learning. The authors interpreted these re-
instance strategy so that by the end of 30 blocks sults as evidence for automatization of grammar
of practice, subjects exhibited the instance- learning. The main focus of these studies, however,
based strategy almost exclusively. (p. 26) was on the issues of implicit versus explicit learning
and top-down learning by rule versus bottom-up
In the easy-rule experiment, 40% rule use was re- learning by association and analogy.
ported initially, suggesting that some participants, N. C. Ellis and Schmidt (1997) argued that the
not surprisingly, were familiar with the thuh/thee ndings of their studies can be accounted for by a
rule from the start. In Block 30, participants re- simple associative learning mechanism even in the
ported using the rule 65% of the time. An inter- case of the acquisition of regular rule-governed
esting nding was that, in the case of the easy rule, forms. DeKeyser (1997) found that performance
rule use resulted in faster response latencies than in both comprehension practice and production
did use of the instance strategy, whereas the reverse practice followed the same power function learning
pattern was obtained in the case of the difcult curve, but that acquisition was skill specic,
rule. In interpreting these results, one has to bear showing little transfer from comprehension to
in mind, as in the case of the Robinson studies production and vice versa. DeKeyser argued that
reported above, that participants were engaged in L2 rules can be learned in much the same way as
a metacognitive judgment task rather than in a learning in other cognitive domains and can be
speech production task requiring the application of accounted for by Andersons model of skill acqui-
the rules. sition, according to which declarative knowledge,
N. C. Ellis and Schmidt (1997) and DeKeyser with practice, turns into procedural knowledge.
(1997) investigated how adult, literate native One of the crucial issues in the debate between
speakers of English acquired some rules of gram- proponents of the strong, weak, and no interface
mar of an articial language in a computer- positions is concerned with the meaning of the
controlled laboratory setting. These experiments expression turn into (transform is used as a syno-
were limited to the written mode for input and nym in this debate) when it is claimed by some and
output; listening and speaking were not involved. denied by others that explicit knowledge can turn
378 Production and Control
into (or transform into) implicit knowledge. Does this or processing efciency that is responsible for the
mean that explicit knowledge undergoes a meta- linguistic uency or prociency (the rapidity, uid-
morphosis such that, eventually, explicit knowledge ity, and accuracy) observed in a bilingual individual.
has ceased to exist and that, in its place implicit Besides automaticity, there is a complementary,
knowledge has arisen? nonautomatic aspect involving attention-based
Such a view implicitly rests on the idea that rst processes that are also required for uent language
there is an area in the brain where explicit knowl- use. These operate in a close fashion with more
edge resides, and furthermore that, during the pro- automatic processes to determine the overall un-
cess of proceduralization, implicit knowledge is derlying efciency of L2 functioning.
formed, settling itself in the same area, forcing Such attention-based processes include focusing
explicit knowledge to dissolve. However, such a on (directing awareness to) the language itself
strong view of transformation is not supported by while learning it, such as the noticing and focus-on-
brain research. Brain research suggests that de- form skills that may be necessary for successful
clarative knowledge resides in the medial temporal learning (Doughty & Williams, 1998; Lightbown
lobe, including the hippocampus, whereas im- & Spada, 1990; Robinson, 1995; Schmidt, 2001).
plicit knowledge is distributed over the neocortex Selective attention is also involved in uency inso-
(Paradis, 1994; Squire & Knowlton, 2000; Ullman, far as the ability to focus on the speech stream as a
2001). Viewed from this neurophysiological per- channel of communication under noisy conditions
spective, the strong interface position in the L2 or focus selectively on phonological cues carrying
acquisition eld should be taken to mean that ex- sociolinguistic messages or on cues to turn taking
plicit knowledge forms a prerequisite for implicit and the like, as pointed out by Levelt (1989) (see
knowledge to come into existence rather than the also Eviatar, 1998, and Fischler, 1998, for more on
claim that explicit knowledge transforms into im- selective attention and language). Finally, there is
plicit knowledge. the attention-directing function of language itself,
The evidence of the studies reviewed in this sec- in which language is used to shape the way a lis-
tion are consistent with Willinghams (1998) posi- tener or reader builds a mental representation of
tion that, already in the initial phases of learning, the message conveyed. This attention-directing
implicit knowledge is spontaneously formed, and function is believed by cognitive linguists to be
that explicit processes are simply not used any longer central to the communicative purpose of language
in later phases. The practical relevance of the inter- (Langacker, 1987; Talmy, 1996, 2000).
face issue remains great: Of course, language N. S. Segalowitz and Frenkiel-Fishman (in
teachers and language learners alike want to know press) found that L2 skills reecting attention-
to what extent knowledge of grammar rules may directing functions of language were signicantly
foster or hinder the attainment of uency in lan- related to levels of automaticity of single-word
guage use. In terms of theoretical explanations, recognition as indexed by the coefcient of vari-
however, the interface issue is likely to form part of ability measure described here. This study involved
the much broader neurocognitive issue of explicit an attention-shifting task adapted from Rogers and
and implicit cognition. Empirical evidence may Monsells (1995) alternating runs paradigm. The
come not only from behavioral data (such as re- stimuli were time adverbials and conjunctions,
sponse time and response variability, presented both good examples of words that serve to direct a
elsewhere in this chapter) but also from neuro- persons attention in particular ways while building
physiological data (such as event-related potential a mental representation of a messages meaning.
and neuroimaging). Time adverbials direct the listener/reader on how
elements of a mental representation should be
foregrounded or backgrounded with respect to
Automaticity and Attention in time. Conjunctions convey the need to form par-
Second Language Prociency ticular links between elements of a mental repre-
sentation.
The research reviewed so far attempted to integrate Participants were given two tasks (N. S. Sega-
the concept of automaticity with theories of L2 lowitz & Frenkiel-Fishman, in press). In one they
prociency and L2 grammar learning. N. S. Sega- had to judge the meaning of a target word be-
lowitz (1997, 2000) proposed, however, that longing to the time adverbial stimulus set. The
automaticity addresses just one component of a other task required them to judge a conjunction.
larger set of issues underlying cognitive uency For example, in the time adverbial task, subjects
Automaticity 379
judged whether a word (soon, later, etc.) referred Moreover, there are automatic modes of pro-
to a moment in time relatively close to or relatively cessing that are fully integrated within nonauto-
far from the present moment (as an illustration, matic modes, and it is impossible to fully tease
compare the meanings of Ill do it soon versus them apart. To illustrate, consider the relatively
Ill do it later). In the conjunction task, subjects simple case of reading a sentence in L2. One has
judged whether a word (because, despite, etc.) to process letters, words, and syntactic patterns and
normally indicates the presence or absence of a integrate all this into the ongoing construction of a
causal link between the clauses it conjoins (e.g., representation of the meaning of the sentence and
compare John passed the exam because he studied of the larger text. Reading will, of course, be uent
all night versus John passed the exam despite to the extent that many of the mechanisms involved
partying all night). are ballistic and do not consume resources better
In the N. S. Segalowitz and Frenkiel-Fishman (in used for other purposes. However, such a need for
press) experiment, on each trial either a time ad- automaticity can be identied at all levels of pro-
verbial or a conjunction appeared in one of four cessing, from relatively low-level letter recogni-
spatial locations on a screen. This location indi- tion to aspects of relatively high-level attention
cated which task (time adverbial or conjunction focusing (see also Tzelgov, Henik, & Leiser, 1990,
judgment) was to be performed. As in the work of for a similar point).
Rogers and Monsell (1995), the tasks alternated This tight relationship between automatic and
in a predictable manner according to the sequence attention-based mechanisms raises the following
. . . adverbial adverbial conjunction conjunc- interesting question that has yet to be addressed
tion . . . and thereby requiring a repeat of a given empirically: Do the attention-based and automatic
task and a switch to the alternate task on every components of prociency develop independently?
second trial. This design provided a measure of the If so, can such development account for individual
switch cost, that is, the cost in response time differences in learning success in a given learning
to switch from one task to the other, compared to context (e.g., a classroom, study abroad, immer-
repeating a task. Participants performed the ex- sion, etc.)? If not, there are at least three alterna-
periment in separate L2 and L1 blocks, thus pro- tives to consider: (a) Do attention-based language
viding a measure of switch cost in each language. skills require a threshold level of automatic
In a separate part of the study (N. S. Segalowitz processing before they can develop? (b) Does the
& Frenkiel-Fishman, in press), subjects ability to acquisition of automatic processing abilities require
process L2 word meaning was indexed in terms of some critical level of supporting attention-based
the coefcient of variability of latency (as discussed mechanisms in place? (c) Should automatic and
in the section Studies of Automaticity in Second attention-based processing be conceived as mutu-
Language Prociency) in a classication task in ally dependent? These questions have important
which nouns were judged as referring to living or practical value in addition to theoretical interest
nonliving objects. Here also, L1 measures were because the answers may point in particular di-
used as baseline. The results indicated that the rections regarding the most effective way to orga-
switch cost in L2 was signicantly correlated with nize L2 learning experiences.
the coefcient of variability of reaction time in the
classication task after taking into account per- Studies Using Bilinguals to
formance on the same tasks in L1. The results were Investigate Automaticity
interpreted as indicating that attention-focusing
skill is related to prociency as indexed by switch The studies reviewed in the previous sections
cost and coefcient of variability of reaction time directly addressed questions about the role played
respectively. by automaticity in bilingualism. Next, we review
Although it is beyond the scope of this chapter to several related examples of research that made use
discuss further the role of attention-based processes of the automatic and nonautomatic characteristics
in L2 functioning and in uency acquisition (see of bilingualism to study automaticity itself and re-
Schmidt, 2001), it is important to keep in mind that lated constructs in addition to contributing directly
automaticity cannot really be talked about without to an understanding of bilingualism as such.
also talking about attention. Automaticity is rec- One interesting study in this category is Meuter
ognized only by virtue of its contrast to nonauto- and Allports (1999) study of attention. Meuter
matic or less-automatic (attention-based) modes of and Allport were interested in the processes re-
processing. sponsible for the shift cost or slowed response time
380 Production and Control
observed when subjects have to perform tasks in language-based cognitive demands, a bilingual with
two different languages. In their study, bilinguals two (or more) equally strong languages at his or
named numerals shown on the screen using L1 or her command continually has to inhibit competi-
L2 in a paradigm in which the language of response tion from the currently not-to-be-used language(s),
was cued by color. competition that arises from the automatic activa-
Meuter and Allport (1999) found that the cost tion of language representations elicited by ongoing
associated with switching to L1 (that is, the slow- thoughts and by stimuli in the environment. For the
ing of the L1 response observed after having just young bilingual child, this may constitute intensive
responded in L2 compared to having just re- training of frontal inhibitory systems, training that
sponded in L1) was greater than the cost associated normally does not occur to the same degree for
with switching to L2. This effect is paradoxical monolinguals. If correct, this view would then
because normally it would be expected to be easier suggest that the automatic activation of language-
to switch to the stronger L1 than to switch to the based representations can, in a bilingual child, have
weaker L2. In fact, however, the authors had pre- far-reaching consequences by providing sustained
dicted this paradoxical effect from their theory of training of inhibitory systems that are required
the nature of switch costs. They believe that the even for nonlinguistic cognitive activity (such as
cost observed on a given switch trial reects the those documented by Bialystok). This idea mer-
need to overcome inhibition activated on the im- its further investigation, especially through studies
mediately preceding trial. Thus, on a switch trial using more direct measures of the automatic nature
involving an L2 response, the bilingual has to of language activation and suppression.
suppress or inhibit the automatic activation of the Automaticity has also been studied in bilinguals
competing stimulus name in L1 to respond cor- with a view to understand the nature of lexical
rectly in L2. If, however, the switch trial requires access. Tzelgov, Henik, Sneg, and Baruch (1996),
an L1 response, the bilingual has to do two things: for example, exploited certain automatic aspects of
cancel the inhibition to responding in L1 that was reading in bilinguals to understand further the na-
activated on the previous trial and overcome any ture of lexical access in skilled readers. Some the-
persisting inhibition from that trial. ories of skilled reading hold that readers access
Meuter and Allports results were consistent meaning from print automatically in a process that
with the idea that there is automatic activation is mediated by preassembled phonological repre-
of L1 representations in L1 naming tasks, sentations (Van Orden, 1987) developed during
whereas there is little or no automatic activation of earlier phases of skill acquisition. Other theories
L2 representations in an L2 naming task; this L1 suggest that automatization in reading skill acqui-
activation may be difcult to overcome when sition involves a shift from dependence on assem-
competition between the languages is important. bled phonological representations to direct access
Presumably, their paradigm could be adapted to of meaning from visual input (Waters, Seidenberg,
quantify this automatic activation when it is useful & Bruck, 1984). The mediated access approach
to measure an individuals degree of balance be- characterizes automatic processing as making use
tween L1 and L2 in terms of automatic processing of activity-specic precompiled productions, as
and attention exibility. (See also chapter 17, by proposed by Anderson (1983) in his process-based
Meuter, this volume.) ACT* approach.
Bialystok (2001) reported a series of highly In contrast, the direct access approach charac-
original studies on the possible cognitive benets terizes automatic processing as a memory-based,
associated with early bilingualism (also, see Bia- single-step retrieval process, similar to Logans
lystoks chapter 20 in this volume). She investi- (1988) memory-based instance theory of automa-
gated what happens when both languages are ticity (i.e., a shift from algorithmic to instance re-
automatically activated and are always in compe- trieval). Tzelgov et al. examined bilingual readers
tition because the individual is growing up bilin- in a Hebrew-English version of the Stroop para-
gual. She compared bilingual children learning digm (Stroop, 1935). They used cross-script ho-
their two languages at the same time with mono- mophones, such as Hebrew color words written in
lingual children and found that, in certain nonlin- the Latin (English) alphabet (e.g., adom), and En-
guistic domains, children with strong L2 abilities glish color words written in Hebrew letters that,
outperformed monolingual children. The results when sounded out, sound like English words.
were consistent with the following idea: Because Consider now the case in which adom is written
almost every waking moment involves dealing with in green ink, and the correct response is therefore
Automaticity 381
green. According to the mediated access ap- out, the functional, communicative use of language
proach, if the subject is a Hebrew speaker and a involves the simultaneous manipulation of many
skilled reader of English, he or she will automati- linguistic elements at different levels, ranging from
cally access via a phonological route the concept the higher levels of content and discourse organi-
of red because /adom/ in Hebrew means red. Ac- zation to the lower levels of processing speech
cording to the direct access approach, however, the sounds and letters (in oral and written communi-
phonologically based link between /adom/ and the cation, respectively). Given the fact that humans
concept red will be bypassed. have a limited capacity for information processing,
In a series of experiments with Hebrew-English it is obvious that language users cannot pay at-
bilinguals, Tzelgov et al. studied whether the au- tention to all information at all linguistic levels
tomatic processing underlying skilled L2 reading simultaneously to the same high degree. In most
made use of the phonological route (and hence communicative situations, the processing of infor-
precompiled productions) or the direct route (and mation at the higher levelsthat is, informa-
hence instance retrieval). They reported nding a tion concerning the content and the course of the
strong cross-script Stroop effect, particularly when communicationconsumes much of this limited
the stimulus was a transliteration of a color name capacity.
in Hebrew, the subjects L1 (adom activating red). VanPatten (1990), for example, reported a
The results thus supported the rst model described study indicating that, in the early stages of L2 ac-
above, namely, that unintentional automatic quisition, learners nd it difcult to focus both on
processing in reading involves precompiled pho- message content and various aspects of form (verb
nological productions and not retrieval of stored form, grammatical functors). Because of the nov-
instances. elty of most communicative acts, the processing of
Tzelgov et al. (1996) argued, on the basis information at the higher levels can hardly be au-
of these results and others they obtained, that there tomatized. What can be automatized to a large
is evidence for two different, coexisting forms of extent, however, is the processing of information at
automaticity, one involving activity-specic pre- the intermediate levels of the retrieval of words to
compiled productions and the other the develop- express personal thoughts; processing at the lower
ment of a database for memory retrieval in the levels of the planning of the morphosyntactic,
execution of the skill in question. The results were phonological, and phonetic aspects of the utter-
also interpreted as support for the asymmetric ance; and the execution of the planned part of an
model of bilingual memory proposed by Kroll (e.g., utterance with the aid of speech organs accompa-
Kroll & Stewart, 1994) because the Stroop effects nied by appropriate gestures.
were themselves asymmetrical as a function of This is what happens during the many years of
which language was L1 (see Kroll & Tokowicz, L1 acquisition and what has turned most adults
chapter 26, and Dijkstra, chapter 9, this volume, for into uent speakers of their native tongue. It is
related discussion). therefore important, in the case of L2 instruction,
In sum, it can be seen from these studies that to devise tasks that do not require the allocation
bilingualism can provide a particularly useful situ- of much attention to the higher levels of informa-
ation for studying cognitive mechanisms not only tion, allowing learners to pay attention to infor-
as they relate to L2 processes, but also as they re- mation at particular lower levels standing in need
late to basic, more general cognitive issues such as of automatization.
automaticity. The basic principle underlying such tasks is
repetition (N. C. Ellis, 2002), but as we discuss in
the following sections, simple repetition as such
Instructional Implications cannot be the whole answer. To the extent learning
is promoted through repetition (e.g., in the case of
Questions about what role, if any, automaticity the automatization of the word-by-word under-
plays in L2 acquisition and prociency will natu- standing of speech), learners should listen to ma-
rally have implications for how to optimize lan- terials that do not contain very many unfamiliar
guage instruction. Here, the central instructional words, preferably several times. Similarly, in the
question is the following: Once language learners case of reading, learners should be given linguisti-
have been exposed to new linguistic information, cally easy (but yet authentic or quasi-authentic)
what must they do to be able to achieve later texts to allow them to increase their reading
automatic access to that information? As pointed speed. Teachers and learners alike should make a
382 Production and Control
principled distinction between two types of listen- using the latency coefcient of variability as an
ing and reading activities: exposure to materials operational denition of automaticity. Akamatsu
containing new linguistic elements for the purpose (2001) trained 46 Japanese university students,
of acquiring new knowledge and exposure to ma- who had at least 6 years of prior instruction in
terials containing familiar elements for the purpose English, in seven weekly sessions to recognize 150
of automatization. English words quickly. In each session, students
had to draw separator lines as quickly and as ac-
curately as possible between words that had been
Research on the Training printed with no interword spaces. Before and after
of Lexical Access training, students took a computer-controlled word
recognition test. This test comprised 50 nonwords
Hulstijn (2001, pp. 283286) discussed various and 50 high-frequency and 50 low-frequency words
pedagogical approaches designed, based on the ideas that had been part of the training set. Both accuracy
presented in the preceding section, to enhance au- and reaction time on correct trials improved signif-
tomatic word recognition. Empirical research on the icantly from pretest to posttest. More interesting,
impact of training for automaticity on subsequent individuals latency coefcient of variability and
reading and writing skills has only just begun (e.g., reaction time were highly and signicantly corre-
Schoonen et al., 2003; Van Gelderen et al., 2004). In lated in the processing of low-frequency words, but
a study involving 281 high school students in Grade not of high-frequency words, both before and after
8 in the Netherlands, Van Gelderen et al. and training. The author speculated that students in this
Schoonen et al. investigated the relative contribution study had already passed the automatization phase
of three sources of linguistic cognition on reading for high-frequency words. Training of these words
and writing in Dutch as an L1 and English as an L2 had only resulted in speed-up, whereas training
(after approximately 250 hours of instruction). The with the low-frequency words had produced a
dependent variables were reading and writing both qualitative change, reecting automatization. The
in L1 (Dutch) and L2 (English). The predictor vari- results of this study (and those of N. S. Segalowitz,
ables fell into three categories: (a) knowledge of Watson, & Segalowitz, 1995, reviewed in the sec-
language, measured with tests of receptive vocabu- tion Automaticity and Communicative Approaches
lary, grammar, and spelling, in both L1 and L2; (b) to Teaching) support the idea that training activi-
speed of access to knowledge of language, measured ties of a relatively short duration can bring about a
with computer-based tests of word recognition, qualitative change in the processing of lexical in-
lexical retrieval, sentence verication, and sentence formation, indicating a gain in efciency that re-
building, in both L1 and L2; and (c) metacognitive ects more than simple speed-up.
knowledge, assessed with a questionnaire pertaining Clearly, research has only begun on the impor-
to knowledge of text characteristics and strategies of tant practical question of how to enhance auto-
reading and writing in L1 and L2. In analyses using matic processing to promote L2 prociency. Such
structural equation modeling, signicant correla- research is in its infancy because researchers have
tions were found between speed measures and only just started to identify the learning issues in-
measures for reading and writing skills. volved and have only recently developed practical
Stronger correlations were found (Van Gelderen performance measures for operationalizing auto-
et al., 2004) between predictor variables and maticity. There are, of course, ways of bringing L2
reading and writing in the case of L2 than in the materials to the learner and to create repetition
case of L1. However, no variance in L1 and L2 conditions in a manner that could promote auto-
reading or writing performance was uniquely ac- maticity beyond those reviewed earlier. For exam-
counted for by the speed measures when the ple, some authors have pointed out that, for any
knowledge of language and metacognitive knowl- task aimed at helping learners gain uency in oral
edge measures were also entered into the regression production, it is essential to provide learners ample
analysis. One plausible interpretation of these time to plan ahead (Robinson, 2001b). In a review
ndings is that most of these low-intermediate L2 of the literature on factors affecting cognitive
learners could already access their L2 knowledge complexity of L2 production tasks, Skehan and
sufciently fast to allow processing of semantic and Foster (2001) claimed that there is considerable
pragmatic information at the text level. agreement that complexity and uency are en-
Finally, we review a study that attempted to hanced by pre-task planning (p. 201). They also
increase L2 automaticity through explicit training pointed out that one of the things speakers do when
Automaticity 383
they are given time to plan their oral production contextualized learning (Gatbonton & Segalowitz,
well ahead of execution is to bring into working 1988; in press) and by providing sets of systemati-
memory elements from long-term memory per- cally constructed materials for practical applica-
ceived to be relevant to the task at hand. tions based on this approach (Gatbonton, 1994).
First, Gatbonton agreed with others that a funda-
mental step in early L2 learning is the automatiza-
Automaticity and Communicative tion of formulaic utteranceschunks of language
Approaches to Teaching that are routinized even in the speech of native
speakers (N. C. Ellis, 1997, 2002; Pawley & Syder,
A fundamental question remains, however, about 1983; Wray, 2002; see chapters in Schmitt, 2004).
how to best promote automatization through rep- Second, she advocated selecting the utterances to be
etition in real learning situations outside the labo- automatized from among those expressions and
ratory. In answering this question, we have to utterance frames useful for a variety of communi-
take into account the extensive and essentially cative purposes. Third, she proposed ways to create
negative experience with so-called pattern drills of activities systematically that are genuinely commu-
the audiolingual method in the 1960s and 1970s. nicative (i.e., for which the communication meets a
This method was used to help L2 learners improve genuinely felt psychological need for information)
their production skills in language laboratories and the activity is inherently repetitive (i.e., the ac-
through the use of equipment to listen to audio tivities involve, in a way that feels natural, the need
recordings and by making recordings of their own to report information to many people, one by one).
speech (Rivers, 1967). Thus, Gatbonton advocated repetition to promote
One of the main reasons why many of these automaticity of basic communicative utterances
drills failed to bring about the desired effect on within a context that requires the learner to coor-
spontaneous language use is that they required dinate these learning activities with the control of
learners to focus on grammatical forms almost attention, decision making, and other higher level
exclusively. Many drills did not force learners to aspects of language processing. She called this pro-
process information at the higher levels of dis- cess creative automatization to reect the idea
course. These methods gave way to what are called that the learner achieves automatization through
communicative language teaching (CLT) methods repetition of acts involving the creation of commu-
that stress the importance of meaningful commu- nicatively valuable utterances.
nication as part of the learning process. Unfortu- N. S. Segalowitz et al. (1995) provided some
nately, most CLT methods do not provide sufcient preliminary experimental support for this crea-
repetition to promote automatization. This is be- tive automatization proposal in a study using a
cause the openness of typical CLT communication single case design. In this study, a Greek-speaking
activities cannot guarantee there will be the nec- psychology student who spoke English as an L2
essary opportunities for repeating and rehearing participated in a psychology tutorial over a 3-week
language input; efforts by teachers to supplement period in which a single article from a psychology
communicative activities with special repetition journal was analyzed from several different per-
exercises are largely unsuccessful for the same spectives. Throughout the 3-week period the stu-
reasons earlier audiolingual drill methods were dent performed lexical decision tasks involving a
(Johnson, 1996, especially pp. 171172). large number of words, including keywords from
Thus, teachers are faced with the following di- the studied article and control words matched for
lemma: Typical methods that provide the repetition frequency but not appearing in the article. The
necessary for automaticity to develop ultimately results showed that a measure analogous to the
fail to promote learning because of the highly de- coefcient of variability of the lexical decision
contextualized nature of the repeated material; at reaction time (coefcient of variability could not be
the same time, typical communicative methods that used directly because this was a single-subject
provide opportunities to fully contextualize learn- study) improved signicantly for the words con-
ing through meaningful communication fail to tained in the studied article but not for the control
provide the repetition necessary for automatiza- words. This result is consistent with the idea that
tion. Can this dilemma be overcome? natural and communicatively meaningful activities
Gatbonton has addressed this problem by pro- inherently repetitive can improve automaticity of
posing an analysis of L2 learning that focuses on lexical access. What is not known is how enduring
repetition leading to automaticity within highly such improved automaticity of lexical access is.
384 Production and Control
This would be an important question for future These involve neuron clusters in the posterior and
research to address. anterior cortex for the perception of objects, their
attributes, and the organization and execution
of action plans; excitatory neurons in the thalamic
nuclei that, by virtue of their wide cortical distri-
Summary and Future Directions
bution, can selectively enhance cortical activity; and
We conclude this review by addressing two ques- frontal cortex circuitry responsible for control. The
tions. First, is it possible to have a general theory of linking of these sites forms what LaBerge referred to
as a triangular circuit of attention. Awareness of
automaticity that will apply in a useful way to
an object is said to occur when an attentional circuit
phenomena of bilingualism? Second, what future
directions ought L2 acquisition research on auto- for that object becomes linked to an attentional
circuit related to the self, such as a self-attended
maticity take?
representation of a persons spatial or temporal lo-
cation in relation to the attended object. LaBerge
A General Theory of Automaticity separated automatic processing from attention-
based processing in terms of the presence or absence
We have seen that automaticity gures prominently of activity in these triangular circuits.
in most accounts of L2 acquisition and prociency This theory has generated considerable discus-
development, just as it does in most accounts of sion. For example, Tzelgov (1999), basing his work
skill acquisition. Nevertheless, the usefulness of on LaBerges theory, proposed that automaticity
studying automaticity cannot to be taken for be used to refer to cases when there is activation of
granted (see, e.g., Pashlers reservations [1998, pp. a triangular circuit not involving a self-attended
357382]). A major stumbling block to a general circuit (see also LaBerge, 2000a, for commentary
theory of automaticity is that the term has either on this). By explicitly proposing neural correlates
been used in a very broad sense, without clear of attention-based and automatic phenomena, La-
operational denition, or else has been dened Berge raised the bar in the way we talk about au-
narrowly but in different ways by different authors tomaticity. It is hoped that in time the multiple
(e.g., in terms of ballistic processing; as a shift from criteria that have up to now complicated discussion
serial to parallel processing; as restructuring re- about automaticity will become more precisely
sulting in a signicant change in latency coefcient dened and distinguishable from one another in
of variability; as latency patterns reecting the terms of underlying neural mechanisms, whether in
power law). These are exactly the same problems terms of LaBerges theory or some other neurobi-
that confront L2 acquisition researchers. They are ological approach to attention.
attempting to distinguish between explicit and Even prior to the emergence of neural theories
implicit learning processes, to understand when of automaticity, there has been a growing consen-
awareness is and is not useful in learning, to nd sus that the common element in most automatic
ways of determining when language functioning is phenomena is ballisticity, the unstoppable execu-
proceeding in an autonomous versus a monitored tion of a process once triggered (Bargh, 1992;
manner, and to understand the conditions under Favreau & Segalowitz, 1983; Neumann, 1984;
which autonomous processing might be acquired Tzelgov, 1999). Although clearly still a work in
and enhanced (N. C. Ellis, 1994; Hulstijn, 2002; progress, it appears that it may become possible
Robinson, 2001a). Thus, both cognitive psycholo- to provide an account that integrates the neural
gists interested in automaticity in general and L2 and behavioral evidence for ballistic processing,
acquisition researchers interested specically in thereby allowing more rigorous specication of the
how languages are learned face the common chal- relation between automatic and other closely re-
lenge of having to tease apart a complex of deeply lated phenomena.
intertwined issues. Is progress being made on this,
or are we moving around in circles?
We think there is reason for optimism. One in- Future Second Language Research
teresting example of potential progress in this area on Automaticity
was provided by LaBerge (1997, 2000b) in his tri-
angular circuit theory of attention. This theory The developments identied in this chapter should
identies particular neural circuits as underlying so- make it possible to address basic questions in
called attention-based and automatic phenomena. L2 acquisition in ways not before possible, using
Automaticity 385
neurophysiological measures (such as event-related ways of conceptualizing the issues reviewed here
potential and neuroimaging) as well as behavioral promise to bring important insights to this area.
measures (such as reaction times of responses eli-
cited in a variety of single and dual tasks). Can we
monitor the degree of automatic (ballistic) proces- Acknowledgments
sing in L2 learners at different stages of acquisition? We thank Laura Collin, Elizabeth Gatbonton,
Can we do so for specic aspects of L2 cognition? Randall Halter, Patsy Lightbown, and Irene OBrien
The current storage-versus-computation debate for helpful comments on earlier versions of this
in linguistics (cf. Nooteboom, Weerman, & Wijnen, chapter. Support for this chapter came from a grant
2002) concerning the division of labor between to Norman Segalowitz from the Natural Sciences
the lexicon (containing chunks of ready-made, and Engineering Research Council of Canada.
stored linguistic information) and the grammar
(containing procedures for computing or parsing References
remaining linguistic information, in language pro-
duction and reception, respectively) may be highly Ackerman, P. L. (1988). Determinants of individ-
relevant for the questions of (a) which linguistic ual differences during skill acquisition:
phenomena are amenable to automatization and (b) Cognitive abilities and information processing.
to what extent knowledge of grammar rules can Journal of Experimental Psychology: General,
117, 288318.
foster or hinder automatization. Perhaps, the suc-
Ackerman, P. L. (1989). Individual differences
cess of L2 acquisition that results in increasingly and skill acquisition. In P. L. Ackerman, R. J.
uent behavior resides, at least partly, in greater Sternberg, & R. Glaser (Eds.), Learning
availability of ever-larger, preassembled linguistic and individual differences: Advances in theory
units and the reduced need to compute information. and research (pp. 165217). New York:
As we point out in this chapter, there is reason Freeman.
to believe that it is especially at the intermediate Akamatsu, N. (2001, February). Effects of training
levels of syntactic, morphological, and phonologi- in word recognition on automatization of
cal encoding/decoding, as well as at the lower levels word-recognition processing of EFL learners.
of articulation and perception of acoustic or or- Paper presented at the 2001 annual conference
of the American Association for Applied Lin-
thographic signals, that component processes can
guistics, St. Louis, MO.
become automatic to a large extent. Nevertheless, Anderson, J. R. (1983). The architecture of cogni-
under certain circumstances, the language user can tion. Mahwah, NJ: Erlbaum.
consciously monitor the outcome of these processes Anderson, J. R., & Lebiere, C. (1998). The atomic
and, for instance, decide to repair an error. components of thought. Mahwah, NJ:
A further question that remains to be studied Erlbaum.
concerns the relationship between the ability to Bargh, J. A. (1992). The ecology of automaticity:
mobilize attentional resources (e.g., noticing) and Toward establishing the conditions needed to
L2 acquisition. Is noticing a cognitive prerequisite produce automatic processing effects. Ameri-
for attaining uency? If so, for which linguistic can Journal of Psychology, 105, 181199.
Bialystok, E. (2001). Bilingualism in development:
phenomena and at which levels of processing might
Language, literacy, and cognition. New York:
this be the case? How do neurobiological mecha- Cambridge University Press.
nisms of attention and automatic processing de- Bourne, L. E., Jr., Healy, A. F., Parker, J. T., &
termine the cognitive efciency that underlies high Rickard, T. C. (1999). The strategic basis of
levels of language prociency? What are the most performance in binary classication tasks:
effective ways to promote prociency in terms of Strategy choices and strategy transitions.
changing the way attention and automatic pro- Journal of Memory and Language, 41,
cesses operate? Can neuroimaging techniques be 223252.
used to monitor such change (see especially chap- De Bot, K. (1992). A bilingual production model:
ters 23 by Hull & Vaid and 24 by Abutalebi, Levelts speaking model adapted. Applied
Linguistics, 13, 124.
Cappi, & Perani in this volume)?
DeKeyser, R. M. (1997). Beyond explicit rule
The acquisition of, and functioning in, an L2 learning: Automatizing second language
provide paradigmatic examples of the challenges morphosyntax. Studies in Second Language
facing cognitive scientists interested in how people Acquisition, 19, 195221.
acquire the ability to perform complex skills. The DeKeyser, R. M. (2001). Automaticity and
availability of the new techniques and the new automatization. In P. Robinson (Ed.),
386 Production and Control
Cognition and second language instruction Hulstijn, J. H. (2001). Intentional and incidental
(pp. 125151). Cambridge, MA: Cambridge second-language vocabulary learning:
University Press. A reappraisal of elaboration, rehearsal and
Doughty, C., & Williams, J. (Eds.). (1998). Focus automaticity. In P. Robinson (Ed.), Cogni-
on form in classroom second language tion and second language instruction (pp.
acquisition. Cambridge, U.K.: Cambridge 258286). Cambridge, U.K.: Cambridge
University Press. University Press.
Ellis, N. C. (Ed.). (1994). Implicit and explicit Hulstijn, J. H. (2002). Towards a unified account
learning of languages. New York: Academic of the representation, processing, and acqui-
Press. sition of second-language knowledge. Second
Ellis, N. C. (1997). Vocabulary acquisition: word Language Research, 18, 193223.
structure, collocation, word-class, and mean- Hulstijn, J. H., & Hulstijn, W. (1984). Grammat-
ing. In N. Schmitt & M. McCarthy (Eds.), ical errors as a function of processing
Vocabulary: Description, acquisition and constraints and explicit knowledge. Language
pedagogy (pp. 122139). Cambridge, U.K.: Learning, 34, 2343.
Cambridge University Press. Johnson, K. (1996). Language teaching and skill
Ellis, N. C. (2002). Frequency effects in language learning. Oxford, U.K.: Blackwell.
processing: A review with implications for Kahneman, D. (1973). Attention and effort.
theories of implicit and explicit language Englewood Cliffs, NJ: Prentice Hall.
acquisition. Studies in Second Language Krashen, S. D. (1981). Second language acquisition
Acquisition, 24, 143188. and second language learning. Oxford, U.K.:
Ellis, N. C., & Schmidt, R. (1997). Morphology Pergamon Press.
and longer distance dependencies: Laboratory Kroll, J. F., & Stewart, E. (1994). Category inter-
research illuminating the A in SLA. Studies in ference in translation and picture naming:
Second Language Acquisition, 19, 145171. Evidence for asymmetric connection between
Ellis, R. (1993). The structural syllabus and second bilingual memory representations. Journal of
language acquisition. TESOL Quarterly, 27, Memory and Language, 33, 149174.
91113. LaBerge, D. (1997). Attention, awareness, and
Ellis, R. (2000, September). The representation and the triangular circuit. Consciousness and
measurement of L2 explicit knowledge. Paper Cognition, 6, 149181.
presented at the conference on Language in the LaBerge, D. (2000a). Clarifying the triangular
Mind, National University of Singapore. circuit theory of attention and its relations to
Eviatar, Z. (1998). Attention as a psychological awareness: Replies to seven commentaries.
entity and its effects on language and com- Psyche, 6. Retrieved February 20, 2002,
munication. In B. Stemmer & H. A. Whitaker from http://psyche.cs.monash.edu.au/v6/
(Eds.), Handbook of neurolinguistics (pp. psyche-6-06-laberge.html
275287). New York: Academic Press. LaBerge, D. (2000b). Networks of attention. In
Favreau, M., & Segalowitz, N. S. (1983). Auto- M. S. Gazzaniga (Ed.), The new cognitive
matic and controlled processes in the rst- and neurosciences (pp. 711724). Cambridge,
second-language reading of uent bilinguals. MA: MIT Press.
Memory & Cognition, 11, 565574. LaBerge, D., & Samuels, J. (1974). Toward a the-
Fischler, I. (1998). Attention and language. In ory of automatic information processing in
R. Parasuraman (Ed.), The attentive brain reading. Cognitive Psychology, 6, 293323.
(pp. 381399). Cambridge, MA: MIT Press. Langacker, R. W. (1987). Foundations of cognitive
Gatbonton, E. (1994). Bridge to uency: Speaking. grammar, Vol. 1: Theoretical prerequisites.
Scarborough, Ontario: Prentice Hall Canada. Stanford, CA: Stanford University Press.
Gatbonton, E., & Segalowitz, N. S. (1988). Crea- Larsen Freeman, D., & Long, M. H. (1991). An
tive automatization: Principles for promoting introduction to second language acquisition
uency within a communicative framework. research. London: Longman.
TESOL Quarterly, 22, 473492. Levelt, W. J. M. (1989). Speaking: From intention
Gatbonton, E., & Segalowitz, N. S. (in press). to articulation. Cambridge, MA: MIT Press.
Rethinking communicative language teaching: Levelt, W. J. M. (1999). Producing spoken lan-
A focus on access to uency. Canadian Mod- guage: A blueprint of the speaker. In C. M.
ern Language Review. Brown & P. Hagoort (Eds.), The neurocogni-
Healy, A. F., Barshi, I., Crutcher, R. J., et al. tion of language (pp. 83122). Oxford, U.K.:
(1998). Toward the improvement of training Oxford University Press.
in foreign languages. In A. F. Healy & L. E. Lightbown, P. M., & Spada, N. (1990). Focus-
Bourne, Jr. (Eds.), Foreign language learning on-form and corrective feedback in
(pp. 353). Mahwah, NJ: Erlbaum. communicative language teaching: Effects on
Automaticity 387
(2003). First language and second language Talmy, L. (2000). Toward a cognitive semantics
writing: The role of linguistic knowledge, (Vols. 1 & 2). Cambridge, MA: MIT Press.
speed of processing, and metacognitive Tzelgov, J. (1999). Automaticity and processing
knowledge. Language Learning, 53, 165202. without awareness. Psyche, 5. Retrieved
Segalowitz, N. S. (1997). Individual differences in February 20, 2002, from http://psy-
second language acquisition. In A. M. B. de che.cs.monash.edu.au/v5/psyche-5-05-tzel-
Groot & J. F. Kroll (Eds.), Tutorials in bilin- gov.html
gualism: Psycholinguistic perspectives (pp. 85 Tzelgov, J., Henik, A., & Leiser, D. (1990). Con-
112). Mahwah, NJ: Erlbaum. trolling Stroop interference: Evidence from a
Segalowitz, N. S. (2000). Automaticity and bilingual task. Journal of Experimental Psy-
attentional skill in fluent performance. In chology: Learning, Memory, and Cognition,
H. Riggenbach (Ed.), Perspectives on uency 16, 760771.
(pp. 200219). Ann Arbor: University of Tzelgov, J., Henik, A., Sneg, R., & Baruch, O.
Michigan Press. (1996). Unintentional word reading via the
Segalowitz, N. S. (2003). Automaticity and second phonological route: The Stroop effect with
languages. In C. Doughty & M. Long (Eds.), cross-script homophones. Journal of Experi-
The handbook of second language acquisition mental Psychology: Learning, Memory, and
(pp. 382408). Oxford, U.K.: Blackwell. Cognition, 22, 336349.
Segalowitz, N. S., & Frenkiel-Fishman, S. (in Ullman, M. T. (2001). The neural basis of lexicon
press). Attention control and ability level in and grammar in first and second language:
a complex cognitive skill: Attention shifting The declarative/procedural model. Bilingual-
and second language prociency. Memory & ism: Language and Cognition, 4, 105122.
Cognition. Van Gelderen, A., Schoonen, R., De Glopper, K.,
Segalowitz, N. S., Poulsen, C., & Segalowitz, S. J. Hulstijn, J., Simis, A., Snellings, P., et al.
(1999). RT coefcient of variation is differ- (2004). Linguistic knowledge, processing
entially sensitive to executive control involve- speed and metacognitive knowledge in first-
ment in an attention switching task. Brain and and second-language reading comprehension:
Cognition, 38, 255258. A componential analysis. Journal of Educa-
Segalowitz, N. S., Watson, V., & Segalowitz, S. J. tional Psychology, 96, 1930.
(1995). Vocabulary skill: Single case assess- Van Orden, G. C. (1987). A ROWS is a ROSE:
ment of automaticity of word recognition in a Spelling, sound and reading. Memory &
timed lexical decision task. Second Language Cognition, 15, 181198.
Research, 11, 121136. VanPatten, B. (1990). Attending to form and con-
Segalowitz, N. S., & Segalowitz, S. J. (1993). tent in the input: An experiment in con-
Skilled performance, practice, and the differ- sciousness. Studies in Second Language
entiation of speed-up from automatization Acquisition, 12, 287301.
effects: Evidence from second language word Waters, G., Seidenberg, M. S., & Bruck, M.
recognition. Applied Psycholinguistics, 14, (1984). Children and adults use of spelling-
369385. sound information in three reading tasks.
Segalowitz, S. J., Segalowitz, N. S., & Wood, A. G. Memory & Cognition, 12, 293305.
(1998). Assessing the development of auto- White, L. (1989). Universal grammar and second
maticity in second language word recognition. language acquisition (Vol. 9). Amsterdam:
Applied Psycholinguistics, 19, 5367. Benjamins.
Skehan, P., & Foster, P. (2001). Cognition and White, L. (1996). Universal grammar and second
tasks. In P. Robinson (Ed.), Cognition and language acquisition: Current trends and new
second language instruction (pp. 183205). directions. In W. C. Ritchie & T. K. Bhatia
Cambridge, U.K.: Cambridge University Press. (Eds.), Handbook of second language acqui-
Squire, L. R., & Knowlton, B. J. (2000). The me- sition (pp. 85120). San Diego, CA: Academic
dial temporal lobe, the hippocampus, and the Press.
memory systems of the brain. In M. S. Gaz- Willingham, D. B. (1998). A neuropsychological
zaniga (Ed.), The new cognitive neurosciences theory of motor skill learning. Psychological
(pp. 765779). Cambridge, MA: MIT Press. Review, 105, 558584.
Stroop, J. R. (1935). Studies of interference in serial Wingfield, A., Goodglass, H., & Lindeld, K. C.
and verbal reactions. Journal of Experimental (1997). Separating speed from automaticity in
Psychology, 18, 643662. a patient with focal brain atrophy. Psycho-
Talmy, L. (1996). The windowing of attention. logical Science, 8, 247249.
In M. Shibatani & S. A. Thompson (Eds.), Wray, A. (2002). Formulaic language and the lex-
Grammatical constructions (pp. 235287). icon. Cambridge, U.K.: Cambridge University
Oxford, U.K.: Oxford University Press. Press.
Erica B. Michael
Tamar H. Gollan
19
Being and Becoming Bilingual
Individual Differences and Consequences
for Language Production
389
390 Production and Control
mechanism of suppression (or inhibition; we use offers suggestions regarding which aspects of L2
these terms interchangeably). A number of studies lexical processing are likely to be most heavily
have drawn a connection between working mem- inuenced by individual differences in suppression
ory and suppression, and we suggest that this link ability.
provides a pathway for understanding why work-
ing memory capacity is correlated with a variety of The Inhibitory Control Model
measures of bilingual performance. Specically, we
argue that working memory may serve to resolve Considerable experimental evidence suggests that
competing activation across the two languages not both of a bilinguals languages are always active to
only by maintaining task-relevant information, but some degree (e.g., Colome, 2001; Hermans, Bon-
also by suppressing interfering activation. gaerts, De Bot, & Schreuder, 1998; Van Hell &
We begin by presenting two inuential models Dijkstra, 2002), thus requiring the bilingual to al-
of bilingualism and argue that both suggest that locate mental resources to control the relative level
suppression skills should play a prominent role in of activation of each language. As such, an obvious
determining a learners rate of acquisition and ul- mechanism to consider as potentially quite impor-
timate L2 prociency. In addition, before turning tant for L2 processing is suppression. Although to
to the working memory literature we discuss cog- our knowledge individual differences in suppres-
nitive processing differences between bilinguals and sion ability and L2 processing have not been di-
monolinguals (see also Bialystok, chapter 20, this rectly studied to date, several investigators have
volume). We highlight this comparison on the as- begun to explore the relationship between sup-
sumption that differences between bilinguals and pression and bilingualism. One relatively recent
monolinguals are caused at least partly by the model of bilingualism that assigns a critical role
greater cognitive demands imposed by bilingualism to suppression is Greens (1998) IC Model, which
relative to knowledge and use of just a single lan- focuses on mechanisms for the control of bilingual
guage. Critically, we suggest that differences be- performance at the single-word level.
tween bilinguals and monolinguals will reect According to the IC Model, bilingual language
aspects of cognitive processing that are helpful processing requires multiple levels of control. One
in achieving and maintaining procient bilingual- type of control allows an individual to execute
ism and are therefore also likely be related to in- the target task rather than another of the many
dividual differences in L2 performance. possible tasks afforded by the environmental
Achieving prociency in an L2 requires mastery stimuli. Green (1998) called this type of control the
of many aspects of the language, including pho- task schema; it is not unique to bilinguals, and it
nology, vocabulary, and syntax. We limit the cur- relies on suppression to allow the individual to
rent discussion primarily to lexical processing, inhibit competitor tasks in favor of the intended
examining early stages of vocabulary acquisition in task. Unlike monolingual processing, bilingual
adult L2 learners as well as later processes involved processing requires suppression at the language
in single-word translation, production, and com- level as well. In picture naming, for example, the
prehension. Although ultimately a bilingual must bilingual not only must choose which task to per-
be able to use and interpret words in a larger con- form (e.g., naming the picture instead of deter-
text, single-word processing forms the basis for mining its category membership), but also must
these skills and has been the topic of the majority of select a language in which to name.
psycholinguistic research on bilingualism to date. The IC Model proposes that each lexical item
has a language tag, denoting it as either L1 or L2,
and word selection is based on which language is
Models of Bilingualism more active at any given time. According to the IC
Model, inhibition plays a key role in many lan-
Two models of bilingual language processing, the guage tasks, as demonstrated in the following
Inhibitory Control (IC) Model (Green, 1998) and procedural description of picture naming: First,
the Revised Hierarchical Model (RHM; Kroll & pictures activate concepts, which in turn activate
Stewart, 1994), describe a variety of ways to think associated lexical items, likely including both L1
about individual differences in bilingual processing and L2 words. Suppression must then occur via
at the lexical level. These models do not make ex- language tags. If a bilingual is naming pictures in
plicit predictions about the nature of such indi- L2, each picture may activate both L1 and L2
vidual differences in bilingual processing, but each words. Based on the intention to name in L2, the
Individual Differences in Bilingualism 391
task schema name in L2 allows the individual to L2 words and concepts are initially weak and only
suppress all words with L1 tags. become stronger with increasing prociency. Be-
Suppression of L1 words is predicted to be more cause L1 words activate their associated concepts so
difcult than suppression of L2 words because L1 strongly, forward translation begins with an L1
typically has a higher resting level of activation word strongly activating a concept and only subse-
than L2. Support for this hypothesis comes from quently does the relatively weak conceptL2
studies of language switching, which generally link provide access to the appropriate translation
show a larger cost for switching from L2 to L1 than equivalent. Backward translation, on the other
vice versa (e.g., Meuter & Allport, 1999; see also hand, can be accomplished with relative ease using
Meuter, chapter 17, this volume). According to the the strong lexical associations from L2 to L1.
IC Model, this phenomenon occurs because the Past research supports the claim that L2 learners
relative difculty of suppressing L1 leads to sub- follow a course in which they initially associate the
sequent relative difculty in reactivating that lan- L2 to the L1 at a lexical level and only later acquire
guage when required by the task switch. the ability to conceptually mediate L2 words. Ta-
The IC Model suggests several loci at which lamas, Kroll, and Dufour (1999) showed that less-
individual differences in suppression ability might procient bilinguals made more errors of lexical
have consequences for bilingual language proces- form (such as confusing the Spanish word hombre,
sing. In particular, individuals who are unusually which means man, with hambre, which means
good at suppressing irrelevant information might hunger), whereas more procient bilinguals
show a larger-than-average cost of switching from made more errors of meaning (such as confusing
a task requiring production in L2 to a task re- hombre with mujer, which means woman).
quiring production in L1. In addition, L1-to-L2 These data suggest that with increasing prociency,
translation may be especially affected by individual L2 words activate semantics more directly. In ad-
differences in cognitive skills because this task dition, many studies have suggested that early L2
requires the ability to suppress word forms in L1. learners perform backward translation with rela-
During L1-to-L2 translation, L1 words will be tively little activation of concepts relative to for-
highly active for two reasons: (a) as mentioned, the ward translation. Backward translation is typically
IC Model proposes a higher resting level of acti- much faster than the reverse and is often unaffected
vation of L1 compared to L2, and (b) the stimuli by conceptual manipulations such as presenting
themselves (L1 words) provide external cues that stimuli in semantically categorized versus mixed
continuously boost the activation of the L1 lexicon. lists (Kroll & Stewart, 1994).
A number of studies have revealed contradictory
evidence surrounding the proposal that only for-
The Revised Hierarchical Model ward translation is conceptually mediated. For
example, De Groot and Poot (1997) and La Heij,
Another prominent model of bilingual language Hooglander, Kerling, and Van der Velden (1996)
processing leads to similar predictions, although it presented evidence suggesting that both directions
proposes that the suppression mechanism plays a of translation are largely conceptually mediated.
less-direct role in leading to procient bilingualism. La Heij et al. did, however, agree with Kroll and
In the RHM (Kroll & Stewart, 1994; see Fig. 26.6 in Stewart that concept activation is easier for L1
Kroll & Tokowicz, chapter 26, this volume), the words than for L2 words. Thus, many researchers
increased difculty of L1-to-L2 (forward) transla- agree that one aspect of L2 learning that is espe-
tion as compared to L2-to-L1 (backward) transla- cially difcult is establishing associations between
tion is thought to occur primarily because retrieving L2 words and concepts. Because this task is thought
an L2 word from a concept is especially hard. to be so difcult, it is likely to be a process heavily
In early stages of L2 acquisition, adults appear inuenced by individual differences in cognitive
to have difculty accessing the meaning of L2 skills. In particular, Michael (1998) hypothesized
words. Learners thus tend to adopt a strategy of that inhibition of the L1 lexicon is an important
associating L2 words to their L1 translations, component of L2 vocabulary acquisition because it
forming strong lexical links from L2 to L1. Because may allow for the development of direct connec-
L1 is the dominant language, strong bidirectional tions between L2 words and concepts. Research
links also exist between L1 words and concepts (at examining the precise aspects of L2 lexical pro-
all stages of L2 prociency). In contrast, lexical cessing that are prone to individual differences
links from L1 to L2 and bidirectional links between may ultimately help resolve the controversy about
392 Production and Control
the role of conceptual mediation in backward found, it may be that bilingualism has no effect
translation. on that particular task, or that the task is simply
To summarize, the IC Model predicts that bi- not sensitive enough to detect subtle differences
linguals at all stages of prociency must be able to that do exist. Understanding these differences and
suppress the activation of L1 words to produce lack of differences will constrain models of language
words in L2. In comparison, the RHM suggests processing (both mono- and bilingual) by requir-
that suppression should be especially important for ing them to specify their underlying cognitive
less-procient bilinguals, partly because of the dif- mechanisms.
culty of establishing direct connections between Perhaps the most intuitively compelling account
concepts and L2 word forms. The two models do of differences between bilinguals and monolinguals
not necessarily contradict each other, but instead is the notion of cross-language interference. One of
highlight different roles for suppression at various the most obvious things that makes bilinguals dif-
stages of bilingual processing. ferent from monolinguals is that, for each concept
When bilinguals are relatively successful at to be expressed, two very closely matching lexi-
suppressing the nontarget language, they would be cal representations are available (i.e., translation
expected to perform language tasks similarly to equivalents). If these representations compete for
their monolingual counterparts. The models dis- selection, suppression would be quite useful for
cussed above, however, illustrate how the presence managing the added interference. Converging jus-
of the L2 may change the conguration of the en- tication for predicting that individual differences
tire language system, including processing of L1. In in suppression ability should be related to bilingual
the next section, we consider differences between performance comes from studies demonstrating that
bilinguals and monolinguals that, under one inter- within-language synonyms (e.g., sofa and couch,
pretation, support the view that bilinguals can the closest thing to translation equivalents in the
never completely suppress the nontarget language monolingual cognitive system) remain active quite
and thus are never functionally monolingual. late in the process of lexical selection in monolingual
language production (Jescheniak & Schriefers,
1998; Peterson & Savoy, 1998). However, a serious
challenge for the cross-language interference ac-
Bilingual Versus Monolingual count comes from a number of experimental nd-
Language Processing ings suggesting that coactive translation equivalents
sometimes lead to facilitation (i.e., the opposite of
Research comparing bilinguals and monolinguals interference). Thus, in addition to the intuitively
may elucidate some of the necessary cognitive appealing cross-language interference account, it is
mechanisms for achieving and maintaining pro- important to consider other mechanisms that could
cient bilingualism. The literature on bilingualism account for differences between bilinguals and
is replete with evidence demonstrating that bilin- monolinguals.
guals perform differently from monolinguals on A second quite compelling and obvious differ-
a broad range of tasks, including both language- ence between bilinguals and monolinguals is that,
based and non-language-based tasks (see also at the level of individual words, bilinguals need to
Bialystok, chapter 20, this volume). For exam- learn and then efciently retrieve roughly twice as
ple, bilinguals may be slower than monolinguals to many items as monolinguals. By virtue of speaking
identify words in lexical decision (Ransdell & two languages, bilinguals necessarily spend less
Fischler, 1987). On the other hand, bilinguals may time using words particular to either language.
be less affected than monolinguals by concreteness Bilinguals thus may be less able to activate lexical
in text recall, perhaps because bilinguals have lan- representations specic to each language or may
guage-specic retrieval cues that monolinguals do have weaker links in the lexical system relative to
not have (Ransdell & Fischler, 1989). Thus, bilin- monolinguals (Gollan & Acenas, 2004). Vocabu-
gualism appears to be associated with both cogni- lary knowledge and single-word retrieval are quite
tive advantages and disadvantages, depending on sensitive to individual differences in cognitive
the type of task used. In contrast to the many ob- ability and to impairment to the cognitive system.
served differences between bilinguals and mono- For example, vocabulary measures are highly cor-
linguals, other studies document a lack of difference related with verbal IQ, and the inability to name
between groups (also on a variety of cognitive tasks; objects (or pictures) is one of the most commonly
e.g., Rosselli et al., 2002). When differences are not reported cognitive complaints after even very mild
Individual Differences in Bilingualism 393
brain damage (Lezak, 1995). Hence, it is reason- words in one language only, when bilinguals re-
able to expect that vocabulary and word retrieval trieved words in their dominant language only, and
skills should also be sensitive to group differences when the targets were matched for familiarity
in experience with language (i.e., mono- vs. bilin- across groups.
gualism). Even if translation equivalents do not The increased TOT rate in bilingual adults
interfere with (and may even facilitate) one an- is reminiscent of studies showing that bilingual
other, it would still be very surprising if the double children have smaller receptive and productive
burden (needing to learn roughly twice as many vocabularies in each language relative to their
words) did not lead to both group differences be- age-matched monolingual counterparts. Most re-
tween bilinguals and monolinguals and individual searchers agree that the bilingual vocabulary dis-
differences within bilinguals. advantage is not found in bilingual adults (for a
What follows is a brief review of comparisons review, see Hamers & Blanc, 2000); however, the
between bilinguals and monolinguals in picture TOT ndings suggest that the disadvantage is in
naming, proper name retrieval, picture classica- fact present if sufciently sensitive measures are
tion, and the verbal uency task, followed by a used to detect it.
discussion of the cognitive mechanisms that these To date, two manipulations have eliminated the
differences suggest are critical for achieving and group difference in TOT rate. First, bilinguals had
maintaining procient bilingualism. We consider the same number of TOTs as monolinguals when
two explanations of the differences observed be- the targets were cognates, which are words that are
tween bilinguals and monolinguals: the cross- similar in both form and meaning across languages
language interference account and the weaker links (e.g., the English word rhinoceros and the Spanish
account. The former account most obviously im- word rinoceronte, as compared to the noncognate
plicates a direct role for suppression in bilingual- translation equivalents tweezers for the Spanish
ism, a topic discussed in detail in the Individual pinzas). The group similarity in TOT rates for
Differences section of the chapter. At the end of the cognates only held true, however, if bilinguals could
section, we consider whether bilingualism leads to produce the word in both languages (i.e., if they
differences in suppression skill. translated the target word into their other language
correctly; Gollan & Acenas, 2004). Otherwise, cog-
nate targets were just as likely to elicit TOTs in
Tip-of-the-Tongue States bilinguals as their noncognate counterparts. Impor-
and Proper Name Retrieval tantly, in this study, monolinguals showed no sig-
nicant cognate facilitations effects on TOT rate,
Research (Gollan & Acenas, 2004; Gollan, Bo- thereby conrming that bilingualism per se (not
nanni, & Montoya, in press; Gollan & Silverberg, some characteristic of the cognate materials) pro-
2001) suggests that there are robust differences duced the reduction in bilingual TOT rate. This
between bilinguals and monolinguals when they nding suggests that, when the materials allow bi-
are asked to produce very-low-frequency words. linguals to make use of their experience with word
Such words are commonly the targets of tip-of-the- forms in both languages, the difference between
tongue states (TOTs). During a TOT, speakers fail groups goes away.
to retrieve a word or name that they are sure they A second, and in some ways surprising, variable
know. TOTs are often accompanied by a feeling of that eliminated the increased TOT rate in bilin-
imminent recall and the ability to report partial in- guals was the use of proper name targets (Gollan,
formation about the target word form (e.g., the rst Bonanni, et al., in press). In fact, in one study the
phoneme; R. Brown & McNeill, 1966). effect was reversed, and bilinguals had signicantly
In several studies comparing bilinguals to age- fewer TOTs than monolinguals. In a diary study in
and education-matched monolinguals, bilinguals which participants kept TOT diaries for 4 weeks,
(including Hebrew-English, Spanish-English, and bilinguals reported fewer TOTs for proper names
Tagalog-English bilinguals) consistently had more relative to their monolingual peers even though
TOTs even though they demonstrated equal ability (after adjusting for the proportion of time using L1
to report the rst phoneme of the target word and, and L2) bilinguals had more TOTs than monolin-
in the majority of cases, equal rates of spontaneous guals in L1 and more TOTs in L2 than in L1.
TOT resolution. The group difference in number of In a second experiment of laboratory-induced
TOTs remained signicant in a variety of condi- TOTs (Gollan, Bonanni, et al., in press), bilinguals
tions, including when the task required retrieval of and monolinguals had the same number of TOTs
394 Production and Control
for famous names and personal names in which par- successful picture naming. Gollan, Montoya,
ticipants attempted to retrieve the names of all the Fennema-Notestine, and Morris (in press) demon-
teachers they had from grade school through high strated that Spanish-English bilinguals (even those
school. Importantly, the same bilinguals reported who reported that English was their strongest lan-
a higher TOT rate relative to the monolinguals in guage or that they were equally procient in English
a different condition that required participants to and Spanish) were slower to name pictures relative
name pictures of objects (thereby replicating the to monolinguals. The same bilinguals, however,
previously reported increased TOT rate). Consid- were able to classify the same pictures (with coun-
ered together, these studies showed no consistent terbalanced presentation of individual pictures be-
bilingual advantage for proper names, but unlike tween subjects) as human made versus natural
the object-naming ndings, neither study produced equally quickly relative to their monolingual con-
a bilingual disadvantage in retrieving proper name trols. Importantly, both tasks (i.e., picture naming
targets. and picture classication) demonstrated robust
The absence of a bilingual disadvantage in these repetition effects such that repeated pictures
studies is surprising because proper names are the were named and classied more quickly on the
most commonly reported (and often most frus- second and third presentations. These data indicate
trating and embarrassing) type of TOT target that both tasks were sensitive to experimental
(Cohen & Burke, 1993), and proper name retrieval manipulations, and thus the lack of differences
is notoriously difcult (Burke, Locantore, & Aus- between groups could not be attributed to insensi-
tin, in press). Interestingly, older adults have an tivity of picture classication to all experimental
increased rate of TOTs relative to younger adults, manipulations.
and the age difference is most pronounced for The presence of robust differences in picture
proper name targets (Burke, MacKay, Worthley, & naming in the absence of any differences in picture
Wade, 1991). This interaction between age and classication provides evidence that the locus of
word type suggests that task difculty inuences the bilinguals relative disadvantage arises after
TOT rate in other comparisons of different par- semantic processing during the retrieval of lan-
ticipant groups. If bilingual disadvantages (e.g., the guage-specic lexical representations. Moreover,
increased TOT rate for object names) resulted from because bilinguals were not at a disadvantage on all
a processing overload specic to the language sys- tasks, the nding strengthens the claim that the
tem, then the difference between bilinguals and differences between groups in picture naming were
monolinguals would be expected to be even more caused by bilingualism, not by some other corre-
robust in the context of a task that is typically most lated factor, such as culture, that might have also
challenging. differed between the two groups and affected cog-
Although generally difcult tasks should be nitive processing more generally.
especially likely to produce individual differences, The absence of differences in picture classica-
these studies indicate that differences between bi- tion does not, however, rule out the possibility that
linguals and monolinguals also depend on factors bilinguals and monolinguals carried out the task
quite specic to bilingualism (e.g., that proper in different ways; there may be two (or more) dif-
names are typically shared across languages). In ferent but equally efcient ways of classifying
summary, both the cognate and the proper name pictures. Furthermore, the absence of differences
ndings clearly indicate that bilinguals do exhibit between groups does not rule out the possibility
some relative uency decits in comparison with that differences exist that could not be detected
monolinguals, but these decits are quite limited in by the task used. These ndings, however, do
scope. clearly suggest that bilingual disadvantages are not
general, and that differences between bilinguals
and monolinguals may be more likely to occur in
Picture Naming Versus Picture tasks that require a language-specic response.
Classication
The TOT state is a dramatic type of retrieval failure Category and Phonemic Fluency
that occurs relatively infrequently, approximately
once or twice a week (A. Brown, 1991). Other The verbal uency task is another language task
evidence for processing costs that may be related that involves intact lexical retrieval and, like pic-
to bilingualism comes from response times during ture naming, produces robust differences between
Individual Differences in Bilingualism 395
bilinguals and monolinguals. In verbal uency tasks, larger for semantic uency than for letter uency if
participants are typically given 60 s to generate as it is also assumed that cross-language interference
many words as they can that belong to either a given is stronger for semantically related words (i.e.,
semantic category (e.g., animals, fruits, and vege- translation equivalents) than for words starting
tables) or a given letter category (e.g., words that with the same letter. This assumption seems rea-
begin with the letter F, A, or S; Borkowski, Benton, sonable given that translation equivalents would be
& Spreen, 1967). In a study comparing Spanish- expected to be coactive during a semantically dri-
English bilinguals to English-speaking monolin- ven task because of the great overlap in meaning.
guals, Gollan, Montoya, and Werner (2002) found According to one instantiation of the interfer-
that bilinguals (even those who reported their En- ence account, during production in the Animals
glish was as procient as that of native speakers) category, semantically related lexical nodes (e.g.,
produced fewer correct exemplars on 9 of 12 se- the Spanish word for dog, which is perro, and re-
mantic categories and 6 of 10 letter categories tes- lated words such as the English word cat) compete
ted. Unexpectedly, there was also an interaction for selection. Because these lexical representations
between category type and participant type; the also compete for selection during natural language
bilingual-monolingual difference was larger and production, they would be connected by inhibitory
more consistent in semantic categories than in letter links that would be especially strong for translation
categories (see Rosselli et al., 2000, for similar equivalents that are closest in meaning (see Cutting
ndings in older adult bilinguals). & Ferreira, 1999, for a discussion of inhibitory
Work in progress by the second author (in col- links between coactive lexical representations in
laboration with Victor S. Ferreira) suggests that the language production).
group differences may have been reduced on letter In contrast, form-related lexical nodes (e.g., doll
uency because of cognate production (because cog- and the Spanish word dogal, which means noose)
nates are easier for bilinguals to produce; see Costa, would only be connected by relatively weak in-
Caramazza, & Sebastian-Galles, 2000; Gollan & hibitory links because competition between form-
Acenas, 2004). Relative to their own overall pro- related competitors either does not arise during
duction rates, bilinguals produced more cognates natural language production (Levelt, Roelofs, &
than monolinguals in letter but not semantic Meyer, 1999) or arises only to the extent that ac-
categories. These data are shown in Fig. 19.1. Us- tivation from the phonological level (in this case,
ing monolinguals rate of cognate production as the phoneme /d/) is allowed to ow back up to the
a baseline for how often cognates should occur lexical level (e.g., Dell, 1986). In such a model, the
in letter and semantic uency in the absence of cross- impact of bilingualism (or of having twice as many
language facilitation, these data suggest that lexical representations active) would be greater for
cross-language facilitation is stronger in the letter semantic uency than for letter uency because,
uency task, thereby reducing the uency difference in a semantic uency task, bilingualism activates
between groups. This effect may have occurred lexical representations that strongly interfere with
because letter categories are inherently larger than the selection of target lexical nodes.
semantic categories and therefore also contain Interestingly, Rosen and Engle (1997) demon-
more cognates. However, other explanations (in- strated that under a cognitive load, individuals
volving cross-language interference) are possible with higher working memory span produced more
for the interaction between participant type and correct exemplars in semantic uency and pro-
category type. duced fewer perseverations (repeating an exemplar
without realizing it) compared to individuals with
lower working memory span. These ndings sug-
Accounts of Differences Between gest a link between suppression mechanisms and
Bilinguals and Monolinguals semantic uency and thus seem to provide con-
verging support for the notion that cross-language
Cross-Language Interference As noted, one ac- interference may create a particular problem dur-
count of the relative bilingual disadvantages in ing semantic uency. It would be interesting to
language production tasks is that bilinguals suffer see if span differences in bilinguals had smaller
from direct interference between competing lexical effects on letter uency than on semantic uency,
representations across languages. Gollan et al. as would be expected if cross-language interference
(2002) suggested that the interference account can is reduced in letter uency relative to semantic
explain the nding that the bilingual impairment is uency.
396 Production and Control
Figure 19.1 The top panel demonstrates bilinguals greater proportion of cognate production in the letter
uency task relative to monolinguals. The bottom panel shows the raw numbers used to calculate the
proportions shown in the top panel. The bottom panel also shows the bilingual groups verbal uency
disadvantage except for the number of cognates produced during letter categories, in which bilinguals
actually produced slightly more responses on average (i.e., 5.0 vs. 4.7). These ndings suggest that cognate
facilitation effects occur during a relatively free language production task (i.e., during category genera-
tion), and that cross-language facilitation may explain why the verbal uency difference between bilin-
guals and monolinguals was larger on semantic categories. Adapted from Gollan et al. (2002).
To explain bilinguals greater TOT rate and both in Hebrew and in English) or because the
slower picture-naming times relative to monolin- cross-language translations of proper names
guals, the cross-language interference account are often cognates that also eliminate the difference
would assume that translation equivalents compete between groups (e.g., Dvora and Debora). Thus,
for selection. This view is consistent with the it appears that the interference account can ex-
nding that bilinguals do not have more TOTs plain many of the above-discussed ndings (but see
than monolinguals for proper name targets (Gol- Gollan & Acenas, 2004).
lan, Bonanni, et al., in press), possibly because The greatest challenge to the interference ac-
proper name targets simply do not have transla- count comes from data suggesting that the activa-
tion equivalent competitors (i.e., Tamar is Tamar tion of translation equivalents actually facilitates
Individual Differences in Bilingualism 397
production. Such cross-language facilitation has thus the connections within the lexical system
been reported not only for tasks that explicitly from semantic to phonological representations are
activate both languages (Costa & Caramazza, weaker (see also Gollan et al., 2002; Gollan &
1999; Costa, Miozzo, & Caramazza, 1999), but Acenas, 2004; Gollan, Montoya, et al., in press).
even for tasks in which the activation of transla- The relative weakness arises because links in the
tion-equivalent lexical representations is implicit lexical system are sensitive to frequency and re-
(Gollan & Acenas, 2004; Gollan, Montoya, et al., cency of use (Burke et al., 1991), and links partic-
in press). In the latter studies, target words that ular to each language are used less often. The
bilinguals were able to translate produced fewer weaker links account assumes a very indirect effect
TOTs (Gollan & Acenas) and faster picture nam- of bilingualism on TOTs; use of words in a non-
ing times (Gollan, Montoya, et al.) than target dominant language results in relatively reduced use
words that bilinguals were not able to translate. of words in the dominant language in comparison
Monolinguals responses to the same stimuli were with monolinguals. It can explain why bilinguals
signicantly less affected by the target words increased TOT rate is eliminated when the targets
translatability, indicating that translatable words are cognates by assuming that translation equiva-
were not simply easier to produce. lents facilitate production at the level of phonology
Although translation facilitation effects seem to (as argued by Costa et al., 2000), and similarly that
rule out cross-language interference as an account lexical representations of proper names are unaf-
of differences between bilinguals and monolin- fected because they do not differ across languages
guals, theoretical and intuitive arguments make us (or because they are cognates). The weaker links
reluctant to abandon cross-language interference account can also use cognate production to explain
entirely until an exact mechanism of the reported why semantic uency is relatively more affected by
translation facilitation effects is identied. Also, bilingualism than is letter uency (see Fig. 19.1).
it is not entirely unreasonable to assume that Finally, although strictly speaking the weaker links
interference arises under some conditions and fa- account does not predict translation facilitation
cilitation under others. Moreover, although the effects, it certainly has an easier time accommo-
translation facilitation effects reported by Gollan dating this nding than the interference account
et al. primarily involved English-dominant partici- (because the latter specically predicts the opposite
pants failing to show interference effects from of facilitation).
Spanish (the less-dominant, and therefore less- The weaker links hypothesis predicts that bi-
active, language), it is important to note that a linguals who speak twice as much as monolinguals
subset of the participants in each experiment should not be different from monolinguals on
demonstrated the same facilitation effects even language-processing tasks. Consistent with this
though English was their less-dominant language. prediction, 30% or even 70% of individual bilin-
In other words, for these participants the dominant guals (using 1 or 2 standard deviation cutoffs, re-
language not only failed to interfere, it actually spectively) reported a number of TOTs that fell
facilitated production in the less-dominant lan- within the range of average monolingual perfor-
guage. One possibility is that translation equiva- mance (Gollan & Acenas, 2004; Gollan & Silver-
lents cause interference during relatively early berg, 2001). However, without direct evidence to
stages of L2 acquisition and later come to serve as support the claim that these bilinguals spoke twice
retrieval cues; this hypothesis is consistent with the as much as monolinguals, other reasons must be
RHMs above-outlined prediction that suppression considered. Although highly speculative at this
should play a greater role in achieving procient point, an alternative possibility is that some bilin-
bilingualism than in maintaining it. (See Costa, guals may have unusually strong phonological
chapter 15, this volume, and La Heij, chapter 14, processing skills and therefore may need fewer
this volume, for further discussion of translation exposures to word forms to become procient in
facilitation results.) retrieving them. In fact, some studies reported
stronger phonological processing skills in both bi-
Weaker Links As an alternative to the interference lingual children and adults (e.g., Campbell & Sais,
account, Gollan and Silverberg (2001) suggested 1995; Papagno & Vallar, 1995). Moreover, pho-
that the reason bilinguals have more TOTs, name nological short-term memory (STM) seems to be
pictures more slowly, and have reduced category related to vocabulary acquisition in both L1 and L2
and letter uency is that they use words particular to (Atkins & Baddeley, 1998; Baddeley, Gathercole,
either language less often relative to monolinguals, & Papagno, 1998; Cheung, 1996; De Groot &
398 Production and Control
Van Hell, chapter 1, this volume; Gathercole & automatic response of reading the words and in-
Baddeley, 1989; Papagno, Valentine, & Baddeley, stead name the color of the ink in which each word
1991; Service, 1992; Service & Kohonen, 1995). is printed (Stroop, 1935). On the critical conict
That said, however, it would be a mistake to look trials, the words are color names that are differ-
for improved phonological processing or STM in ent from the color of the ink. For example, if the
bilinguals in general; if these advantages existed, word red is written in blue ink, the participant
then bilinguals (as a group) should not have had must say blue. In the section Working Mem-
more TOTs than monolinguals. ory and Suppression, we discuss a nding demon-
strating that individuals with higher working
Can Bilingualism Improve memory span were better able to prevent Stroop
Suppression Skills? interference than individuals with lower working
memory span, but only when the number of con-
Discussions of weaker links and cross-language ict trials was high (Long & Prat, 2002). It may
interference imply that bilingualism imposes drastic be the case that if a bilingual Stroop advantage ex-
processing costs. It is important to note, however, ists, it also can only be observed under particular
that bilinguals must be viewed under a magnifying circumstances.
glass before such costs can be seen, and in most For example, in a different, but Stroop-like task,
cases the advantages of bilingualism (whether Bialystok (2002) reported that older bilinguals
cognitive, practical, or cultural) far outweigh were better than older monolinguals in saying up
the costs. To illustrate, most people would prefer when presented with an arrow that pointed down
occasionally having more difculty retrieving low- (and vice versa). A number of studies have shown
frequency words (i.e., falling into TOTs for words that the factors related to bilingualism do modulate
like carburetor more often than their monolingual the Stroop effect (for a review, see Smith, 1997).
friends, as discussed above; see also Ecke, 1996) to Specically, Stroop interference occurs across lan-
being completely unable to communicate in any guages (e.g., when the words are written in one
language other than their rst. Moreover, current language and color naming is done in the other
research on bilingualism is testing whether bilin- language) but only with a certain level of pro-
gualism confers some previously unidentied pro- ciency. Further, at very high levels of prociency,
cessing advantages. bilinguals become more adept at suppressing the
On the assumptions that bilinguals must man- cross-language Stroop effect. In future research,
age cross-language interference and that increased these factors may help to reveal why a bilingual
practice controlling interference leads to improved Stroop advantage has not yet been consistently
functioning in cognitive control, it would not observed (see also Bialystok, chapter 20, this
be surprising if bilinguals were better than mono- volume).
linguals at carrying out tasks requiring control Future studies comparing bilinguals to mono-
mechanisms. Importantly, this prediction is based linguals will no doubt continue to inform research
on the assumptions that bilinguals need to manage on the cognitive mechanisms that lead to pro-
the activation of both languages, that cross- cient bilingualism. An important consideration in
language interference arises at least at some level of these studies is that selection criteria for both
processing, and perhaps most important, that the monolinguals and bilinguals will undoubtedly affect
cognitive mechanism of suppression is subject to the outcome. For example, in the United States
practice effects. One conrmation of the predicted monolinguals are quite common. In contrast, in
bilingual advantage in suppression skills was re- most other parts of the world monolinguals are
ported by Ransdell, Arecco, and Levy (2001), who unusual. In such places, people who remain mono-
showed that, on a test of writing quality and u- lingual may be individuals with cognitive weak-
ency, monolingual performance suffered in the nesses for the types of skills important to becoming a
presence of unattended irrelevant speech, but bi- procient bilingual. Similarly, the choice of task will
lingual performance did not. also be important because most tasks tap into mul-
The generalizability of these ndings, however, tiple cognitive mechanisms, and bilinguals may
may be rather limited. For example, bilinguals do differ from monolinguals in more than one way.
not show better ability to prevent Stroop-like in- Therefore, interpreting experiments comparing
terference (e.g., Sebova & Arochova, 1986). In a groups will rst require breaking down each
typical Stroop task, participants see a list of words task into individual components and then generat-
written in different colors and must inhibit the ing different predictions regarding how each
Individual Differences in Bilingualism 399
component should (or should not) be affected by size from two to six. After each mathematical
bilingualism. operation, a word is presented for memorization.
The participant must solve each equation as it is
presented and at the end of the set must recall all of
Individual Differences the words from that set. (In a reading span task, the
in Language Processing participant would read sentences instead of solving
math problems.) The participants working mem-
The studies reviewed above clearly suggest that ory span is then determined by either counting the
bilingualism affects cognitive processing in a num- total number of words correctly recalled or asses-
ber of signicant and yet interestingly limited ways. sing the largest set size at which performance was
The underlying mechanisms by which bilingualism consistently successful. In many studies measuring
affects the cognitive system are presently under working memory span, participants are labeled as
investigation. We turn now to a discussion of some higher or lower working memory span on the
of the empirical evidence examining individual basis of a median split or quartile analysis of span
differences in L1 and L2 processing. These studies scores. Importantly, the designation of an individ-
provide additional evidence regarding the aspects ual as having higher or lower working memory
of bilingual language processing that are most span is relative only to the participants in that
cognitively demanding. The bulk of recent studies particular study and is not an absolute judgment of
on individual differences in language processing that individuals working memory capacity. For the
have focused on working memory. Two consider- purposes of this chapter, the term span on its own
ations make these studies relevant for discussions should be read as working memory span.
of bilingualism. First, several studies suggest that Studies using both reading span and operation
suppression is a central component of working span show them to be equally correlated with mea-
memory (a position discussed in the Working sures of verbal ability, suggesting that individual
Memory and Suppression section); second, at least differences in working memory capacity are inde-
some studies show a very direct relationship be- pendent of task-specic (e.g., language vs. math)
tween working memory measures and bilingual processing skill (Engle, Cantor, & Carullo, 1992).
tasks (especially translation). The inclusion of some sort of processing component,
however, is critical for the predictive power of
working memory span tasks. Considerable support
Working Memory and First exists (summarized in a meta-analysis by Daneman
Language Processing & Merikle, 1996) for the claim that language pro-
cessing is more highly correlated with measures of
Working memory is generally thought of as a sys- dual-task working memory span than with simple
tem that temporarily stores and processes infor- storage measures such as word or digit span. This
mation during the performance of cognitive tasks. claim does not, however, rule out the possibility that
Tests of simple STM capacity typically require only simple STM plays some role in language processing
storage and later retrieval of information (e.g., a (as noted in the section comparing bilinguals to
list of digits or words; for a discussion of several monolinguals, in which we briey presented evidence
different ways the term STM is used, see Delis, suggesting that phonological STM may be important
Kaplan, Kramer, & Ober, 2000). In contrast, in the for vocabulary learning in both L1 and L2).
context of experimental (rather than clinical) re- A now-large body of research supports the idea
search, tests of working memory capacity usually that working memory is necessary to form and re-
include additional processing demands (e.g., re- member an integrated representation of a text. For
membering a sequence of digits and then producing example, studies have shown that working memory
them in reverse order). In more recent measures of capacity is strongly related to the ability to com-
working memory capacity, participants are asked prehend written and spoken stories, interpret the
to store and later retrieve lists of single words intended meaning of an ambiguous word, use con-
while simultaneously carrying out an additional text to infer the meaning of an unfamiliar word, and
cognitive task such as reading sentences or solving produce category exemplars, synonyms, and sen-
mathematical operations. These measures are often tences from given words (Daneman & Carpenter,
called reading span and operation span, respectively. 1980, 1983; Daneman & Green, 1986; Gernsbacher
In a typical operation span task, a participant is & Faust, 1991; Just & Carpenter, 1992; King &
presented with math problems in sets ranging in Just, 1991; Rosen & Engle, 1997).
400 Production and Control
In a landmark study by Daneman and Carpenter As mentioned, Rosen and Engle (1997) exam-
(1980), participants read a set of passages and ined the relationship between working memory
answered both comprehension and pronominal span and retrieval of category exemplars. Partici-
reference questions; reading span was highly cor- pants were asked to generate aloud as many animal
related with performance on both types of ques- names as they could in 15 min (the same task
tion. In a later study, Daneman and Carpenter used by Gollan et al., 2002, except that in that
(1983) assessed the abilities of higher and lower study participants were given only 1 min to gen-
span individuals to detect and recover from se- erate exemplars). Individuals with higher working
mantic ambiguities in passages containing ambig- memory span generated more animal names than
uous words such as bat. In each passage, the individuals with lower working memory span. A
ambiguity was resolved in a manner that was either second experiment used the same paradigm, but
consistent or inconsistent with the prior context. included a concurrent digit-tracking task. Under
All participants spent more time reading the dis- this condition of increased load, individuals with
ambiguating word in the inconsistent cases than in lower working memory span were more likely to
the consistent cases, but higher span participants perseverate (i.e., to repeat items they had already
were more likely than lower span participants to named) than were individuals with higher working
resolve the ambiguity correctly. The authors sug- memory span.
gested that working memory aids in ambiguity Rosen and Engle (1997) suggested that, under
resolution because it allows readers to return to increased load, the lower span participants were
previous information and integrate it with the unable to suppress previously retrieved responses
current representation of the text. effectively, which contributed to their lower over-
Just and Carpenter (1992) further addressed the all retrieval rate (but see Mayr & Kliegl, 2000, who
issue of ambiguity resolution by recording reading argued that recognition, not suppression, prevents
times for garden path sentences. Prior to the am- perseverations in semantic uency). The authors
biguity, disambiguating information was presented argued that the higher and lower span participants
in the form of an animate or inanimate noun. For differed not in their category knowledge, but in
example, the sentence The evidence examined by their ability to control the search process effec-
the lawyer shocked the jury should be easier to tively. Given that the ability to retrieve lexical
resolve than the sentence The defendant examined items from memory is an important and potentially
by the lawyer shocked the jury. Participants with challenging skill for bilinguals, Rosen and Engles
lower working memory span were led down the hypothesis about the role of suppression in work-
garden path despite prior disambiguating infor- ing memory and in word retrieval highlights an
mation, but higher span individuals were success- important way in which working memory capacity
fully able to integrate the pragmatic and syntactic and L2 processing ability may be related. We re-
information to avoid being misled. turn to this discussion in a later section.
Working memory is important not only for higher
level tasks such as text comprehension, but also for Working Memory and Second
lower level verbal tasks. For example, Daneman and Language Processing
Green (1986) proposed that working memory plays
a role in vocabulary acquisition and verbal uency Research on working memory in L1 has shown that
(two skills that seem obviously important for gaining as the language task is made more difcult, the
prociency in an L2 and, as discussed above, seem to effect of span increases (Miyake, Carpenter, &
be inuenced by bilingualism). In one experiment, Just, 1994). Given that acquiring an L2 as an adult
participants read a passage containing a novel word is usually a more effortful and deliberate process
and attempted to infer its meaning. The ability to than acquiring a rst spoken language as a child
acquire new vocabulary was highly correlated with (DeKeyser & Larson-Hall, chapter 5, this volume),
working memory span. The authors argued that it seems reasonable to predict that if working
working memory facilitates recall of previous infor- memory capacity inuences L1 processing, it will
mation, thereby better allowing higher span indi- have an even stronger impact on L2 processing.
viduals to infer the meanings of unfamiliar words Furthermore, the studies discussed in this section
from context. In a second experiment, Daneman and support the idea that working memory may be
Green showed that working memory span was also involved in different or additional aspects of L2
correlated with the ability to produce sentences and processing as compared to L1 processing because L2
synonyms from given words. acquisition and processing require the integration of
Individual Differences in Bilingualism 401
a new set of phonemes, labels, and rules into an capacity and rudimentary L2 learning that takes
already-existing linguistic and conceptual system. place in the laboratory. In the work of Kempe,
Finally, the studies in this section indicate that the Brooks, and Kharkhurin (1999), adult native En-
role of working memory in L2 processing may vary glish speakers who had never studied Russian were
as a function of the relative dominance of L1 over trained on pairs of Russian color adjectives and
L2 because the particular processes that occur nouns. After several training sessions, the learners
during L2 learning change as an individual gains were given each noun and asked to produce the
prociency (see discussion of RHM). appropriate color adjective. The nouns varied in
Miyake (1998) suggested that the vague (but their gender and transparency, leading to some
often-discussed) concept of language aptitude adjective endings that could be learned by rule and
has working memory capacity as its core. He argued others that required rote memorization. Higher
that language comprehension and production con- span learners were better than lower span learners
sist of processing sequences of symbols over time at generalizing rules to new items, but differences
and thus share many component processes with in working memory span did not predict perfor-
working memory span tasks. According to Miyake, mance on rote memorization of the endings. In a
these types of processes are especially salient in the similar study involving learning of the genitive and
domain of syntax. Learners initially attempt to dative endings for Russian nouns, working memory
use their L1 grammatical structures to interpret span again predicted rule generalization perfor-
and produce L2 sentences (MacWhinney, 1997) and mance, but in this experiment span also predicted
must gradually learn new syntactic distinctions the ability to recall the nouns themselves (Kempe &
and adjust the relative strengths of various syntactic Brooks, 2000). Although the results of these two
cues to levels that are more appropriate for the studies are somewhat contradictory, the line of
particular language being learned. Overcoming re- research suggests that working memory is impor-
liance on L1 and adjusting these cue strengths likely tant in the acquisition of inectional morphology.
make substantial demands on cognitive resources, In addition, the recall results from Kempe and
and in support of this argument, Miyake demon- Brooks suggest that working memory may play a
strated a correlation between working memory role in vocabulary acquisition, an aspect of L2
span and the acquisition of linguistic cues in Japa- processing that is central to the next three studies
nese speakers learning English as an L2. we discuss.
Harrington and Sawyer (1992) also studied na- In a study of procient English-Spanish and
tive Japanese speakers learning English, focusing on Spanish-English bilinguals, Michael, Tokowicz,
L2 grammar and reading comprehension as mea- and Kroll (2003) compared the magnitude of span
sured by the grammar and reading subsections of differences in translation and L2 picture naming,
the Test of English as a Foreign Language (TOEFL). two single-word production tasks expected to dif-
Results showed a signicant correlation between fer in their demands on suppression mechanisms.
working memory span and performance on both Both tasks may require suppression of the nontar-
subsections of the TOEFL. Participants completed get language to allow production in the target
the working memory span task in their L2, thus a language, but in translation, as mentioned in our
possible confound was that the observed correlation discussion of the IC Model, the nontarget language
was mediated by L2 lexical development and may have an especially high level of activation
grammatical knowledge. When these factors were because the task explicitly activates both lan-
statistically partialed out, though, a small effect of guages, and the language of presentation itself must
working memory span remained, suggesting that be suppressed. Michael et al. therefore predicted
working memory does indeed play a role in the ac- larger span differences in translation than in L2
quisition of L2 grammar and reading. It is impor- picture naming. As predicted, higher span partici-
tant to note that participants also completed simple pants performed both directions of translation
word list memory and forward digit span tasks, and more quickly and accurately than their lower span
performance on these measures did not correlate counterparts. In contrast, span was not related to
with L2 grammar and reading. L2 picture-naming performance. This pattern of
The studies described thus far included bilin- results suggests that high working memory capacity
guals and L2 learners who acquired their L2 is particularly benecial for bilingual tasks that
in classrooms, immersion environments, or other necessitate the activation of both languages.
naturally occurring settings. It is also possible to Michael, Dijkstra, and Kroll (2002) and Kroll,
examine the relationship between working memory Michael, Tokowicz, and Dufour (2002) evaluated
402 Production and Control
individual differences in L2 lexical processing at direction of translation in the reaction time (RT)
two different stages of prociency. In Michael data remains unclear, but as we discuss next, this
et al.s (2002) study, highly procient Dutch- pattern appears to be somewhat consistent across
English bilinguals completed a reading span task, studies.
word naming (reading aloud) in L1 and L2, and Experiment 2 of Kroll et al.s (2002) study ex-
translation in both the forward and backward di- amined native English speakers who had been
rections. Because L1 and L2 word naming are fairly studying either French or Spanish for only a short
automatic tasks for procient bilinguals, it was pre- period of time. Participants completed the same set
dicted that there would be no signicant difference of tasks as in Michael et al.s (2002) study and
between higher and lower span bilinguals on these similarly were classied as having either higher or
tasks. Indeed, the results indicated that span had lower working memory span. As in the Michael
absolutely no effect on word-naming latencies. et al. study, there was no relationship between
In contrast to the predictions for word naming, both working memory span and word-naming perfor-
the IC Model and the RHM predict that translation, mance. The pattern of results for the translation
particularly forward translation (in which L1 must data, however, demonstrated a complex interaction
be suppressed), should place especially large de- between span and cognate status, as shown in Fig.
mands on suppression. These predictions were 19.2. The results for noncognates were very similar
partially conrmed. Higher span participants were to those observed by Michael et al. Higher span
signicantly faster than lower span participants in participants were signicantly faster than lower
both forward and backward translation. span participants to translate in both directions.
These results (Michael et al., 2002) suggest that For cognates, however, the higher span L2 learners
working memory is critical for controlling the ac- in Kroll et al.s study actually translated more
tivation of two languages, particularly in a task slowly than their lower span counterparts. In con-
(like translation) that forces both languages to be trast, among the highly procient bilinguals studied
active. Interestingly, as predicted, the higher span by Michael et al., the high-span advantage in speed
participants were signicantly more accurate than of translation persisted across both noncognates
the lower span participants for forward translation, and cognates.
but for backward translation there was no accuracy According to Kroll et al. (2002), this remarkable
difference between the two groups. The reason for and counterintuitive nding that higher span
the lack of interaction between span group and learners translated cognates more slowly than
Figure 19.2 Mean response latency for translation (correct responses only) as a function of cognate status
and the direction of translation for early second language learners (adapted from Kroll et al., 2002). For
cognates, lower span learners were faster than higher span learners in both directions of translation. For
noncognates, the higher span learners were faster. L1, rst language; L2, second language.
Individual Differences in Bilingualism 403
lower span learners may indicate that the higher we consider whether suppression may mediate the
span learners were attempting to rely less on word relationship between working memory and bilin-
form cues and more on conceptual mediation, gualism. We propose that performance on working
thereby reducing the cognate facilitation effect and memory span tasks is correlated with performance
instead producing an added burden of suppressing on bilingual tasks because both types of tasks may
the tendency to link L2 word forms directly to L1 require suppression. Thus far, we have argued that
word forms. Given that the RHM associates con- the relationship between bilingualism and suppres-
cept mediation with increased prociency, this sion is complex; from the standpoint of the IC
nding can be taken to indicate that the higher Model and the RHM, we discussed a number of
span participants were actually at a more advanced possible roles for suppression in bilingual proces-
stage of L2 learning. Although the translation sing, and we presented support (albeit limited) for
performance of the higher span participants ap- the claim that bilingualism improves suppression
peared to suffer an initial cost in processing time as skills. We now turn our attention to the nal piece
a result of early attempts at concept mediation, it of the puzzle, the link between working memory
seems likely that the concept mediation strategy span and suppression ability.
would be benecial for the longer-term goals as- Conway and Engle (1994) were among the rst
sociated with attaining L2 prociency. to propose that suppression is an important com-
Taken together, these studies suggest that there ponent of working memory, and that individuals
are complex interactions among prociency level, with higher working memory capacity may thus be
form similarity across languages (i.e., cognate sta- better at suppressing irrelevant information than
tus), and working memory capacity. At this point, individuals with lower working memory capacity.
it is difcult to ascertain precisely how these factors In the last several years, a large amount of empirical
interact, but these relationships will no doubt be a evidence from a wide variety of tasks has provided
topic of investigation in future studies. support for the idea that performance on working
Another important topic for future research is memory measures involves the ability to maintain
the role of immersion experience in L2 learning activation of multiple representations and inhibit
because the L2 learning environment may inter- irrelevant information simultaneously. Studies of
act with working memory capacity in predicting the relationship between working memory and
L2 performance. Specically, preliminary evidence suppression have generally taken one of two ap-
showed that on a comprehension task, L2 learners proaches. The rst approach is to ask whether the
with some immersion experience showed much ability to suppress is disrupted by an experimen-
smaller effects of span than individuals who were tally imposed working memory load, which would
exposed to their L2 only in a classroom (Sunder- indicate shared resources for the suppression and
man, Persaud, & Kroll, 2004). In other words, working memory systems. The second approach is
lower span participants who had immersion expe- to compare suppression abilities in individuals with
rience performed just as well as their higher span higher and lower working memory span.
immersion peers, whereas lower span participants Engle, Conway, Tuholski, and Shisler (1995)
without immersion experience were generally slower examined the effect of a memory load on negative
and less accurate at comprehension than their higher priming. In a standard priming paradigm, proces-
span counterparts. These results suggest that im- sing of a given stimulus on one trial leads to faster
mersion experience may provide learners with ex- processing of an identical or related stimulus on
ternal resources that allow them to compensate for a subsequent trial. Negative priming refers to the
having relatively low working memory capacity or a phenomenon that ignoring a stimulus on one trial
poor ability to suppress irrelevant information. leads to slower processing of an identical or related
stimulus on a subsequent trial. Engle and col-
Working Memory and Suppression leagues reasoned that if suppression is a controlled,
effortful process requiring resources similar to
The studies reviewed in the section Working those involved in working memory, then tying up
Memory and Second Language Processing present a those resources with a memory load should reduce
promising new approach to understanding individ- participants ability to suppress and thus also re-
ual differences in achieving and maintaining bilin- duce negative priming. Indeed, Engle et al. ob-
gualism and establish a growing body of support for served that as the memory load increased, the
the proposed link between working memory ca- amount of negative priming decreased, and in fact
pacity and bilingual performance. In this section, with the largest load, negative priming gave way to
404 Production and Control
facilitation. In a follow-up study, Conway, Tu- colleagues found that higher and lower span par-
holski, Shisler, and Engle (1999) provided further ticipants performed similarly on the prosaccade
support for the idea that negative priming is de- task, but the antisaccade task elicited signicant
pendent on working memory, demonstrating that differences between the higher and lower span
individuals with lower working memory span participants: Higher span participants were both
showed less negative priming than individuals with faster and more accurate at identifying target let-
higher working memory span. ters than were lower span participants. In addition,
In a similar approach, Conway, Cowan, and Mitchell, Macrae, and Gilchrist (2002) pointed
Bunting (2001) used the cocktail party phenome- out that there are two components to the anti-
non to examine the relationship between working saccade task: suppressing the reexive response to
memory capacity and inhibition. The cocktail party look toward the cued location and generating a
phenomenon refers to the nding that peoples at- voluntary eye movement in the direction opposite
tention is often captured by a stimulus with high the cue. They localized the effect of the dual task to
personal relevance, such as their own name, even the suppression component by demonstrating that
when they are actively attending to other input. a cognitive load interfered equally with a no sac-
Using a selective listening procedure in which par- cade condition, which required only suppression.
ticipants were required to shadow a message pre- The studies described in this section suggest that
sented to their right ear and ignore the words working memory capacity and suppression ability
presented to their left ear, Conway et al. found that are directly related. Both the IC Model and the
lower span participants were signicantly more RHM provide mechanisms for linking suppression
likely than higher span participants to report to bilingual processing. Although no one to our
hearing their name in the unattended channel, knowledge has directly tested these claims, the
suggesting that the higher span participants had monolingual work described may help explain some
more resources available to inhibit the distracting of the observed patterns of relationship between
information successfully. working memory span and L2 processing. If work-
As mentioned, the Stroop task is a commonly ing memory span is in part a measure of suppression
studied suppression paradigm that has revealed ability, then the span differences observed in L2
both individual and group differences in cognitive processing may reect differences in suppression
processing under certain circumstances. As we indi- ability that have critical consequences for bilingual
cated, Long and Prat (2002) found span differences language processing.
in the magnitude of the Stroop effect, but only in
some conditions. When only a small proportion of
the trials were conict trials, all participants expe- Summary
rienced a substantial amount of Stroop interference.
When conict trials were relatively frequent, how- We have reviewed experimental evidence and the-
ever, higher span participants were able to reduce oretical reasons for considering working memory
the amount of Stroop interference greatly, whereas and suppression as important factors in bilingual
lower span participants were not. This pattern of language processing. Although the evidence is
results is important because it shows that even in a preliminary and mixed at this point, the reasons for
situation that generally leads to interference, indi- considering these mechanisms are compelling and
viduals with higher working memory span seem to promise to inform both models of bilingualism as
have a greater ability to control that interference, at well as models of cognitive processing more gen-
least when it is particularly useful to do so. erally. As models of suppression, working memory,
The suppression tasks discussed thus far all in- and bilingualism continue to become more de-
volved language processing to some extent. Kane, tailed, it will be increasingly possible to predict
Bleckley, Conway, and Engle (2001) studied the both how bilingualism may affect these cognitive
relationship between working memory and sup- skills and how these cognitive skills may affect
pression using two visual-orienting tasks that did bilingualism.
not require a verbal response. In the prosaccade
task, participants had to identify a target letter that
appeared in a cued location. In the antisaccade Acknowledgments
task, the target letter appeared opposite the cued The writing of this chapter was supported in part
location, requiring the participant to inhibit the by a National Research Service Award (NIMH
reexive response to look toward the cue. Kane and HD41307-02) to Erica B. Michael and by a Career
Individual Differences in Bilingualism 405
Development Award (NIDCD DC00191) to Tamar Conway, A. R. A., Tuholski, S. W., Shisler, R. J., &
H. Gollan. Engle, R. W. (1999). The effect of memory
load on negative priming: An individual dif-
References ferences investigation. Memory & Cognition,
27, 10421050.
Atkins, P. W. B., & Baddeley, A. D. (1998). Working Costa, A., & Caramazza, A. (1999). Is lexical
memory and distributed vocabulary learning. selection in bilingual speech production
Applied Psycholinguistics, 19, 537552. language-specic? Further evidence from
Baddeley, A., Gathercole, S., & Papagno, C. Spanish-English and English-Spanish bilin-
(1998). The phonological loop as a language guals. Bilingualism: Language and Cognition,
learning device. Psychological Review, 105, 2, 231244.
158173. Costa, A., Caramazza, A., & Sebastian-Galles, N.
Bialystok, E. (2002, April). Bilingualism: Defense (2000). The cognate facilitation effect: Impli-
against the decline of executive functions? cations for models of lexical access. Journal of
Poster presented at the Cognitive Aging Experimental Psychology: Learning, Memory,
Conference, Atlanta, GA. and Cognition, 26, 12831296.
Borkowski, J. G., Benton, A. L., & Spreen, O. Costa, A., Miozzo, M., & Caramazza, A. (1999).
(1967). Word uency and brain damage. Lexical selection in bilinguals: Do words in the
Neuropsychologia, 5, 135140. bilinguals two lexicons compete for selection?
Brown, A. (1991). A review of the tip-of-the- Journal of Memory and Language, 41,
tongue experience. Psychological Bulletin, 365397.
109, 204223. Cutting, J. C., & Ferreira, V. S. (1999). Semantic
Brown, R., & McNeill, D. (1966). The tip of the and phonological information ow in the
tongue phenomenon. The Journal of Verbal production lexicon. Journal of Experimental
Learning and Verbal Behavior, 5, 325337. Psychology: Learning, Memory, and Cogni-
Burke, D. M., Locantore, J. K., & Austin, A. A. tion, 25, 318344.
(2004). Cherry pit primes Brad Pitt: Daneman, M., & Carpenter, P. A. (1980).
Homophone priming effects on young and Individual differences in working memory and
older adults production of proper names. reading. Journal of Verbal Learning and
Psychological Science, 15, 164170. Verbal Behavior, 19, 450466.
Burke, D. M., MacKay, D. G., Worthley, J. S., & Daneman, M., & Carpenter, P. A. (1983).
Wade, E. (1991). On the tip of the tongue: Individual differences in integrating informa-
What causes word nding failures in young tion between and within sentences. Journal of
and older adults? Journal of Memory and Experimental Psychology: Learning, Memory,
Language, 30, 542579. and Cognition, 9, 561584.
Campbell, R., & Sais, E. (1995). Accelerated Daneman, M., & Green, I. (1986). Individual
metalinguistic (phonological) awareness in differences in comprehending and producing
bilingual children. British Journal of words in context. Journal of Memory and
Developmental Psychology, 13, 6168. Language, 25, 118.
Cheung, H. (1996). Nonword span as a unique Daneman, M., & Merikle, P. M. (1996). Working
predictor of second-language vocabulary memory and language comprehension: A
learning. Developmental Psychology, 32, meta-analysis. Psychonomic Bulletin &
867873. Review, 3, 422433.
Cohen, G., & Burke, D. M. (1993). Memory for De Groot, A. M. B., & Poot, R. (1997). Word
proper names: A review. Memory, 1, 249263. translation at three levels of prociency in a
Colome, A`. (2001). Lexical activation in bilinguals second language: The ubiquitous involvement
speech production: Language-specic or of conceptual memory. Language Learning,
language-independent? Journal of Memory 47, 215264.
and Language, 45, 721736. Delis, D., Kaplan, E., Kramer, J., & Ober, B.
Conway, A. R. A., Cowan, N., & Bunting, M. F. (2000). California Verbal Learning Test-II.
(2001). The cocktail party phenomenon re- San Antonio, TX: Psychological Corporation.
visited: The importance of working memory Dell, G. S. (1986). A spreading-activation model of
capacity. Psychonomic Bulletin & Review, 8, retrieval in sentence production. Psychological
331335. Review, 93, 283321.
Conway, A. R. A., & Engle, R. W. (1994). Ecke, P. (1996). Cross-language studies of lexical
Working memory and retrieval: A resource- retrieval: Tip-of-the-tongue states in rst
dependent inhibition model. Journal of and foreign languages. Dissertation Abstracts
Experimental Psychology: General, 123, International Section A: Humanities and
354373. Social Sciences, 57(4-A).
406 Production and Control
Engle, R. W., Cantor, J., & Carullo, J. J. (1992). Psychology: Learning, Memory, and
Individual differences in working memory and Cognition, 24, 12561274.
comprehension: A test of four hypotheses. Just, M. A., & Carpenter, P. A. (1992). A capacity
Journal of Experimental Psychology: Learn- theory of comprehension: Individual differ-
ing, Memory, and Cognition, 18, 972992. ences in working memory. Psychological
Engle, R. W., Conway, A. R. A., Tuholski, S. W., Review, 99, 122149.
& Shisler, R. J. (1995). A resource account of Kane, M. J., Bleckley, M. K., Conway, A. R. A., &
inhibition. Psychological Science, 6, 122125. Engle, R. W. (2001). A controlled-attention
Gathercole, S. E., & Baddeley, A. D. (1989). view of working-memory capacity. Journal
Evaluation of the role of phonological STM in of Experimental Psychology: General, 130,
the development of vocabulary in children: A 169183.
longitudinal study. Journal of Memory and Kempe, V., & Brooks, P. J. (2000, November).
Language, 28, 200213. Learning complex morphological paradigms.
Gernsbacher, M. A., & Faust, M. E. (1991). The Poster presented at the annual meeting of the
mechanism of suppression: A component of Psychonomic Society, New Orleans, LA.
general comprehension skill. Journal of Kempe, V., Brooks, P. J., & Kharkhurin, A. (1999,
Experimental Psychology: Learning, Memory, November). Multiple determinants of individ-
and Cognition, 17, 245262. ual differences in language learning. Poster
Gollan, T. H., & Acenas, L. A. (2004). What is a presented at the annual meeting of the
TOT? Cognate and translation effects on tip- Psychonomic Society, Los Angeles, CA.
of-the-tongue states in Spanish-English and King, J., & Just, M. A. (1991). Individual differ-
Tagalog-English bilinguals. Journal of Exper- ence in syntactic processing: The role of
imental Psychology: Learning, Memory, and working memory. Journal of Memory and
Cognition, 30, 246269. Language, 30, 580602.
Gollan, T. H., Bonanni, M. P., & Montoya, R. I. Kroll, J. F., Michael, E., Tokowicz, N., & Dufour,
(in press). Proper names get stuck on bilingual R. (2002). The development of lexical uency
and monolingual speakers tip-of-the-tongue in a second language. Second Language
equally often. Neuropsychology. Research, 18, 137171.
Gollan, T. H., Montoya, R. I., Fennema-Notestine, Kroll, J. F., & Stewart, E. (1994). Category inter-
C., & Morris, S. K. (in press). Bilingualism ference in translation and picture naming:
affects picture naming but not picture classi- Evidence for asymmetric connections between
cation. Memory & Cognition. bilingual memory representations. Journal of
Gollan, T. H., Montoya, R., & Werner, G. (2002). Memory and Language, 33, 149174.
Semantic and letter uency in Spanish- La Heij, W., Hooglander, A., Kerling, R., & Van
English bilinguals. Neuropsychology, 16, der Velden, E. (1996). Nonverbal context ef-
562576. fects in forward and backward word transla-
Gollan, T. H., & Silverberg, N. B. (2001). Tip-of- tion: Evidence for concept mediation. Journal
the-tongue states in Hebrew-English bilin- of Memory and Language, 35, 648665.
guals. Bilingualism: Language and Cognition, Levelt, W. J. M., Roelofs, A., & Meyer, A. S.
4, 6383. (1999). A theory of lexical access in speech
Green, D. W. (1998). Mental control of the bilin- production. Behavioral and Brain Sciences,
gual lexico-semantic system. Bilingualism: 22, 175.
Language and Cognition, 1, 6781. Lezak, M. D. (1995). Neuropsychological assess-
Hamers, J. F., & Blanc, M. H. A. (2000). ment (3rd ed.). New York: Oxford University
Bilinguality and bilingualism (2nd ed.). Press.
Cambridge, U.K.: Cambridge University Press. Long, D. L., & Prat, C. S. (2002). Working mem-
Harrington, M., & Sawyer, M. (1992). L2 working ory and Stroop interference: An individual
memory capacity and L2 reading skill. Studies differences investigation. Memory & Cogni-
in Second Language Acquisition, 14, 2538. tion, 30, 294301.
Hermans, D., Bongaerts, T., De Bot, K., & MacWhinney, B. (1997). Second language acqui-
Schreuder, R. (1998). Producing words in a sition and the Competition Model. In A. M. B.
foreign language: Can speakers prevent inter- de Groot & J. F. Kroll (Eds.), Tutorials in
ference from their rst language? Bilingualism: bilingualism: Psycholinguistic perspectives
Language and Cognition, 1, 213229. (pp. 113142). Mahwah, NJ: Erlbaum.
Jescheniak, J. D., & Schriefers, K. I. (1998). Mayr, U., & Kliegl, R. (2000). Complex semantic
Discrete serial versus cascading processing in processing in old age: Does it stay or does it
lexical access in speech production: Further go? Psychology and Aging, 15, 2934.
evidence from the coactivation of near- Meuter, R. F. I., & Allport, A. (1999). Bilingual
synonyms. Journal of Experimental language switching in naming: Asymmetrical
Individual Differences in Bilingualism 407
costs of language selection. Journal of Memory speakers. Journal of Memory and Language,
and Language, 40, 2540. 28, 278291.
Michael, E. B. (1998). The consequences of indi- Rosen, V. M., & Engle, R. W. (1997). The role of
vidual differences in cognitive abilities for working memory capacity in retrieval. Journal
bilingual language processing. Unpublished of Experimental Psychology: General, 126,
doctoral dissertation, The Pennsylvania State 211227.
University, University Park. Rosselli, M., Ardila, A., Araujo, K., Weekes, V. A.,
Michael, E. B., Dijkstra, T., & Kroll, J. F. (2002, Caracciolo, V., Padilla, M., et al. (2000).
November). Individual differences in the Verbal fluency and repetition skills in healthy
degree of language nonselectivity in uent older Spanish-English bilinguals. Applied
bilinguals. Poster presented at the annual Neuropsychology, 7, 1724.
meeting of the Psychonomic Society, Kansas Rosselli, M., Ardila, A., Santisi, M. N., Arecco, M.
City, KS. R., Salvatierra, J., Conde, A., et al. (2002).
Michael, E. B., Tokowicz, N., & Kroll, J. F. (2003, Stroop-effect in Spanish-English bilinguals.
April). Modulating access to L2 words: The Journal of the International Neuropsycholo-
role of individual differences and language gical Society, 8, 819827.
immersion experience. Paper presented at Sebova, E., & Arochova, O. (1986). An attempt at
the Fourth International Symposium on a modication of the Stroop test for preschool
Bilingualism, Tempe, AZ. age children. Studia Psychologica, 28,
Mitchell, J. P., Macrae, C. N., & Gilchrist, I. D. 179182.
(2002). Working memory and the suppression Segalowitz, N. (1997). Individual differences in
of reexive saccades. Journal of Cognitive second language acquisition. In A. M. B. de
Neuroscience, 14, 95103. Groot & J. F. Kroll (Eds.), Tutorials in
Miyake, A. (1998). Individual differences in second bilingualism: Psycholinguistic perspectives
language proficiency: The role of working (pp. 85112). Mahwah, NJ: Erlbaum.
memory. In A. F. Healy & L. E. Bourne, Jr. Service, E. (1992). Phonology, working memory,
(Eds.), Foreign language learning: Psycholin- and foreign-language learning. Quarterly
guistic studies on training and retention (pp. Journal of Experimental Psychology. A,
339364). Mahwah, NJ: Erlbaum. Human Experimental Psychology, 45A, 2150.
Miyake, A., Carpenter, P. A., & Just, M. A. (1994). Service, E., & Kohonen, V. (1995). Is the relation
A capacity approach to syntactic comprehen- between phonological memory and foreign
sion disorders: Making normal adults perform language learning accounted for by vocabulary
like aphasic patients. Cognitive Neuropsy- acquisition? Applied Psycholinguistics, 16,
chology, 11, 671717. 155172.
Papagno, C., Valentine, T., & Baddeley, A. (1991). Smith, M. C. (1997). How do bilinguals access
Phonological short-term memory and foreign- lexical information? In A. M. B. de Groot &
language vocabulary learning. Journal of J. F. Kroll (Eds.), Tutorials in bilingualism:
Memory and Language, 30, 331347. Psycholinguistic perspectives (pp. 145168).
Papagno, C., & Vallar, G. (1995). Verbal short- Mahwah, NJ: Erlbaum.
term memory and vocabulary learning in Stroop, J. R. (1935). Studies of interference in serial
polyglots. The Quarterly Journal of Experi- verbal reactions. Journal of Experimental
mental Psychology, 48A, 98107. Psychology, 18, 643662.
Peterson, R. R., & Savoy, P. (1998). Lexical selection Sunderman, G., Persaud, A., & Kroll, J. F. (2004).
and phonological encoding during language When language learning is not a matter of
production: Evidence for cascaded processing. talent alone: The effects of cognitive abilities
Journal of Experimental Psychology: Learning, and study abroad experiences on language
Memory, and Cognition, 24, 539557. processing. Manuscript in preparation, The
Ransdell, S., Arecco, M. R., & Levy, C. M. (2001). Pennsylvania State University, University
Bilingual long-term working memory: The Park.
effects of working memory loads on writing Talamas, A., Kroll, J. F., & Dufour, R. (1999).
quality and uency. Applied Psycholinguistics, Form related errors in second language learn-
22, 113128. ing: A preliminary stage in the acquisition of
Ransdell, S. E., & Fischler, I. (1987). Memory in L2 vocabulary. Bilingualism: Language and
a monolingual mode: When are bilinguals at Cognition, 2, 4558.
a disadvantage? Journal of Memory and Van Hell, J. G., & Dijkstra, T. (2002). Foreign
Language, 26, 392405. language knowledge can inuence native
Ransdell, S. E., & Fischler, I. (1989). Effects of language performance in exclusively native
concreteness and task context on recall of contexts. Psychonomic Bulletin & Review, 9,
prose among bilingual and monolingual 780789.
This page intentionally left blank
PART IV
Introduction to Part IV
Aspects and Implications of Bilingualism
411
412 Aspects and Implications of Bilingualism
different results in late second language (L2) pairs; producing on demand words that begin with a
speakers (commensurate with the extent of their certain letter or a three-letter string or that belong to
lacunae in L2 implicit linguistic competence). a certain semantic category; deriving a verb from a
Comparisons between neuroimaging studies that noun; or counting the number of L2 words in a list of
are based on the processing of single words and L1 words. In each case, these laboratory tasks,
those that use sentence processing, because they are by comparison with appropriate controls, may per-
based on altogether different types of language haps allow certain components of processing to be
components of a fundamentally different nature, identied. How those components combine in real
subserved as they are by different memory systems time may not, however, reect the events that
that rely on different anatomical structures located characterize ordinary language processing outside
in different parts of the brain, may only serve to the laboratory.
show how the two processes differ from each other.
The effects of single-word processing can be
expected to be similar in L1 and L2 by virtue of the Declarative/Procedural Memory
declarative-memory support of words in both lan-
The declarative/procedural (implicit/explicit) dis-
guages. In contrast, the effects of text processing
tinction has pervasive implications in every neu-
(sentences and short stories) may differ between L1
ropsychological domain of bilingualism research
and L2 in proportion to bilinguals greater reliance
and cannot be ignored. It has an impact not only
on pragmatics (and to a lesser extent on some as-
on the interpretation of the results obtained with
pects of rote-learned metalinguistic knowledge).
single words versus those obtained with the rest of
Note that both native and L2 speakers use the same
language structure (a combination of phonology,
cerebral mechanisms in processing verbal commu-
morphosyntax, and semantics; i.e., tasks involving
nication, but in different proportions (determined
sentences and short stories as stimuli), but also on
by the degree of automatization of L2, extent of
the identication of the cerebral mechanisms used
metalinguistic knowledge, and degree of reliance
by early and late bilinguals, procient and non-
on pragmatics).
procient speakers of an L2.
Also, single words (presented in isolation from
It is therefore not surprising that it should also
any sentential context) do not tap pragmatic aspects
be relevant to simultaneous translation (see Chris-
of language or the procedural memory system that
toffels & De Groot, chapter 22, this volume). The
subserves the language system (implicit linguistic
declarative knowledge of Patient A. D. (Paradis,
competence; i.e., phonology, morphology, syntax,
Goldblum, & Abidi, 1982) was not affected by
and semantics other than conscious lexical seman-
her aphasia, as demonstrated by her ability to re-
tics). Experiments using single words therefore
member correctly (and state in her available lan-
cannot, in my opinion, address questions about the
guage) which objects she was able to name (and
representation of the language system, including
which she could not name) the day beforeyet
questions of the participation of the right hemi-
without the ability to recover their names. This
sphere through pragmatic aspects.
might be an explanation for her ability to translate
Single words, however, can be expected to ac-
into a language that was not available for sponta-
tivate the right hemisphere to the extent that the
neous use: She was probably performing a conscious
conceptual features corresponding to their lexical
metalinguistic task (not involving her implicit com-
meanings are connected to right hemisphere sen-
petence used to produce language automatically).
sorimotor and affective representations (the latter
Meaning-based interpreting depends on implicit
through the privileged pathways to subcortical
linguistic competence, whereas transcoding depends
limbic structures). This is true of both uni- and
on metalinguistic knowledge. The patient must have
bilinguals.
been transcoding. For a discussion of the cognitive
It is also important not to generalize to language
processes involved in simultaneous interpretation,
representation or processing from results of experi-
see Christoffels and De Groots chapter 22 in this
ments that use tasks other than normal language-
volume.
processing tasks, such as switching on demand in
response to a color or a sound clue; deciding whether
a word is concrete or abstract, whether it rhymes Neuroimaging Evidence and Theory
with another word or not, whether it is a word of
L1 or of L2, or whether its meaning is or is not re- Neuroimaging studies have so far by and large failed
lated to another word; memorizing unrelated word to connect to theoretical debates in the behavioral
Introduction to Part IV 413
and clinical literature. The investigation by Price, worthy of serious attention, but because of the lack
Green, and von Studnitz (1999) was the rst bilin- of validity of its basic premise, namely, that the
gual study actually to test a hypothesis rather than to percentage of ear, half visual eld, or tapping ad-
go on a shing expedition (i.e., lets poke here and vantage reects the degree of cerebral lateralization
see what happens, a practice that then leaves re- of the language system (Paradis, 2003).
searchers in a quandary when it comes to inter- Bilingual laterality studies continue to refer to
preting the datamany of them unexpected). As the age of acquisition variable as pre- and post-
a result, neuroimaging studies have not provided puberty (Evans, Workman, Mayer, & Crowley,
more answers than previous behavioral and clinical 2002; Hull & Vaid, chapter 23, this volume). This
studies. They have at best conrmed some of the notion of puberty as a cutoff point of the critical
clinical ndings and have produced a number of period goes back to Lennebergs (1967) conjecture
unreplicated, contradictory, often-uninterpretable that language lateralization is a gradual process
results. For example, it is not known whether the that would end at puberty. It was quickly pointed
regions showing activation in unexpected areas such out that the data on which Lenneberg had based his
as the inferior temporal gyrus and the temporal pole hypothesis did not support it (Krashen, 1973).
(not to mention right-hemisphere areas) reect ac- It has since become apparent that (a) language (qua
tivation or inhibition, and given that there are more implicit linguistic competence) lateralization is not
inhibitory neurons than activating ones (Chertkow gradual, from bilateral to left lateralized, but rather
& Murtha, 1999), chances are that some do reect is left lateralized from the start, and (b) that the age
inhibition. It is also not known whether the observed for the ability to acquire language like a native
activation results from some task-related or general speaker is much earlier than the onset of puberty,
problem-solving function. namely, before 5 years of age (see chapters 5 [De-
Paradis (2004) listed over 20 severe problems Keyser & Larson-Hall] and 6 [Birdsong] in this
inherent with neuroimaging techniques that, to- volume for a full discussion of age effects on L2
gether with the large number of reported inter- acquisition).
individual variables among homogeneous groups of It should not be surprising that there is right
speakers and the number of contradictory ndings, hemisphere involvement in text (stimulus sentence
should give us serious cause for concern about the or short story) processing. Semantic and pragmatic
reliability and validity of functional neuroimaging features should not be confounded if we are to make
techniques applied to cognitive processes in bilin- sense of event-related potential and neuroimaging
gual populations. Current neuroimaging techni- experimental ndings. Semantics (the meaning of
ques, although extremely useful in detecting infarcts a sentence derived from the lexical meaning of its
and tumors, are not suited for detecting, let alone words and its grammatical structure) needs to
measuring, processes that typically last a few milli- be distinguished from pragmatics (the meaning of
seconds and occur microns apart within a number of an utterance taking various contexts into account,
circumscribed areas possibly distributed over large including general knowledge). Every utterance (e.g.,
distances cortically and subcortically. The task is a sentence presented as part of a short story or ut-
made even more difcult considering that some of tered in a natural context) necessarily derives part of
the processes are not only microanatomical but also its meaning from pragmatics and to that extent can
biochemical (involving neurotransmitters, hor- be expected to activate areas of the right hemi-
mones, etc.) and may rely on electrical activity (such sphere. To the extent that implicit competence
as ring patterns and Hebb-type cell assemblies). grammatical devices are not available (because they
Neuroimaging studies may show that a particular are not provided by the grammar of a particular
area is active when one or the other of a speakers language, because they have not been internalized
languages or both are processed. They do not in- by an L2 learner, or because they have been im-
dicate whether the languages share the same neu- paired by pathology), speakers will compensate by
ronal circuits. At best, they indicate that a gross relying on pragmatics to ll the gap.
anatomical area is involved, but say nothing about At this point, the usefulness of a comparison
the microanatomical sites within this region. between experimental laterality studies (proble-
matic because they are based on the invalid premise
Lateralization that the degree of ear or visual half-eld advantage
is an index of degree of language laterality) and
It is not so much because the bilingual laterality neuroimaging studies (plagued with a lack of con-
literature is contradictory that it is considered not sensus on the appropriate parameters to interpret
414 Aspects and Implications of Bilingualism
what is observed, such as activation level settings, The research that is reviewed in the chapters that
statistics, etc.) seems extremely limited. follow examines distinctions between representa-
tion and control (e.g., Bialystok, chapter 20, in the
case of bilingual children and Green, chapter 25, in
Language and Thought the case of bilingual individuals with aphasia), in the
nature of representations and how they are accessed
Contrary to Macnamaras (1970) assumption (see in the two languages, and in the neurocognitive
Pavlenko, chapter 21, this volume), bilingual in- basis of bilingualism. Many of these investigations
dividuals do not need to translate to themselves into the behavioral and neural bases of bilingualism
in L2 what they have heard or said in L1 (or vice are at an early stage of development. My comments
versa) to communicate with themselves. Each lan- suggest that we be cautious, particularly in gen-
guage is understood directly (Paradis, 2004), just as eralizing the results of the neuroimaging studies
it is by unilingual native speakers. What this means published to date and in interpreting the available
is that bilinguals are able to organize their mental evidence on laterality. Despite this caution, there
representations in accordance with the meaning appears to be promise in the convergence of ap-
of each language. This ability to adopt two per- proaches represented by the chapters in this section.
spectives might account for the results obtained
by young bilinguals on intelligence tests. Far from
being handicapped, bilingual persons are reported References
to possess greater cognitive exibility, which could Albert, M. L., & Obler, L. K. (1978). The bilingual
explain their superior performance not only on brain. New York: Academic Press.
metalinguistic tasks and verbal intelligence tests, Chee, M. W. L., Tan, E. W. L., & Thiel, T. (1999).
but also on divergent thinking tasks, on concept Mandarin and English single word
formation and general reasoning tests, and in the processing studied with functional magnetic
discovery of underlying rules in the resolution of resonance imaging. Journal of Neuroscience,
a problem. The common nding in the disparate 19, 30503056.
cognitive domains investigated by Bialystok (chap- Chertkow, H., & Murtha, S. (1997). PET
ter 20, this volume) is that bilingual children are activation and language. Clinical
Neuroscience, 4, 7886.
more advanced than unilinguals in solving problems Evans, J., Workman, L., Mayer, P., & Crowley, P.
requiring the inhibition of misleading information. (2002). Differential bilingual laterality:
Bilingual persons who speak both languages like Mythical monster found in Wales. Brain and
natives need not nd themselves in any predica- Language, 83, 291299.
ment: They may function cognitively differentially Kim, K. H., Relkin, N. R., Lee, K.-M., & Hirsch, J.
without any ill consequences. In fact, their bi- (1997). Distinct cortical areas associated with
lingualism is likely to enrich their general mental native and second languages. Nature, 388,
representations. To the extent that they have native 171174.
competence in both languages, they are able to Krashen, S. (1973). Lateralization, language
organize their representations now in accordance learning and the critical period: Some new
evidence. Language Learning, 23, 6374.
with the patterns of L1, now in accordance with Lenneberg, E. H. (1967). Biological foundations
those of L2, thus having more ways of sorting out of language. New York: Wiley.
the same data of experience. Macnamara, J. (1970). Bilingualism and
From a neuropsychological viewpoint, bilingual thought. Monograph Series on Languages
individuals may encode particular concepts in dif- and Linguistics, 23, 2545.
ferent ways (see Pavlenko, chapter 21, this volume); Marian, V., Spivey, M., & Hirsch, J. (2003).
that is, the conceptual components corresponding to Shared and separate systems in bilingual
a word and its translation equivalent may some- language processing: Converging evidence
times differ considerably, but the cerebral mechan- from eyetracking and brain imaging. Brain
isms that subserve the conceptual representations and Language, 86, 7082.
Paradis, M. (2000). The neurolinguistics of
and the lexical representations are the same, irre- bilingualism in the next decades. Brain and
spective of the degree of conceptword overlap Language, 71, 178180.
between languages. (See Francis, chapter 12, this Paradis, M. (2003). The bilingual Loch Ness
volume, for further discussion of the representation monster raises its nonasymmetric head
of concepts in the bilinguals two languages.) againor, why bother with such cumbersome
Introduction to Part IV 415
20
Consequences of Bilingualism
for Cognitive Development
417
418 Aspects and Implications of Bilingualism
among those alternatives (e.g., as synonyms) does have been found to perform better on concept
not invoke the activation of entire systems of formation tasks (Bain, 1974), divergent thinking
meaning as the alternative names from different and creativity (Torrance, Wu, Gowan, & Aliotti,
languages are likely to do. From the beginning, 1970), and eld independence and Piagetian con-
therefore, bilingualism has consequences. What is servation (Duncan & De Avila, 1979). In a par-
not inevitable, however, is that one of these conse- ticularly well-designed study, Ben-Zeev (1977)
quences is to inuence the quality or manner of reported bilingual advantages on both verbal and
cognitive development. nonverbal measures in spite of a signicant bilin-
Early research on the cognitive consequences of gual disadvantage in vocabulary. Her explanation
bilingualism paid virtually no attention to such is- was that the mutual interference between lan-
sues as the nature of bilingual populations tested, guages forces bilinguals to adopt strategies that
their facility in the language of testing, or the inter- accelerate cognitive development. Although she did
pretation of the tests used. As an apparent default, not develop the idea further, it is broadly consistent
cognitive ability was taken to be determined by with the explanation proposed elsewhere (Bialy-
performance on IQ tests, at best a questionable stok, 2001) and below.
measure of intelligence (see Gould, 1981). For ex- Researchers such as Hakuta, Ferdman, and Diaz
ample, Saer (1923) used the Stanford-Binet test and (1987), MacNab (1979), and Reynolds (1991)
compared bilingual Welsh children with monolin- challenged the reliability of many of those studies
gual English children and reported the inferiority reporting felicitous cognitive consequences for bi-
and mental confusion of the bilinguals. Darcy lingualism and argued that the data were not yet
(1963) reviewed many subsequent studies of this conclusive. MacNab (1979) was the most critical,
type and pointed to their common nding that bi- but conceded that bilinguals consistently out-
linguals consistently scored lower on verbal tests and performed monolinguals in generating original uses
were often disadvantaged on performance tests as for objects, an ability compatible with the claim of
well. Although Darcy cautioned that multiple fac- Peal and Lambert (1962) for an increase in exibility
tors should be considered, a more salubrious ac- of thought. Reynoldss (1991) reservation depended
count of this research was offered by Hakuta (1986), in part on his requirement that evidence for bilingual
who attributed the inferior results of the bilinguals in superiority should be presented in the context of an
comparison to their new native-speaking peers to explanation for why such effects occur.
conducting the tests in a language they were only The purpose of the present review is to describe
beginning to learn. some selected cognitive processes and evaluate the
The antidote to the pessimistic research was evidence for bilingual inuences on their develop-
almost as extreme in its claims. In a watershed ment and to interpret those effects within an ex-
study, Peal and Lambert (1962) tested a carefully planatory framework. Peal and Lamberts (1962)
selected group of French-English bilingual chil- idea that bilingualism would foster exibility of
dren and hypothesized that the linguistic abilities of thought has persisted, often accompanied by sup-
the bilinguals would be superior to those of the porting evidence. Their explanation was that the
monolinguals but that the nonverbal skills would experience of having two ways to describe the world
be the same. Even the expectation of an absence of gave bilinguals the basis for understanding that
a bilingual decit was a radical departure from the many things could be seen in two ways, leading to a
existing studies. Not only was the linguistic ad- more exible approach to perception and interpre-
vantage conrmed in their results, but also an un- tation. I return to this idea at the end of the chapter.
expected advantage in some of the nonverbal The majority of the more recent literature has
cognitive measures involving symbolic reorganiza- focused on the consequences of bilingualism for the
tion was found. Their conclusion was that bilin- development of childrens linguistic and metalin-
gualism endowed children with enhanced mental guistic concepts. It is entirely plausible that learn-
exibility, and this exibility was evident across all ing two languages in childhood could alter the
domains of thought. course of these developments, but documenting
Subsequent research has supported this notion. those abilities has revealed unexpected complexity.
Ricciardelli (1992), for example, found that few Bilingualism is often (but not consistently) found to
tests in a large battery of cognitive and metalin- promote more rapid development of metalinguistic
guistic measures were solved better by bilinguals, concepts. In contrast, oral language prociency,
but those that were included tests of creativity and particularly in terms of early vocabulary develop-
exible thought. In addition, balanced bilinguals ment, is usually delayed for bilingual children.
Consequences of Bilingualism 419
Reading and the acquisition of literacy is less well determining whether bilingualism had an impact on
studied, but the existing evidence gives little reason development. Macnamara (1966, 1967) raised the
to believe that bilingualism itself has a signicant possibility that bilingualism might interfere with
impact on the manner or ease with which children childrens competence in these areas. Based on the
learn to read. The effects of bilingualism on all research available at the time, he concluded that
these language-related developments are discussed there was no evidence that bilingualism handi-
elsewhere (e.g., Bialystok, 2001, 2002) and are not capped childrens computational ability for me-
reviewed here. This chapter examines only the chanical arithmetic, but that it did impair childrens
nonverbal cognitive consequences of becoming bi- ability to solve mathematical word problems. His
lingual in childhood. own large-scale study of English-speaking children
The possibility that bilingualism can affect non- in Irish language schools conrmed this pattern. He
verbal cognitive development is steeped in an as- attributed the decit to what he considered the
sumption, namely, that linguistic and nonlinguistic inevitable language handicap that followed from
knowledge share resources in a domain-general bilingualism but did not discount the logical possi-
representational system and can inuence each bility that bilingualism itself was to blame. A simpler
other. In some theoretical conceptions of language, explanation, however, is compelling: Childrens
language representations and processes are isolated competence in Irish was inadequate to the task. The
from other cognitive systems (e.g., Pinker, 1994). culprit was not bilingualism but rather the use of a
Although it may be possible in these views to un- language for a complex educational purpose that
derstand that bilingualism would inuence linguistic exceeded the childrens prociency in that language.
and metalinguistic development, it is difcult to Although bilingualism frequently compromises
imagine that the effect of constructing two lan- childrens prociency in one of the languages, the
guages would extend beyond that domain. There- decit is neither inevitable nor pervasive. Therefore,
fore, even to pose the possibility that bilingualism bilingualism itself may not have been a factor in the
inuences nonverbal cognitive growth requires ac- performance of the children in that study.
cepting that linguistic and nonlinguistic functioning Macnamara concluded that the mechanical abili-
converge on some essential cognitive mechanism. ties to carry out arithmetic operations were equiva-
Such cognitive models typically incorporate an ex- lent in monolinguals and bilinguals, but others have
ecutive function, one that includes the limitations of presented a different view. Some researchers have
working memory and representational processes reported weak but consistent evidence that adult bi-
and is limited by a central resource responsible for linguals take longer to solve mental arithmetic
selective attention, inhibition, and planning (e.g., problems than monolinguals, particularly in their
Norman & Shallice, 1986). If bilingualism alters weaker language (Magiste, 1980; Marsh & Maki,
something essential about nonverbal cognitive de- 1976; McClain & Huang, 1982). Geary, Cormier,
velopment, then it might well be through its impact Goggin, Estrada, and Lunn (1993) speculated that
on such a generalized executive function. this difference arose because these mechanical
In the following discussion, three areas of cog- problems were solved verbally by mediating the
nitive development are examined to determine if operations in one of the language, so they developed
they are acquired differently, or on a different a task that bypassed the possibility of verbal medi-
timescale, by bilingual children. The three areas are ation. They presented arithmetic problems with a
concepts of quantity and arithmetic ability, hier- solution, and participants only needed to judge
archical classication in a task switch paradigm, whether the solution was correct. If verbal media-
and theory of mind. Following this discussion, the tion were required, participants would conduct
common pattern from the three developmental ar- these computations in their stronger language,
eas is discussed, and a possible explanation for eliminating the burden of the weak language effect.
developmental differences between monolingual and With the language component of the task removed,
bilingual children is proposed. they found no overall differences in reaction times to
solve these problems. In a more detailed follow-up
study, they divided the reaction time between time
Quantitative Concepts spent encoding and retrieving and time spent com-
and Abilities puting the operations. Here, they found no group
difference in encoding but a signicant monolin-
Quantitative concepts and mathematical abilities gual advantage in the computing. Their interpreta-
were investigated early by researchers interested in tion was that both groups had the same automated
420 Aspects and Implications of Bilingualism
access to the stored arithmetic facts, but that on a verbal task. Then, they were timed as they
monolinguals could perform computations on these counted in both directions in both languages. The
facts more rapidly than bilinguals. They interpreted relevant measure was the ratio of the time required
this as indicating working memory differences be- to count backward over the time required to count
tween the groups that favored monolinguals. forward in the same language. Backward counting
Frenck-Mestre and Vaid (1993) reported that would inevitably be slower, and the greater effort
bilinguals veried simple arithmetic problems most required would increase its difference from forward
quickly and accurately when the problems were counting. Furthermore, by computing this time as a
presented as digits, slower when presented in word ratio of the time needed to count forward in each
format in their rst language, and slower again language, possible differences in the time required
in their second language. They pointed to other simply to recite the number sequence in the two
studies that indicated that number processing itself languages were eliminated. The results showed no
is not slower in a second language and so con- difference between languages on the verbal task but
cluded that the explanation for their data was that a signicant increase in time required to count
it is arithmetic ability that is compromised for the backward in their weaker language.
bilinguals in their second language. This result may These studies indicated that bilingual adults
reect the same difference reported by Geary et al. generally take longer to solve mathematical prob-
(1993) regarding the computation aspect of solving lems than monolingual adults do, particularly when
these problems in a weak language. Frenck-Mestre the problems are posed in their weak language. The
and Vaid concluded that arithmetic is sensitive to studies also conrmed, however, that language itself
the language in which it is learned, and that the has a role to play in these mathematical operations.
ability to carry out arithmetic operations is im- Therefore, it is still conceivable that bilingual chil-
paired in a second language. However, their bilin- dren who are initially learning these skills may
gual participants were late language learners who be compromised in their acquisition, and that
had weaker prociency in their second than in their the decit may be greater if instruction takes place
rst language, so it is still possible that the effect in the weaker language. This, in fact, was the
was signaling a weakness in language competence. point that Macnamara was arguing in his early
In an interesting study, Spelke and Tsivkin (2001) studies on this issue. Therefore, research with chil-
trained bilinguals to perform new arithmetic oper- dren is required to establish whether bilingualism
ations in each of their languages and then tested has an impact on the development of mathematical
them in both languages. For computations involv- abilities.
ing accurate access to large numbers, performance Secada (1991) studied Hispanic children solving
was better in the language in which that problem word problems in both English and Spanish.
was trained, suggesting that the coding of that There were two main ndings. First, children
information was specic to the language. This ef- could solve the problems equally well in both lan-
fect even generalized to numerical information guages. Second, children who were more balanced
about time and space, indicating a general encod- in their language abilities for the two languages
ing process for quantities in which language is part demonstrated higher overall achievement in the
of the representation. These results extended the problem-solving tasks. He concluded that the
work of Frenck-Mestre and Vaid (1983) regarding problem-solving ability of the bilingual children
the language specicity of these operations. In ad- was equivalent to that of their monolingual peers.
dition, there was a main effect of language in Although his study did not include an explicit
which participants always performed better in their comparison with monolingual children solving the
rst language, replicating earlier work on this same problems, it showed that lower levels of lan-
problem. guage prociency did not interfere with the ability
The differences reported in these studies can of these children to solve the problems in their
also be found in the simplest numerical procedure, weaker language. Similarly, Morales, Shute, and
namely, counting. In a small-scale study in our Pellegrino (1985) hypothesized that if language
laboratory, we compared the speed with which prociency were not an issue, then bilingual chil-
bilingual adults could count forward and backward dren should perform just as well as monolingual
in their two languages. The participants were children on problem-solving tasks. In their study,
highly uent speakers of English and Portuguese. there were no differences between monolingual and
They were rst asked to recall a list of words in bilingual groups when math problems were pre-
each language to ensure some rough equivalence sented to each in the dominant language.
Consequences of Bilingualism 421
This conclusion is different from the one not speak to bilingualism so much as to the ne-
reached by Mestre (1988). He claimed that bilin- cessity for having sufcient language skills to carry
guals with mathematical skills comparable to out basic cognitive activities in any domain. Stud-
monolinguals tended to solve math word problems ies examining bilingual children in their stronger
incorrectly because of language deciency. His ar- language generally show no decit in acquiring
gument was based on studies with bilingual chil- mathematical concepts or solving mathematical
dren who were studying in English but for whom problems. These results show that bilingualism
English was their weaker language, a situation does not alter childrens ability to construct the
similar to that in which Macnamara (1966) pre- necessary mental representations for mathematics
dicted grave results for bilinguals. Mestre identied relative to monolinguals, but that problems framed
the diverse forms of language prociency that are in a verbal context that exceeds their linguistic
required to solve these problems, such as literacy, sophistication imposes a barrier to accessing those
vocabulary, and syntactic knowledge, and argued representations and interferes with performance. In
that all of them are compromised for bilingual that sense, language limitations weaken childrens
children. These results are different from those re- ability to learn concepts and to solve mathematical
ported by Secada, but the children in Mestres problems relative to monolinguals.
study were not as fully bilingual. In Secadas study, Prior to the time when arithmetic operations
the children were in bilingual education programs can be carried out, children must establish the
with most of their instruction conducted in English, concept of invariant quantity as a system of rela-
and English was the dominant language for most of tional meanings. These concepts include under-
the children at the time of the study. In Mestres standing various aspects of the number system and
study, the children lacked some minimal level of its operations, including rules for correspondence
competence in the weaker language to proceed and rules for counting. This knowledge develops
through the process of understanding and solving gradually as children piece together the system and
mathematical word problems. learn the symbolic and notational indicators of that
Comparisons in terms of rst and second or system. The primary principle that children must
stronger and weaker languages help to interpret the internalize is cardinality, the idea that numbers have
results when comparing monolinguals and bilin- quantitative signicance (Fuson, 1988; Gelman &
guals performing arithmetic tasks, but the language Gallistel, 1978; Wynn, 1992). If this concept is
itself also contributes importantly to the expla- learned differently by bilinguals and monolinguals,
nation. In a series of studies examining both chil- then that could set the stage on which further dis-
dren and adults who were Welsh-English bilinguals parities in mathematical ability could be built.
or English or Welsh monolinguals, Ellis (1992) This possibility was tested in a study of chil-
showed that the longer word names for numbers in drens understanding of cardinality using two
Welsh increased working memory demands and problems (Bialystok & Codd, 1997). In the towers
reduced the availability of working memory for task, we showed children piles of Lego blocks and
calculation. This effect of increased time needed to piles of Duplo blocks. The Duplo blocks are iden-
perform in Welsh was independent of the partici- tical to the Lego blocks except they are twice as
pants level of bilingualism. large on each dimension. We told children that
A general result from all these studies is that each block was an apartment that one family could
solving mathematical problems in a weak language live in, even though some apartments were big and
is more difcult for bilinguals than it is either for some were small. We were going to build apart-
monolinguals or for bilinguals in their strong lan- ment buildings out of the blocks, and they had to
guage. The effect is expressed as longer reaction count the apartments (blocks) and tell us which
times in adults and increased errors in children. building had more families living in it. Children
Some studies have shown that adult bilinguals were shown pairs of towers and were reminded
produce increased reaction time when solving these each time to count the blocks. The relevant trials
problems in both their languages, so there may also were those that compared a Lego tower and a
be some costs involved in having two systems to Duplo tower, but in which the higher Duplo tower
manipulate. But, the main nding is that weakness had fewer blocks. Height was a compelling, al-
in language prociency can affect the ability to though misleading, cue, and children needed to
carry out problem solving in other domains and ignore the height and report that the tower that
interfere with childrens ability to master these resulted in a higher number when counting was
problems. This is entirely reasonable, but it may the tower with more blocks. Children found this
422 Aspects and Implications of Bilingualism
difcult, but the bilingual children performed requires children to follow a simple rule to sort a
signicantly better than the monolinguals in their set of cards and then reverse that rule to sort the
ability to resist focusing on the height of the tower same cards in a different way. In a series of studies,
and attend only to the counting operation. Zelazo and his colleagues (Frye, Zelazo, & Palfai,
The second problem was the sharing task. Chil- 1995; Jacques, Zelazo, Kirkham, & Semcesen,
dren were shown two identical dolls and a set of 1999; Zelazo & Frye, 1997; Zelazo, Frye, &
candies that they were asked to divide equally be- Rapus, 1996) have demonstrated childrens failure
tween them. When the candies had been divided and to reverse a rule that has been established for
the child agreed that both dolls had the same number a particular set. In the task, children are shown a
of candies, they were asked to count the candies in container consisting of two sorting compartments,
the pile of the rst doll and then to say, without each indicated by a target stimulus, for example, a
counting, how many candies were in the second red square and a blue circle. They are then given a
dolls pile. Like the towers task, the problem re- set of cards containing instances of shapecolor
quired counting a small set of items and making combinations that reverse the pairings, in this case,
a statement about quantity based on the counting blue squares and red circles. Children are rst told
procedure. The difference was that the towers task to sort by one dimension, for example, color, and
contained misleading information that appeared to place all the blue squares in the compartment
give them the answer, but the sharing task did not. indicated by the blue circle and all the red circles
The sharing task was difcult, but both groups in the compartment indicated by the red square.
performed to the same level. Although these were the Children can perform this classication essentially
same children solving similar problems, the bilin- without errors. When they have completed that
gual advantage was found only for the towers task. phase, they are asked to re-sort the same cards by
Both the towers task and the sharing task are the opposite dimension, shape. In this case, the blue
based on the cardinal principle that the last number squares must be placed in the box indicated by the
counted indicates the quantity of the set. The differ- red square and the red circles must be placed in
ence between the problems is that the towers task the box indicated by the blue circle. The nding is
assesses this principle in the context of misleading that preschool children persist in sorting by the rst
information specically designed to distract the child dimension (color), continuing to place the blue
by presenting a plausible but incorrect alternative to squares with the blue circle, even though they are
the cardinal principle. Bilingual children were better reminded of the new rule on each trial. Bilingual
able than monolinguals to focus on the counting children, however, adapt to the new rule and solve
operation and not attend to the irrelevant height. this problem earlier than monolinguals (Bialystok,
In both these domains, bilingual children (and 1999; Bialystok & Martin, 2004).
adults) were equivalent to monolinguals on direct There are different possibilities for why children
assessments of mathematical ability. For problem perseverate on the rst set of rules. The explanation
solving, bilinguals were sometimes hampered by proposed by Zelazo and Frye (1997) is called the
inadequate linguistic competence and performed cognitive complexity and control theory. They ar-
less well or less efciently, especially when tested in gue that children cannot solve the problem until
their weaker language. For children learning basic they acquire sufciently complex rule systems and
arithmetic concepts, however, bilinguals performed reective awareness of those rules. According to this
better than monolinguals when the problem was interpretation, the task requires children to con-
presented in a misleading context. In this case, the struct complex embedded representations of rules in
bilingual children demonstrated superiority in their which instructions concerning specic dimensions
ability to focus attention and ignore misleading cues. are embedded under a more general representation
These attentional abilities translated into superior that classies the stimuli. The ability to switch the
performance on a test of basic quantitative concepts. sorting criterion depends on representing the rela-
tion between the dimensions in terms of the higher
order rule that unies the specic lower order rules.
Young children are unable to do this, and because
Task Switching and Concept they represent only the individual rules, they fail the
Formation task. By 5 years old, children have the ability to
represent a hierarchical structure and can pass the
A surprising but consistent decit in young chil- task, seeing the cards as, for example, simulta-
drens performance has been shown on a task that neously a red thing and a round thing.
Consequences of Bilingualism 423
There is no doubt that the representational de- phase requires that they inhibit those descriptions
mands of this task are difcult. Children must ap- so they can reinterpret the card in terms of the
preciate the dual nature of the sorting task and postswitch feature.
recognize that either dimension can be used as The conclusion from these studies is that the
a classication criterion. This explanation places primary difculty children face in the postswitch
much of the burden on the development of ade- phase of the card sort task is in ignoring the con-
quate representations of the problem. However, the tinued presence of the cue that indicated the rule for
task also imposes high demands on childrens the preswitch sorting and reinterpreting that target
ability to control selective attention: Children stimulus in a new way. If the obsolete feature from
must inhibit attention to a perceptual dimension that target stimulus is removed, children easily re-
that was previously valid and refocus on a different assign the values and sort correctly on the post-
aspect of the same stimulus display. switch phase (Bialystok & Martin, 2004, Study 2).
Our explanation for the difculty presented by In the standard version, however, the problem is
the problem and for the reason for the bilingual difcult because of its demands on control of at-
advantage comes from the need to selectively attend tention, and bilingual children consistently solve
to and recode specic display features. Children this problem earlier than comparable monolinguals.
code the target stimuli according to the rst rule
system, in this case, the red thing and the blue thing.
When the second rule system is explained, those Theory of Mind
descriptions become obsolete and must be revised,
recoding the targets as the square thing and the The nal example of a cognitive achievement that
round thing. Having already represented the targets may be differentially developed in monolingual and
in one way, however, it is difcult for children now bilingual children is one that has been intensively
to think of the items as a square thing and a round investigated in the past several years. Researchers
thing. This reinterpretation of the targets requires have been interested in the emergence of childrens
inhibition of their original values, and that is dif- understanding of theory of mind, the knowledge
cult because the colors remain perceptually present that beliefs, attitudes, and perceptions are con-
even though they are now irrelevant. structed by individual minds that have a particular
Two studies provided converging support for (literal or metaphorical) point of view (e.g., review
this interpretation of the primary source of dif- in Wellman, 1990). The breadth and pervasiveness
culty in this task. Typically, the experimenter of this understanding across cognitive domains
names each card when passing it to the child to be makes its development central to childrens intel-
sorted, but children persist in sorting it according lectual growth.
to the obsolete dimension. Kirkham, Cruess, and Explanations for childrens success on theory of
Diamond (2003) revised the procedure by requiring mind tasks at the age of about 4 years have varied.
the child to name each card before placing it into One view, called the theory theory, considers that
the sorting box. The modication produced sig- theory of mind is a holistic construct that exists
nicantly better performance, presumably by re- independent of other cognitive achievements and
directing childrens attention to the new relevant emerges with maturation (Astington, 1993; Perner,
feature. Furthermore, instructing children to place 1991). Other explanations take a more processing
the cards in the container face up instead of face view by considering the memory and executive
down as in the standard version made the task functioning demands built into these tasks and
more difcult as it increased childrens distraction demonstrate a correlation between success on these
to the obsolete feature. Similarly, Towse, Redbond, executive tasks and theory of mind problems
Houston-Price, and Cook (2000) presented a test (Carlson & Moses, 2001; Carlson, Moses, & Hix,
card to children who had made postswitch errors 1998; Hala & Russell, 2001; Hughes, 1998). In
and asked them to name the card. More than a reversal of that position, Perner, Stummer, and
half of these children described the card by naming Lang (1998) argued that it is competence with
the preswitch dimension; they continued to see the theory of mind that brings children to higher levels
card as a blue thing even though they had just been of executive functioning, thereby reversing the di-
taught the shape game. Both these studies indicate rection of putative causality.
that children persist in mentally encoding the cards In the standard paradigms for assessing theory
according to the description relevant in the pre- of mind, children are given information about a
switch phase. Correct performance in the postswitch situation or an object, the information is then
424 Aspects and Implications of Bilingualism
modied, and the child is required to predict The testing phase consisted of three questions.
whether another person, not present when the The rst two questions are called the appearance
amendments were described, would know the up- questions because they are based on the original
dated information. In situation-based tasks, a toy expectation of the object from rst looking at it:
is hidden in a location and then moved; in false What did you think this was when you rst saw it?
contents task, a container that is assumed to hold What will Tigger (a stuffed toy participating in the
one kind of item actually holds another; in ap- initial interaction but hidden during the revelation)
pearance-reality tasks, an object that looks to be think it is when we bring him back? The third
one thing turns out to be a different kind of thing. question is called the reality question because it
The question asked of the child is whether another assesses the actual identity of the item that is not
child who was not shown the truth about the lo- revealed by its outward appearance: What is it re-
cation, contents, or identity would know what the ally? There was no difference between responses to
correct values were. Children who fail the theory of the two appearance questions, so they were com-
mind task respond by saying that the novice child bined into a single score for appearance questions
would have full access to the information that the that was compared to performance on the reality
experimental child had and be able to answer the question.
questions properly. Monolingual and bilingual children exhibited
Although the modularized view of these abilities different patterns for the two questions. Both
is compelling and consistent with much evidence, groups performed the same on the appearance
the tasks nonetheless incorporate complex proces- question, but the bilingual children outperformed
sing demands. If bilingual children were precocious the monolinguals on the reality question. The an-
in the development of at least one of these com- swers to the appearance question are supported by
ponent processes, then it is possible that they would the continued presence of the objects during ques-
solve theory of mind tasks earlier than monolin- tioning. The reality question, in contrast, requires
guals. The kinds of tasks for which bilingual chil- children to go beyond the appearance of the display
dren have shown an advantage are those that and state the actual identity or function of the
include misleading information, a situation char- object. Because the appearance conicts with the
acteristic as well of these theory of mind tasks. The correct answer to the reality question, the solution
tasks are based on conict between two states requires children to ignore that appearance actively
real and altered, appearance and realityand the and to state what it is in spite of that misleading
child must understand which possible congura- perceptual exterior.
tion will provide the correct answer. The difculty On the theory of mind task used in this study,
is that the original state of the display, namely, the bilingual children outperformed monolinguals on
appearance of the object or the initial hiding place, questions that place high demands on the ability to
remains visible during the questioning, potentially control attention and inhibit misleading perceptual
misleading the child into the original response. information. Consistently, this is the kind of pro-
Therefore, children must resist basing their answers cess that bilingual children master earlier than their
on these previously correct and now-obsolete cues. monolingual peers.
Senman and I (Bialystok & Senman, 2004) ex-
amined this possibility in a study in progress using
appearance-reality tasks. Four items that appeared Bilingualism: Whats the
to be one thing but were found on inspection to Difference?
be something else were shown to monolingual and
bilingual children who were 4 years old. The four In the three examples of cognitive performance
items were an object that looked like a rock but described, there is no overall advantage that comes
was actually a sponge, a crayon box that had Legos to children who are bilingual. They do not display
inside instead of crayons, a plastic whale that was mathematical precocity and are compromised on
really a pen, and a plastic snowman that opened certain mathematical computations and problems
up and was really a book. Following the standard presented in their weaker language, they do not
procedure for these tasks, the experimenter show- demonstrate superior skill in monitoring and up-
ed the child each item and discussed what it looked dating classication problems, and they are not
like. When children agreed on the appearance of consistently more advanced than monolinguals in
each object, the experimenter revealed the actual establishing the basic concepts for theory of mind.
identity (or contents) of the item. However, in all three domains, problems in which
Consequences of Bilingualism 425
conicting information, especially perceptual in- of inhibition in young children and connected it to
formation, interferes with the correct solution and important changes in problem solving (Dagenbach
requires attention and effort to evaluate and ulti- & Carr, 1994; Dempster, 1992; Diamond, 2002;
mately ignore one of the options are solved better Diamond & Taylor, 1996; Harnishfeger & Bjork-
by bilinguals. lund, 1993). Inhibition is the essential factor in
This ability to inhibit attention to misleading distinguishing the performance of the bilingual
information constitutes a signicant processing children, so it may be that bilingualism exerts its
advantage, but other aspects of cognitive develop- effect primarily on the inhibition component of
ment are impaired for bilingual children. One attention.
prime area of consistent bilingual disadvantage is Inhibition and control of attention are carried
in receptive vocabulary. Bilingual children gener- out in the frontal lobes (Stuss, 1992). Patients with
ally score lower than respective monolinguals in damage to the frontal lobes experience difculty in
each of their languages. This result has been repli- tasks that require switching attention (e.g., Wis-
cated in almost every study that has compared consin Card Sorting Test) and selecting relevant
monolingual and bilingual children in the pre- features in the presence of distracting information
school and sometimes early school years (review in (e.g., Tower of London) (Burgess & Shallice, 1996;
Bialystok, 2001). It is this weak competence in the Kimberg, DEsposito, & Farah, 1997; Luria, 1966;
language of schooling that led Macnamara (1966) Perrett, 1974). Even automated tasks, like Stroop
to caution that bilingual children were disadvan- tests, are difcult for these patients because they
taged both educationally and cognitively, and it have inadequate control over their attention to the
was undoubtedly this compromised verbal pro- irrelevant features of the Stroop stimuli, normally
ciency that was responsible for his conclusion that the color word. This performance prole is the
bilingualism impaired childrens ability to solve reverse of that obtained with bilingual children:
mathematical word problems. What is difcult for frontal patients develops early
However, as subsequent research showed, bi- for bilingual children.
lingual and monolingual children who were equa- Cognitive control of attention declines in heal-
ted for language ability solved mathematical thy older adults with normal aging. Hasher and
problems to exactly the same level of competence. Zacks (1988) elaborated a model of attention that
In many domains, therefore, bilingual children de- includes both the excitatory mechanisms that are
velop cognitive skills in the same manner and on triggered by environmental stimuli and the inhibi-
the same schedule as do monolinguals. Although tory mechanisms that are required to suppress the
this may not seem to be newsworthy, early proc- activation of extraneous information. Without ad-
lamations of the debilitating effect of bilingualism equate inhibition, working memory becomes clut-
on childrens development are safely eradicated by tered with irrelevant information and decreases the
the declaration that the bilingualism might instead efciency of cognitive processing (Hasher, Zacks,
have no effect at all on childrens development. & May, 1999). Dempster (1992) proposed a sim-
What is signicant about the bilingual advan- ilar description but described the rise and fall of
tage in resolving conicting information is its per- these inhibitory processes over the entire lifespan
sistence across verbal and nonverbal domains of rather than just their decline with aging. The con-
problem solving. This selectivity of attention is sequence of aging in these views is that older adults
an aspect of executive functioning that develops have less control over the contents of working
gradually through childhood. Tipper and his col- memory than do younger adults, a situation that
leagues (Tipper, Bourque, Anderson, & Brehaut, is functionally similar to the difference between
1989; Tipper & McLaren, 1990) argued that at- monolingual and bilingual children solving prob-
tention is comprised of independent and indepen- lems based on selective attention.
dently developing components. Three of these Duncan (1996) used selective attention and in-
components are inhibition, selection, and habitua- hibitory control to integrate research from several
tion. Two of them, selection and habituation, are areas of cognitive processing. He demonstrated
as well formed in childhood and function for chil- that the effects of frontal lobe lesions, differences in
dren essentially the same as they do for adults. intelligence (dened by g, the measure of general
In contrast, inhibition develops slowly, changing intelligence proposed by Spearman, 1927), and
childrens performance as it emerges and imposing divided attention are evidence of the same pro-
a measure of selectivity on their behavior. Other cesses that distinguish between active or passive
researchers also have documented the development control of attention. These processes are situated in
426 Aspects and Implications of Bilingualism
the frontal lobes, making the frontal structures the demonstrated shared representations that are mu-
seat of highly generalized forms of intelligence. tually active during language processing in either
This analysis supports the association between the language.
processes that are enhanced for bilingual children Neuroimaging studies of language processing in
and the processes that are damaged through frontal bilinguals provide a unique perspective on this issue
lobe injury and decline with normal aging. More- by attempting to identify the regions of cortical
over, these processes are central to general concepts activation. Studies by Chee, Tan, and Thiel (1999)
of intelligence as measured by standardized tests. and Illes et al. (1999) using functional magnetic
This line of reasoning that includes intelligence, or resonance imaging (fMRI), Klein and colleagues
g, in the equation potentially carries profound im- (Klein, Milner, Zatorre, Zhao, & Nikelski, 1999;
plications for claims about the effect of bilingual- Klein, Zatorre, Milner, Meyer, & Evans, 1995)
ism on intelligence, but such conclusions are vastly using positron emission tomography, and Pouratian
premature because any relevant or detailed re- et al. (2000) using intraoperative optical imaging of
search examining the logical steps in this argument intrinsic signals found no disparity in the activated
does not exist. But, bilingualism clearly alters spe- regions when performing tasks in either the rst or
cic cognitive processes that are part of the un- second language (although Pouratian et al. did also
derpinnings to this broader view of intelligence. nd some areas unique to each language in a
Why would bilingualism have this effect? naming task). Conversely, studies by Kim, Relkin,
Current research on the organization of two Lee, and Hirsch (1997) and Dehaene et al. (1997)
languages in the mind of adult bilinguals shows using fMRI found some evidence of separate acti-
convincingly that both languages remain active vation when using each of the languages, at least for
during language processing in either language. This some bilinguals on some kinds of tasks. Again, part
view is in contrast to earlier models that posited a of the conict can be attributed to the level of pro-
switch that activated only the relevant language ciency in the second language (e.g., Perani et al.,
(Macnamara & Kushnir, 1971). Evidence for 1998). As in the behavioral studies, high prociency
shared processing comes from both psycholinguis- in both languages was associated with more com-
tic and neuroimaging studies. Psycholinguistic plete overlap in the processing regions.
models differ on whether the word level of repre- If two languages are mutually active (psycho-
sentation for the two languages is separate (Brauer, linguistic evidence) and share common representa-
1998; Durgunoglu & Roediger, 1987; Van Hell & tional regions (neuroimaging evidence), then a
de Groot, 1998) or common (Chen & Ng, 1989; mechanism is required to keep them functionally
Francis, 1999a; Grainger, 1993; Guttentag, Haith, distinct. Without procedures for separating the
Goodman, & Hauch, 1984; Hermans, Bongaerts, languages, any use of one language would evoke
de Bot, & Schreuder, 1998) but agree that these unwanted intrusions from the other. Green (1998)
lexical representations are connected through a addressed this question with a model based on in-
common conceptual system (review in Smith, hibitory control, an executive system for activating
1997). Some of the contradiction between the po- or inhibiting linguistic representations (lemmas).
sitions on how words are represented is resolved The model has three components: a hierarchy of
when prociency levels are included in the analysis language task schemas, lexical representations, and
(Francis, 1999b; Kroll & De Groot, 1997; Kroll & a selection mechanism based on inhibition. A reg-
Stewart, 1994). Higher levels of prociency in the ulatory system, modeled after Shallices (1988)
second language produce lexical-semantic (con- supervisory attentional system, controls levels of
ceptual) congurations that more closely resemble activation by regulating the language task schemas.
those constructed in the rst language, whereas This makes the model responsive to the demands of
second languages with low prociency levels re- each individual situation. The task schemas deter-
quire mediation of the rst language. In fact, below mine output by controlling the activation levels of
some threshold of prociency, it becomes debatable the competing responses from the two languages
whether the individuals are bilinguals or second and inhibiting the lemmas that belong to the lan-
language learners. The research described above guage incorrect for that situation. The basic notion
examining cognitive consequences of bilingualism is that each of a bilinguals two languages can be
considers only bilinguals who are reasonably pro- described on a continuum of activation in a specic
cient in both languages, thereby assuming some context (cf., Grosjean, 1997; Paradis, 1997) and
approach to balanced prociency. This is the situ- not through a binary switch as earlier models had
ation, then, for bilinguals in studies that have posited.
Consequences of Bilingualism 427
The central mechanism of this model is inhibi- ability of bilinguals to prevent interference from
tion of competing lexical representations. Green the other language through an inhibition mecha-
(1998) cited evidence showing that positron emis- nism in regions of the frontal lobes.
sion tomographic studies of translation indicated If the inhibitory control model of Green is cor-
increased activity in the anterior cingulate, an area rect, then bilingualism by its very nature results in
activated during Stroop tasks and associated with greater use of inhibitory control because it is in-
the inhibition of prepotent responses (Posner & voked every time language is used. Bilingual chil-
DiGirolamo, 2000), whereas comparable scans of dren therefore experience extensive practice of this
performance while reading (but not translating) did executive function in the rst few years of life, at
not invoke activity in this area. This pattern was least once both languages are known to a sufcient
conrmed in a study by Price, Green, and von level of prociency to offer viable processing sys-
Studnitz (1999), who showed separate brain acti- tems. If this practice in inhibiting linguistic pro-
vation patterns for switching and translating, with cessing carries over to processing in disparate
translating again activating the anterior cingulate. cognitive domains, then bilinguals should be more
Greens explanation depends on accepting in- able than monolinguals to perform tasks that
hibition as the primary mechanism for negotiating require the inhibition of irrelevant information
the language used in specic contexts, and inde- (see Meuter, chapter 17, and Michael & Gollan,
pendent evidence has supported the plausibility chapter 19, this volume, for related discussion con-
of this interpretation. Juncos-Rabadan (1994) and cerning adult bilingual performance).
Juncos-Rabadan and Iglesias (1994) showed that The prefrontal cortex is the last brain area to
language deterioration in the elderly is attributable mature in development, a possible reason that
to declines in attentional abilities, and that bilin- many of the tasks that require switching atten-
guals suffer loss in attentional processing on both tion or ignoring conicting information are difcult
their languages. They attributed the changes to for young children to solve. The bilingual experi-
problems with inhibition mechanisms and demon- ence of negotiating two language representations,
strated these processing changes occurred equally switching attention between them on a constant
with both languages in bilinguals. basis, and selecting subtle features of linguistic in-
Studies by Hernandez and colleagues, also ex- put to guide performance in choosing the correct
amining aging and bilingualism, provided further response language may accelerate the development
evidence for the role of inhibitory control mecha- of the responsible cortical areas. Thus, bilingualism
nisms in language processing for bilinguals (Her- may provide the occasion for a more rapid devel-
nandez, Dapretto, Mazziotta, & Bookheimer, opment of an essential cortical center, and the
2001; Hernandez & Kohnert, 1999; Hernandez, consequence of that development inuences a wide
Martinez, & Kohnert, 2000). They presented older range of cognitive activities.
and younger Spanish-English bilinguals with a This explanation is based on the assumption
switching task in which the participant was re- that cortical organization is plastic and that it can
quired to name simple line drawings in one or the be altered with experience. Both presumptions are
other language. A cue preceding each trial indi- supported in neuroscience research. Studies by Re-
cated the language in which the response was re- canzone, Merzenich, Jenkins, Grajski, and Dinse
quired. The interesting results came from mixed (1992) comparing nger sensitivity in monkeys that
block presentations in which the two languages did or did not receive a stimulating learning experi-
were combined into a single block, requiring rapid ence and by Ebert, Pantev, Wienbruch, Rockstroth,
monitoring and switching between languages. and Taub (1995) comparing nger sensitivity in
These conditions were more difcult for the older violin players and nonmusicians reported cortical
bilinguals than the younger ones, as evidenced by a reorganization and enhancement in the represen-
signicant increase in the reaction time. More in- tation area responsible for those ngers. In both
teresting, however, is that an fMRI study of a small cases, an environmental experience that offered
number of (young) bilingual individuals perform- massive practice in an activity resulted in a reor-
ing this task showed that switching between lan- ganization of a signicant cortical region.
guages was accompanied by activation in the In rehabilitation research, Taub (2001) has been
dorsolateral prefrontal cortex, an area involved in successful in reestablishing motor control in areas
task switching and control of attention. Finally, a paralyzed through stroke. Patients who lose control
study by Rodriguez-Fornells, Rotte, Heinze, Nos- over some area, for example, an arm, have the
selt, and Munte (2002) used fMRI to locate the spared arm immobilized and are trained to use the
428 Aspects and Implications of Bilingualism
paralyzed arm through massive practice, an expe- exibility of thought, a conclusion shared by others
rience that results in the transfer of motor control as well (cf. MacNab, 1979). The usual explanation
for that arm to an undamaged cortical region. Bi- for this advantage is that having two linguistic
lingualism may provide another example of this systems and two names for things endows bilin-
kind of reorganizational process. The environ- guals with the capacity to see things from different
mental experience of using two languages from perspectives, in both aspects, and switch between
childhood provides massive practice in the atten- these designations. For example, creativity tasks,
tion and inhibition centers of the prefrontal cortex such as requiring the participant to generate un-
and promotes their development. usual uses for common objects, requires individuals
to suppress the usual use or appearance of these
objects, freeing oneself to entertain alternatives.
The Bilingual Impact The nonverbal tests in which Peal and Lamberts
(1962) bilinguals excelled all required a degree of
Speculations about the manner in which bilin- manipulation as opposed to more straightforward
gualism may inuence cognitive functioning are concept formation or computation. These measures
rarely couched in terms of detailed processes like are aspects of uid intelligence. Moreover, they
control of attention and inhibition. Instead, the de- frequently require the ability to ignore misleading
scriptions are pitched at the level of overall intelli- information, such as the usual use of a common
gence, claiming enhancements (e.g., Peal & Lambert, object, to attend to a subtler feature and propose a
1962) or decits (e.g., Saer, 1923), but broad in their novel function. If indeed bilinguals perform these
implications. How can the processing descriptions tasks better than monolinguals, it would be attrib-
proposed here be reconciled with the claims made by utable to precisely the same processes that ensured
these more global views? their advantage in the other executive function tasks
In an early description of intelligence, Cattell described throughout. In that sense, creativity may
(1963) distinguished between uid and crystallized indeed be an indirect beneciary of bilingualism,
forms. Fluid intelligence declines with aging and is at least in the way it is assessed on psychological
correlated with a range of frontal tasks (Kray & tests.
Lindenberger, 2000; Salthouse, Fristoe, McGuthry, Bilingualism changes something fundamental
& Hambrick, 1998). In contrast, crystallized intel- about the way cognitive processes are shaped by
ligence remains relatively stable across the lifespan, young children. How extensive these changes are in
if anything increasing with the accretion of knowl- either cognitive space or developmental time are
edge and experience, and does not correlate with questions that are still under investigation. Even if
those tasks that demand online processing and at- these advantages prove to be more transient or
tention. In Duncans (1996) model, described in the more fragile than some of the more optimistic
section Bilingualism: Whats the Difference?, he data suggest they are, their role in discarding old
posits a relation between g and performance in a fears that bilingualism confuses children and re-
variety of frontal tasks, but it is possible that his tards their intellectual growth has been a worthy
equation could be made more precise by considering outcome.
only uid intelligence rather than the commonality
across all forms of intellectual assessment. If bilin-
Acknowledgment
gualism has an impact on a general form of intelli-
gence, then based on the performance on specic The preparation of this chapter and the research
tasks, it is likely that the impact is conned to uid reported in it was supported by a grant from the
intelligence, those aspects of performance most de- Natural Sciences and Engineering Research Coun-
pendent on executive control. There is, of course, no cil of Canada.
evidence that bilingualism does affect intelligence.
The claim here is more simply that the specic cog- References
nitive processes that do appear to be enhanced by
Astington, J. W. (1993). The childs discovery of
bilingualism would most likely have an impact on
the mind. Cambridge, MA: Harvard Univer-
only one aspect of general intelligence, namely, uid sity Press.
intelligence. Bain, B. C. (1974). Bilingualism and cognition:
The most general aspect of cognition that Toward a general theory. In S. T. Carey (Ed.),
Peal and Lambert (1962) identied as the locus Bilingualism, biculturalism, and education:
of bilingual inuence was creative thinking and Proceedings from the conference at College
Consequences of Bilingualism 429
Universitaire Saint Jean (pp. 119128). Darcy, N. T. (1963). Bilingualism and the mea-
Edmonton, Canada: University of Alberta. surement of intelligence: A review of a decade
Ben-Zeev, S. (1977). The inuence of bilingualism of research. Journal of Genetic Psychology,
on cognitive strategy and cognitive develop- 103, 259282.
ment. Child Development, 48, 10091018. Dehaene, S., Dupoux, E., Mehler, J., Cohen, L.,
Bialystok, E. (1999). Cognitive complexity and Paulesu, E., Perani, D., et al. (1997). Ana-
attentional control in the bilingual mind. tomical variability in the cortical representa-
Child Development, 70, 636644. tion of first and second language.
Bialystok, E. (2001). Bilingualism in development: NeuroReport, 8, 38093815.
Language, literacy, and cognition. New York: Dempster, F. N. (1992). The rise and fall of the
Cambridge University Press. inhibitory mechanism: Toward a unied
Bialystok, E. (2002). Acquisition of literacy in bi- theory of cognitive development and aging.
lingual children: A framework for research. Developmental Review, 12, 4575.
Language Learning, 52, 159199. Diamond, A. (2002). Normal development of pre-
Bialystok, E., & Codd, J. (1997). Cardinal limits frontal cortex from birth to young adulthood:
evidence from language awareness and bilin- Cognitive functions, anatomy, and biochem-
gualism for developing concepts of number. istry. In D. T. Stuss & R. T. Knight (Eds.),
Cognitive Development, 12, 85106. Principles of frontal lobe function (pp. 466
Bialystok, E., & Martin, M. (2004). Attention 503). London: Oxford University Press.
and inhibition in bilingual children: Evidence Diamond, A., & Taylor, C. (1996). Development
from the dimensional change card sort task. of an aspect of executive control: Develop-
Developmental Science, 7, 325339. ment of the abilities to remember what I said
Bialystok, E., & Senman, L. (2004). Executive and to do as I say, not as I do. Develop-
processes in appearancereality tasks: The role mental Psychology, 29, 315334.
of inhibition of attention and symbolic repre- Duncan, J. (1996). Attention, intelligence, and
sentation. Child Development, 75, 562579. the frontal lobes. In M. Gazzaniga (Ed.),
Brauer, M. (1998). Stroop interference in bilin- The cognitive neurosciences (pp. 721733).
guals: The role of similarity between the two Cambridge, MA: MIT Press.
languages. In A. F. Healy & L. E. Bourne, Jr. Duncan, S. E., & De Avila, E. A. (1979). Bilin-
(Eds.), Foreign language learning: Psycholin- gualism and cognition: Some recent ndings.
guistic studies on training and retention (pp. Journal of the National Association for Bilin-
317337). Mahwah, NJ: Erlbaum. gual Education, 4, 1550.
Burgess, P. W., & Shallice, T. (1996). Response Durgunoglu, A. Y., & Roediger, H. L. (1987). Test
suppression, initiation and strategy use fol- differences in accessing bilingual memory.
lowing frontal lobe lesions. Neuropsychologia, Journal of Memory and Language, 26, 377391.
34, 263272. Ebert, T., Pantev, C., Wienbruch, C., Rockstroth,
Carlson, S. M., & Moses, L. J. (2001). Individual B., & Taub, E. (1995). Increased cortical
differences in inhibitory control and childrens representation of the ngers of the left hand in
theory of mind. Child Development, 72, string players. Science, 270, 305306.
10321053. Ellis, N. C. (1992). Linguistic relativity revisited:
Carlson, S. M., Moses, L. J., & Hix, H. R. (1998). The bilingual word-length effect in working
The role of inhibitory control in young chil- memory during counting, remembering num-
drens difculties with deception and false bers, and mental calculation. In R. J. Harris
belief. Child Development, 69, 672691. (Ed.), Cognitive processing in bilinguals (pp.
Cattell, R. B. (1963). Theory of fluid and crystal- 137155). Amsterdam: North-Holland.
lized intelligence: A critical experiment. Jour- Francis, W. S. (1999a). Analogical transfer of
nal of Educational Psychology, 54, 122. problem solutions within and between lan-
Chee, M. W. L., Tan, E. W. L., & Thiel, T. (1999). guages in Spanish-English bilinguals. Journal
Mandarin and English single word processing of Memory and language, 40, 301329.
studied with functional magnetic resonance Francis, W. S. (1999b). Cognitive integration of
imaging. Journal of Neuroscience, 19, language and memory in bilinguals; semantic
30503056. representation. Psychological Bulletin, 125,
Chen, H. C., & Ng, N. L. (1989). Semantic 193222.
facilitation and translation priming effects in Frenck-Mestre, C., & Vaid, J. (1993). Activation
Chinese-English bilinguals. Memory and of number facts in bilinguals. Memory and
Cognition, 18, 279288. Cognition, 21, 809818.
Dagenbach, D., & Carr, T. (1994). Inhibitory Frye, D., Zelazo, P. D., & Palfai, T. (1995). Theory
processes in attention, memory, and language. of mind and rule based reasoning. Cognitive
New York: Academic Press. Development, 10, 483527.
430 Aspects and Implications of Bilingualism
Fuson, K. C. (1988). Childrens counting and performance: Interaction of theory and appli-
concepts of number. New York: Springer- cation (pp. 653675). Cambridge, MA: MIT
Verlag. Press.
Geary, D. C., Cormier, P., Goggin, J. P., Estrada, Hermans, D., Bongaerts, T., De Bot K., &
P., & Lunn, M. C. E. (1993). Mental Schreuder, R. (1998). Producing words in a
arithmetic: A componential analysis of foreign language: Can speakers prevent inter-
speed-of-processing across monolingual, weak ference from their rst language? Bilingualism:
bilingual, and strong bilingual adults. Inter- Language and Cognition, 1, 213229.
national Journal of Psychology, 28, 185201. Hernandez, A. E., Dapretto, M., Mazziotta, J., &
Gelman, R., & Gallistel, C. R. (1978). The childs Bookheimer, S. (2001). Language switching
understanding of number. Cambridge, MA: and language representation in Spanish-
Harvard University Press. English bilinguals: An fMRI study.
Gould, S. J. (1981). The mismeasure of man. New NeuroImage, 14, 510520.
York: Norton. Hernandez, A. E., & Kohnert, K. J. (1999). Aging
Grainger, J. (1993). Visual word recognition in and language switching in bilinguals. Aging,
bilinguals. In R. Schreuder & B. Weltens Neuropsychology, and Cognition, 6, 6983.
(Eds.), The bilingual lexicon (pp. 1125). Hernandez, A. E., Martinez, A., & Kohnert, K.
Amsterdam: Benjamins. (2000). In search of the language switch: An
Green, D. W. (1998). Mental control of the bilin- fMRI study of picture naming in Spanish-English
gual lexico-semantic system. Bilingualism: bilinguals. Brain and Language, 73, 421431.
Language and Cognition, 1, 6781. Hughes, C. (1998). Executive function in pre-
Grosjean, F. (1997). Processing mixed languages: schoolers: Links with theory of mind and
Issues, ndings, and models. In A. M. B. de verbal ability. British Journal of Develop-
Groot & J. F. Kroll (Eds.), Tutorials in bilin- mental Psychology, 16, 233253.
gualism: Psycholinguistic perspectives (pp. Illes, J., Francis, W. S., Desmond, J. E., Gabrieli, J.
225254). Mahwah, NJ: Erlbaum. D. E., Glover, G. H., Poldrack, R., et al.
Guttentag, R. E., Haith, M. M., Goodman, G. S., (1999). Convergent cortical representation of
& Hauch, J. (1984). Semantic processing of semantic processing in bilinguals. Brain and
unattended words in bilinguals: A test of the language, 70, 347363.
input switch mechanism. Journal of Verbal Jacques, S., Zelazo, P. D., Kirkham, N. Z., &
Learning and Verbal Behavior, 23, 178188. Semcesen, T. K. (1999). Rule selection versus
Hakuta, K. (1986). Mirror of language: The debate rule execution in preschoolers: An error-
on bilingualism. New York: Basic Books. detection approach. Developmental Psychol-
Hakuta, K., Ferdman, B. M., & Diaz, R. (1987). ogy, 35, 770780.
Bilingualism and cognitive development: three Juncos-Rabadan, O. (1994). The assessment of
perspectives. In S. Rosenberg (Ed.), Advances bilingualism in normal aging with the Bilin-
in applied psycholinguistics: Reading, writing, gual Aphasia Test. Journal of Neurolinguis-
and language learning (Vol. 2, pp. 284319). tics, 8, 6773.
New York: Cambridge University Press. Juncos-Rabadan, O., & Iglesias, F. J. (1994). De-
Hala, S., & Russell, J. (2001). Executive control cline in the elderlys language: Evidence from
with strategic deception: A window on early cross-linguistic data. Journal of Neurolinguis-
cognitive development? Journal of Experi- tics, 8, 183190.
mental Child Psychology, 80, 112141. Kim, K. H. S., Relkin, N., Lee, K., & Hirsch, J.
Harnishfeger, K. K., & Bjorklund, D. F. (1993). (1997). Distinct cortical areas associated with
The ontogeny of inhibition mechanisms: A native and second languages. Nature, 388,
renewed approach to cognitive development. 171174.
In R. Pasnak & M. Howe (Eds.), Emerging Kimberg, D. Y., DEsposito, M., & Farah, M. J.
themes in cognitive development (Vol. 1, pp. (1997). Effects of bromocriptine on human
2849). New York: Springer-Verlag. subjects depend on working memory capacity.
Hasher, L., & Zacks, R. T. (1988). Working Neuroreport, 8, 35813585.
memory, comprehension, and aging: A review Kirkham, N., Cruess, L., & Diamond, A. (2003).
and a new view. In G. H. Bower (Ed.), The Helping children apply their knowledge to
psychology of learning and motivation (Vol. their behavior on a dimension-switching task.
22, pp. 193225). San Diego, CA: Academic Developmental Science, 6, 449476.
Press. Klein, D., Milner, B., Zatorre, R. J., Zhao, V., &
Hasher, L., Zacks, R. T., & May, C. P. (1999). Nikelski, J. (1999). Cerebral organization in
Inhibitory control, circadian arousal, and age. bilinguals: A PET study of Chinese-English
In D. Gopher & A. Koriat (Eds.), Attention verb generation. NeuroReport, 10,
and performance, 17: Cognitive regulation of 28412846.
Consequences of Bilingualism 431
Klein, D., Zatorre, R. J., Milner, B., Meyer, E., & D. Shapiro (Eds.), Consciousness and self-
Evans, A. C. (1995). The neural substrates of regulation (Vol. 4, pp. 118). New York:
bilingual language processing: evidence from Plenum Press.
positron emission tomography. In M. Paradis Paradis, M. (1997). The cognitive neuropsychology
(Ed.), Aspects of bilingual aphasia (pp. 2336). of bilingualism. In A. M. B. de Groot & J. F.
Oxford, U.K.: Pergamon. Kroll (Eds.), Tutorials in bilingualism: Psy-
Kray, J., & Lindenberger, U. (2000). Adult age cholinguistic perspectives (pp. 331354).
differences in task switching. Psychology and Mahwah, NJ: Erlbaum.
Aging, 15, 126147. Peal, E., & Lambert, W. (1962). The relation of
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical bilingualism to intelligence. Psychological
and conceptual memory in the bilingual: Monographs, 76(546), 123.
Mapping form to meaning in two languages. Perani, D., Paulesu, E., Galles, N. S., Dupoux, E.,
In A. M. B. de Groot & J. F. Kroll (Eds.), Dehaene, S., Bettinardi, V., et al. (1998). The
Tutorials in bilingualism (pp. 169199). bilingual brain: Prociency and age of acqui-
Mahwah, NJ: Erlbaum. sition of the second language. Brain, 121,
Kroll, J. F., & Stewart, E. (1994). Category inter- 18411852.
ference in translation and picture naming: Perner, J. (1991). Understanding the representa-
Evidence for asymmetric connections between tional mind. Cambridge, MA: MIT Press.
bilingual memory representations. Journal of Perner, J., Stummer, S., & Lang, B. (1998). Exec-
Memory and Language, 33, 149174. utive functions and theory of mind: Cognitive
Luria, A. R. (1966). Higher cortical functions in complexity or functional dependence? In P. D.
man. London: Tavistock. Zelazo, J. W. Astington, & D. R. Olson (Eds.),
MacNab, G. L. (1979). Cognition and bilingual- Developing theories of intention: Social un-
ism: A reanalysis of studies. Linguistics, 17, derstanding and self-control (pp. 133152).
231255. Mahwah, NJ: Erlbaum.
Macnamara, J. (1966). Bilingualism and primary Perrett, E. (1974). The left frontal lobe of man and
education. Edinburgh, U.K.: Edinburgh Uni- the suppression of habitual responses in verbal
versity Press. categorical behavior. Neuropsychologia, 12,
Macnamara, J. (1967). The effect of instruction in 323330.
a weaker language. Journal of Social Issues, Pinker, S. (1994). The language instinct. New
23, 121135. York: Morrow.
Macnamara, J., & Kushnir, S. (1971). Linguistic Posner, M. I., & DiGirolamo, G. J. (2000). Exec-
independence of bilinguals: The input switch. utive attention: Conict, target detection, and
Journal of Verbal Learning and Verbal cognitive control. In R. Parasuraman (Ed.),
Behavior, 10, 480487. The attentive brain (pp. 401423). Cam-
Magiste, E. (1980). Arithmetical calculations in bridge, MA: MIT Press.
monolinguals and bilinguals. Psychological Pouratian, N., Bookheimer, S. Y., OFarrell, A. M.,
Research, 42, 363373. Sicotte, N. L., Cannestra, A. F., Becker, D.,
Marsh, L. G., & Maki, R. H. (1976). Efciency of et al. (2000). Optical imaging of bilingual
arithmetic operations in bilinguals as a func- cortical representations. Journal of Neurosur-
tion of language. Memory and Cognition, 4, gery, 93, 676681.
459464. Price, C. J., Green, D. W., & von Studnitz, R.
McClain, L., & Huang, J. Y. S. (1982). Speed of (1999). A functional imaging study of trans-
simple arithmetic in bilinguals. Memory and lation and language switching. Brain, 122,
Cognition, 10, 591596. 22212235.
Mestre, J. P. (1988). The role of language com- Recanzone, G. H., Merzenich, M. M., Jenkins, W.
prehension in mathematics and problem solv- M., Grajski, K. A., & Dinse, H. R. (1992).
ing. In R. R. Cocking & J. P. Mestre (Eds.), Topographic reorganization of the hand rep-
Linguistic and cultural inuences on learning resentation in cortical area 3b of owl monkeys
mathematics (pp. 201220). Hillsdale, NJ: trained in a frequency-discrimination task.
Erlbaum. Journal of Neurophysiology, 67, 10311056.
Morales, R. V., Shute, V. J., & Pellegrino, J. W. Reynolds, A. G. (1991). The cognitive conse-
(1985). Developmental differences in under- quences of bilingualism. In A. G. Reynolds
standing and solving simple mathematics word (Ed.), Bilingualism, multiculturalism, and
problems. Cognition and Instruction, second language learning: The McGill con-
2, 4157. ference in honour of Wallace E. Lambert (pp.
Norman, D. A., & Shallice, T. (1986). Attention in 145182). Hillsdale, NJ: Erlbaum.
action: Willed and automatic control of be- Ricciardelli, L. A. (1992). Bilingualism and cogni-
havior. In R. J. Davidson, G. E. Schwartz, & tive development in relation to threshold
432 Aspects and Implications of Bilingualism
theory. Journal of Psycholinguistic Research, Tipper, S. P., Bourque, T. A., Anderson, S. H., &
21, 301316. Brehaut, J. C. (1989). Mechanisms of atten-
Rodriguez-Fornells, A., Rotte, M., Heinze, H.-J., tion: A developmental study. Journal of
Nosselt, T. M., & Munte, T. F. (2002). Brain Experimental Child Psychology, 48, 353378.
potential and functional MRI evidence for Tipper, S. P., & McLaren, J. (1990). Evidence for
how to handle two languages with one brain. efcient visual selectivity in children. In J. T.
Nature, 415, 10261029. Enns (Ed.), The development of attention:
Saer, D. J. (1923). The effects of bilingualism on Research and theory (pp. 197210). New
intelligence. British Journal of Psychology, York: North-Holland.
14, 2538. Torrance, E. P., Wu, J. J., Gowan, J. C., & Aliotti,
Salthouse, T. A., Fristoe, N. M., McGuthry, K. E., N. C. (1970). Creative functioning of
& Hambrick, D. Z. (1998). Relation of task monolingual and bilingual children in Singa-
switching to speed, age, and uid intelligence. pore. Journal of Educational Psychology, 61,
Psychology and Aging, 13, 445461. 7275.
Secada, W. G. (1991). Degree of bilingualism and Towse, J. N., Redbond, J., Houston-Price, C. M.
arithmetic problem solving in Hispanic first T., & Cook, S. (2000). Understanding the
graders. The Elementary School Journal, 92, dimensional change card sort: Perspectives
213-231. from task success and failure. Cognitive
Shallice, T. (1988). From neuropsychology to Development, 15, 347365.
mental structure. Cambridge, U.K.: Cam- Van Hell, J. G., & De Groot, A. M. B. (1998).
bridge University Press. Conceptual representation in bilingual mem-
Smith, M. C. (1997). How do bilinguals access ory: Effects of concreteness and cognate status
lexical information? In A. M. B. de Groot & in word association. Bilingualism: Language
J. F. Kroll (Eds.), Tutorials in bilingualism (pp. and Cognition, 1, 193211.
145168). Mahwah, NJ: Erlbaum. Wellman, H. M. (1990). Childs theory of mind.
Spearman, C. (1927). The abilities of man. New Cambridge, MA: MIT Press.
York: Macmillan. Wynn, K. (1992). Childrens acquisition of
Spelke, E. S., & Tsivkin, S. (2001). Language and the number words and the counting system.
number: A bilingual training study. Cognition, Cognitive Psychology, 24, 220251.
78, 4588. Zelazo, P. D., & Frye, D. (1997). Cognitive
Stuss, D. T. (1992). Biological and psychological complexity and control: A theory of the de-
development of executive functions. Brain and velopment of deliberate reasoning and inten-
Cognition, 20, 823. tional action. In M. Stamenov (Ed.), Language
Taub, E. (2001, April). Adult brain plasticity: structure, discourse, and the access to
Unlearning paralysis. Paper presented at the consciousness (pp. 113153). Amsterdam:
symposium Beyond the Myth: Experience Benjamins.
Expectant and Experience Dependent Zelazo, P. D., Frye, D., & Rapus, T. (1996). An
Brain Plasticity at the biennial meeting of age-related dissociation between knowing
the Society for Research in Child Develop- rules and using them. Cognitive Development,
ment, Minneapolis, MN. 11, 3763.
Aneta Pavlenko
21
Bilingualism and Thought
ABSTRACT This chapter discusses the implications of recent theoretical and empirical
investigations in linguistic relativity for the study of bilingualism. It starts with a dis-
cussion of new developments in the study of the Sapir-Whorf hypothesis and then offers
a framework for the study of bilingualism and thought from a neo-Whoran per-
spective. Subsequently, it outlines nine areas in which current empirical inquiry either
illuminates thought processes of adult bi- and multilingual individuals or offers pro-
ductive directions for future studies of bilingualism and thought. The chapter ends with
a discussion of ways in which research with bilingual individuals can offer unique
contributions to the study of linguistic relativity and to the understanding of the in-
teraction between language and thought.
433
434 Aspects and Implications of Bilingualism
same reality: They not only differ from each This approach to conceptual representation
other, but also consist of multiple discourses asso- recognizes that concepts are based on linguistic
ciated with various contexts. and perceptual bases and distinguishes between
This approach was found to be productive in the language-based (or language-related) concepts and
study of language and space, previously explored concepts not immediately linked to language, for
exclusively through the lenses of structural rela- which language users may have a mental repre-
tivity. Pederson (1995) found that rural and urban sentation but no specic linguistic means of en-
speakers of Tamil in South India differed system- coding. The latter possibility was also recognized
atically in their verbal and nonverbal performance by Whorf (1956), who emphasized his interest in
on three problem-solving spatial tasks (a memory linguistic thinking or thought insofar as it is
task, a route completion task, and inferencing) linguistic (pp. 6768).
because of differences in habitual linguistic encod- Language-based concepts in turn are subdivided
ing of spatial information: Absolute references into lexicalized and grammaticized concepts. Lex-
were more typical of Tamils from rural settings, icalized concepts entail lexical encodings of natural
and relative references were more typical for those objects, artifacts, substances, events, or actions,
from urban settings. His study suggests that even and grammaticized concepts entail morphosyn-
in areas such as spatial terminology, usually seen as tactically encoded notions, such as number, gender,
uniform within a particular language, differences tense, or aspect (Slobin, 2001). Bruner (1996),
in discursive practices may result in differences in Chafe (2000), Hill and Mannheim (1992), and
verbal and nonverbal performance. Lucy (1992a, 1992b, 1996, 1997a) also argue for
Undoubtedly, different neo-Whoran scholars an expansion of the scope of the study of mental
espouse distinct views of language and of the re- representations from lexicalized and grammati-
lationship between language and thought. What is cized concepts to narrative structures, discourses,
important for the present chapter is the shared and discursive indexing of identities.
acknowledgment that speakers construction of In addition to dening what one understands as
the world may be inuenced by the structural language and thought, it is crucial to dene what
patterns of their languages, as well as by their is considered as evidence of inuence of language
discourses, and that it may be changed through on thought. In the present chapter, I adopt Lucys
participation in alternative discourses, such as (1992a, 1996, 1997a, 2000) view that, to avoid
schooling, or through additional language learning. showing the inuence of language on language,
This perspective builds on Whorfs (1956) original one needs to consider (whenever possible) evidence
assumption that second language (L2) learning from both verbal and nonverbal behaviors. Non-
just like socialization into new discoursesmay verbal behaviors refer to those elicited through
result in assimilation of new perspectives and classication, categorization, sorting, matching,
conceptual restructuring. memory, and role-playing tasks; verbal behaviors
The goal of neo-Whoran inquiry is to examine include elicitation, inferencing, and picture de-
the inuence of language, conceived of either as scription, as well as interviews, storytelling, and
structures or as discourses, on thought. Thought is other conversational activities (of particular im-
typically dened in two ways: Some scholars focus portance here is the speakers selection of aspects of
on the contents of thought, that is, speakers con- reality for subsequent description and memoriza-
ceptualizations of the world; others examine the tion). In this view, the inuence of language on
processes of thinking, such as attending, remem- thought will be seen as the case where the partic-
bering, or reasoning (Lucy, 1992b). More often ular language interpretation guides or supports
than not, the two foci are combined, and the cognitive activity and hence the beliefs and be-
scholarly inquiry examines ways in which differ- haviors dependent on it (Lucy, 1997b, p. 295).
ences in linguistic encodings correspond to differ- Some scholars also argue that early psychological
ent conceptualizations and lead to differences in studies of linguistic relativity oversimplied the
cognitive processes. Consequently, in what follows Sapir-Whorf hypothesis to make it t experi-
I see concepts as mental representations that af- mental paradigms (Lee, 1997, p. 454) and, as a
fect individuals immediate perception, attention, result, effectively side-stepped looking at what
and recall and allow members of specic language people mean by what they say, and what they do,
and culture groups to conduct identication, com- interactionally, with words (Edwards, 1997,
prehension, inferencing, and categorization along p. 22). Consequently, underscoring Whorfs origi-
similar lines (Pavlenko, 1999). nal interest in habitual thought, neo-Whorans
436 Aspects and Implications of Bilingualism
aim to combine experimental research with the language becomes a new person (Rossi-Landi,
study of thought in context, that is, in daily activ- 1973, p. 33).
ities and practices, at the intersection of linguistics, Eventually, however, the phenomena of bilin-
psychology, and anthropology (Edwards, 1997; gualism and translation were co-opted to refute
Hunt & Agnoli, 1991; Lucy, 1992b, 1996, 1997a). linguistic relativity in a way succinctly summarized
In sum, neo-Whorans acknowledge that dif- by Stubbs (1997): But languages are not incom-
ferent language levels may affect distinct cognitive patible. We can translate between them. And bi-
processes and activities to varying degrees or not at linguals speak different languages, but they do not
all. Contemporary investigations of linguistic rela- perceive the world differently when they switch
tivity, conducted both in experimental and natu- from one language to another (p. 359). In the eld
ralistic contexts, aim at uncovering ways in which of bilingualism, this thesis was espoused by Mac-
cross-linguistic differences in lexical and morpho- namara (1970, 1991), who repeatedly argued that
syntactic categories, as well as in discourses, cor- if the Whoran hypothesis were true, bilinguals
respond to different conceptual representations of would be doomed, having to conform to one of the
objects, actions, events, time, or space and lead to three patterns: (a) think in Language A when
differences in thought processes. speaking either A or B, that is, employ the semantic
framework appropriate to Language A; as a result,
the speakers attempts to understand Language B
Bilingualism and Linguistic or to make themselves understood would be quite
Relativity futile (Macnamara, 1991, p. 48); (b) think in a
hybrid manner, appropriate to neither language,
Over the years, bilingualism rarely entered into that is, employ a hybrid semantic system and risk
debates about language and thought: Current col- understanding no one and being understood by
lections of work on linguistic relativity are devoted no one (Macnamara, 1991, p. 48); (c) have two se-
exclusively to explorations in monolingual contexts mantic systems, appropriate to their two languages.
(for an exception, see a chapter by Gomez-Imbert The third possibility, according to Macnamara
in Gumperz & Levinson, 1996). This monolingual (1970), means that bilinguals will think differently
bias does not, however, come from Whorf (1956), depending on which language is used and conse-
one of the rst to champion the importance of quently will have difculties (a) communicating
multilingual awareness and to argue that to with themselves and (b) translating into one lan-
restrict thinking to the patterns merely of English, guage what was said in another. In a later paper,
and especially to those patterns which represent the Macnamara (1991) took a less radical view and
acme of plainness in English, is to lose a power of suggested that in the third case bilinguals would be
thought which, once lost, can never be regained able to translate and to communicate with speakers
(p. 244). of either language. Yet, he claimed that these im-
Whorfs writings clearly show his belief that plications ran afoul of the guiding principles of
additional language learning has the power of natural language semanticswhatever can be ex-
transforming or enhancing the speakers world- pressed in one language, can be translated into
view. It is, therefore, ironic, that later on his work anotherand quipped that if linguistic relativity
was misinterpreted as an argument for linguistic on the scale proposed by Whorf were true, then
determinism, a view according to which the lan- Whorfs own learning of Hopi and Navaho would
guage one speaks determines ones view of the be extremely mysterious (Macnamara, 1991,
world once and forever. Clearly, Whorf, an avid p. 49), if not impossible.
language learner committed to comparative lin- Not surprisingly, other scholars in bilingualism,
guistics, did not and could not entertain such a many of them bi- and multilingual themselves, tried
possibility; rather, he argued for the benets of to counter Macnamaras (1970, 1991) and other
linguistic pluralism (Fishman, 1980). His early similar arguments. Paradis (1979), in his reply to
supporters expressed a similar interest in implica- Macnamara (1970), argued that the rst two op-
tions of linguistic relativity for L2 learning and use tions and difculties with translation are indeed the
(J. Carroll, 1963) as well as an awareness that bi- case, and that none of the three cases described
lingualism of their research participants may have could be used to refute the Whoran hypothesis ad
an impact on their ndings (J. Carroll, 1963; J. absurdo. In fact, Macnamaras rst option closely
Carroll & Casagrande, 1958). At times, they even describes the phenomenon of rst language (L1)
expressed a belief that whoever learns a new transfer, well established in the eld of second
Bilingualism and Thought 437
language acquisition (SLA) and indeed known to speakers to establish baseline cross-linguistic dif-
impede intercultural communication. His second ferences. What is unfortunate is that once such
option is reminiscent of a language contact situa- differences have been established for particular
tion in which speakers of a contact variety may languages or concepts, further research was rarely
develop new linguistic repertoires and new con- if ever conducted with bilingual speakers.
ceptualizations distinct from those employed by Several reasons explain this lack of attention to
members of their L1 and L2 communities. And the bilingualism. To begin, many linguists and psy-
third option well describes bicultural bilinguals chologists, particularly in North America, are still
who adjust their linguistic and conceptual reper- reluctant to acknowledge that more than half
toires depending on the interlocutor. of the worlds population is bi- and multilingual
Interestingly, some bicultural bilinguals do in- (Romaine, 1995); thus, if we are to grapple with
deed experience difculties in translating from one the Sapir-Whorf hypothesis or any other cognitive
language to another (cf. Todorov, 1994). These theory, we have to understand how it plays out
difculties are often commented on by bilingual with multilingual speakers. The research with bi-
writers who view translation as an approximation lingual subjects is further compromised by the lack
at best (for an in-depth discussion of the work of of understanding of bilingualism in mainstream
bilingual writers, see Beaujour, 1989; Kellman, psychology. Some researchers treat bilingualism
2000; Pavlenko, 1998). Some of these individuals, as a monolithic phenomenon and thus do not pay
particularly those who had learned a second lan- much attention to linguistic trajectories of their
guage later in life, see themselves as living in two study participants; others consider it possible to use
different and often incompatible worlds; others bilingual subjects as if they were monolingual,
view L2 socialization as a means of an intense per- either completely discounting their bilingualism
sonal transformation (Beaujour, 1989; E. Hoffman, (Berlin & Kay, 1969, p. 12) or assuming that be-
1989; Kellman, 2000; Pavlenko, 1998; Wierzbicka, cause the subjects had learned the L2 postpuberty,
1985). What emerges from these testimonies is a it would not affect their L1 (cf. Munnich, Landau,
far more nuanced picture of linguistic effects than & Dosher, 2001).
could ever be imagined within a monolingual per- These researchers are clearly unaware of two
spective. This picture deserves further examination, facts. First, the critical period is no longer a given in
if only because it directly contradicts facile state- the eld of SLA (Birdsong, 1999; Ioup, Boustagui,
ments about bilinguals not seeing the world dif- El Tigi, & Mosel, 1994), and even if it were, it had
ferently through the lenses of their two languages. been posited (and explored) regarding phonologi-
Consider, for instance, a statement by the well- cal and syntactic but not conceptual competence.
known linguist Anna Wierzbicka (1985): Furthermore, research has demonstrated that re-
gardless of the age of acquisition, L2 learners L1
It is not impossible (though very difcult) to competence in a variety of domains, including con-
leave the experiential world of ones native ceptual representation, is subject to L2 inuence
language for that of another language, or (Cook, 2003; Pavlenko, 2000).
stretching the metaphor to the limit, to inhabit Several scholars have pointed to the pervasive
two different worlds at once. But when one monolingual bias of explorations in cognitive psy-
switches from one language to another it is not chology and linguistics. Hunt and Agnoli (1991)
just the form that changes but also the content. expressed concern over ways in which the scholarly
(p. 187) community had ignored experiences of bilingual
individuals, who may perceive their two worlds
In fact, it is quite possible that bilinguals are the as untranslatable and incommensurable. Green
only ones to experience directly the effects of lin- (1998) cautioned against approaching all bilinguals
guistic relativity, and to fully understand these ef- in the same way because they may have different
fects, we need to pay more attention to linguistic levels of expertise and different competences in
transitions. Yet, many researchers continue to see their two languages. Ochs (1993) and Lee (1997)
bilingualism as a challenge for the Sapir-Whorf advocated a view of L2 socialization as encultura-
hypothesis and bilinguals as undesirable and tion into new ways of thinking and speaking.
messy subjects who should be excluded from Building on these proposals, I suggest that re-
experimental research to eliminate intervening vari- search on linguistic relativity can and should in-
ables. Clearly, initial empirical studies, such as Lu- corporate bilingualism as a test case rather than as
cys (1992a), had to be carried out with monolingual an argument against the Whoran hypothesis. The
438 Aspects and Implications of Bilingualism
context-sensitive view advocated here sees bilin- in deviation from L1-based categorization
guals as members of multiple discursive commu- patterns.
nities with linguistic repertoires that are not
necessarily identical to those of monolingual speak- I will now review the evidence for these and
ers. Consequently, individual bilingualism is seen other possible effects from the studies of linguistic
(a) as a conglomerate of linguistic and social tra- relativity. Despite the fact that neo-Whoran the-
jectories, whereby differences in age and history of orizing made requirements for convincing evidence
language acquisition, as well as in language pro- more rigorous and the terms of debate more com-
ciency, may lead to distinct effects of language on plex, several studies forged exciting new directions
thought; (b) as a dynamic process whereby L2 so- in the study of language and thought. I discuss this
cialization is viewed as a productive site of possible research in terms of cross-linguistic differences in
cognitive transformations and enrichment, in accor- nine basic concepts, which allow us to talk about
dance with Whorfs (1956) original arguments. This our surroundings and experiences: color, objects
perspective allows me to offer a framework (see and substances, number, space, motion, time,
also Pavlenko, 1999, 2000, 2002a) that incorpo- emotions, and personhood. I also discuss the nd-
rates seven possible relationships between language ings in the inquiry on discursive relativity and
and thought in individual bi- and multilinguals: autobiographical memory, paying particular atten-
tion to work that either illuminates bilinguals
1. Coexistence of L1 and L2 conceptual do- thought processes or offers new directions for re-
mains is directly implied by the Sapir-Whorf search in bilingualism.
hypothesis and suggests that bicultural bi-
linguals using different languages may draw
on distinct conceptual representations and Color
index distinct discursive identities.
2. L1-based conceptual transfer refers to the The domain of color reference has been at the
L1-based conceptual system guiding L2 lan- center of debates on linguistic relativity for more
guage learning and use, at least in the be- than 50 years. This interest stems from the fact that
ginning and intermediate stages of L2 different languages treat the notion of color
acquisition. differently by encoding varying numbers of colors
3. Internalization of new concepts entails adop- in different ways (e.g., nominally, verbally, adjec-
tion of L2 wordsand underlying concepts tivally) and making different semantic distinctions
into the L1 of immigrant bilinguals and between hues. For instance, classic Greek did not
learners in language contact situations who distinguish between the colors English speakers call
perceive the need to emphasize distinctions blue and black; contemporary Russian and Italian
nonexistent in the L1 or to refer to new objects offer, respectively, two and four terms corre-
and notions specic to the L2 community. sponding to the English blue (Hunt & Agnoli,
4. Shift from L1 to L2 conceptual domain re- 1991). Some languages, such as Fon (Benin) or
fers to a shift of category prototypes or Ngbaka-mabo (Central Africa), do not even con-
boundaries in the process of L2 socialization. ceptualize color as a dimension independent of
5. Convergence of L1 and L2 conceptual do- other parameters of colored objects (Dubois, 1997;
mains entails creation of a unitary concept, Lucy, 1997b).
domain, or system distinct from both the L1 Initial color studies offered some evidence that
and L2 based, which may occur in simulta- color codability (i.e., availability of a verbal label)
neous bilingualism or arise as a result of makes colors more distinct and therefore more
language contact. memorable (cf. Brown & Lenneberg, 1954; J.
6. Restructuring of a conceptual domain refers Carroll & Casagrande, 1958). In contrast, later
to a case where a shift is not complete but studies argued that color perception is subject to
certain elements may be deleted from or in- universal, physiologically based constraints, and
corporated in a concept or a conceptual do- that it is perceptual salience, not language, that
main. may cause differences in memory (Berlin & Kay,
7. Attrition of previously learned concepts in- 1969; Heider, 1972). The split between proponents
volves a loss of previously learned concepts, and opponents of universal constraints on color
classication schemas, categorical distinc- cognition is still characteristic of the eld (cf. Hardin
tions, or narrative conventions, evidenced & Maf, 1997). At the same time, the eld has come
Bilingualism and Thought 439
closer to acknowledging both biological and cul- areas mapped by monolingual speakers of these
tural/linguistic inuences on color cognition. languages. For instance, in Hindi there is no word
The proponents of relativity acknowledge the for gray. Not surprisingly, in the achromatic series
physiological basis of color vision but argue that monolingual Hindi speakers did not map the gray
earlier studies were compromised because of the area. In contrast, three of ve Hindi-English bilin-
lack of attention to linguistic status of color terms guals did map such an area, showing sensitivity to
and because of their reliance on focal colors, on the the new distinction acquired in English.
basic color terms of American English, on the In sum, it appears that, in the case of divergent
Western concept of color, on bilingual informants color systems, bilinguals conceptual representa-
(in Berlin & Kay, 1969), and on methodologies at tions and consequently patterns of verbal and
odds with the researchers own objectives (Dubois, nonverbal categorization may differ from those of
1997; Hardin & Banaji, 1993; Hunt & Agnoli, monolingual speakers. These representations may
1991; Lucy, 1992b, 1997b; Saunders & van be unied or language dependent and may incor-
Brakel, 1997a, 1997b). In turn, the supporters of porate new concepts and distinctions internalized
universal constraints on color cognition agree that in the process of L2 socialization.
such inuences may be moderated by language (cf.
Davies & Corbett, 1997; Davies, Sowden, Jerrett,
Jerrett, & Corbett, 1998). Studies show that, in Objects and Substances
some contexts, perception of and memory for col-
ors may be inuenced by their codability in the The second prominent area of research involves
speakers language, as seen on sorting, categori- linguistic and conceptual differences in represen-
zation, and memory tasks (Davidoff, Davies, & tation of objects and substances. This line of in-
Roberson, 1999; Davies & Corbett, 1997; Davies quiry derives from cross-linguistic differences in
et al., 1998; Kay & Kempton, 1984; Lucy, 1997b). number marking. The majority of European lan-
For instance, speakers of Setswana (a Bantu lan- guages are known as noun class languages and
guage spoken in Botswana), a language that has a mark most nouns for number. These languages
single term botala for blue and green, were more encode a count/mass distinction morphosyntacti-
likely than speakers of English and Russian to cally; that is, they include the notion of unit or
group the two colors together (Davies & Corbett, form as a part of a basic meaning of a noun,
1997). directing attention to number. Other languages,
To date, only a few studies have addressed bi- such as Yucatec, Japanese, or Mandarin, are known
linguals color concepts. Ervin-Tripp (1961/1973) as classier languages and lack a morphosyntactic
demonstrated that Navaho-English bilinguals col- count/mass distinction. In these languages, nouns
or categories differ from those of monolingual commonly refer to substances, rather than objects,
speakers of English and Navaho and form one and must be accompanied by a numeral classier
underlying system. In turn, bilingual speakers of that provides information about material properties
Kwakwala (spoken on Vancouver Island) and of the referent (Foley, 1997; Lucy, 1992a). Because
English differentiate between yellow and green classier languages provide no syntactic support for
when speaking English but in Kwakwala stick the object/substance distinction, they offer a natural
to the composite term lhenxa (yellow-with-green) arena in which to investigate cognitive behaviors of
(Saunders & van Brakel, 1997a). Saunders and van both children and adults.
Brakel (1997b) note that several informants in Kay Studies have established that children learning
and Berlins subsequent research (Kay, Berlin, English show preference for shape-based classi-
Maf, & Merrield, 1997; Kay, Berlin, & Merri- cation of various objects as early as 2 years of age;
eld, 1991) appeal to L2 loans from English and similar preferences are shown by English-speaking
Spanish when discussing colors. adults. In contrast, Yucatec- and Japanese-speaking
The L2 inuence was also found in a study by children and adults show preference for material-
Caskey-Sirmons and Hickerson (1977) that exam- based classication on verbal and nonverbal tasks,
ined color boundaries of native speakers of Korean, with Yucatec adults exhibiting it also in their
Japanese, Hindi, Cantonese, and Mandarin who everyday activities (Gentner & Boroditsky, 2001;
had learned English as adults. The researchers Imai, 2000; Imai & Gentner, 1997; Lucy, 1992a;
found that the boundaries for nonoverlapping Lucy & Gaskins, 2001). Zhang and Schmitt (1998)
color terms had shifted in the process of L2 so- also investigated effects that particular types of clas-
cialization and were no longer comparable to the siers have on conceptualization and categorization
440 Aspects and Implications of Bilingualism
of objects. Speakers of Mandarin in their studies lingual naming patterns, especially when it came to
perceived objects that share a classier as more the housewares.
similar to each other than did speakers of English; Together, these ndings point to a pervasive
in recall tasks, they were more likely than speakers inuence of L1-based categorization patterns and
of English to recall classier-sharing objects in to difculties in acquiring full conceptual repre-
clusters. sentations in the L2. Future studies of object cate-
To date, I know of no studies that address gorization will need to pay closer attention to
shape- versus material-based object categorization similarities and differences in L1 and L2 categori-
preferences of bilingual subjects to see whether, for zation patterns and consider the possibility of L2
instance, the learning of English modies catego- inuence on L1, as well as the interaction between
rization preferences of Japanese speakers or vice three or more languages with distinct patterns.
versa. Other interesting questions in this area arise
regarding childhood bilinguals: When does lan-
guage start inuencing categorization preferences Number and Numeric Systems
in different domains? How do children reconcile
incompatible patterns? Lucy and Gaskins (2001) The third line of inquiry also draws on differences
suggest that in the area of object categorization in number marking, as well as on those in number
such inuence occurs in later childhood; work on encoding. As discussed, languages differ signi-
motion patterns and spatial cognition shows that cantly in grammatical number marking: Classier
in these areas the inuence starts early on (Bower- languages, such as Indonesian or Japanese, lack the
man & Choi, 2001; Gentner & Boroditsky, 2001; category altogether; noun class languages, such as
Gopnik, 2001). English, allow their speakers to differentiate one
In addition to number marking, languages may basket from two or more baskets; and some lan-
differ in ways they encode even such everyday ob- guages, like Yimas, differentiate among one im-
jects as shoes and boots or cups and glasses. For pram (basket), two impraml (baskets), and more
instance, both English and Russian have translation than two impramat (baskets) (Foley, 1997). Lucys
equivalents of cups/chashki and glasses/stakany, (1992a) work showed that, because objects are
but objects that English-speakers consider to be marked for number in English but not in Yucatec,
paper cups are seen as stakanchiki (small glasses) in speakers of the two languages differ systematically
Russian, a language in which glassness is dened in memory for objects.
through shape and the absence of handles rather Languages also differ in number encoding, using
than through material. As a result, speakers of a variety of systems. Most languages have a base
languages that encode objects differently perform number and number names that are often a con-
differently on sorting and categorization tasks traction of smaller units. English, for instance, is a
(Kronenfeld, 1996; Malt, Sloman, & Gennari, 2003; base 10 language in which 21 could be expressed as
Malt, Sloman, Gennari, Shi, & Wang, 1999). two tens and one. Although base 10 system has
A few studies also throw light on the bilinguals now taken over most languages, numerical encod-
performance. Graham and Belnap (1986) showed ing remains highly variable, with base 2 used in
that intermediate and advanced Spanish learners some aboriginal languages in Australia and base 20
of English who had resided in the United States in Eskimo and Yoruba (Dehaene, 1997). The most
less than a year exhibited L1-based categorization transparent reection of the decimal structure is
patterns in cases where boundary differences in found in the grammar of Asian languages with
English did not correspond to those in Spanish roots in ancient Chinese (Chinese, Japanese, and
(e.g., in the case of chair, stool, and bench vs. silla Korean among them), in which number names are
and banco). Malt and Sloman (2003) asked three fully congruent with the base 10 numeration sys-
groups of L2 users of English to name common tem. When speakers of these languages learn nu-
household objects in English. The stimuli consisted meracy, all they have to learn are the digits from 0
of 60 pictures of storage containers (bottles, jars, to 9 and the notion of place value; then they can
etc.) and 60 pictures of housewares (dishes, plates, generate numbers without any further memoriza-
bowls, etc.). The researchers reported that even the tion (e.g., 17 is represented as seventeen in English
most advanced speakers in their study, ones who but as ten-seven in Korean or Japanese). In con-
had been in the United States for 8 or more years trast, children learning English or French have to
and had 10 or more years of formal English in- learn by rote not only the numerals from 0 to 10,
struction, exhibited some discrepancies from mono- but also those from 11 to 19, and the tens numbers
Bilingualism and Thought 441
from 20 to 90. What are the cognitive conse- adults (for a discussion, see Bialystok, chapter 20,
quences of these linguistic differences? this volume). These studies have established that
In a series of studies, Miura and associates some areas of numerical cognition are language-
(Miura, 1987; Miura, Kim, Chang, & Okamoto, independent (Spelke & Tsivkin, 2001), that there
1988; Miura & Okamoto, 1989) compared cog- is an advantage in calculation speed for the pre-
nitive representation of number of American, Chi- ferred language (Noel & Fias, 1998), that the
nese, Japanese, and Korean rst graders by asking preferred language is not necessarily the rst one but
the children to construct various numbers with may be the language of schooling (Vaid & Menon,
two types of rods, short ones that represented 2000) or training (Spelke & Tsivkin, 2001), and
1 unit, and longer ones that represented 10 units. that L1 dominance for mental computation may
They found that Chinese, Japanese, and Korean decrease with the length of residence in the L2
children preferred to use a combination of 10-unit context (Tamamaki, 1993). These studies, however,
and 1-unit rods, while American children were focused on bilingualism per se, rather than on the
more likely to represent numbers through a col- effects of having two diverging numeric systems.
lection of 1-unit rods. The researchers explained Future studies could examine numeracy develop-
the difference through the fact that the notion of ment in bilingual children who are learning two
place value is an inherent component of linguis- distinct numerical systems and see, for instance,
tic encoding of number in Asian languages, but whether there is transfer of skills and concepts, such
needs to be understood and internalized by English- as place value, from one language into another.
speaking children. They also found that more
Asian children than American children were able
to construct each number in two ways, which Space
suggests greater exibility of mental number ma-
nipulation. The fourth area in which both lexicosemantic and
In turn, Miller and Stigler (1987) showed that morphosyntactic differences may be important in-
Chinese children between 4 and 6 years of age volves conceptualization of space and memory for
outperformed English-speaking American children spatial arrangements. Cross-linguistic differences
of the same age on abstract counting and on count- in conceptualization of space are commonly dis-
ing sets of objects varying in size and arrangement; cussed in terms of three frames of reference. An
Chinese children could also count higher than their absolute frame uses information external to both
American peers. The teens created a particular the speech participants and the gure-ground
stumbling block for American children; they were scene, such as north, south, east, or west; this
also more likely to skip numbers and were the only frame is commonly used by the speakers of Tzeltal,
ones to produce nonstandard numbers such as a Mayan language, spoken in Mexico. An intrinsic
forty-twelve. frame uses the features of the object in question as
Together, these studies suggest that number en- the point of departure, and the relative or deictic
coding in Asian languages facilitates understanding frame is based on projections from the human
of basic mathematical concepts such as place value, body, such as in front (of me) or to the left.
numerical relations such as part-whole, and the The latter frames are commonly used by speakers
mental manipulation of number quantities required of English or Dutch to describe small layouts for
for numerical reasoning. At the same time, it is also which absolute systems are not appropriate. Stud-
possible that dramatic differences between popula- ies have shown that different speech communities
tions are enhanced by social and cultural factors may favor different reference frames. As a result,
(cf. Towse & Saxton, 1997). Furthermore, the members of these speech communities differ sys-
early differences in understanding of the place value tematically in their performance on verbal and
concept, mental exibility, or counting skill may nonverbal problem-solving, memory, role-playing,
be strictly developmental; it remains to be deter- and description elicitation tasks, with Tzeltal
mined what role, if any, they play in later mathe- speakers, for instance, favoring an absolute frame
matical performance. of reference for tabletop arrangements, and Dutch
Little is known at this point about implications speakers opting for the relative one (Bowerman,
of grammatical number marking differences for 1996a, 1996b; M. Carroll, 1993, 1997; Choi &
bilinguals verbal and nonverbal performance, even Bowerman, 1991; Levinson, 1996, 1997, 2003;
though numerous studies have addressed mathe- Pederson, Danziger, Wilkins, Levinson, Kita, &
matical performance of bilingual children and Senft, 1998).
442 Aspects and Implications of Bilingualism
speakers of English). A series of empirical studies forward); Mandarin typically describes time as
by Jarvis (1994, 2000) demonstrated that begin- vertical, using spatial morphemes sha`ng (up) and
ning and intermediate learners of English who de- xia` (down) (notably, each language has a handful
scribed collisions appealed to L1 transfer in their of the opposite metaphors as well). In her study,
use of motion verbs and produced strikingly dif- Boroditsky (2001) compared performance of native
ferent descriptions. In future studies, it is important speakers of English and Mandarin-English bilin-
to use a combination of verbal and nonverbal tasks guals on a series of psycholinguistic tasks, all con-
to examine how motion is represented in bilinguals ducted in English. The subjects were rst exposed
who speak a satellite-framed and a verb-framed to visual stimuli that served as either horizontal or
language. vertical spatial primes. Then, they were asked to
answer a true/false question about time, with half
of the questions using a horizontal metaphor
Time (March comes before April) and half using purely
temporal terms (March comes earlier than April).
Yet another concept intrinsically linked to She found that both English-speaking and Man-
both space and motion is time. Explorations of darin-speaking subjects answered the before/after
cross-linguistic differences in encoding and con- questions faster after horizontal primes than after
ceptualization of time are rooted in Whorfs vertical primes. They did differ, however, on the
(1956) original arguments about the lack of the purely temporal questions: English speakers an-
time concept in Hopi. Several critics, most notably swered them faster after horizontal primes and bi-
Gipper (1976) and Malotki (1983), argued against linguals after vertical primes. These differences
Whorf, pointing out that Hopi has a rich and ex- were taken to signify differences in the temporal
tended temporal system. At the same time, both thought in the two speech communities, English
Gipper (1976) and Malotki (1983) admitted and Mandarin.
that, although their work rejects the notion of Boroditsky (2001) also examined the effects of
Hopi as a timeless language, it supports the idea age of acquisition and length of exposure on reac-
that the Hopi sense of time and the role time plays tion time. She found that age of acquisitionbut
in their lives and culture do not correspond to not the overall length of exposurewas a reliable
Western notions. Gipper (1976) described the predictor of patterns of response: The later in life
Hopi time experience as cyclic rather than linear, did the participants learn English, the more likely
and Malotki (1983) emphasized that for a good they were to show the vertical bias in their re-
many Hopi who are living on their ancestral land sponses. It is unfortunate, however, that the re-
and are clinging to what is left of their ancient searcher did not examine the effects of exposure to
traditions, time is basically an organic experience the L2 context, which are likely to differ from the
which unfolds in harmony with the cyclic rhythms effects of overall length of exposure (i.e., partici-
of their social, agricultural, or religious events (p. pants who had studied English for 10 years, 5 of
633). Lucy (1996) pointed out that Malotki (1983) them in the United States, may be much more
and others, who look for a concept of time in competent and acculturated than those who stud-
Hopi, completely miss Whorfs crucial point about ied English for 15 years, with only 1 or 2 of them in
distinct structuration of the time words in English an English-speaking context). In future studies, it
and Hopi grammars. In other words, the issue is would be important to pay more attention to this
the difference between conceptualizations of time variable and to conceptualizations of time on
rather than the lack or existence of an abstract which speakers draw in daily language use and thus
time concept in Hopi. in habitual thought.
The debate about the concept of time was so
heated that not until recently did scholars dare to
approach the issue again from a Whoran per- Emotions
spective. To date, Boroditskys (2001) study is the
only one explicitly engaged with bilingual subjects. The next area of investigation, emotion terms and
The researcher shows that English and Mandarin discourses, has produced a wealth of studies that
use different spatiotemporal metaphors when talk- explored cross-linguistic differences and their im-
ing about time: English favors horizontal meta- plications for how emotions are constructedand
phors (e.g., ahead of time, behind schedule, looking experiencedin different cultures (Athanasiadou
444 Aspects and Implications of Bilingualism
that are activated in interactions in the language in (1954/1973, 1964/1973, 1967/1973) pioneering
question and facilitate comprehension, recognition, explorations have shown that bicultural bilinguals
and recall. often draw on different cultural themes when re-
In turn, Heyman and Diesendruck (2002) ex- sponding to visual prompts in their respective lan-
plored how the distinction between the verb to be guages. In a somewhat different format, her work
and its Spanish counterparts ser and estar inuences has been followed up by Koven (1998), who ex-
the reasoning of Spanish-English bilingual children amined ways in which simultaneous Portuguese-
about human psychological characteristics. Ser French bilinguals talked about the same personal
commonly refers to permanent characteristics and experience in their two languages. She found that
properties; estar refers to temporary states and these children of Portuguese immigrants drew on
properties. The study showed that bilingual children different linguistic repertoires when telling their
had formed distinct conceptual representations of stories: In Portuguese, they resorted to colloquial
these verbs: They treated ser and to be as more likely discourses they had learned from their peasant
to convey the stability of psychological characteris- parents and relatives; in French, they drew on dis-
tics than estar. In view of the difculties experienced courses of urban youth. As a result, the stories in
by native speakers of English in internalizing con- French exhibited a more critical stance and indexed
ceptual distinctions between ser and estar, this and the storytellers as tough Parisian youths, while the
similar contrasts (e.g., English to know versus stories in Portuguese took a less empowered stance,
Spanish saber/conocer or French savoir/connaitre) linked to the speakers rural and immigrant origins.
could be productively explored in future research Together, these studies point to the possibility of
with bilinguals at different prociency levels. bilingual speakers indexing different identities in
Future studies could also explore how bi- and their two or more languages through the use of
multilinguals at different prociency and cultural distinct linguistic repertoires.
competence levels conceptualize and perform selves Cross-linguistic studies of storytelling also sug-
in relations to other persons. For instance, cultural gest that different speech communities may rely on
competence in Japanese involves the ability to different narrative conventions and structures, the
evaluate ones own status with regard to that of latter seen by Bruner (1996) as evidence of narra-
ones interlocutor(s) and to mark the differences tive thought. Western stories typically have a prob-
linguistically in an appropriate manner without lem resolution part, while in some other cultures,
appearing either rude or exaggeratedly polite. Cul- the conict is created but not necessarily resolved
tural competence in French or Russian involves the in the story; this in turn inuences comprehensi-
ability to differentiate appropriately between the bility by interlocutors raised in different narrative
informal and formal you (tu/vous or ty/vy). A na- traditions (Holmes, 1997; McCabe & Bliss, 2003;
tivelike conceptual representation of these lexi- Mistry, 1993). Moreover, while most European
calized and grammaticized concepts would involve languages favor temporaland often chrono-
not only the knowledge of and about such dis- logicalnarrative sequencing, stories told in the
tinctions, but also the knowledge of links between American Indian language Kuna focus much more
these categories and linguistic practices, namely, on location, direction, and ways in which actions
in which contexts particular personal pronouns, are performed, so that Western listeners and read-
honorics, forms of address, or caste terms are ers have difculty following these narratives in
likely to be used. translation (Sherzer, 1987). Here, future studies
could build on previous inquiry identifying speech
communities in which narratives are constructed
Discourse differently and examining ways in which bi- or
multilingual speakers construct stories about the
The next line of inquiry focuses on discourses, same event in the languages in question.
showing that members of different speech com-
munities may rely on different interpretive stances,
frames, and scripts to decide on the tellability of Autobiographical Memory
events and to reconstruct worlds in stories (Berman
& Slobin, 1994; Chafe, 1980, 2000; Liebes & The last line of inquiry to be discussed is investiga-
Katz, 1990; McCabe & Bliss, 2003; Sherzer, 1987; tion of bilingual autobiographic memory. Several
Slobin, 1996, 2000; Tannen, 1980). Ervin-Tripps studies, most notably the work of Schrauf and
446 Aspects and Implications of Bilingualism
associates, suggest that bilinguals tend to retrieve cally encoded concepts sensitize speakers of a
memories in the same language in which they were particular language to specic distinctions and en-
encoded or at least to report them more vividly and sure the ease and uniformity of everyday processes
in more detail if reporting in the language of the of encoding and decoding. In this, salient mental
event (Javier, Barroso, & Munoz, 1993; Marian & representations facilitate recall, categorization, and
Neisser, 2000; Schrauf, 2000; Schrauf & Rubin, comprehension along the lines of habitual modes of
1998, 2003). The stories told in the language of the thought and may complicate communication with
event are more elaborate, detailed, and emotional; members of other speech communities. We can also
they include more idea and thought units and evoke see that outcomes of the few studies with bilingual
a higher level of imagery and emotional texture subjects are quite different from those with mono-
(Javier et al., 1993). At the same time, it is clear that linguals. While the studies with monolingual par-
most memories, like any other inner speech activi- ticipants show systematic intergroup differences
ties, can be translated according to the needs of the (or lack thereof) in verbal and nonverbal perfor-
context, even though some aspects may be trans- mances, bilinguals may exhibit the sevenand
formed or deleted in translation (cf. Pavlenko, possibly moredifferent performance patterns
1998). outlined in the beginning of the chapter.
Interesting evidence regarding such transfor- To begin, some bilinguals draw on distinct
mations comes from memoirs of bilingual writers conceptual representations when speaking their
(Pavlenko, 1998, 2001; Todorov, 1994). These per- respective languages (Saunders & van Brakel,
sonal testimonies suggest that autobiographical 1997a), experience different imagery related to the
tellings in the writers two or more languages are L1- and L2-based concepts (Slobin, 2000), draw on
often quite distinct and incompatible because the distinct discourses and linguistic repertoires, and
languages and the discourses associated with them index distinct discursive identities in their two
shape the stories in distinct ways. This intriguing languages (Ervin-Tripp, 1954/1973, 1964/1973,
intersection between narrative conventions and 1967/1973; Koven, 1998; Pavlenko, 1998, 2001).
autobiographic memory awaits further exploration These verbal behaviors suggest coexistence of L1-
with bilingual participants. Future inquiry will al- and L2-based conceptual domains. Strong evidence
low us to assess the impact of cross-linguistic dif- has accumulated in support of the second pattern,
ferences in narrative structure and conventions on L1-based conceptual transfer experienced by be-
verbal recalls of events that took place in distinct ginning and intermediate L2 learners (Becker &
linguistic contexts. Carroll, 1997; Boroditsky, 2001; Graham & Bel-
The studies of autobiographic memory also nap, 1986; Jarvis, 1994, 2000; Jarvis & Odlin,
suggest that the metaphor of twoor more 2000; Pavlenko & Jarvis, 2002). The third pattern,
different worlds is not simply a poetic affordance internalization of new concepts, is well docu-
but an apt description of the lives of bicultural bi- mented in the study of conceptually driven lexical
linguals. As Schrauf and Rubin (2003) state re- borrowing, loan translation, and code switching
garding bilingual immigrants, these bilinguals are in immigrant bilingualism (Pavlenko, 2002a; Ro-
people with dual sociocultural worlds or associa- maine, 1995). Limited evidence is also available for
tional networks that consist of an innumerable processes such as shift (Caskey-Sirmons & Hick-
concatenation of forgotten, half-remembered, and erson, 1977), convergence (Ervin-Tripp, 1961/
vividly remembered contexts in which [they] came 1973), restructuring (Caskey-Sirmons & Hick-
to communicative and cultural competence, learn- erson, 1977; Pavlenko, 2002b), and attrition
ing where and when and how to be unconsciously (Pavlenko, 2002b) of language-based concepts in
native (p. 134). the process of L2 socialization (for an in-depth
discussion, see Pavlenko, 2002a).
Different types of bilinguals behave differently
Interaction Between Languages in experimental and natural contexts. Simultaneous
and Thought in Bilingual bicultural bilinguals may develop representations
Individuals different from those of sequential or late bilinguals;
among late bilinguals, foreign language users and
In sum, while recognizing that concepts not en- speakers with minimal exposure to the target lan-
coded in a particular language may nevertheless be guage may differ from L2 users socialized into the
imagined by its speakers, research has convincingly target language community. Overall, language-
demonstrated that lexically and morphosyntacti- inuenced conceptual changes appear to be affected
Bilingualism and Thought 447
by eight factors (for an in-depth discussion, see least a few studies were conducted with bilingual
Pavlenko, 1999, 2000). Individual factors include individuals, none were conducted with other types
(a) the speakers language learning histories; (b) of multilinguals; this lacuna is still awaiting to be
their language dominance and prociency; (c) the lled. Finally, because recent inquiry suggests that
degree of biculturalism and acculturation; and (d) sign language use may enhance individuals face
expertise in the domain in question. Interactional memory (Arnold & Mills, 2001), future inquiry
factors include (e) the context of language inter- also needs to consider possible cross-modal lin-
action and (f) the linguistic status of the interloc- guistic effects on the thought processes of hearing
utor (i.e., familiarity with the speakers languages). and deaf sign language users.
Linguistic and psycholinguistic factors include (g)
the degree of relatedness between the mental rep-
resentations in the languages in question (concept
comparability) and (h) the degree to which the Conclusion
concept of one language could be expressed in the
other language and the means with which it is ex- As Duranti (1997, p. 60) points out, the fact
pressed (type of encoding). that our notions of language and worldview have
Together, investigations of bilingual perfor- changed means that some of the assumptions on
mance on verbal and nonverbal tasks and in nat- which Sapir and Whorfs work was based are no
ural contexts show that conceptual representations longer taken for granted, and that the range of the
may be transformed in adulthood in the process of phenomena investigated under the rubric of lin-
L2 socialization. These ndings have important guistic relativity has been modied and expanded.
implications for the study of the bilingual mental This chapter proposed a number of ways in which
lexicon. To date, most studies in this area have research on linguistic relativity could benet from
engaged with the lexical level of processing and including bilingual subjects and, conversely, has
representation. Conceptual processing, if included, shown how the study of the bilingual lexicon,
was tested through naming and recognition tasks. memory, and cognition could gain from new di-
The present discussion suggests that conceptual rections offered in neo-Whoran inquiry. Current
representations of bilingual individuals are com- empirical and phenomenological studies with bi-
plex and dynamic phenomena, and that to create lingual subjects strongly suggest that languages
a full picture of how specic concepts are re- may indeed create different worlds for their speak-
presented in the memory of particular bilingual ers, and that participation in discursive practices of
individuals or groups of individuals, a variety of a new target language community may transform
verbal and nonverbal tasks will need to be used, these worlds. Together, these studies convincingly
including but not limited to naming, categoriza- demonstrate that bilingualism could be extremely
tion, matching, inferencing, memory tasks, role benecial for enriching the speakers linguistic
playing, elicited storytelling, and most impor- repertoires and offering them alternative concep-
tantly, the study of habitual thought (i.e., spon- tualizations crucial for exible and critical think-
taneous behavior in naturalistic contexts). Further ing. No one understood this better than Benjamin
research in this area has enormous potential for Lee Whorf, who more than 60 years ago argued
discovery of new effects of language on cognition that those who envision a future world speaking
that would be distinct from what we see in cross- only one tongue, whether English, German, Rus-
linguistic explorations with monolingual speakers. sian, or any other, hold a misguided ideal and
While I have outlined some directions for future would do the evolution of the human mind the
inquiry in the respective sections, three more gen- greatest disservice (1941/1956, p. 244).
eral comments need to be made. First, to date there
are only a few studies that explore cross-linguistic
differences in conceptual representation in bilin- Acknowledgments
gual individuals. None of these studies offers a
This chapter has beneted tremendously from the
rigorous combination of verbal and nonverbal
generous and thoughtful comments on earlier
tasks with extensive investigation of habitual drafts offered by David Green, Scott Jarvis, Mi-
modes of thought. In the future, it would be pref- chele Koven, Terry Odlin, Sanna Reynolds, and
erable to conduct studies that combine different Bob Schrauf and by the editors of the volume, Judy
types of evidence and explore effects in different Kroll and Annette de Groot. All remaining errors
kinds of bilingual individuals. Second, while at or inaccuracies are strictly my own.
448 Aspects and Implications of Bilingualism
colors and odors. Behavioral and Brain Gopnik, A. (2001). Theories, language, and culture:
Sciences, 20, 188. Whorf without wincing. In M. Bowerman &
Duranti, A. (1994). From grammar to politics: Lin- S. Levinson (Eds.), Language acquisition and
guistic anthropology in a Western Samoan vil- conceptual development (pp. 4569). Cam-
lage. Berkeley: University of California Press. bridge, U.K.: Cambridge University Press.
Duranti, A. (1997). Linguistic anthropology. Graham, R., & Belnap, K. (1986). The acquisition
Cambridge, U.K.: Cambridge University Press. of lexical boundaries in English by native
Edwards, D. (1997). Discourse and cognition. speakers of Spanish. International Review of
London, England: Sage. Applied Linguistics in Language Teaching, 24,
Enfield, N., & Wierzbicka, A. (Eds.). (2002). 275286.
The body in description of emotion: Cross- Green, D. W. (1998). Bilingualism and thought.
linguistic studies. Special issue. Pragmatics Psychologica Belgica, 38, 251276.
and Cognition, 10, 125. Gumperz, J., & Levinson, S. (Eds.). (1996). Re-
Ervin, S. (1973). Identification and bilingualism. thinking linguistic relativity. Cambridge, U.K.:
In A. Dil (Ed.), Language acquisition and Cambridge University Press.
communicative choice. Essays by Susan M. Hardin, C., & Banaji, M. (1993). The inuence of
Ervin-Tripp (pp. 114). Stanford, CA: language on thought. Social Cognition, 11,
Stanford University Press. (Original work 277308.
published 1954) Hardin, C., & Maf, L. (Eds.). (1997). Color ca-
Ervin-Tripp, S. (1973). Semantic shift in bilinguals. tegories in thought and language. Cambridge,
In A. Dil (Ed.), Language acquisition and U.K.: Cambridge University Press.
communicative choice. Essays by Susan M. Harkins, J., & Wierzbicka, A. (Eds.). (2001).
Ervin-Tripp (pp. 3344). Stanford, CA: Emotions in crosslinguistic perspective. Berlin,
Stanford University Press. (Original work Germany: Mouton de Gruyter.
published 1961) Harre, R. (Ed.). (1986). The social construction of
Ervin-Tripp, S. (1973). Language and TAT content emotions. Oxford, U.K.: Blackwell.
in bilinguals. In A. Dil (Ed.), Language ac- Heelas, P. (1986). Emotion talk across cultures. In
quisition and communicative choice. Essays by R. Harre (Ed.), The social construction of
Susan M. Ervin-Tripp, (pp. 4561). Stanford, emotions (pp. 234266). Oxford, U.K.:
CA: Stanford University Press. (Original work Blackwell.
published 1964) Heider, E. (1972). Universals in color naming and
Ervin-Tripp, S. (1973). An issei learns English. In memory. Journal of Experimental Psychology,
A. Dil (Ed.), Language acquisition and com- 93, 1020.
municative choice. Essays by Susan M. Ervin- Heyman, G., & Diesendruck, G. (2002). The
Tripp (pp. 6277). Stanford, CA: Stanford Spanish ser/estar distinction in bilingual chil-
University Press. (Original work published drens reasoning about human psychological
1967) characteristics. Developmental Psychology,
Fishman, J. (1980). The Whorfian hypothesis: 38, 407417.
Varieties of valuation, conrmation and dis- Hill, J., & Mannheim, B. (1992). Language and
conrmation: I. International Journal of the world view. Annual Review of Anthropology,
Sociology of Language, 26, 2540. 21, 381406.
Foley, W. (1997). Anthropological linguistics. Hoffman, C., Lau, I., & Johnson, D. (1986). The
Oxford, U.K.: Blackwell. linguistic relativity of person cognition: An
Gentner, D., & Boroditsky, L. (2001). Individua- English-Chinese comparison. Journal of Per-
tion, relativity, and early word learning. In sonality and Social Psychology, 51, 1097
M. Bowerman & S. Levinson (Eds.), Language 1105.
acquisition and conceptual development Hoffman, E. (1989). Lost in translation. A life in a
(pp. 215256). Cambridge, U.K.: Cambridge new language. New York: Penguin Books.
University Press. Hollan, D. (1992). Cross-cultural differences in the
Gentner, D., & Goldin-Meadow, S. (Eds.). (2003). self. Journal of Anthropological Research, 48,
Language in mind: Advances in the study of 283300.
language and thought. Cambridge, MA: MIT Holmes, J. (1997). Struggling beyond Labov
Press. and Waletzky. Journal of Narrative and Life
Gipper, H. (1976). Is there a linguistic relativity History, 7, 9196.
principle? In R. Pinxten (Ed.) Universalism Hunt, E., & Agnoli, F. (1991). The Whoran hy-
versus relativism in language and thought. pothesis: A cognitive psychology perspective.
Proceedings of a colloquium on the Sapir- Psychological Review, 98, 377389.
Whorf hypothesis (pp. 217228). The Hague, Imai, M. (2000). Universal ontological knowledge
The Netherlands: Mouton. and a bias toward language-specic categories
450 Aspects and Implications of Bilingualism
Macnamara, J. (1991). Linguistic relativity re- place value. Journal of Educational Psychol-
visited. In R. Cooper & B. Spolsky (Eds.), The ogy, 81, 109113.
inuence of language on culture and thought: Muhlhausler, P., & Harre, R. (1990). Pronouns
Essays in honor of Joshua A. Fishmans 65th and people: The linguistic construction of so-
birthday (pp. 4560). Berlin, Germany: Mou- cial and personal identity. Oxford, U.K.:
ton de Gruyter. Blackwell.
Malotki, E. (1983). Hopi time. A linguistic analysis Munnich, E., Landau, B., & Dosher, B. (2001).
of the temporal concepts in the Hopi language. Spatial language and spatial representation: A
Berlin, Germany: Mouton de Gruyter. cross-linguistic comparison. Cognition, 81,
Malt, B., & Sloman, S. (2003). Linguistic diversity 171207.
and object naming by non-native speakers Niemeier, S., & Dirven, R. (Eds.). (2000).
of English. Bilingualism: Language and Evidence for linguistic relativity. Amsterdam:
Cognition, 6, 4767. Benjamins.
Malt, B., Sloman, S., & Gennari, S. (2003). Uni- Noel, M., & Fias, W. (1998). Bilingualism and
versality and language specicity in object numeric cognition. Psychologica Belgica, 38,
naming. Journal of Memory and Language, 231250.
49, 2042. Nuyts, J., & Pederson, E. (Eds.). (1997). Language
Malt, B., Sloman, S., Gennari, S., Shi, M., & and conceptualization. Cambridge, UK: Cam-
Wang, Y. (1999). Knowing versus naming: bridge University Press.
Similarity and the linguistic categorization of Ochs, E. (1993). Constructing social identity: A
artifacts. Journal of Memory and Language, language socialization perspective. Research
40, 230262. on Language and Social Interaction, 26, 287
Marian, V., & Neisser, U. (2000). Language- 306.
dependent recall of autobiographical memo- Paradis, M. (1979). Language and thought in bi-
ries. Journal of Experimental Psychology: linguals. In W. McCormack & H. Izzo (Eds.),
General, 129, 361368. The Sixth LACUS Forum (pp. 420431).
Markus, H., & Kitayama, S. (1991). Culture and Columbia, SC: Hornbeam Press.
the self: Implications for cognition, emotion, Pavlenko, A. (1998). Second language learning by
and motivation. Psychological Review, 98, adults: Testimonies of bilingual writers. Issues
224253. in Applied Linguistics, 9, 319.
Markus, H., & Kitayama, S. (1994). The cultural Pavlenko, A. (1999). New approaches to concepts
construction of self and emotion: Implications in bilingual memory. Bilingualism: Language
for social behavior. In S. Kitayama & H. and Cognition, 2, 209230.
Markus (Eds.), Emotion and culture: Empiri- Pavlenko, A. (2000). L2 influence on L1 in late
cal studies of mutual inuence (pp. 89130). bilingualism. Issues in Applied Linguistics, 11,
Washington, DC: American Psychological 175205.
Association. Pavlenko, A. (2001). In the world of the tradition
McCabe, A., & Bliss, L. (2003). Patterns of nar- I was unimagined: Negotiation of identities
rative discourse: A multicultural, life-span in cross-cultural autobiographies. The Inter-
approach. Boston: Allyn & Bacon. national Journal of Bilingualism, 5, 317344.
Miller, K., & Stigler, J. (1987). Counting in Chi- Pavlenko, A. (2002a). Conceptual change in bilin-
nese: Cultural variation in a basic cognitive gual memory: A neo-Whoran approach. In
skill. Cognitive Development, 2, 279305. F. Fabbro (Ed.), Advances in the neurolin-
Mistry, J. (1993). Cultural context in the develop- guistics of bilingualism (pp. 6994). Udine,
ment of childrens narratives. In J. Altarriba Italy: Udine University Press.
(Ed.), Cognition and culture: A cross-cultural Pavlenko, A. (2002b). Bilingualism and emotions.
approach to psychology (pp. 207228). Multilingua, 21, 4578.
Amsterdam: Elsevier Science. Pavlenko, A. (2002c). Emotions and the body in
Miura, I. (1987). Mathematics achievement as a Russian and English. Pragmatics and Cogni-
function of language. Journal of Educational tion, 10, 201236.
Psychology, 79, 7982. Pavlenko, A. (2003). Eyewitness memory in late
Miura, I., Kim, C., Chang, C., & Okamoto, Y. bilinguals: Evidence for discursive relativity.
(1988). Effects of language characteristics on The International Journal of Bilingualism, 7,
childrens cognitive representation of number: 3, 257281.
Cross-national comparisons. Child Develop- Pavlenko, A., & Jarvis, S. (2002). Bidirectional
ment, 59, 14451450. transfer. Applied Linguistics, 23, 190214.
Miura, I., & Okamoto, Y. (1989). Comparisons of Pederson, E. (1995). Language as context, language
U.S. and Japanese rst graders cognitive rep- as means: Spatial cognition and habitual lan-
resentation of number and understanding of guage use. Cognitive Linguistics, 6, 3362.
452 Aspects and Implications of Bilingualism
Pederson, E., Danziger, E., Wilkins, D., Levinson, Slobin, D. (1996). From thought and language
S., Kita, S., & Senft, G. (1998). Semantic to thinking for speaking. In J. Gumperz &
typology and spatial conceptualization. S. Levinson (Eds.), Rethinking linguistic
Language, 74, 557589. relativity (pp. 7096). Cambridge, U.K.:
Potter, J., & Wetherell, M. (1987). Discourse Cambridge University Press.
and social psychology: Beyond attitudes and Slobin, D. (2000). Verbalized events: A dy-
behavior. London: Sage. namic approach to linguistic relativity
Putz, M., & Verspoor, M. (Eds.). (2000). Explo- and determinism. In S. Niemeier &
rations in linguistic relativity. Amsterdam: R. Dirven (Eds.), Evidence for linguistic
Benjamins. relativity (pp. 107138). Amsterdam:
Rintell, E. (1984). But how did you feel about that? Benjamins.
The learners perception of emotion in speech. Slobin, D. (2001). Form-function relations: How
Applied Linguistics, 5, 255264. do children nd out what they are? In M.
Romaine, S. (1995). Bilingualism (2nd ed.). Bowerman & S. Levinson (Eds.), Language
Oxford, U.K.: Blackwell. acquisition and conceptual development
Rosaldo, M. (1980). Knowledge and passion: (pp. 406449). Cambridge, U.K.: Cambridge
Ilongot notions of self and social life. Cam- University Press.
bridge, U.K.: Cambridge University Press. Spelke, E., & Tsivkin, S. (2001). Initial knowl-
Rossi-Landi, F. (1973). Ideologies of linguistic edge and conceptual change: Space and
relativity. The Hague, The Netherlands: number. In M. Bowerman & S. Levinson
Mouton. (Eds.), Language acquisition and concep-
Rumsey, A. (1990). Wording, meaning, and lin- tual development (pp. 7097).
guistic ideology. American Anthropologist, Cambridge, U.K.: Cambridge University
92, 346361. Press.
Sapir, E. (1921). Language: An introduction to the Stubbs, M. (1997). Language and the mediation of
study of speech. New York: Harcourt, Brace, experience: Linguistic representation and cog-
& Company. nitive orientation. In F. Coulmas (Ed.), The
Sapir, E. (1929). The status of linguistics as a handbook of sociolinguistics (pp. 358373).
science. Language, 5, 20714. Oxford, U.K.: Blackwell.
Saunders, B., & Van Brakel, J. (1997a). Are there Talmy, L. (1991). Path to realization: A typology
nontrivial constraints on color categorization? of event conation. Proceedings of the 17th
Behavioral and Brain Sciences, 20, 167179. Annual Meeting of the Berkeley Linguistics
Saunders, B, & Van Brakel, J. (1997b). Color: Society, 480519.
An exosomatic organ? Behavioral and Brain Tamamaki, K. (1993). Language dominance in
Sciences, 20, 212220. bilinguals arithmetic operations according to
Schrauf, R. (2000). Bilingual autobiographical their language use. Language Learning, 43,
memory: Experimental studies and clinical 239262.
cases. Culture and Psychology, 6, 387417. Tannen, D. (1980). A comparative analysis of oral
Schrauf, R., & Rubin, D. (1998). Bilingual auto- narrative strategies: Greek and American
biographical memory in older adult immi- English. In W. Chafe (Ed.), The Pear Stories:
grants: A test of cognitive explanations of Cognitive, cultural, and linguistic aspects of
the reminiscence bump and the linguistic en- narrative production (pp. 5187). Norwood,
coding of memories. Journal of Memory and NJ: Ablex.
Language, 39, 437457. Todorov, T. (1994). Dialogism and schizophrenia.
Schrauf, R., & Rubin, D. (2003). On the bilinguals In A. Arteaga (Ed.), An other tongue. Nation
two sets of memories. In R. Fivush & C. Haden and ethnicity in the linguistic borderlands (pp.
(Eds.), Autobiographical memory and the 203214). Durham, NC: Duke University
construction of a narrative self: Developmen- Press.
tal and cultural perspectives (pp. 121145). Towse, J., & Saxton, M. (1997). Linguistic inu-
Mahwah, NJ: Erlbaum. ences on childrens number concepts: Meth-
Sherzer, J. (1987). A discourse-centered approach odological and theoretical considerations.
to language and culture. American Anthro- Journal of Experimental Child Psychology,
pologist, 89, 295309. 66, 362375.
Shweder, R., & Bourne, E. (1984). Does the con- Vaid, J., & Menon, R. (2000). Correlates of bi-
cept of the person vary cross-culturally? In R. linguals preferred language for mental com-
Shweder & R. LeVine (Eds.), Culture theory: putations. Spanish Applied Linguistics, 4,
Essays on mind, self, and emotion (pp. 158 325342.
199). Cambridge, MA: Harvard University Whorf, B. (1956). Language, thought, and reality
Press. (J. B. Carroll, Ed.). New York: Wiley.
Bilingualism and Thought 453
Wierzbicka, A. (1985). The double life of a bilin- Press/Editions de la Maison des Sciences de
gual. In R. Sussex & J. Zubrzycki (Eds.), Polish lHomme.
people and culture in Australia (pp. 187223). Zhang, S., & Schmitt, B. (1998). Language-
Canberra: Australian National University. dependent classication: The mental repre-
Wierzbicka, A. (1999). Emotions across languages sentation of classiers in cognition, memory,
and cultures: Diversity and universals. Cam- and ad evaluations. Journal of Experimental
bridge, U.K./Paris: Cambridge University Psychology: Applied, 4, 375385.
Ingrid K. Christoffels
Annette M. B. de Groot
22
Simultaneous Interpreting
A Cognitive Perspective
ABSTRACT Simultaneous interpreting (SI) is one of the most complex language tasks
imaginable. During SI, one has to listen to and comprehend the input utterance in one
language, keep it in working memory until it has been recoded and can be produced in
the other language, and produce the translation of an earlier part of the input, all of this
at the same time. Thus, language comprehension and production take place simulta-
neously in different languages. In this chapter, we discuss SI from a cognitive per-
spective. The unique characteristics of this task and comparisons with other, similar,
tasks illustrate the demanding nature of SI. Several factors inuence SI performance,
including the listening conditions and the language combination involved. We discuss
some processing aspects of SI, such as the control of languages and language recoding.
We ask whether experience in interpreting is related to some special capabilities and
discuss possible cognitive subskills of SI, such as exceptional memory skills. Finally, we
discuss the implications of SI for theories of language production.
454
Simultaneous Interpreting 455
two languages must be activated and controlled are known to inuence interpreting performance.
simultaneously (Grosjean, 1997), and theories of Next, we review research that compares SI with si-
speech perception that assign articulation a crucial milar tasks. Finally, we consider SI as a manifesta-
role in comprehension (e.g., Liberman & Mat- tion of expertise and address some issues that need
tingly, 1985) should be reconciled with the fact to be resolved before SI can be modeled. But, before
that in SI production and comprehension are per- beginning our review of SI research, we describe
formed simultaneously. briey the different forms of interpreting and com-
pare interpreting with translating to show that
cognitively they should be regarded as distinct tasks.
The Experimental Study of
Simultaneous Interpreting
Forms of Interpreting
In trying to understand SI, researchers have gen- In professional practice, two kinds of interpreting
erally taken three different approaches. The rst are common: simultaneous interpreting and con-
approach concerns the detailed study of the output secutive interpreting. The main difference between
of the interpreting process under varying circum- these two forms of interpreting is the timing be-
stances. The second approach is to regard SI as a tween input and output. In consecutive interpreting,
complex task and as such to compare it with other an interpreter starts to interpret when the speaker
tasks to gain more insight about the relevant pro- stops speaking, either in breaks in the source speech
cessing components. For example, interpreting is (discontinuous interpreting) or after the entire
often compared with shadowing, which involves speech is nished (continuous interpreting) (see also,
the immediate verbatim repetition of what is heard. Gerver, 1976). The consecutive interpreter usually
Interpreting and shadowing are similar in that both takes notes while the source speech is delivered. SI
tasks involve simultaneous listening and speaking, contrasts with consecutive interpreting in that the
but they are different in that shadowing does not interpreter is required to listen and speak at the same
require the input to be transformed. time instead of alternating between listening and
The third approach regards SI as a complex skill speaking. As a consequence, the cognitive demands
and compares experienced professional interpreters of SI and consecutive interpreting are likely to be
with students learning SI or with untrained but different. Consecutive interpreting puts large de-
procient bilinguals. The hypothesis underlying mands on long-term memory because it requires
this approach is that interpreters may possess spe- reciting a message into another language on the
cic task-relevant subskills. Superior processing in basis of memory and a few notes, whereas in SI
particular cognitive subskills would suggest that constraints in online information processing are
the interpreting experience itself may boost these likely to constitute the main challenge to acceptable
skills, or that interpreters are self-selected on the performance.
specic abilities required for performing the task Mixtures of text-to-text translation and inter-
adequately. preting also exist. For example, in so-called sight
Research on interpreting has its own methodo- interpreting, the interpreter produces a verbal
logical problems (e.g., Massaro & Shlesinger, translation of a written text (Moser-Mercer, 1995).
1997). A critical issue is that professional inter- SI from or into sign language is especially interesting
preters do not abound, so an adequate sample for because the two languages involved are in a different
any given study cannot always be obtained, espe- modality.
cially if a specic language combination is required.
Many studies are therefore prone to a lack of sta-
tistical power, making it hard to draw general con- Interpreting Versus Translating
clusions from the data. Other methodological
problems concern a lack of ecological validity of the In many respects, translating and interpreting are
experimental setting and the stimulus materials very similar tasks. Both are modes of bilingually
(e.g., Gile, 2000; but see Frauenfelder & Schriefers, mediated communication for a third party (see also
1997). Neubert, 1997). These forms of language use are
In the remainder of this chapter, we rst discuss a unique in the sense that interpreters and translators
number of essential characteristics and processing are not supposed to contribute to the content of
aspects of SI that together illustrate its cognitive the message that they have to transfer. In addition
complexity. We then examine a set of factors that to monitoring what they say or write, as normal
456 Aspects and Implications of Bilingualism
speakers or writers would do, interpreters and tasks, to differences in the goals that need to be
translators have to match the content of what they achieved in the two tasks. The readers of a trans-
say or write to the content of a source text. lation expect a well-written text; therefore, the
The typical differences between translating and linguistic acceptability requirements are very high
interpreting concern the modes of input and out- in translating. For interpreters, it is especially im-
put. These are the visual and written mode in the portant to deliver clear target language, but the
case of translating and the auditory and verbal stylistic demands are those of ordinary speech,
mode in the case of interpreting. There are other which implies that grammatically less-well-formed
obvious differences between the two (see Gile, utterances are acceptable. A nal noteworthy dif-
1995; Padilla & Martin, 1992), some of which are ference is that an interpreted text is usually shorter
likely to inuence the language comprehension than the original source text, whereas a translated
process. In SI, the input rate is determined by the text is usually longer (Chernov, 1994; Padilla &
speaker of the source text. The rate will usually be Martin, 1992). The latter difference implies that
comparable to that in normal speech, that is, about interpreting involves a loss of information.
100 to 200 words per min. Speech is transient; any
information missed is irretrievable. The clarity of
input in interpreting can vary widely because of Characteristics of Simultaneous
the variability of the speakers or because of vari- Interpreting
ability of the quality of technical equipment and
environmental circumstances. In translating, the
source text is static and permanently available. It The Simultaneity of Comprehension
can be consulted and reread at a rate that suits the and Production
translator.
One of the most salient features of SI is that two
Regarding language production, there is a no-
streams of speech have to be processed simulta-
ticeable difference in the amount of output pro-
neously: The input has to be understood, and the
duced by interpreters and translators within a given
output has to be produced. Note that this implies
time span. Interpreters usually work in pairs, tak-
that interpreters have a split conceptual attention
ing turns approximately every 30 min. The speed
(MacWhinney, 1997, and chapter 3, this volume).
of delivery is speaking rate. This amounts to up to
One conceptual focus is directed to understanding
approximately 4,000 words on average in a 30-min
the input; the other focus is on conceptualizing and
turn. Translators usually produce that amount of
producing an earlier part of the message. Past re-
translated text in an entire day.
search suggests that interpreters exploit natural
More important, there is only one go to
pauses and hesitations in the source speech to
produce a good interpretation, whereas iterative
reduce simultaneous processing to a minimum
improvement of the target text is an essential
(Barik, 1973; Goldman-Eisler, 1972, 1980).
component of the translation process (Gile, 1995;
In an analysis of the temporal characteristics of
Moser-Mercer, Kunzli, & Korac, 1998). When
source and target delivery patterns, Barik (1973)
translating, there is also an opportunity to use
conrmed that interpreters proportionally speak
dictionaries and to consult experts and colleagues.
more during pauses in the input than would be
In contrast, interpreters have to acquire the re-
expected if the input and output patterns were in-
levant knowledge in advance. Moreover, unlike
dependent (but see Gerver, 1976). When taking this
translating, interpreting always takes place directly
into account, about 70% of the time that inter-
in front of an audience. An advantageous con-
preters are speaking, they are simultaneously lis-
sequence of this is that interpreters usually share
tening to input (Chernov, 1994). In other words,
the communicative context with the source speaker
most of the time interpreters have to cope with
and the listeners. Also, just as the audience, the
simultaneous comprehension and production of
interpreter has access to extralinguistic information
language (see also Goldman-Eisler, 1972).
to aid comprehension (e.g., nonverbal commu-
nication and slides). In contrast, in translating, the
translated text is typically the only source of in- The Lag Between Source
formation available to its readers. and Target Message
A translated text is generally of a higher quality
than an interpreted text, a fact that relates, in ad- The production of the target message usually lags
dition to cognitive demand differences between the behind that of the source message by a few seconds.
Simultaneous Interpreting 457
Before being able to produce an adequate inter- To conclude, there appears to be an optimal
preting output, a certain amount of input has to be earvoice span that is a compromise between the
available. This lag, the so-called earvoice span, is length of the stretches of input required for full
measured as the number of words or seconds be- understanding and the limits of working memory.
tween the input and the corresponding output. The result of these opposing demands settles on an
Average lags reported for interpreting are longer average lag of four to ve words (see also Ander-
than for shadowing. For interpreting, the average son, 1994; Goldman-Eisler, 1980).
lag varies between 4 and 5.7 words (Gerver, 1976;
Goldman-Eisler, 1972; Treisman, 1965), whereas The Unit of Interpreting
for shadowing it varies between 2 and 3 words
(Gerver, 1976; Treisman, 1965). Consistent with Closely related to the issue of an optimal earvoice
Bariks study (1973), who reported lags of 2 to 3 s span is the question of what constitutes the unit
for interpreting, in our laboratory we observed (chunk) from which SI output is built. The in-
average lags of about 2 s for interpreting and 1 s for terpreting unit is probably larger than a single word
shadowing. We estimated this to be equivalent to because the span consists of several words on aver-
about 5 words for interpreting and between 2 and 3 age. Moreover, literal word-by-word translation
words for shadowing (Christoffels & De Groot, would render an unintelligible interpretation, if only
2004). because languages often differ in word order, and
The earvoice span is likely to be inuenced by a single words do not always have an exact transla-
number of factors, such as the language of input tion equivalent. So, rather than translating each
(Goldman-Eisler, 1972). Even so, the reported incoming word separately, interpreting usually in-
average span across the various studies seems con- volves rephrasing at a higher level (Goldman-Eisler,
sistent. In fact, as discussed in the section on de- 1980; Schweda-Nicholson, 1987).
terminants of interpreting output, some input In an analysis of a large number of transla-
manipulations do not inuence the earvoice span. tion chunks, Goldman-Eisler (1972) found that for
The span appears to result from an interplay be- about 92% of these chunks the earvoice span
tween two contrasting factors. The rst is that there consisted of at least a complete noun phrase plus
is an advantage in waiting as long as possible before verb phrase, from which she concluded that the
starting to produce the translation. The longer verb phrase is an especially crucial part of the input
the actual production is delayed, the more in- chunk. Apparently, grammatical information is
formation about the intended meaning of the input needed before interpreting is possible, and the
is available (see also Barik, 1975; Kade & Claus, clause may be the favored unit in interpreting. This
1971) and the lower the chance of misinterpretation is also indicated by the tendency of interpreters to
because ambiguities may be resolved. postpone the translation when the verb is uttered
In support of this, in a study on sign language late in the input clause. Furthermore, Goldman-
interpreting Cokely (1986) observed that the num- Eisler found that in 90% of the cases interpreters
ber of errors was negatively correlated with time started to translate before a natural pause in the
lag. Furthermore, Barik (1975) suggested that spe- source speech occurred, which suggested that in-
cic difculties observed in SI with function words terpreters do not merely mirror the input chunking
(e.g., to, for, as) are caused by misinterpretation of the speaker but impose their own segmentation
because of too short an earvoice span. Because of the text. Nevertheless, Barik (1975) found that
these words are highly ambiguous without suf- the more the speaker of the source text paused at
cient context, they may lead to interpreting errors grammatical junctions, the better the performance.
when translated before the intended meaning is The usefulness of such input parsing again con-
fully resolved. verges with the idea of the clause as the unit of
In contrast, there is also an advantage in keep- processing in interpreting.
ing the lag as short as possible because a short lag In an eye-tracking study involving sight inter-
taxes memory less than a long lag. With a long lag, preting of ambiguous phrases presented in context,
the interpreter runs the risk of loss of information McDonald and Carpenter (1981) reported that
from working memory, with the effect of losing the during the rst pass, parsing was very similar to
thread of the input speech. Barik (1975) reported parsing in ordinary reading. The interpretation was
that the longer the interpreter lagged behind, the typically produced during a time-consuming sec-
greater the likelihood that source text content was ond pass of a chunk, when phrases were reread.
omitted. They concluded that parsing or chunking in (sight)
458 Aspects and Implications of Bilingualism
translation is initially very similar to the analogous be translated in the L2 will, in addition to a
processes in reading comprehension. translation schema, also trigger the naming-in-
It thus seems that a good candidate for the pre- L1 task. To translate from L1 to L2, an L1 pro-
ferred unit of interpreting is the clause. Interpreting duction schema must be inhibited, and the schema
strategies, which may also inuence the size of the for L2 must be activated; that schema in turn in-
chunking unit, are discussed in a later section. hibits lemmas that are tagged to belong to L1. If
word translation is a task that already involves high
levels of control, then the control demands imposed
by SI on the cognitive system must be very high
Processing Aspects of indeed.
Simultaneous Interpreting SI may be problematic for any activation-
inhibition account because, unlike common lan-
Control of Languages guage production by bilinguals, SI requires
activation of both languages simultaneously. A
It is a basic requirement of SI to produce pure number of authors have considered ways in which
target language, that is, language that does not SI might be integrated into existing theoretical
contain any language switches. Yet, the nature of frameworks and the implications for language
the task demands that both languages are si- selection and control.
multaneously activated while performing the task. In a framework toward a neurolinguistic theory
Therefore, control of languages is crucial to SI. To of SI, Paradis (1994, 2000) proposed the subset and
explain how languages are kept separate and in- the activation threshold hypotheses (see also Para-
terference of the nontarget language is prevented in dis, 1997). The subset hypothesis states that all the
common speech of bilinguals, a number of theories elements of one language are strongly associated
propose a mechanism of external global inhibition into a subset that behaves like a separate network
or deactivation of activity in the nontarget lan- that can be separately activated or inhibited. The
guage system or global activation of the target activation threshold hypothesis holds that an item
language (De Bot & Schreuder, 1993; Dijkstra & is selected when its activation exceeds that of its
Van Heuven, 1998; D. W. Green, 1986, 1998, and competitors, which are simultaneously inhibited
chapter 25, this volume; Grosjean, 1997; Paradis, (their activation thresholds are raised). More im-
1994). Experiments on language switching provide pulses are required to self-activate a trace (in pro-
evidence for a general inhibitory control mechan- duction) voluntarily than to have it activated by
ism (e.g., Meuter & Allport, 1999). That the con- external stimuli (in comprehension). When a bilin-
trol of languages may be especially important in SI gual speaks in one language only, the activation
was suggested by the results of a positron emission threshold of the nonselected language is raised suf-
tomographic (PET) study by Price, Green, and Von ciently to prevent interference during production
Studnitz (1999). These authors reported that word (cf. the notion of global inhibition). Paradis (1994,
translation in comparison to reading in the rst 2000) suggested that in SI the threshold of the source
language (L1) and second language (L2) increased language is higher than the threshold of the target
the activity of the areas in the brain believed to language because production requires more activa-
control action. tion than comprehension. It is not clear whether
The inhibitory control model proposed by D. such an activation pattern allows for the production
W. Green (1986, 1998) addresses the issue of of target language only, without interference from
language control most directly. In this model, the the source language, or what the consequences are
bilingual avoids speaking in the unintended lan- of higher activation of the target language for
guage by suppressing activity in the nontarget comprehension of the source language.
language system. So-called language task schemas De Bot (2000) discussed a bilingual version of
compete to determine the output. Top-down con- Levelts model for language production in relation
trol is achieved by an executive system that boosts to interpreting (see De Bot & Schreuder, 1993).1
the activation of the target task schema (and sup- Like Paradis, De Bot assumed that language-
presses activation of the competing task schemas). specic subsets develop, that spreading activation is
Translation is given as an example task in which an the main mechanism of selection of elements and
alternative schema must be suppressed. According rules, and that languages can be separately acti-
to D. W. Green (1998, and chapter 25, this vo- vated as a whole. In the bilingual counterpart of
lume), presentation of a word in the L1 that has to Levelts model, all linguistic elements are labeled
Simultaneous Interpreting 459
for language. At the conceptual level (the preverbal language (Fabbro & Gran, 1994). In other words,
message), it is specied what the language of a according to this strategy, interpreting involves full
particular output chunk should be. To prevent the comprehension of the source language in a way
selection of source language elements, De Bot similar to common comprehension of speech. From
(2000) suggested that in SI the target language cue the representation of the inferred meaning, pro-
has a high value, so that only elements from that duction takes place in the target language.
particular language are selected. The transcoding strategy involves the literal
Finally, Grosjean (1997) attempted to integrate transposition of words or multiword units. The
SI within the theoretical concept of the language interpreter supposedly translates the smallest pos-
mode continuum, which entails that bilinguals may sible meaningful units of the source language that
nd themselves on a continuum with the extreme have an equivalent in the target language. Trans-
points of being in a completely monolingual mode coding is often called a word-based or word-for-
(complete deactivation of the other language) or in word strategy (e.g., Fabbro et al., 1990), but if this
a completely bilingual mode (both languages are strategy strictly involved replacing single words by
activated, and language switches can occur). their translation equivalents, its role has to be
To allow for SI, Grosjean added input and output limited because the resulting interpretation would
components to the continuum and suggested that be unintelligible. Paradis (1994) proposed that
the activation of these two components, rather than transcoding can take place at different levels of the
the level of activation of each language, varies. language system (phonology, morphology, syntax,
At the input side, both languages are activated to and semantics) by automatic application of rules.
allow for comprehension of input and monitoring One linguistic element is directly replaced by
of output. At the output side, the source language its structural equivalent in the target language.
output mechanism is inhibited (in the monolingual Figure 22.1 depicts the two alternative strategies.
mode). Grosjean acknowledged that, even with the They are usually not considered mutually exclusive;
addition of these two components to his model, both strategies can be available to the experienced
unanswered questions remain, such as how the interpreter.
interpreter is able to switch occasionally from tar- The important difference between these two
get to source language while for production the strategies is that, in transcoding, small translation
source language should be strongly inhibited. units are transposed into the other language without
To our knowledge, no past studies have ex- necessarily rst being fully comprehended and in-
amined language control in SI. Nevertheless, it tegrated into the discourse representation, whereas
should be clear that the control of languages is an the meaning-based strategy clearly involves full
important aspect of processing in SI. In the nal comprehension, including grasping the pragmatic
part of this chapter, this issue, and specically the intention of the input, after which the constructed
issue of selectively producing target language in SI, meaning is produced in the target language.
is discussed further. According to Paradis (1994), translation-
specic systems subserve the transcoding strategy.
Language Recoding Connections between equivalent items in the two
languages may function independently of those that
What exactly happens when the source language is subserve each of the separate languages: Patients
recoded in the target language? Theoretically, two showing paradoxical translation after brain da-
interpreting strategies have been distinguished: a mage were able to translate into a language that
meaning-based strategy and a transcoding strategy was not available for spontaneous production, but
(e.g., Anderson, 1994; Fabbro & Gran, 1994; comprehension of both languages was normal at
Fabbro, Gran, Basso, & Bava, 1990; Isham, 1994; all times (Paradis, Goldblum, & Abidi, 1984, in
Isham & Lane, 1994; Massaro & Shlesinger, Paradis, 1994; see also D. W. Green, chapter
1997). These strategies have also been referred to 25, this volume). According to Paradis, this shows
as vertical and horizontal translation (De Groot, that there are four neurofunctionally independent
1997, 2000) or Strategy I and Strategy II, respec- systems: one underlying L1, one underlying L2, and
tively (Paradis, 1994). two translation-specic systems involving connec-
Meaning-based interpreting is conceptually tions between the two languages, both from L1 to
mediated interpretation. The interpreter is thought L2 and vice versa. The meaning-based strategy does
to retain the meaning of chunks of information and not appeal to these systems. Meaning-based inter-
to recode the meaning of these chunks in the target preting depends on implicit linguistic competence,
460 Aspects and Implications of Bilingualism
Figure 22.1 Two alternative interpreting strategies (based on Paradis, 1994). The light arrows depict the
meaning-based strategy. The source language (SL) utterance is fully comprehended and represented at a
nonverbal conceptual level before its meaning is produced as an utterance in the target language (TL). The
dark arrows depict the transcoding strategy, according to which particular parts of the utterance (e.g., a
certain word or grammatical construction) are directly transcoded into their equivalent in the target language.
acquired incidentally and used automatically, Transcoding at the lexical level does not ne-
whereas transcoding depends on metalinguistic cessarily imply that words are translated via direct
knowledge that is learned consciously and that is lexical links between the form representations of
available to conscious recall. the corresponding source language and target lan-
Transcoding, or more specically, word-based guage words, as in the word association model for
interpreting, is often regarded as an inferior inter- word translation (Potter, So, Von Eckardt, &
preting procedure and is associated with un- Feldman, 1984). Translation of individual words
acceptable output (e.g., Shreve & Diamond, 1997). can be semantically mediated, and there is evidence
It is supposedly used relatively often by in- that even at an early stage of learning an L2, this
experienced interpreters, in the case of difcult is indeed what happens (De Groot, 2002; but
source text (e.g., highly technical text), or under see Kroll & Stewart, 1994). If the semantic level
stress (Fabbro & Gran, 1994). In contrast, Paradis is distinguished from a conceptual level of re-
(1994) argued that beginning interpreters often presentation, with the former storing the lexical
employ the meaning-based strategy, whereas skil- meaning of words and the latter containing multi-
led interpreters may use transcoding because the modal, nonlinguistic representation structures (Pav-
rules underlying transcoding presumably have to be lenko, 1999; see also Francis, chapter 12, this volume)
learned. transcoding at the word level can be regarded as
Simultaneous Interpreting 461
other hand, while the input is being transcoded into comprehend, translate, and produce speech. In
matched output, it is likely that this input is si- addition, because interpreters monitor their output,
multaneously processed further, up to full com- it may be necessary to keep some sort of re-
prehension, resulting in a level of comprehension presentation of the input phrase available until
that matches comprehension resulting from pure after production in the target language.
meaning-based interpreting. One of the best-known models of working
memory is that of Baddeley and colleagues (see e.g.,
Baddeley & Logie, 1999; Gathercole & Baddeley,
Self-Monitoring 1993). This multiple-component model consists of
a central executive and two slave systems, spe-
Speakers are assumed to monitor their own speech,
cialized for the temporary storage of phonologi-
and the self-monitoring system involved is thought
cally based material and of visuospatial material.
to employ the comprehension system (Levelt,
These subsidiary systems are called the phonolo-
1989). However, in SI the comprehension system
gical loop and the visuospatial sketchpad, respec-
is already occupied with understanding the source
tively. A fourth component has been proposed, the
text (Frauenfelder & Schriefers, 1997). This raises
episodic buffer, which is a limited-capacity store
the question how monitoring in SI comes about.
capable of integrating information from different
That interpreters indeed monitor whether the pro-
sources in a multidimensional code (Baddeley,
duced translation is correct has been suggested by
2000). The central executive is seen as a mechan-
several authors (Gerver, 1976; Isham, 2000; Lons-
ism controlling processes in working memory, in-
dale, 1997) and is evident from the self-corrections
cluding the coordination of the subsidiary systems,
that we have observed in our own data and that were
the manipulation of material held in these systems,
reported by others (e.g., Gerver, 1976).
and the control of encoding and retrieval strategies.
Most of the theoretical accounts of SI discussed
The phonological loop is specialized in maintain-
in previous sections have incorporated some form
ing verbally coded information and is therefore the
of output monitoring. In both Gervers (1976) and
most relevant slave system for SI. It consists of
Mosers (1978) model, the monitoring of output
two parts: the phonological store and the subvocal
is performed by comparing the meanings of the
rehearsal process. The phonological store retains
source message (retained in the input buffer) and
material in a phonological code, which decays over
the target message before production takes place.
time. The subvocal rehearsal process serves to re-
In Paradiss account (1994), it occurs after pro-
fresh the decaying representations in the store.
duction has taken place. Paradis himself noted that
Short-term recall for lists of words is disrupted
the comparison between the meaning of the source
when participants continuously articulate irrele-
and target messages is not specied in his model
vant syllables during the presentation of these
and there is no consideration to what happens
words, a technique called articulatory suppression
when the output is not satisfactory.
(e.g., Baddeley, Lewis, & Vallar, 1984). Articu-
The issue of output monitoring in SI is parti-
latory suppression also leads to reduced recall of
cularly interesting because apparently three speech
auditorily presented short discourse (Christoffels,
streams in two languages reside simultaneously in
2004). The requirement to maintain information
the language system: the comprehension of input,
during speech production may be an important
the production of output, and the monitoring of
aspect of the task difculty of SI because producing
output. Especially for the comprehension system,
speech during SI resembles articulatory suppres-
the situation is complicated because it needs to
sion. In fact, one may expect reduced recall because
handle source language input and target language
of the disruption of the rehearsal process in all
output simultaneously. How these speech streams
tasks in which comprehension and verbal produc-
can all cooccur at the same time and how they are
tion are involved simultaneously (see also Daro` &
kept separate from one another are questions that
Fabbro, 1994; Isham, 2000).
still have to be resolved.
After interpreting, text recall is indeed worse
than after listening to it (e.g., Christoffels, 2004;
Memory Processes Daro` & Fabbro, 1994; Gerver, 1974b; Isham,
1994). Two possible causes for the reduced recall
SI poses a great burden on working memory because after SI can be deduced from the articulatory loop
interpreters simultaneously have to store informa- model. First, production of the target speech may
tion and perform all sorts of mental operations to prevent subvocal rehearsal. Second, apart from the
Simultaneous Interpreting 463
incoming source language, the interpreters own skills (see the section on cognitive skills). Apart
voice enters the phonological store, possibly caus- from the phonological loop, the central executive
ing interference. and the episodic store are bound to be important.
Isham (2000) found that verbatim recall after They are presumably involved in the activation of
articulatory suppression was worse than recall after relevant information in long-term memory, the
both common listening and dichotomously listen- suppression of irrelevant information, the integra-
ing (listening to two speech streams, one of them tion of information, and the coordination of the
presented to each ear). He concluded that reduced different processes during SI (see also Bajo, 2002).
recall after SI is mainly caused by the actual pro-
duction of speech and not by the fact that two speech
streams enter the phonological store simultaneously.
Another reason for the reduced recall after SI may
Determinants of Interpreting
be the higher cognitive demands of simultaneous Output
comprehension and production.
The pattern of results found when comparing Listening Conditions: Input Rate,
recall following interpreting with recall following Information Density, and Sound
other, similar tasks is not consistent, however: Quality of Input
Recall after interpreting was found to be better
than after shadowing (Gerver, 1974b), but digit Input rate inuences the rate at which information
span performance was found to be worse in an has to be processed. Consequently, it also inu-
interpreting condition than in any of the remaining ences interpreting performance. It is not always the
conditions, including shadowing (Daro` & Fabbro, case, however, that the faster the input rate is, the
1994). Note however, that shadowing involved harder interpreting becomes. Slow, monotonous
verbal repetition of digits presented 1 s apart; these delivery of the source message can be as stressful as
circumstances may actually support recall. Finally, a speeded presentation (Gerver, 1976). According
no differences in recall whatsoever were obtained to Gerver, rates between 100 and 120 words per
between conditions of SI, shadowing, articulatory min are comfortable for the interpreter. When
suppression (Christoffels, 2004), or paraphrasing comparing the effect of increasing the input rate in
(Christoffels & De Groot, 2004). (Paraphrasing in shadowing and interpreting (from 95 up to 164
this context involved rephrasing the meaning of a words per min), he found that the proportion of
sentence in the same language but in different correctly shadowed text decreased only at the two
words or using an alternative grammatical con- highest rates, whereas in SI, performance decreased
struction; see Moser, 1978.) These inconsistent further with each increase in input rate. Moreover,
results are likely to be caused by differences in the shadowers maintained a steady ear-voice span of 2
relevance of long- and short-term memory in recall to 3 words at all input rates and increased their
performance across these studies. articulation rates as input rate increased. In con-
In conclusion, the relevant studies disagree on trast, the interpreters span increased from 5 to 8.5
whether interpreting and shadowing lead to dif- words, and their output rate remained the same,
ferent memory performance, but clearly memory indicating that they paused more and spoke less the
performance after interpreting is worse than after higher the input rate (Gerver, 1969, in Gerver,
just listening to a text. Interference from articu- 1976).
latory activity during interpreting forms at least a Shadowing performance is more accurate than
partial explanation for this differential memory SI performance, both for bilinguals not trained in SI
performance. This explanation is supported by the (Treisman, 1965) and for SI professionals (Gerver,
better sentence recall of sign language interpreters 1974a). Treisman investigated the effect of in-
in comparison to spoken language interpreters formation density rather than input rate on accu-
(Isham, 1994). racy of performance. Interpreting suffered more
Working memory is important in ordinary than shadowing from increasing information den-
language processing (see Gathercole & Baddeley, sity. No effect of information density on the ear
1993). It remains to be seen whether working voice span was found. The last result, however,
memory has a role in interpreting beyond its role in was based on six participants only, so this null ef-
ordinary language processing. That such is the case fect can be caused by lack of statistical power.
is suggested by studies that indicated that profes- Gerver manipulated the amount of noise in the
sional interpreters possess outstanding memory input and found that this manipulation had a larger
464 Aspects and Implications of Bilingualism
effect on the number or errors in interpreting than data of three professional interpreters and three
in shadowing. The earvoice span again remained inexperienced participants. For the professionals,
constant irrespective of the amount of noise. Both the number of errors and omissions were the same
ndings suggest that interpreters sacrice accuracy for the two directions. Interestingly, the partici-
to keep a constant earvoice span (Gerver, 1976). pants without experience in SI performed better
Alternatively, the participants may already have when interpreting from L1 into L2 than vice versa.
performed at their maximum lag in the relatively To conclude, so far no consistent effect of trans-
easy conditions and were therefore unable to in- lation direction has been obtained.
crease their earvoice span any further when the It is possible that the particular language com-
amount of noise or the information density in- bination involved inuences the difculty of inter-
creased (see the discussion on the lag between preting: The more the two languages involved
source and target language). deviate from one another on the lexical, morpho-
To summarize, these ndings indicated that in- logical, syntactic, semantic, and pragmatic levels,
terpreting is more difcult and more sensitive to the more difcult SI is likely to be. For example,
factors inuencing task difculty than shadowing. Barik (1975) observed that syntactic differences
Furthermore, they showed that not all factors that between source and target language might cause
increase task difculty also affect the earvoice problems. If, for instance, certain grammatical
span. constructions specic to a (source) language are
transferred into the target language, awkward or
Translation Direction and ungrammatical target language may result. Note
Language Combination that such an inuence of the source language on the
target language may indicate a role for the trans-
A recurring question concerns the role of the coding strategy in SI discussed in the section on
direction of translation in interpreting. It is often language recoding.
claimed that interpreting is easier into than from Goldman-Eisler (1972) found a longer earvoice
ones native language, which is typically the inter- span for interpreting from German to English than
preters dominant language (see Barik, 1975; Ger- from English to French or French to English. The
ver, 1976; Gile, 1997; Treisman, 1965). In word author attributed this nding to the fact that, in
translation studies, such a directional effect has German but not in English and French, the verb
been observed by some authors, who have shown frequently follows the object (subject-object-verb
that translating from L1 into L2 is slower and more order). Because the minimal translation unit is
prone to errors than translating from L2 into L1 likely to be a clause (as discussed earlier), when
(e.g., Kroll, Michael, Tokowicz, & Dufour, 2002; interpreting from German into English the inter-
Kroll & Stewart, 1994), but others have reported preter may have to wait for the verb in the input,
null effects or even the opposite effect (e.g., De causing lengthening of the earvoice span. Similar
Groot & Poot, 1997; La Heij, Hooglander, Kerl- problems may arise when interpreting from lan-
ing, & Van der Velden, 1996; see for discussion, guages with occasional verb-subject-object order,
Kroll & De Groot, 1997). such as Arabic (Gile, 1997; MacWhinney, 1997,
In interpreting studies, there is little experi- and chapter 3, this volume). It seems, then, that
mental evidence in support of any directional language combinations differ in the extent to which
effect. Rinne et al. (2000) compared, using PET, they pose demands on working memory. As a
interpreting from and into the native language, consequence, they may differ in the ease with
among other things. They found more extensive which an interpretation can be produced.
activation during translation into L2, possibly re- The effort model of SI (Gile, 1995, 1997) pro-
ecting differences in difculty between the two vides a capacity account of why effects of language
translation directions. Treisman (1965) found that combination may arise. This model discusses SI in
both French-dominant and English-dominant bi- terms of a limited capacity system. Three basic
linguals (without interpreting experience) were concurrent, conscious, and deliberate efforts are
better when interpreting from English into French presented: the listening and analysis effort, the
than when interpreting in the reverse direction. In production effort, and the memory effort. Each
a study on allocation of attention and text type, effort represents all the different processes involved
Daro`, Lambert, and Fabbro (1996) found no effect in comprehension, production, and memory, re-
of translation direction. Finally, Barik (1973, 1994) spectively. Moreover, a separate coordination ef-
provided a detailed analysis of translation direction fort is postulated. At any point in time, the three
Simultaneous Interpreting 465
basic efforts are processing different speech seg- not await the entire sentence before starting to in-
ments. The total capacity requirement is the sum of terpret (Moser-Mercer, Frauenfelder, Casado, &
all four efforts. It varies depending on the specic Kunzli, 2000).
information segments that are processed and If at discourse level a text is highly structured
therefore uctuates in accordance with the incom- according to a familiar schema, this may help
ing speech ow. As a consequence, errors may even predict what comes next. In a pilot study, Ada-
occur with relatively easy source segments because mowicz (1989) presented SI students with a pre-
of a sequential failure originating from an upstream pared, structured text and a spontaneous text.
difculty in the source message. Adamowicz argued that the prepared text was
For example, when capacity needed to produce more predictable than a spontaneous text, and
a difcult chunk is not immediately available, this that the difference in predictability between the
causes an increased memory load because incoming two text types should inuence the earvoice span
input has to be stored until production is possible. because anticipation allows for a shorter lag be-
The additional capacity required for memory may tween speaker and interpreter in the case of pre-
diminish capacity for comprehension, which in pared text. This prediction was substantiated by
turn may lead to problems in the comprehension of the data. Note, however, that Adamowiczs line of
the next speech segment. Specic difculties with argument and her data are contrary to the com-
certain language combinations can be expected for monly held belief that interpreting is only feasible
similar reasons. For example, syntactic differences in the case of spontaneous speech because it
between source and target language that force an is more redundant, has a lower information den-
interpreter to wait before formulating the target sity, and contains more hesitations than a pre-
utterance tend to increase the load on the memory pared text (e.g., Anderson, 1994; Chernov, 1994;
effort. Gile, 1997).
To summarize, the sparse experimental data Finally, the context of a source text and prior
suggest that, of the two variables discussed in this knowledge of the topic may make the text more
section, that is, translation direction and language predictable, help activate relevant registers in
combination, the latter may be the more important memory, and help select the most salient units of
determinant of interpreting performance. meaning from memory (see De Bot, 2000). An-
derson (1994) tested two factors that interpreters
traditionally believe to be sources of contextual
Source Text Characteristics information that are important for interpreting: the
amount of text-relevant knowledge the interpreter
Redundancy and the Possibility of Anticipation has prior to the interpreting event and the presence
The characteristics of the source text, especially the of visual information while interpreting (e.g., the
degree to which it is redundant, are likely to have speaker). She found no difference in quality of SI
an effect on interpreting performance. Chernov when professional interpreters received a complete
(1994) stated that, given the large processing load text of the speech beforehand, a summary of the
involved, SI of nonredundant speech (e.g., poetry speech, or no information other than its title. An-
or legal papers) should be impossible. He assumed derson also obtained no difference between con-
that speech redundancy normally enables the an- ditions with and without visual information of
ticipation of subsequent input. the speaker on video. Similarly, Jesse, Vrignaud,
Other authors have acknowledged the im- Cohen, and Massaro (2001) found no superior
portance of anticipation in SI as well (e.g., De Bot, SI performance when presenting visual information
2000; Moser-Mercer, 1997). In Mosers model of on speech lip movements together with audi-
SI (1978), a decision point is included that allows tory speech. Clearly, further research is needed to
for anticipation. On a decision that prediction of establish what role these types of contextual in-
input is possible, current input is discarded. That formation play in SI, if any.
interpreters indeed anticipate subsequent input is
evidenced by the fact that they sometimes produce Manipulation of Texts Daro` et al. (1996) studied,
a translation of a part of the source text that has among other things, the role of text difculty in SI.
not yet been produced by the speaker (e.g., Besien, They found that the number of errors was larger
1999; Gernsbacher & Shlesinger, 1997). In fact, a for the difcult texts, which were more syntacti-
certain amount of anticipation is always involved cally complex and contained more low-frequency
in interpreting because the interpreter usually does words than the easy texts.
466 Aspects and Implications of Bilingualism
Barik (1975) observed difculties not only for Only a surprisingly small part of the manipulated
function words and grammatical structures that strings was actually interpreted (only one of four
differ between source and target language, but also modiers), reducing the chance of a false-cognate
for some relatively common, notably abstract effect to materialize. It is likely that the modiers
words. He suggested these words might be pro- may have been regarded as redundant information
blematic because they may have different transla- that can be easily skipped, whether automatically
tion equivalents depending on the context. It would or deliberately (Schlesinger, 2000b).
be interesting to determine whether these observa- To summarize, text type and text difculty are
tions hold up experimentally and whether factors likely to inuence SI, and there is some evidence
known to inuence single-word translation (e.g., that corroborates this suggestion. Although it is not
word frequency and word concreteness) affect SI clear which text characteristics play the largest role
performance as well. in SI, an important variable may be whether parts
Van Hell (1998) found that, for single-word of the input can be easily anticipated. Specic word
translation in a highly constrained sentence con- properties, like word length, may inuence inter-
text, the effects of word concreteness and cognate preting output as well.
status were attenuated as compared to these effects
on word translation in isolation (the variable cog-
nate status is a measure of the orthographic and
phonological overlap between the words in a pair Simultaneous Interpretation
of translation equivalents; compare the noncognate Versus Similar Tasks
word pair bike and its Dutch equivalent ets with
the cognate word pair cat and its equivalent kat). Mental Load and Stress
Incidentally, an effect of word manipulations
such as cognate status would point at the use of the Several studies have considered the role of mental
transcoding strategy in SI because, according to the load and stress in interpreting in comparison to
meaning-based interpreting strategy, the interpret- other, similar tasks. A number of these studies used
ing output is produced from relatively large chunks the nger-tapping version of a verbal-manual in-
of input coded in a nonverbal conceptual form. It terference paradigm. Finger tapping is interrupted
should therefore not matter whether word equiva- by the processing demands of another (cognitive)
lents in source and target language are cognates. task, and this interference is larger the more de-
Shlesinger (2000b) examined the effect of some manding this other task is, thus indicating the
of these word-type manipulations on interpreting. cognitive load that is involved. A. Green, Sweda-
She embedded different types of strings containing Nicholson, Vaid, White, and Steiner (1990) found
adjective modiers (e.g., delicate, immature, frac- that interference on tapping was larger for inter-
tured, vulnerable ego) in six text segments and preting (and paraphrasing) than for shadowing,
looked, among other things, at the effect of indicating that the former is a cognitively more
the length of the input strings and whether they demanding task.
contained true or false cognates. False cognates, or The nger-tapping paradigm has also been used
interlingual homographs, are orthographically to infer lateralization of language. Concerning SI,
similar (or identical) words that belong to two the question posed in this type of research was
different languages but that do not share meaning whether interpreters, bilinguals, and monolinguals
across these languages (for example, the English showed different lateralization patterns in L1 and
word slim means clever in Dutch). Suppressing a L2 (see, e.g., Corina & Vaid, 1994; Fabbro et al.,
false cognate presumably requires effort; the in- 1990; A. Green et al., 1990). Results have not been
terpreter must assess whether a cognate ortho- consistent across the different studies, but lately the
graphic form involves a true or a false cognate and differences in lateralization data have been taken to
must then access the appropriate target language indicate larger involvement of pragmatic strategies
replacement (Gernsbacher & Shlesinger, 1997). to compensate for low L2 prociency rather than
Therefore, the presence of false cognates was ex- differential brain representation of language pro-
pected to inuence performance. However, Shle- cesses (Fabbro, 2001; Fabbro & Gran, 1997;
singer found better performance for short than for Paradis, 2000).
long words in the input strings (i.e., a word length Hyona, Tommolo, and Alaja (1995) took pupil
effect), but no effect of false cognates was found. dilation as a measure of processing load. Students of
This null effect was qualied by another nding: interpreting listened to, shadowed, and interpreted
Simultaneous Interpreting 467
an auditorily presented text. In shadowing, the earvoice span, and relatively large effects of in-
pupil diameter was larger than in listening, but formation density and noise on SI indicate that
interpreting yielded an even larger average pupil interpreting is more sensitive than shadowing to
diameter than shadowing, again suggesting that factors that increase task difculty. The combined
processing load is largest in interpreting. results of these studies suggest that interpreting is
Studies using other physiological measures also a more demanding and more complex task than
indicated that mental load during SI is high, and shadowing is. Using PET, Rinne et al. (2000) also
that coping with the difculties of SI induces stress contrasted SI and shadowing. The brain areas that
in interpreters. Klonowicz (1990) found an ele- were selectively activated in SI (i.e., after subtrac-
vated heart rate for both shadowing and inter- tion of the areas that were activated in shadowing)
preting in comparison to just listening, suggesting were those that are typically associated with lexical
an equally large mental effort on these tasks. In retrieval, working memory, and semantic proces-
a second study, Klonowicz (1994) studied the de- sing. This suggests that these cognitive processes
velopment of systolic blood pressure, diastolic play a larger role in interpreting than in shadowing.
blood pressure, and heart rate during four succes- Shadowing and interpreting share one source of
sive turns in interpreting. At the beginning of each task difculty in SI, namely, the simultaneity of
turn, systolic and diastolic blood pressures in- comprehension and production. The tasks differ in
creased immediately. During the turn, systolic that interpreting, not shadowing, involves the re-
blood pressure dropped to normal levels, whereas coding of source into target language, which may
diastolic blood pressure remained elevated. Heart account for the observed differences between the
rate only normalized in the rst two turns, after two tasks. Recoding may consist of two sub-
which it also remained elevated. According to components: First, in SI the message has to be re-
Klonowicz (1994), these results point to system- formulated. Second, SI involves the simultaneous
atically increased arousal in SI that mimics the activation of two languages (e.g., Anderson, 1994;
arousal leading to the development of essential De Groot, 1997). It is possible that not all of these
hypertension. task (sub)components contribute equally to task
Moser-Mercer et al. (1998) investigated the difculty.
effect of prolonged interpreting turns (i.e., longer Anderson (1994) compared performance on
than 30 min) on both the quality of output and shadowing, interpreting, and paraphrasing. In con-
psychological and physiological stress experienced trast to shadowing, in both paraphrasing and in-
by the interpreters. Rather interesting trends oc- terpreting reformulation is required, but only in
curred, similar to ndings for air trafc control, interpreting two languages are involved. By ex-
which is known to be an extremely demanding task ploiting these task characteristics, it is possible to
(Zeier, 1997). After an initial rise of the level of disentangle the subcomponent of reformulating a
stress hormones, it decreased with further time on message from the subcomponent of doing so in
task. The decrease may be caused by decreased another language. Twelve professional interpreters
motivation to perform well. Mental overload performed poorer in interpreting than in shadow-
caused by increased time on task appears to change ing on two quality measures, but interpreting dif-
the interpreters attitude to the job: Less effort is fered from paraphrasing only according to one of
expended and carelessness may set in. This inter- the two quality measures. The earvoice span was
pretation corresponds to the nding that the num- smaller in shadowing than in interpreting and
ber of serious meaning errors increases during the paraphrasing, but it did not differ between the last
second 30 min on task, even though the interpreters two tasks. In other words, Anderson replicated the
were apparently not aware of this performance difference between shadowing and interpreting, but
drop (see also Zeier, 1997). To summarize, these the results did not clearly indicate that the in-
studies indicate that SI involves a high mental load volvement of two languages instead of just one is
and can induce physiological stress. an important additional subcomponent in SI on top
of the reformulation subcomponent.
Sources of Difculty in In a study mentioned earlier, we attempted to
Simultaneous Interpreting disentangle all three proposed sources of cognitive
complexity in SI by comparing the shadowing of
In the studies described in the previous sections, sentences with paraphrasing and interpreting them
SI and shadowing were often contrasted. Worse (Christoffels & De Groot, 2004). Bilinguals with-
performance in SI, larger pupil dilation, longer out interpreting experience performed these tasks
468 Aspects and Implications of Bilingualism
simultaneously and in a delayed condition, that is, A nal, perhaps critical, difculty in para-
immediately after presentation of each sentence. By phrasing may be that, despite the fact that the input
including this condition, the effect of simultaneity message is already properly formulated, an alter-
of comprehension and production as a source of native wording has to be found. In paraphrasing, it
difculty in SI could be tested. The quality of the may therefore be necessary to inhibit the original
shadowing output was better in the delayed than in sentence form and to monitor output rigidly to
the simultaneous condition, but the difference was avoid literal repetition. All in all, there is reason to
small, suggesting that simultaneity of input and believe that paraphrasing may involve higher de-
output on its own adds somewhat to the com- mands than interpreting.
plexity of SI but is not a major source of com- In conclusion, it seems that both requirements
plexity. Also, the difference in output quality (of simultaneity of comprehension and production
between the three tasks in the delayed condition and of reformulation) contribute to the complexity
was small, suggesting that having to rephrase a of SI, but that especially the combination of these
sentence per seeven into a different language two components taxes the limited mental resources.
may also not be a major source of difculty on its
own. However, in the simultaneous condition, in-
terpreting and paraphrasing performance were
notably poorer than in the delayed condition, Novices Versus Experts
whereas shadowing performance was much more
similar in these two conditions. These ndings Are Interpreters Special?
showed that especially the combined requirements
of simultaneity and rephrasing have a detrimental Is there anything that distinguishes experienced
effect on the quality of performance in SI. interpreters from novices? If so, are the differences
There was no difference between paraphrasing qualitative or quantitative, and are they caused by
and interpreting in the quality of performance, a difference in talent or training? Neubert (1997)
which may suggest that the additional demand of claimed that untrained or natural translation is
activating two languages on top of reformulation is distinctly different from professional translation
not substantial. However, the earvoice span was and interpreting. Harris and Sherwood (1978),
signicantly larger in paraphrasing than in inter- however, argued that translation in general is an
preting. The paraphrasing task has been considered innate skill. According to them, translation is co-
as unilingual interpreting or intralanguage extensive with bilingualism, and therefore all bi-
translating (Anderson, 1994; Malakoff & Hakuta, linguals are able to translate (see also Malakoff,
1991). For this reason, the task is often used as an 1992; Malakoff & Hakuta, 1991).
exercise or assessment task in the training of inter- Dillinger (1994) compared professional inter-
preters (Moser-Mercer, 1994), and interpreting preters and balanced bilinguals on comprehension
in bilinguals has been compared directly to para- during interpreting, as measured by a wealth of
phrasing by monolinguals (Green et al., 1990). different variables. He found only small quantita-
In support of this view, interpreters sometimes tive differences and no qualitative differences be-
accidentally translate into the same language tween the two groups and argued that interpreting
(Anderson, 1994; De Bot, 2000). is not a special, acquired skill but the application
However, the larger earvoice span for para- of an existing skill that accompanies bilingualism
phrasing than for interpreting suggests that naturally. Of course, it is still an open question
paraphrasing is more demanding than interpreting. whether any differences may be found for language
The reason may be that the vocabulary demands in production.
paraphrasing are likely to be larger than in inter- Studies in which only nonprofessional inter-
preting because the latter only may require a basic preters participate are sometimes criticized for not
vocabulary in both languages, whereas paraphras- being informative about professional interpreting
ing requires a large vocabulary in the one language (e.g., Setton, 1999; see also Gile, 1991, 1994). But
concerned (Malakoff & Hakuta, 1991). Moreover, research with professionals also can have potential
changing the grammatical structure, as is typically drawbacks. As Shlesinger (2000a) pointed out, it
required in paraphrasing, may be more demanding may be difcult to distinguish between idiosyn-
than nding a grammatical equivalent of an input cratic strategies applied by the experienced inter-
segment in the output language, as required in in- preter and other, more general cognitive processes
terpreting. involved in the process. When novices perform the
Simultaneous Interpreting 469
SI task, presumably no such strategies have devel- articulatory suppression on the one hand and SI
oped yet. It is therefore both theoretically and performance in bilinguals without previous SI ex-
methodologically important to learn whether in- perience on the other hand (Christoffels, 2003).
terpreting in trained professionals and untrained In contrast, Chincotta and Underwood (1998)
bilinguals involves similar processes or is funda- did not nd a difference in digit span between
mentally different. English-Finish interpreters and Finish students
majoring in English, neither in a condition with ar-
ticulatory suppression nor in one without such
Cognitive Subskills suppression. However, consistent with earlier nd-
ings, differences in memory processes between the
By comparing novices and professionals on tasks two groups were suggested by the nding that
that are supposed to tap into possibly relevant the standard language effect in the digit span task (a
subskills, we can gain more insight into what cog- larger digit span in the language in which one can
nitive subskills are in fact important for SI. In the articulate faster) disappeared for the students in an
next section, we discuss memory skills, verbal u- articulatory suppression condition, whereas for the
ency, basic language processes, and other subskills interpreters it persisted.
in relation to SI. Finally, Bajo (2002) reported that word recall
in interpreters, participants with a similarly large
Memory Skills A number of studies indicated that reading span, and noninterpreters alike was dis-
interpreting is associated with efcient working rupted by divided attention manipulations that
memory skills. Padilla, Bajo, Canas, and Padilla tapped into the visual spatial sketchpad and the
(1995) compared experienced interpreters with central executive components of working memory.
student interpreters and noninterpreters on a stan- The nding that the interpreters did not outper-
dard digit span test and a reading span test, which form other groups on these working memory tasks
is thought to tap into both the processing and suggests that the ability to cope with simultaneity
storage aspects of working memory (Daneman & of verbalization and recall in SI may not reect a
Carpenter, 1980). They found that the average general ability of the executive to coordinate mul-
performance of the interpreters was higher than tiple tasks and processes, but instead involves a
that of the other two participant groups (see also specic skill to coordinate the verbal processes
Bajo, Padilla, & Padilla, 2000). In our laboratory, implicated in SI.
we found that, for unbalanced bilinguals, inter- To summarize, ndings of superior or qualita-
preting performance was signicantly correlated tively different performance on several verbal mem-
with both the digit span and the reading span in the ory tasks for professional interpreters than for other
two languages concerned, although only margin- groups of participants suggest the importance of ef-
ally so for L1 (Christoffels, De Groot, & Waldorp, cient working memory skills for SI.
2003), indicating a relation between SI perfor-
mance and working memory capacity in this group. Verbal Fluency Fabbro and Daro` (1995) observed
Moreover, memory performance in L1 and L2 of greater resistance to the detrimental effects of de-
professional interpreters was superior to that of layed auditory feedback in students of SI than in
bilinguals who had no SI experience but were si- monolingual controls. In a delayed auditory feed-
milarly procient in L2 (Christoffels, De Groot, & back condition, the speakers own voice is ampli-
Kroll, 2003). ed and delayed for a few hundred milliseconds, a
Padilla et al. (1995) compared recall of words in situation that in general causes speech disruption.
conditions with and without articulatory suppres- The student interpreters showed less speech dis-
sion during presentation. For the articulatory sup- ruption than the controls. Fabbro and Daro` sug-
pression condition, a signicant group effect was gested that the students were more resistant to the
obtained. This was caused by a decrement in the interfering effects of delayed auditory feedback
recall scores of all groups except the experienced because they had developed a high general verbal
interpreters, who apparently were resistant to the uency as well as an ability to pay less attention to
effect of articulatory suppression (see also Bajo their own verbal output.
et al., 2000). This nding suggests that the ability to Moser-Mercer et al. (2000) reported a number
cope with concurrent articulation is important in SI. of pilot studies comparing ve students of inter-
This conclusion is also supported by the association preting with ve experienced interpreters, all native
that occurs between retention under conditions of speakers of French. In line with the results of
470 Aspects and Implications of Bilingualism
Fabbro and Daro` (1995), they obtained a smaller groups on the words, but on the nonwords the in-
detrimental effect of delayed auditory feedback for terpreters were faster than the bilingual partici-
the professionals than for the students on reading a pants. The relevance of quick lexical access is also
French text but not on reading an English text. No indicated by the positive correlation between in-
differences were found between professionals and terpreting performance on the one hand and word
students on tasks involving semantics, free asso- naming and word translation in the two languages
ciation, spelling, morphology, and phonology. involved (English and Dutch) on the other hand, a
Finally, in a shadowing task, the interpreters ear result that we obtained for unbalanced bilinguals
voice span was similar to that of the students in untrained in SI (Christoffels et al., 2003). However,
their native French language, whereas the students when comparing the performance of interpreters
were faster in shadowing in English. Moreover, in and other highly procient bilinguals (teachers of
both languages, the interpreters made more errors L2) on these same tasks, we obtained no differences
than the students did. Moser-Mercer et al. (2000) between groups. This nding suggests that efcient
explained these remarkable results by suggesting lexical retrieval may not be uniquely related to SI,
that professionals are used to processing larger but to high L2 prociency instead (Christoffels, De
chunks of input than those required in shadowing, Groot, & Kroll, 2004).
which might make it harder for them to respect the Finally, in a dichotic listening task, Fabbro,
instruction of immediate repetition imposed by Gran, and Gran (1991) compared students of in-
the shadowing task. If this explanation holds, then terpreting with professionals in how well they de-
we should be cautious in using the shadowing task tected errors in translations of sentences. The
in studies that test interpreters (see also Frauen- participants simultaneously received the source
felder & Schriefers, 1997). sentence to one ear and its translation to the other
To summarize, none of the differences between ear. Professional and student interpreters did not
professionals and students observed by Moser- differ from one another in recognizing correct
Mercer et al. (2000) clearly supports the idea translations. However, an interesting difference
that professionals have special verbal uency skills. between the two groups was that the students re-
Perhaps the two groups compared in this study cognized more syntactic errors than the profes-
performed similarly because the students were al- sionals, whereas the professionals recognized more
ready enrolled in an SI training program and were semantic errors. This suggests that the groups dif-
therefore possibly (self-) selected on verbal uency fered in the level at which they processed the input.
skills. The additional interpreting experience of the To summarize, although it is not altogether
professionals may not exert a notable effect on clear which language subprocesses are most critical
some of the subskills involved in SI. However, for skilled SI performance, there is some evidence
given the small sample size, we cannot draw any to suggest that interpreters are relatively efcient in
rm conclusions from the results of this study. processing meaning.
Basic Language Processes Efcient language pro- Other Subskills A number of other potentially re-
cessing may be especially important for SI. The levant subskills of SI are worth mentioning. Gerns-
more the language processes that are involved in SI bacher and Shlesinger (1997) pointed out that
are automated, the more processing capacity will people differ in how efciently they can suppress
be available for other relevant processes and the interfering information, such as the inappropriate
faster the outcome of these processes will be avail- meanings of homonyms, recently processed (but
able for further processing. For example, the ability currently inappropriate) syntactic form, and the
to access and retrieve words quickly may be an literal interpretation of metaphors. They sug-
important subskill. Bajo et al. (2000) presented a gested that, in interpreting, resources required for
categorization task to four groups of participants: suppression are diminished because the system is
interpreters, interpreting students, bilinguals, and already involved in simultaneous comprehension
monolinguals. On each trial, the participants had and production. Because, nevertheless, interfering
to decide whether a word was a member of the information will have to be suppressed, the abi-
category to which another word referred. Espe- lity to do so effectively is likely to be another im-
cially for atypical exemplars of categories, the in- portant subskill of interpreting.
terpreters were faster than all other groups, Similarly, Tijus (1997) argued that the most
indicating faster semantic access. In a lexical deci- important subskill of SI is to be able to detect
sion task, no difference was found between the inconsistencies resulting from incorrect assignment
Simultaneous Interpreting 471
of meaning to polysemous phrases and to resolve tests, suggesting that especially rather general
them immediately. Detecting and quickly resolving verbal abilities and the processing of text are pre-
such inconsistencies requires a large memory ca- dictive for SI and consecutive interpreting. Predic-
pacity for input processing (Tijus, 1997), which tion of pass/fail rates was better on the basis of
again points to the relevance of efcient memory these tests than on the existing selection proce-
processes for interpreting. dures, showing that aptitude testing can be useful
in practice (see Hoffman, 1997, for a discussion of
interpreting regarded as a skill from the perspective
Training or Selection? of the psychology of expertise, and see Arjona-
Tseng, 1994; Lambert, 1991; and Moser-Mercer,
It is not clear whether the differences found between 1994, for discussions of aptitude tests used in
interpreters and other groups of participants con- training programs).
cern qualitative or quantitative differences in un-
derlying processes. Another relevant question that
needs to be answered is whether the skills required
for SI have developed as a consequence of training
Relevant Issues and
and experience in SI or whether successful inter- Concluding Remarks
preters chose a career in SI because they possess
certain talents that make them well suited to the task. In this chapter, we presented an overview of ex-
Bajo et al. (2000) presented evidence suggesting perimental research into SI from a cognitive per-
that training in interpreting can improve perfor- spective. In the nal part of this chapter, we briey
mance on basic language skills. They compared review a number of the most important issues that
students of interpreting who received a year of need to be addressed in developing a complete
training with an untrained control group on three model of SI.
tasks: comprehension, categorization, and lexical
decision. Both groups were tested twice, once at the
beginning of that year and once at the end. The The Locus of Recoding
student interpreters, but not the controls, showed
An important issue to resolve is where and how in
improved performance on the second test.
the system actual recoding of language (translation)
The most likely answer to the question of what
takes place. Two alternative theoretical views on
causes differences between novices and experienced
this issue were discussed: meaning-based interpret-
interpreters is that both certain language and
ing and transcoding. Although little direct experi-
memory abilities are required for a high perfor-
mental data exist to support either of these two
mance level, and that certain skills develop with
recoding strategies, there is some evidence to sug-
practice. It is, therefore, of great practical interest
gest that, in addition to meaning-based interpreting,
to nd out which aspects of SI can and should be
transcoding also takes place. This issue of how
learned on the one hand and what determines ap-
translation takes place has to be taken into account
titude and which tests can predict aptitude on the
by models of bilingual processing. For example, if
other hand (Moser-Mercer, 1994).
transcoding occurs, it may take place at a number of
Gerver, Longley, Long, and Lambert (1984)
different levels in the bilingual system: phonologi-
addressed the latter issues. They developed a set of
cal, morphological, syntactic, and semantic (Para-
psychometric tests to select trainees for a course in
dis, 1994). This implies the existence of direct links
simultaneous and consecutive interpreting. At the
between representations of the linguistic elements of
beginning of this course, they administered tests
one language and the corresponding representations
based on text materials (recall, cloze, and error
in the other language. The existence of such links
detection); linguistic subskills (synonym genera-
constrains current models of bilingual memory.
tion, sentence paraphrasing, and comprehension);
and a nonlinguistic speed stress test. The tests
correlated with nal examination ratings, and stu- Resource-Consuming Subcomponents
dents passing the course had a higher score on all of Simultaneous Interpreting
tests than the students who failed, albeit the dif-
ference was not signicant for each of the tests. The A further question is which subcomponents of the
text-based tests were more predictive for passing full interpreting task appeal to the limited mental
the course than the linguistic subskills and speed resources of the interpreter and how these
472 Aspects and Implications of Bilingualism
resources are allocated. In fact, it is not yet clear that are shared between the two languages (for
which subcomponents should be distinguished in SI reviews, see De Groot, 2002; Kroll & Dijkstra,
in the rst place and whether they share resources. 2002). The research on access to bilingual memory
Both Gerver (1976) and Gile (1997) assumed that mainly supports the idea that lexical access is
resources are limited and shared between the var- nonselective, that is, that both during comprehen-
ious components in their models. As a consequence, sion and during production, words from both
the monitoring of output, for instance, might suffer languages are initially activated (e.g., Colome,
if the listening conditions are suboptimal. It is as yet 2001; Dijkstra, Van Jaarsveld, & Ten Brinke,
unclear whether language recoding, the switch of 1998; Hermans, Bongaerts, De Bot, & Schreuder,
language itself, should be regarded as an additional 1998; Jared & Kroll, 2001; Van Heuven, Dijkstra,
resource-consuming processing step in SI in addition & Grainger, 1998, but see Costa, chapter 15, this
to the steps required for comprehending and pro- volume, for language-specic selection).
ducing language, or whether instead the nonverbal As mentioned (see The Control of Languages),
meaning is derived from the source language and in a framework in which control of languages is
the target message is subsequently simply produced exercised by global inhibition of the nontarget
from this meaning representation (Anderson, 1994; language, presumably two languages must be ac-
De Groot, 1997; Isham & Lane, 1994). If only tive simultaneously in SI. The ensuing question is
meaning-based translation holdsand not trans- how it is possible that during SI only the target
codingit may not be necessary to assume an ad- language is produced.
ditional translation stage. Figure 22.2 illustrates two alternative proposals
that allow target language production in SI within
a framework of global inhibition of the nontarget
Representation, Selection, Access, language; in addition, it illustrates a third proposal
and Control that does not assume global inhibition of a lan-
guage. To simplify matters, only lexical activation
An issue that has received little attention so far is considered. According to all three solutions,
is how the language system(s) are represented and lexical items belonging to the source language must
specically whether language comprehension and be separated from those of the target language. The
production are subserved by one and the same items of different languages may form independent
system or by two functionally independent systems subsets, or they are somehow labeled for language
instead. Yet, to model SI it is necessary to make (e.g., using language tags or by connections to
choices regarding the basic architecture of the language nodes; De Bot, 2000; Dijkstra & Van
language system(s). Considering monolingual lan- Heuven, 1998; D. W. Green, 1986, 1998; Poulisse,
guage processing, we may ask which parts, re- 1997).
presentations, or processes are shared between the The important difference between the rst two
language comprehension and production systems. alternatives (Figs. 22.2[a] and 22.2[b]) is whether
Kempen (1999), for example, assumed that gram- separate input and output lexicons exist. If the
matical encoding and decoding are performed by parsimonious solution is chosen, with just one
the same system, an assumption that may be dif- lexicon for both comprehension and production
cult to reconcile with the simultaneity of compre- (Fig. 22.2[a]), the problem is to explain why source
hension and production in SI, and Frauenfelder and language elements are not being selected for pro-
Schriefers (1997) and De Bot (2000) suggested that duction even though both languages are activated.
comprehension and production processes may One possibility is that, irrespective of activation in
share the lexical and grammatical knowledge sys- the lexicon, the source target elements are not
tems (but see Harley, 2001). considered for selection at all (e.g., Costa, Miozzo,
With respect to bilingual language processing, & Caramazza, 1999) (see Fig. 22.2[a]). In other
common questions are how the two languages are words, this alternative assumes language-specic
represented in the bilingual mind and how lexical selection. Indeed, Costa (chapter 15, this volume)
access to bilingual memory comes about. Most argues that in highly procient bilinguals (such as
of the relevant research on bilingual memory re- interpreters), lexical selection may be language
presentation focuses on the lexicon and converges specic. The mechanism for such ltering of
on the conclusion that word forms are represented language is as yet unclear. Perhaps only items with
in language-specic memory stores, whereas word a target language label can be selected when certain
meanings are stored in memory representations language schemas are adopted.
(a)
(b) ( continued)
Figure 22.2 The conceptual and semantic levels of representation are separated. Meaning-based translation
is illustrated by the route from the language comprehension system via the conceptual level of representation
to the language production system. Transcoding at the lexical level takes the shortcut from the source
language lexicon via the semantic level to the target language lexicon. (a) The lexicon is integrated for input
and output. Both source language and target language lexicons are highly activated (gray in the gure), but
selection of source lexicon items for production is not possible. (b) The input and output lexicons are
separated. The input lexicons for both languages are activated (gray) to allow for comprehension of
the source language and monitoring the produced output. There is (almost) no activation of the source
language in the output system, so production only takes place in the target language. Selection of lexical items
may be language nonspecic and based solely on the level of activation; source language items are hardly
activated and therefore not selected. (c) The input and output lexicons are separated. There is no global
activation/inhibition of languages, but a subset of appropriate items is activated instead (gray). Language is
one of the elements contained by the conceptual message that determines what lexical items are activated.
Selection is language nonspecic and based on the level of activation; the intended item in the target language
is selected because it was activated more than semantically related items in both languages.
474 Aspects and Implications of Bilingualism
SI performance may also be explained in terms production. Whatever the solution to be chosen,
of an inhibition account by assuming separate in- any model of SI, but also models of common bi-
put and output lexicons that can be separately ac- lingual language processing, should ultimately be
tivated or inhibited (see Fig. 22.2[b]). According to able to explain the language control that is ex-
this scheme, the output lexicon for the source lan- ercised during SI.
guage is strongly inhibited in SI, so that usually The selection of topics that we addressed in this
only target language elements will be selected. On chapter has been dictated primarily by the available
the input side, both languages are active, but not to research. It is clear that SI is an extremely complex
the same degree, to allow for comprehension of the task, and that many of its intricacies are yet to be
input and monitoring of the produced output (see resolved. The fact that SI, despite its complexity, is
also Grosjean, 1997). at all possible may help to constrain models of
Finally, a third option is not to assume that (bilingual) language processing because it req-
global activation or inhibition of language systems uires these models to account for simultaneous
controls language output, but that only specic language comprehension and production, for the
activation of the relevant elements in the lexicon simultaneous use and control of two languages, for
occurs. Language is one of the properties embedded translation processes, and for monitoring in SI.
in the conceptual message that selectively activates Although SI is complex, we hope to have demon-
a number of relevant semantically related lexical strated that there are ways to study it successfully.
elements in both languages. However, because of This fact, combined with the recognition that
this language cue, the appropriate element in the no account of the bilingual mind and bilingual
target language will receive the most activation and language processing can be complete without the
will therefore be selected. inclusion of a satisfactory explanation of SI per-
Such a proposal, based on a model by Poulisse formance, may challenge other researchers to take
and Bongaerts (1994, in Poulisse, 1997), is dis- up the study of SI as well.
cussed in detail by La Heij (chapter 14, this vo-
lume). This option is presented in Fig. 22.2(c) in a
model that assumes (functionally) separate input Acknowledgments
and output lexicons. If integrated input and output The Netherlands Organization for Scientic
lexicons were assumed instead, the elements of Research is gratefully acknowledged for funding
the source language that received a lot of activation this project. This chapter was written while I. K.
by the input might be inadvertently selected for Christoffels was supported by a grant from this
Simultaneous Interpreting 475
organization awarded to A. M. B. de Groot. We Bajo, M. T., Padilla, F., & Padilla, P. (2000).
thank Judith Kroll, Susanne Borgwaldt, and Lou- Comprehension processes in simultaneous
rens Waldorp for their valuable comments on ear- interpreting. In A. Chesterman, N. Gallardo
lier versions of this chapter. San Salvador, & Y. Gambier (Eds.),
Translation in context (pp. 127142).
Note Amsterdam: Benjamins.
Barik, H. C. (1973). Simultaneous interpretation:
1. In Levelts model (Levelt, 1989; Levelt, Temporal and quantitative data. Language
Roelofs, & Meyer, 1999), three subcomponents are and Speech, 16, 237270.
proposed. The rst component, the conceptualizer, Barik, H. C. (1975). Simultaneous interpretation:
formulates the intended message in a preverbal, Qualitative and linguistic data. Language and
nonlinguistic form. This preverbal message con- Speech, 18, 272297.
tains all the information required for the second Barik, H. C. (1994). A description of the various
component, the formulator, to convert the message types of omissions, additions and errors of
in a speech plan by applying grammatical and translation encountered in simultaneous
phonological rules and selecting the appropriate interpretation. In S. Lambert & B. Moser-
lexical items. Lexical items consist of two parts, the Mercer (Eds.), Bridging the gap: Empirical
lemma (representing syntax) and the lexeme (re- research in simultaneous interpretation
presenting morphophonological form). The third (pp. 121137). Amsterdam: Benjamins.
component, the articulator, subsequently converts Besien, V. (1999). Anticipation in simultaneous
the speech plan into sounds. interpretation. Meta, 14, 250259.
Chernov, G. V. (1994). Message redundancy and
References message anticipation in simultaneous inter-
pretation. In S. Lambert & B. Moser-Mercer
Adamowicz, A. (1989). The role of anticipation in (Eds.), Bridging the gap: Empirical research in
discourse: Text processing in simultaneous simultaneous interpretation (pp. 139153).
interpreting. Polish Psychological Bulletin, 20, Amsterdam: Benjamins.
153160. Chincotta, D., & Underwood, G. (1998).
Anderson, L. (1994). Simultaneous interpretation: Simultaneous interpreters and the effect of
Contextual and translational aspects. In concurrent articulation on immediate
S. Lambert & B. Moser-Mercer (Eds.), memory. Interpreting, 3, 120.
Bridging the gap: Empirical research in Christoffels, I. K. (2003). Listening while talking:
simultaneous interpretation (pp. 101120). The retention of prose under articulatory
Amsterdam: Benjamins. suppression in relation to simultaneous inter-
Arjona-Tseng, E. (1994). A psychometric approach preting. Manuscript submitted for publication,
to the selection of translation and interpreting University of Amsterdam, The Netherlands.
students in Taiwan. In S. Lambert & Christoffels, I. K. (2004). Cognitive studies in
B. Moser-Mercer (Eds.), Bridging the gap: simultaneous interpreting. Unpublished
Empirical research in simultaneous inter- doctoral dissertation, University of
pretation (pp. 6986). Amsterdam: Benjamins. Amsterdam, The Netherlands.
Baddeley, A. (2000). The episodic buffer: A new Christoffels, I. K., & De Groot, A. M. B. (2004).
component of working memory. Trends in Components of simultaneous interpreting: A
Cognitive Sciences, 4, 417423. comparison with shadowing and paraphras-
Baddeley, A. D., Lewis, V., & Vallar, G. (1984). ing. Bilingualism: Language and Cognition, 7,
Exploring the articulatory loop. Quarterly 114.
Journal of Experimental Psychology, 36A, Christoffels, I. K., De Groot, A. M. B., & Kroll,
233252. J. F. (2004). Memory and language skills in
Baddeley, A. D., & Logie, R. H. (1999). Working simultaneous interpreting: Expertise and lan-
memory: The multiple-component model. In guage prociency. Manuscript in preparation.
A. Miyake & P. Shah (Eds.), Models of Christoffels, I. K., De Groot, A. M. B., & Waldorp,
working memory: Mechanisms of active L. J. (2003). Basic skills in a complex task: A
maintenance and executive control graphical model relating memory and lexical
(pp. 2861). Cambridge, U.K.: Cambridge retrieval to simultaneous interpreting. Bilingu-
University Press. alism: Language and Cognition, 6, 201211.
Bajo, M. T. (2002, March). Working memory in Cokely, D. (1986). The effects of lag time on in-
translation and language interpretation. Paper terpreters errors. Sign Language Studies, 53,
presented at the Workshop on Processing 341375.
and Storage of Linguistic Information in Colome, A ` . (2001). Lexical activation in bilinguals
Bilinguals, Amsterdam. speech production: Language-specic or
476 Aspects and Implications of Bilingualism
Journal of Experimental Psychology, 26, 337 Harley, T. A. (2001). The psychology of language:
341. From data to theory (2nd ed.). Hove, U.K.:
Gerver, D. (1976). Empirical studies of simulta- Psychology Press.
neous interpretation: A review and a model. In Harris, B., & Sherwood, B. (1978). Translating as
R. W. Briskin (Ed.), Translation: Applications an innate skill. In D. Gerver & H. W. Sinaiko
and research (pp. 165207). New York: (Eds.), Language interpretation and commu-
Gardner Press. nication (pp. 155170). New York: Plenum
Gerver, D., Longley, P., Long, J., & Lambert, S. Press.
(1984). Selecting trainee conference inter- Hermans, D., Bongaerts, T., De Bot, K., &
preters: A preliminary study. Journal of Schreuder, R. (1998). Producing words in a
Occupational Psychology, 57, 1731. foreign language: Can speakers prevent inter-
Gile, D. (1991). Methodological aspects of inter- ference from their rst language? Bilingualism:
pretation (and translation) research. Target, 3, Language and Cognition, 1, 213229.
153174. Hoffman, R. R. (1997). The cognitive psychology
Gile, D. (1994). Methodological aspects of inter- of expertise and the domain of interpreting.
pretation and translation research. In Interpreting, 2, 189230.
S. Lambert & B. Moser-Mercer (Eds.), Hyona, J., Tommola, J., & Alaja, A.-M. (1995).
Bridging the gap: Empirical research in Pupil dilation as a measure of processing load
simultaneous interpretation (pp. 3956). in simultaneous interpretation and other lan-
Amsterdam: Benjamins. guage tasks. Quarterly Journal of Experi-
Gile, D. (1995). Basic concepts and models mental Psychology, 48A, 598612.
for interpreter and translator training. Isham, W. P. (1994). Memory for sentence form
Amsterdam: Benjamins. after simultaneous interpretation: Evidence
Gile, D. (1997). Conference interpreting as a both for and against deverbalization. In S.
cognitive management problem. In J. H. Lambert & B. Moser-Mercer (Eds.), Bridging
Danks, G. M. Shreve, S. B. Fountain, & M. K. the gap: Empirical research in simultaneous
McBeath (Eds.), Cognitive processes in interpretation (pp. 191211). Amsterdam:
translation and interpreting (pp. 196214). Benjamins.
Thousand Oaks, CA: Sage. Isham, W. P. (2000). Phonological interference in
Gile, D. (2000). Issues in interdisciplinary research interpreters of spoken-languages: An issue of
into conference interpreting. In B. Englund storage or process? In B. Englund Dimitrova
Dimitrova & K. Hyltenstam (Eds.), Language & K. Hyltenstam (Eds.), Language processing
processing and simultaneous interpreting (pp. and simultaneous interpreting (pp. 133149).
89106). Amsterdam: Benjamins. Amsterdam: Benjamins.
Goldman-Eisler, F. (1972). Segmentation of input Isham, W. P., & Lane, H. (1994). A common
in simultaneous translation. Journal of Psy- conceptual code in bilinguals: Evidence from
cholinguistic Research, 1, 127140. simultaneous interpreting. Sign Language
Goldman-Eisler, F. (1980). Psychological mechan- Studies, 85, 291316.
isms of speech production as studied through Jared, D., & Kroll, J. F. (2001). Do bilinguals ac-
the analysis of simultaneous translations. tivate phonological representations in one or
In B. Butterworth (Ed.), Language production. both of their languages when naming words.
Vol. 1: Speech and talk (pp. 143153). Journal of Memory and Language, 44, 231.
London: Academic Press. Jesse, A., Vrignaud, N., Cohen, M., & Massaro,
Green, A., Sweda-Nicholson, N., Vaid, J., White, D. W. (2001). The processing of information
N., & Steiner, R. (1990). Hemispheric from multiple sources in simultaneous inter-
involvement in shadowing vs. interpretation: preting. Interpreting, 5, 95115.
A time-sharing study of simultaneous inter- Kade, O., & Claus, C. (1971). Some methodolo-
preters with matched bilingual and gical aspects of simultaneous interpreting.
monolingual controls. Brain and Language, Babel, 17, 1216.
39, 107133. Kempen, G. (1999). Human grammatical coding.
Green, D. W. (1986). Control, activation, and Unpublished manuscript, Leiden University,
resource: A framework and a model for the The Netherlands.
control of speech in bilinguals. Brain and Klonowicz, T. (1990). A psychophysiological as-
Language, 27, 210223. sessment of simultaneous interpreting: The
Green, D. W. (1998). Mental control of the interaction of individual differences and men-
bilingual lexico-semantic system. Bilingualism, tal workload. Polish Psychological Bulletin,
Language and Cognition, 1, 6781. 21, 3748.
Grosjean, F. (1997). The bilingual individual. Klonowicz, T. (1994). Putting ones heart into
Interpreting, 2, 163187. simultaneous interpretation. In S. Lambert &
478 Aspects and Implications of Bilingualism
B. Moser-Mercer (Eds.), Bridging the gap: Massaro, D. W., & Shlesinger, M. (1997).
Empirical research in simultaneous interpreta- Information processing and a computational
tion (pp. 213224). Amsterdam: Benjamins. approach to the study of simultaneous
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical interpretation. Interpreting, 2, 1353.
and conceptual memory in the bilingual: McDonald, J. L., & Carpenter, P. A. (1981).
Mapping form to meaning in two languages. Simultaneous translation: Idiom interpretation
In A. M. B. de Groot & J. F. Kroll (Eds.), and parsing heuristics. Journal of Verbal
Tutorials in bilingualism (pp. 169199). Learning and Verbal Behaviour, 20, 231247.
Mahwah, NJ: Erlbaum. Meuter, R. F. I., & Allport, A. (1999). Bilingual
Kroll, J. F., & Dijkstra, A. (2002). The bilingual language switching in naming: Asymmetrical
lexicon. In R. Kaplan (Ed.), Handbook of costs of language selection. Journal of Memory
applied linguistics (pp. 301321). New York: and Language, 40, 2540.
Oxford University Press. Moser, B. (1978). Simultaneous interpretation: A
Kroll, J. F., Michael, E., Tokowicz, N., & Dufour, hypothetical model and its practical applica-
R. (2002). The development of lexical uency tion. In D. Gerver & H. W. Sinaiko (Eds.),
in a second language. Second Language Language Interpretation and Communication
Research, 18, 137171. (pp. 353368). New York: Plenum.
Kroll, J. F., & Stewart, E. (1994). Category inter- Moser-Mercer, B. (1994). Aptitude testing for
ference in translation and picture naming: conference interpreting: Why, when and how.
Evidence for asymmetric connections between In S. Lambert & B. Moser-Mercer (Eds.),
bilingual memory representations. Journal of Bridging the gap: Empirical research in
Memory and Language, 33, 149174. simultaneous interpretation (pp. 5768).
La Heij, W., Hooglander, A., Kerling, R., & Van Amsterdam: Benjamins.
der Velden, E. (1996). Nonverbal context Moser-Mercer, B. (1995). Sight translation and
effects in forward and backward word human information processing. In A. Neubert
translation: Evidence for concept mediation. & G. M. Shreve (Eds.), Basic issues in trans-
Journal of Memory and Language, 35, lation studies: Proceedings of the Fifth Inter-
648665. national Conference (Vol. 2, pp. 159166).
Lambert, S. (1991). Aptitude testing for simulta- Kent, OH: Kent State University.
neous interpretation at the University of Moser-Mercer, B. (1997). Beyond curiosity: Can
Ottawa. Meta, 36, 586594. interpreting research meet the challenge? In
Lambert, S. (1992). Shadowing. Meta, 37, 263273. J. H. Danks, G. M. Shreve, S. B. Fountain, &
Levelt, W. J. M. (1989). Speaking: From intention M. K. McBeath (Eds.), Cognitive processes in
to articulation. Cambridge, MA: MIT Press. translation and interpreting (pp. 176195).
Levelt, W. J. M., Roelofs, A., & Meyer, A. S. Thousand Oaks, CA: Sage.
(1999). A theory of lexical access in speech Moser-Mercer, B., Frauenfelder, U. H., Casado, B.,
production. Behavioural and Brain Sciences, & Kunzli, A. (2000). Searching to dene ex-
22, 175. pertise in interpreting. In B. Englund Dimi-
Liberman, A. M., & Mattingly, I. G. (1985). The trova & K. Hyltenstam (Eds.), Language
motor theory of speech perception revised. processing and simultaneous interpreting (pp.
Cognition, 74, 431461. 107131). Amsterdam: Benjamins.
Lonsdale, D. (1997). Modeling cognition in SI: Moser-Mercer, B., Kunzli, A., & Korac, M. (1998).
Methodological issues. Interpreting, 2, 91117. Prolonged turns in interpreting: Effect on
MacWhinney, B. (1997). Simultaneous interpreta- quality, physiological and psychological stress
tion and the competition model. In J. H. (pilot study). Interpreting, 3, 4764.
Danks, G. M. Shreve, S. B. Fountain, & M. K. Moser-Mercer, B., Lambert, S., Daro`, V., &
McBeath (Eds.), Cognitive processes in Williams, D. (1997). Skill components in
translation and interpreting (pp. 215232). simultaneous interpreting. In Y. Gambier,
Thousand Oaks, CA: Sage. D. Gile, & C. Taylor (Eds.), Conference in-
Malakoff, M., & Hakuta, K. (1991). Translation terpreting: Current trends in research
skill and metalinguistic awareness in bilin- (pp. 133148). Amsterdam: Benjamins.
guals. In E. Bialystok (Ed.), Language proces- Neubert, A. (1997). Postulates for a theory of
sing and language awareness (pp. 141166). translation. In J. H. Danks, G. M. Shreve, S. B.
New York: Oxford University Press. Fountain, & M. K. McBeath (Eds.), Cognitive
Malakoff, M. E. (1992). Translation ability: A processes in translation and interpreting (pp.
natural bilingual and metalinguistic skill. 124). Thousand Oaks, CA: Sage.
In R. J. Harris (Ed.), Cognitive processing in Padilla, P., Bajo, M. T., Canas, J. J., & Padilla, F.
bilinguals (pp. 515529). Amsterdam: North (1995). Cognitive processes of memory in si-
Holland. multaneous interpretation. In J. Tommola
Simultaneous Interpreting 479
23
Clearing the Cobwebs From
the Study of the Bilingual Brain
Converging Evidence From Laterality
and Electrophysiological Research
480
Laterality and the Bilingual Brain 481
accessed, and produced, the neuropsychological atypical cases that were included precisely because
study of the brain offers an additional level at which they were unusual.
to test, and eventually perhaps to constrain, cogni- A similar problem clouds the interpretation
tive models of bilingual lexical memory or language of another aspect of the clinical evidence, namely,
processing (e.g., Green, 1998; Kroll & Dijkstra, crossed aphasia, which has a direct bearing on
2002). For example, neurobehavioral investiga- the question of whether bi- or multilingualism al-
tions could be used to address current debates about ters the canonical pattern of left hemispheric (LH)
whether words in the bilinguals languages are ac- dominance for language. Although some research-
cessed selectively or nonselectively. ers have argued for a greater role for the RH in
For cognitive neuroscientists, bilingualism is of bilingual language functioning relative to that in
interest insofar as it allows for a study of the in- monolinguals, primarily on the basis of a larger
uence of early experience on brain plasticity for incidence of crossed aphasia among polyglots rel-
language. In contrast to monolinguals, who do not ative to the low estimate observed in monolinguals,
show much variation in age of rst exposure to here again the overinclusion of possibly unrepre-
language or in language mastery, bilinguals differ sentative cases may have contributed to this effect.
considerably on these dimensions and thereby of- There do exist a few recent group studies of unse-
fer cognitive neuroscientists a way of studying neu- lected cases; however, these have yielded mixed
ral correlates of early versus delayed exposure to results with respect to the question of a higher
language and degree of language competence at- incidence of crossed aphasia in polyglots versus
tained. In short, the study of bilinguals allows for a single-language users (see Vaid, 2002, for further
test of the neural basis, if any, of the notion of discussion of this issue).
a critical or sensitive period for language acquisi- Quite apart from the problem of sampling bias,
tion (Johnson & Newport, 1989; Newport, Bave- clinical studies also suffer from extreme variability
lier, & Neville, 2001). Studies of the effects of early in how the patients languages were assessed post-
sensory experience (e.g., Burton, Snyder, Conturo, morbidly (and they typically have precluded pre-
Akbudak, Ollinger, & Raichle, 2002; Neville, morbid assessment). Although the work of Michel
Coffey, Lawson, Fischer, Emmorey, & Bellugi, Paradis and his associates in particular has sought
1997) have shown variability in the degree of brain to ameliorate this situation through the use of
plasticity for different functions. It may be rea- an extensive array of standardized measures across
sonable to expect variations in the degree of plas- a variety of language pairs, there still remains a
ticity for different language functions as well paucity of comparative observations using these
(Newport et al., 2001). standardized measures.
Finally, what is particularly problematic in in-
terpreting studies derived from clinical populations
Clinical Sources of Evidence is the issue of compensatory function: Whenever
there is injury to the brain, there is the possibility of
Our current understanding of neural bases of lan- subsequent reorganization of brain function. Thus,
guage functioning in bilinguals is based on accu- even if clinical data were to point reliably to greater
mulated evidence and hypotheses derived from a RH participation in (certain subgroups of) bilin-
variety of sources, clinical and normative. Clinical guals, it would be difcult to disentangle post-
sources are largely in the form of case reports of morbid reorganization of brain functions from
language loss or recovery in bilingual (or polyglot) the situation characterizing brain functioning prior
aphasics following unilateral brain injury (see to the brain injury. Ultimately, to fully under-
Fabbro, 2001; Green & Price, 2001). Paradis stand language organization in the healthy brain,
(2001) documented a variety of patterns of lan- studies involving brain-intact individuals must be
guage recovery or impairment, including parallel considered.
and nonparallel patterns. However, as the vast
majority of the early clinical literature and a good
proportion of even recent studies consisted of single Experimental Sources
or selected cases rather than unselected group of Evidence
samples, the actual incidence of parallel versus
nonparallel patterns of language impairment or re- There are, in fact, quite a number of studies of
covery cannot be reliably determined because of language organization in brain-intact bilinguals.
possible sampling bias, that is, the overinclusion of These may be classied into three types: studies
482 Aspects and Implications of Bilingualism
involving lateralized presentation of visual or au- themselves been subjected to criticism on a number
ditory input, studies involving event-related po- of methodological and interpretive grounds (see
tential (ERP) recordings from LH and RHs during Paradis, 1999; Vaid & Hull, 2002, for further
language tasks, and those involving functional im- discussion).
aging using hemodynamic measures (see also Ko-
chunov et al., 2003, for an anatomic study of brain
shape differences between bilingual groups). Al- Toward Convergence
though laterality studies are the most numerous,
electrophysiological and imaging studies of bilin- We suggest that, instead of dismissing entire bodies
guals are on the rise. of literature, a more fruitful approach would be
In terms of the issues they have addressed, one that recognizes that each methodological per-
neuroimaging studies with bilinguals have largely spective has something to offereven as it may
focused on whether there are overlapping or dis- have specic limitationsand that therefore each
tinct cortical (or, in some cases, subcortical) re- body of literature should be given due consider-
gions activated during rst language (L1) versus ation. Specically, we wish to argue that the be-
second language (L2) processing or during the havioral laterality literature, rather than being
processing of a procient versus a less-procient viewed as unreliable, may be a particularly in-
language (see Abutalebi, Cappa, & Perani, chapter formative source of evidence and should also be
24, this volume, for a review). Evidence on this acknowledged for its historical and heuristic con-
question is mixed, with some studies suggesting an tribution to theorizing about brain/language rela-
overlapping representation and others a distinct tionships.
representation, particularly in less-uent bilinguals. As the earliest experimental source of evidence
Consequently, only tentative conclusions about the on brain functioning in brain-intact individuals,
nature of individual differences in bilingual cortical laterality studies with bilinguals provided the rst
activation patterns may be drawn from the evi- empirical challenge to the view that the LH is al-
dence to date, especially given that very few of the ways dominant for language. This body of work
studies actually compared different bilingual sub- also provided the earliest hints that the RH may
groups, such as procient versus nonprocient contribute to language processing in bilinguals to
users of a given language. a greater degree than had previously been consid-
Furthermore, only a handful of the imaging ered. There has been a resurgence of interest within
studies have addressed the question of whether and mainstream cognitive neuroscience research (by
how the two languages of bilinguals may differ at which is usually implicitly meant research on mono-
the interhemispheric level. Indeed, of the 40-plus linguals) on the RHs contribution to language
bilingual imaging studies that have appeared, we processing (e.g., Chiarello, 2003; Federmeier &
were able to identify only 6 that specically mea- Kutas, 1999). It would be appropriate to consider
sured and reported quantitative data regarding how this work interfaces with what has already
functional asymmetry of the bilinguals languages. been studied with bilinguals in this regard.
We summarize these briey in the section Electro- We feel that revisiting the bilingual laterality
physiology and Imaging Research With Bilinguals. literature would be instructive on several counts.
Researchers using imaging technologies often First, it would serve to place neurobehavioral in-
make an implicit (and sometimes explicit) claim vestigations of language processing and the brain
that imaging as a source of evidence about neural in a historical context. Indeed, what is sometimes
functioning is somehow more direct and informa- forgotten is that hypotheses about prociency or
tive than other sources because imaging captures age-related effects on bilingual neural organization
the workings of the brain in vivo. This claim is discussed in current electrophysiological and neu-
accompanied by a tendency on the part of some roimaging studies have their intellectual roots in
neuroimaging researchers, as well as others who hypotheses initially formulated and tested in a be-
cite imaging ndings, to discredit and dismiss other havioral laterality context (e.g., Vaid & Genesee,
sources of evidence, such as the bilingual aphasia 1980). Second, an examination of bilingual later-
literature or the behavioral laterality literature, as ality studies would serve as a cautionary reminder
unreliable, indirect, and ultimately uninformative.1 of the methodological and interpretive challenges
It should be pointed out, however, that neuroima- that confront any empirical investigation of indi-
ging studies of language in general, and of bilin- vidual differences in brain functioning. Third, and
gual language processing in particular, have most important, such an examination would make
Laterality and the Bilingual Brain 483
it possible for converging ndings to be uncovered ticipants concurrently perform a variety of verbal
across different methods or research techniques. comprehension or production tasks.
To date, there has been no systematic attempt to The behavioral laterality studies also vary in
seek convergence between laterality ndings from their degree of methodological rigor. Many of the
behavioral methodologies and those from neuro- early studies did not systematically screen bilin-
behavioral ones. This, then, was one aim of the guals on prociency or other relevant parameters
present overview. To draw connections between (e.g., L2 acquisition age). Others did not match
the behavioral laterality literature and other sour- stimuli across the two languages in terms of fre-
ces of normative evidence, it is necessary to estab- quency, length, or other germane criteria. Still oth-
lish just what the laterality literature has found. ers did not use appropriate statistical analyses (see
To this end, an additional aim was to present Obler, Zatorre, Galloway, & Vaid, 1982). In the
the outcome of a meta-analytic review of bilingual face of these sources of variability, the challenge for
laterality studies. students of this (as any experimental) literature has
In what follows, we rst discuss the behavioral been to attend carefully to methodological aspects
literature at large and suggest why it has resisted of individual studies to evaluate their internal
summary. We next present ve hypotheses that validity.
have been examined in the bilingual laterality lit- In light of the methodological differences in the
erature. Then, we briey describe a meta-analytic literature, it is perhaps not surprising that studies
review of studies within this literature in which have also differed in their outcomes, with some
bilinguals and monolinguals were directly com- reporting greater RH involvement in one or both
pared. Next, we consider how the ndings from languages of bilinguals as compared to that in
the meta-analysis of individual differences in lan- monolinguals; others report no differences, and still
guage lateralization t with analogous ndings others report greater LH involvement. This multi-
from recent event-related brain potential studies plicity of ndings has prompted some scholars to
of bilinguals. We conclude with general remarks conclude that the bilingual laterality literature is
about the importance of converging evidence and contradictory and hence not worthy of serious at-
an indication of some variables that remain to be tention (Paradis, 2003; Sussman & Simon, 1988).
teased apart in future experimental research on Nevertheless, others have viewed the complexity
brain lateralization of language in bilinguals. of this literature as an inevitable reection of the
multifaceted nature of the topic and various ap-
proaches to it (Vaid, 2002).
Bilingual Laterality Studies: Indeed, several integrative reviews (e.g., Ullman,
An Overview 2001; Vaid, 1983, 2002; Zatorre, 1989) have tried
to make some sense of the literature and relate bi-
The sheer size and complexity of the bilingual la- lingual laterality ndings to more general theoret-
terality literature have hindered previous attempts ical formulations about language and about the
to render the ndings coherent. Although modest brain. Although many of these reviews sought to be
in comparison to the number of laterality studies comprehensive and thorough, qualitative reviews
conducted with monolinguals, the total number of are open to the possibility of bias in terms of which
laterality studies with bilinguals of which we are studies get included and how they are discussed.
aware is currently around 150. The studies differ To get a more accurate picture, all relevant studies
from each other in a number of respects: languages should be considered, and the effects of possible
or language pairings of the bilingual participants, variables moderating the outcomes should be sys-
language acquisition histories, contexts of language tematically assessed.
use, testing paradigms, response measures, lan- What we are advocating here is a quantitative
guage components studied, and tasks used. For rather than a narrative review. Meta-analysis, which
example, dichotic listening studies typically use provides a quantitative assessment of a body of
word-level stimuli and measure response accuracy, work, would seem to be particularly useful in an-
whereas visual half-eld studies use words, word swering whether or how bilingualism affects brain
pairs, or sentences and measure response latency to lateralization for language. The results of a meta-
speeded judgments on the basis of visual, phonetic, analytic investigation of laterality studies compar-
semantic, or syntactic dimensions of the stimuli. ing bilinguals with monolinguals could in turn in-
Dual-task studies in turn measure interference in form questions of broader theoretical interest in
motoric performance (nger tapping) while par- the cerebral lateralization literature at large, that is,
484 Aspects and Implications of Bilingualism
the role played by the RH in mediating linguistic 3. Stage of L2 acquisition hypothesis. The stage
function and the extent of variation in cerebral of L2 acquisition hypothesis (Obler, 1981)
lateralization for language in neurologically healthy predicts that initial stages of L2 (and perhaps
monolingual participants. In other words, meta- also L1) acquisition are associated with
analysis offers an opportunity not only to summa- greater RH involvement than are later stages
rize patterns of language lateralization in bilinguals, of mastery in the language. Thus, the hy-
but also to arrive at a clearer understanding of pothesis predicts more RH involvement
language lateralization in monolinguals. Before among beginning learners of an L2 relative
discussing the results from a meta-analysis of lat- to advanced learners, with uent bilinguals
eralization in bilinguals versus monolinguals (Hull presumably showing a shift to growing LH
& Vaid, 2003b), we review existing hypotheses involvement.
about language lateralization in bilinguals. 4. Manner of L2 acquisition hypothesis. The
manner of L2 acquisition hypothesis (see
Galloway & Krashen, 1980) was initially
proposed as an extension of the stage hy-
Hypotheses About Language pothesis as it applies mainly to late L2
Lateralization in Bilinguals acquisition. According to the manner hy-
pothesis, if the L2 was acquired primarily in
Although it is generally accepted that the left ce- an informal manner (e.g., in a natural con-
rebral hemisphere is the dominant hemisphere for text and with emphasis on actual communi-
language, and particularly for grammar, a number cation and daily interaction with other
of studies have questioned the degree to which this interlocutors), there should be greater RH
pattern is altered by gender, handedness, and var- involvement compared to an L2 acquired in
ious experiential factors, such as bilinguality. a formal setting. In the latter case, a pre-
Indeed, several different hypotheses have been pro- dominant emphasis on the rules of gram-
posed with respect to how bilingualism, as a source mar or spelling or language usage restricted
of differences in language experience, might inu- to visual rather than combined visual and
ence brain functioning (see Genesee, 1982; Vaid & auditory modes should be associated with
Hall, 1991). Some of these hypotheses developed in greater LH involvement. Thus, the manner
response to suggestions rst made by early neu- hypothesis leads to the interesting prediction
rologists reporting cases of selective loss and re- that, if the mode of L2 acquisition and use is
covery of language in polyglots; others developed predominantly a formal one, there may be
from psychological investigations of individual greater LH involvement in the L2 than in
differences in cognitive functioning in relation to the L1.
language acquisition history. 5. Age of L2 acquisition hypothesis. The age of
In the following list, the rst two hypotheses fo- L2 acquisition hypothesis (Genesee, Hamers,
cus on comparisons between bilinguals and mono- Lambert, Mononen, Seitz, & Starck, 1978)
linguals and the remainder address differences posits that the earlier the acquisition of an L2
between bilingual subgroups. relative to the L1, the closer the pattern of
lateralization should be for the two lan-
1. The L2 hypothesis. This hypothesis predicts guages. Thus, early bilinguals should show
that the RH is more involved in the proces- no differential lateralization patterns in
sing of the L2 relative to the L1 of bilinguals. their two languages, whereas late bilinguals
As such, when comparing bilinguals (partic- should show more discrepancy. In the origi-
ularly on the L2) with monolinguals, more nal framing of the hypothesis (Vaid & Gen-
LH involvement in the latter relative to the esee, 1980), it was assumed that early
former may be expected. bilinguals would behave like monolinguals
2. Balanced bilingual hypothesis. This hypoth- (both groups showing LH dominance),
esis posits that the incidence of nativelike whereas late bilinguals would show a dif-
prociency in two (or more) languages alters ferent pattern from that of early bilinguals,
hemispheric involvement such that RH acti- the precise nature of which would in turn
vation for both languages of uent bilinguals reect other variables such as prociency and
is predicted to be greater than that of manner of acquisition. Given that neither the
monolinguals. cerebral cortex nor the corpus callosum are
Laterality and the Bilingual Brain 485
fully developed at least through the age of 5 that provided quantitative data concerning direct
years (see Joseph, 1982), the age hypothesis comparisons of monolinguals and bilinguals on
allows for the possibility that languages ac- the same L1. As a result, no neuroimaging studies
quired after the brain is relatively mature qualied for this meta-analysis. Excluded as well
may be functionally organized in distinct were all studies that failed to use identical later-
ways from those learned earlier. alized presentation conditions for bilinguals and
monolinguals (e.g., stimulus exposure differed be-
tween groups). The stringent criteria resulted in a
total of 23 studies, all of which directly compared
Toward a Demystication linguistic function in monolinguals with that of
of the Laterality Literature: bilinguals, and many also evaluated hemispheric
Our Meta-analysis specialization for language within bilingual sub-
groups.
An early meta-analysis of the bilingual laterality The primary goal of our meta-analysis (Hull &
literature (Vaid & Hall, 1991) examined a total of Vaid, 2003b) was to establish whether and under
59 studies that had appeared as of December 1989. what experimental conditions systematic differ-
All ve hypotheses summarized above were tested. ences might exist in the lateralization of language
The ndings showed support for the existence of between monolinguals and bilinguals. Thus, the
differences in lateralization between monolinguals results of the meta-analysis have direct bearing on
and certain bilingual subgroups. One particularly the balanced bilingual and the L2 hypotheses. An
salient outcome of the Vaid and Hall (1991) meta- additional goal was to determine, for those studies
analysis was that, contrary to the expectation that in the sample for which within-bilingual compari-
hemispheric involvement in early bilinguals would sons were also possible, whether L2 uency or age
resemble that of monolinguals, early bilinguals in- of L2 acquisition moderated language lateraliza-
stead showed bilateral involvement for both lan- tion. The outcomes of the analysis thus have bear-
guages and signicantly differed in lateralization ing on the age and the stage hypotheses as well.2 (It
from late bilinguals. Trends were also observed for should be remembered that data from the late bi-
differential laterality related to the experimental linguals in this meta-analysis were limited to per-
paradigm employed (e.g., tachistoscopic viewing formance in the L1 only.)
showed less left lateralization than dichotic listen- The main experimental variable in our meta-
ing); other trends failed to reach signicance ow- analysis (Hull & Vaid, 2003b) was that of group,
ing, perhaps, to the small cell sizes in many cases. that is, bilinguals versus monolinguals. This vari-
Finally, although there were hints of task-related able was operationalized as follows: bilinguals,
effects, many of the studies had not been designed persons who were able to communicate in at least
to consider task-related processing effects, thus two languages (regardless of uency level); mono-
task was not coded as a moderator variable in the linguals, persons with very little or no exposure to
Vaid and Hall meta-analysis. a language other than the native one. In addition,
Our meta-analysis of the bilingual laterality bilinguals were classied according to age of onset
literature (Hull & Vaid, 2003b) included studies of L2 exposure and by L2 prociency. Early bi-
that appeared up to December 2001. This meta- linguals were dened as bilinguals whose exposure
analysis specically focused on studies assessing to the L2 occurred by age 6 years; late bilinguals
lateralization in bilinguals versus monolinguals. were dened as bilinguals whose exposure to the
In addition, unlike the previous meta-analysis, our L2 occurred after age 6 years.
meta-analysis kept the languages spoken by mono- In addition to the moderator variables of L2
linguals and bilinguals in any given study constant; acquisition age and L2 uency, three other mod-
thus, bilinguals data on their L1 only (in the case erator variables were examined (Hull & Vaid,
of late bilinguals and on both languages in the 2003b): participant gender, testing paradigm, and
case of early bilinguals) was compared with that of task demands. Criteria for task demands were dif-
monolingual speakers of that language. Finally, we cult to establish, but what we nally selected were
coded for the effects of task-related processing de- the following: visual demands, when the verbal
mands in hemispheric functioning. material was presented to visual half-elds or vi-
As noted, our meta-analysis (Hull & Vaid, sual aspects of the verbal stimuli were highlighted
2003b) included only those behavioral language (e.g., orthographic judgments); auditory, when the
lateralization studies of brain-intact individuals verbal material was typically presented dichotically
486 Aspects and Implications of Bilingualism
or auditory aspects of the stimuli were highlighted guals showed evidence for bilateral hemispheric
(e.g., rhyme judgments); semantic, when partici- involvement compared to monolinguals and late
pants were to process the meaning of the verbal bilinguals, who showed LH dominance overall.
input (e.g., synonym or semantic category judg- The consistency of differential lateralization in
ments, paraphrasing, or translation); surface, which language users with distinct language acquisition
referred to tasks tapping syntactic properties of histories lends support to the notion of some form
words and sentences (e.g., part of speech judg- of a sensitive period for language acquisition.
ments); and global, which referred to general ver- What is not clear is whether this sensitive period
bal tasks that did not isolate an individual sensory reects strictly neurological (e.g., brain matura-
modality or language component (e.g., reading/ tional) parameters or other parameters (e.g., situ-
listening to a story). ational or cognitive changes in language experience
and use over a lifetime).
In addition to the ndings with respect to lan-
Meta-analytic Outcomes guage acquisition history, interesting interactions
The main outcomes of this meta-analysis were of gender with group were also revealed. In par-
as follows (for further details, see Hull & Vaid, ticular, bilaterality was more pronounced in bilin-
2003b): gual than in monolingual men, but despite similar
trends for women, differences in laterality failed to
Monolinguals as a group were LH dominant reach signicance between monolingual and bilin-
overall when collapsed across paradigms and gual women. There may be speculation as to why
tasks. However, monolinguals showed bilat- women do not show marked differences in later-
eral involvement on tachistoscopic viewing alization as a function of the number of languages
paradigms and bilateral involvement on tasks learned. Is it because women simply make greater
with visual demands.3 use of both hemispheres during language tasks in
Early bilinguals showed bilateral activation general? This possibility was suggested by an ad-
(all were uent bilinguals). ditional nding in the meta-analysis: Monolingual
Late uent bilinguals showed LH dominance and bilingual women showed greater RH involve-
overall and did not differ from monolinguals. ment than monolingual and bilingual men during
Collapsed across acquisition age, bilinguals dichotic listening tasks. Another possibility is that
showed bilateral involvement for tachisto- women as a group are more sensitive to emotional
scopic viewing, dichotic listening, and verbal- or pragmatic aspects of language than are men (see
manual interference testing paradigms; they Voyer, 1996) and thus may recruit the RH more
showed LH dominance for tasks with audi- than men when processing spoken language. Both
tory demands and bilateral involvement for these possibilities need to be explored more directly
tasks with visual and global demands. in future research.
Bilingual and monolingual women did not Our meta-analytic ndings (Hull & Vaid,
differ; they were relatively more LH later- 2003b) offer a descriptive summary of language
alized than bilingual men, but less LH lateral- lateralization effects across a variety of linguistic
ized than monolingual men. tasks and language acquisition histories and as such
Bilingual men showed bilateral hemispheric can be used to inform models of the functional
involvement; monolingual men showed LH organization of language in the bilingual brain and
dominance. This group difference was signif- to suggest areas for further research in bilingual-
icant. ism. For example, given that early bilinguals evi-
denced greater RH involvement during language
processing than either late bilinguals or monolin-
Discussion of Meta-analytic guals, further research may be directed at what
Outcomes this difference might reect in terms of language-
processing mechanisms.
Overall, our ndings (Hull and Vaid, 2003b) One view has been that the RH is particularly
clearly showed that monolinguals and (early) bi- associated with pragmatic or metalinguistic strate-
linguals do indeed differ in hemispheric speciali- gies (Paradis, 2000); an increased reliance on RH-
zation for the processing of the native language. mediated strategies for language processing could
Furthermore, a difference was obtained between thus be interpreted as compensation for rela-
early and late bilinguals. Specically, early bilin- tively weaker linguistic skills. However, our results
Laterality and the Bilingual Brain 487
showing greater RH involvement only in bilinguals than in later L2 acquisition. With increasing prac-
who acquired prociency in both languages early in tice in the L2, there should presumably be an im-
life argues against a compensation explanation. provement in the processing of grammatical rules
An alternative possibility is that young children by procedural memory; this would be reected in
growing up in a multilingual environment may an increasing reliance on the LH for L2 processing.
need to employ a holistic monitoring strategy to Consistent with Ullmans prediction, uent late
identify the appropriate language to use in a spe- bilinguals in our meta-analysis showed LH domi-
cic situation (e.g., with school friends vs. with nance for language processing, whereas nonuent
grandparents). Such a strategy could conceivably late bilinguals showed signicantly greater RH
stimulate increased RH activation, in line with the involvement, although it must be noted that our
suggestion that the RH is preferentially involved nonuent sample was relatively small (comprised
with holistic or Gestalt-like processing (e.g., Fab- of seven comparison groups).
bro, Gran, Basso, & Bava, 1990). Bilingual studies concerning cognitive processes
Still other interpretations of early bilinguals other than language are limited, but there is at least
apparent bilateral hemispheric involvement in lan- one recent study that compared monolinguals
guage could be proposed and tested in the context and bilinguals on a face discrimination task (Haus-
of models of the bilingual mental lexicon and bi- mann, Durmusoglu, Yazgan, & Gunturkun, 2004).
lingual conceptual organization, most of which so The experiment showed that early uent bilinguals
far have drawn their evidence primarily from displayed a reduced RH advantage on face dis-
studies of late bilinguals (see, e.g., De Groot, 1993; crimination tasks relative to monolinguals. The
Kroll, 1993; Kroll & De Groot, 1997). authors proposed that the neuronal space available
Another matter of theoretical interest concerns for nonlinguistic functions in bilinguals is crow-
how the patterns of hemispheric activity revealed ded by the demand for additional cortical space
by our meta-analysis (Hull & Vaid, 2003b) t with needed to process the two languages, and thus less
what is known about the involvement of the two space is available for nonlinguistic functions.
hemispheres in areas of cognitive functioning out- Whereas caution must be used in generalizing from
side of language use. A particularly compelling the results of a single study, the results of Haus-
comparison can be made between our results and mann et al. provided at least some support for the
an integrative perspective that relates neural dif- notion that early multiple language experience may
ferences to differences in declarative and proce- affect not only the organization of language, but
dural memory systems (Paradis, 1994; Ullman, also that of nonlinguistic functions.
2001). According to the declarative/procedural Taken together, the lateralization differences
model developed by Ullman (2001) on the basis of revealed by the meta-analysis (Hull & Vaid, 2003b)
an examination of lesion, neuroimaging, and elec- and the studies discussed in this section make clear
trophysiological studies, lexical knowledge makes that particular differences in language acquisition
use of declarative memory, which is thought to be history (e.g., early vs. late L2 acquisition) may give
housed in temporal lobe structures. Declarative rise to functional differences in language process-
memory has been implicated in the explicit learning ing. In a broader sense, the present meta-analytic
and use of facts and of event knowledge. In con- results also explicate a range of language skills,
trast, grammatical knowledge is thought to be which any person can develop, that have a mea-
subserved by implicit, procedural memory thought surable impact on the functional organization of
to be housed in left frontal and basal ganglia the brain for language and perhaps even for other
structures. Procedural memory has been implicated aspects of human cognition.
in the development of motor and cognitive skills in An important caveat that must be kept in mind
the L1. With respect to L2s acquired subsequent to is that our (Hull & Vaid, 2003b) meta-analysis
an L1, Ullman proposed that linguistic forms with covered only a subset of the extant bilingual la-
grammatical computation that is thought to depend terality literature, focusing as it did only on those
on procedural systems in L1 are more dependent on studies in which bilinguals and monolinguals were
declarative and lexical memory in L2, either by directly compared and in which monolinguals
memorization or construction by explicit rules. As language matched bilinguals L1. Thus, a good pro-
such, Ullman argued that linguistic forms in L2 portion of the overall bilingual laterality literature
should be more reliant on declarative than proce- had to be excluded. Fortunately, the results of a
dural memory than those in L1. This reliance comprehensive meta-analysis that considered the
should be greater in earlier stages of L2 acquisition full set of bilingual laterality studies, including those
488 Aspects and Implications of Bilingualism
without monolingual controls (see Hull, 2003; Hull analysis found support for task effects, in many
& Vaid, 2003a), corroborated the bilingual group cases the cell sizes were too small to permit useful
differences noted in our meta-analysis. conclusions about the interaction of group and task
A critical interpretive issue that our (Hull & effects.
Vaid, 2003b) meta-analysis brought to light is the
importance of distinguishing prociency effects
from age of L2 acquisition effects when designing
studies or interpreting conclusions. Because early Electrophysiological and Imaging
bilinguals are generally highly procient in both Research with Bilinguals:
languages whereas late bilinguals tend to vary in Lessons From Behavioral
their L2 prociency, care must be taken to avoid Laterality Research
interpreting prociency effects as age effects or vice
versa. In our analysis, we separately examined high The ndings from our (Hull & Vaid, 2003b) meta-
versus moderately procient late bilinguals (all analysis underscore the importance of including
early bilinguals were highly procient) and found monolingual participants in any study that seeks to
that most of the bilingual within-group variance address whether the state of being bilingual un-
was explained in terms of early versus late L2 ac- iquely affects cerebral representation of language.
quisition, whereas degree of L2 uency explained The meta-analysis showed that, when bilinguals
only about one third of the variance. and monolinguals are directly compared in a given
Our (Hull & Vaid, 2003b) review also indicated experiment, something in the processing of lan-
a need for caution when considering the use of guage in bilinguals appears to be different from
language tasks involving global demands because that in monolinguals, and that something likewise
groups tested with such tasks yielded by far the differentiates language processing in early bilin-
greatest levels of unexplained variance, even after guals from individuals who acquire an L2 later
the application of all combinations of the moder- in life. Moreover, the meta-analysis indicated that
ators (e.g., L2 acquisition age, L2 uency, partici- language processing in monolinguals may reect
pant sex). Thus, tasks that tap global demands, more recruitment of the RH than has been ex-
such as listening to a story, appear to be inadequate pected; furthermore, it was found that RH in-
for eliciting consistent lateralization patterns across volvement is especially evident on certain tasks and
participants. Given the lack of moderating inu- in certain subgroups (e.g., women more so than
ence associated with global task demands, it is men for auditory tasks).
suggested that future efforts aimed at nding reli- What we cannot conclude from this literature
able relationships between specic language tasks are answers to questions about intrahemispheric
and hemisphericity will be more fruitful if they differences in the neural regions subserving various
make use of behavioral tasks that study specic components of language or the time course of
linguistic subcomponents (e.g., semantic, syntactic, neural processing of language. Such questions are,
phonetic, orthographic, or pragmatic) in isolation. of course, readily addressed by positron emission
In general, the explanatory value of the later- tomographic, functional magnetic resonance im-
ality literature would be enhanced by inclusion of aging (fMRI), and event-related brain potential
greater numbers of specic kinds of controlled studies (see Kutas, 1997; Hagoort, Brown, & Os-
studies. The majority of bilinguals included in our terhout, 1999; and Price, 1998, for reviews). An
(Hull & Vaid, 2003b) meta-analysis were uent in additional question that imaging techniques can
the L2; therefore, the results are relatively less in- perhaps better address is how differences in cere-
formative on the question of effects of degree of L2 bral representation for language interact with those
competence. More studies are needed that compare for other cognitive functions, such as memory or
uent bilinguals versus those with differing levels executive control (Jackson, Swainson, Cunnington,
of mastery of the L2. Studies examining prociency & Jackson, 2001; Price, Green, & von Studnitz,
effects should ideally use a common standard for 1999).
judging L2 uency (Grosjean, 1998). More studies Although it is not our intention here to discuss
are also needed that systematically include sex as in great detail the imaging literature concerning
a variable and that examine late bilinguals varying bilinguals, particularly because most such studies
in their manner of L2 acquisition. Finally, more do not speak to lateralization differences, it may be
studies that specically look at task-related pro- worthwhile to describe the ndings of those few that
cessing effects are needed. Although the meta- did provide some information about hemispheric
Laterality and the Bilingual Brain 489
differences. One fMRI study that compared uent Although a separate review of ERP evidence is
English-Spanish and Spanish-English bilinguals clearly needed, it is beyond the scope of the present
who had acquired their L2 at around 12 years of overview. Instead, we draw certain connections
age showed spatially overlapping frontal lobe ac- between ndings observed in the laterality litera-
tivation patterns on a semantic processing task ture, as revealed in our (Hull & Vaid, 2003b) meta-
(Illes et al., 1999). An effect size analysis of frontal analysis, and those noted in selected ERP studies in
lobe laterality in these ndings revealed that left which relevant comparisons were undertaken.
frontal activation exceeded right frontal activation The ERP technique is used to identify averaged
for both languages of the late bilinguals and more evoked brain potentials or components elicited
so for the L2 relative to the L1 (although the latter by specic sensory, motor, or cognitive events
trend did not reach signicance). Conversely, an (Kutas, 1997). ERP component signatures are
fMRI study with moderately uent French-English comprised of a series of peaks (positive voltage)
bilinguals who acquired the L2 during late child- and valleys (negative voltage) described in terms of
hood showed increased RH participation during their polarity, peak onset, and peak amplitude
L2 (relative to L1) processing on a global language and their scalp distribution. During a given cogni-
comprehension task; laterality effect size analysis tive task, the systematic variations in the amplitude
revealed activation differences between the two of electrical activity in the brain (i.e., signature
languages were signicant (Dehaene et al., 1997). forms) and deviations in the latencies of their on-
An additional nding of interest derives from the set are termed ERP effects (Hahne & Friederici,
laterality effect size analysis of the fMRI results of a 2001). These allow researchers to infer the degree
word stem completion (i.e., production) task that and timing of electrical activity in the brain during
involved uent Chinese-English bilinguals with late a specic cognitive activity (for a review of lan-
L2 acquisition (Chee, Tan, & Thiel, 1999). In this guage-related effects, see Hagoort et al., 1999;
case, the analysis showed a nonsignicant trend for Kutas & Van Petten, 1994). The most studied of
greater left frontal activation during L1 production the language-related components, the N400 effect,
relative to L2, a nding in direct contrast with that is thought to reect an unexpected semantic event;
of the semantic comprehension tasks of Illes et al. the N280 and P600 components in turn have been
(1999). Interestingly, comparisons of laterality ef- hypothesized to correlate with automatic and con-
fect sizes for frontal lobe activation patterns of early trolled aspects of syntactic processing, respectively
and late bilinguals in the Chee et al. study showed a (see Hahne & Friederici, 1999).
nonsignicant trend for increased RH participation Given the high temporal resolution of the ERP
on both languages of early bilinguals relative to technique, studies using this method allow ne-
participants with late L2 acquisition. grained analysis of the temporal unfolding of
Imaging studies that evaluate overall laterality events, whether these are events that normally last
patterns across a variety of linguistic tasks per- milliseconds, seconds, or minutes. As such, ERP
formed by participant groups that vary in language studies provide an ideal opportunity to study lin-
acquisition history and the specic languages guistic processing in real time. Moreover, unlike
learned could provide a new tool for investigating traditional behavioral online methodologies, which
the conditions under which cerebral lateralization use reaction time measures for some ancillary task
differences may emerge. However, to date, imaging on which judgments are required (e.g., lexical de-
studies as a whole have not contributed much to this cision), ERP techniques do not require participants
question. As noted, of the approximately 40 bilin- to engage in some other task but provide a more
gual functional imaging studies that have appeared unobtrusive measure of participants responses
thus far, only 6 specically measured and reported while they are simply reading or listening to words
data regarding the overall laterality of native and or sentences for comprehension.
nonnative languages in bilinguals.
Role of Timing of Language
Bilingual Event-Related Potential Exposure
and Laterality Studies: Recent electrophysiological research suggests the
Converging Evidence importance of early sensory and language experi-
ence in inuencing how neural subsystems subser-
The number of ERP studies with bilinguals or L2 ving language will develop and function. One claim
learners was at the time of this writing around 36. currently under study is that delayed acquisition of
490 Aspects and Implications of Bilingualism
an L2 (e.g., L2 acquired after puberty) is associated our meta-analysis (Hull & Vaid, 2003b). In this
with decits in grammatical processing and in the study, ERP recordings were compared in mono-
processing of certain phonological tasks, whereas linguals and late bilinguals while they read se-
semantic processing and the processing of other mantically anomalous sentences in either English
phonological judgments are hypothesized as unaf- or French. Although much of the bilingual data
fected by age of L2 acquisition (Ullman, 2001; were collapsed across language acquisition order,
Weber-Fox & Neville, 2001). Next, we briey sum- comparisons of the results when bilinguals read
marize ndings of bilingual ERP studies relevant sentences only in the L1 revealed considerable
to this claim and discuss them in relation to the overlap between the distributions of monolinguals
laterality evidence. and late bilinguals (Ardal et al., 1990, p. 199).
Ardal et al. further reported that, regardless of
Semantic Processing acquisition order, bilinguals with a high degree of
L2 prociency displayed patterns of brain activity
The typical ERP study of semantic processing em- that correlated with those of L1 use. Our meta-
ploys a sentence comprehension task in which a analysis also found very similar patterns of (LH)
context-anomalous versus a context-congruent word brain involvement for monolinguals and late bi-
is presented, usually at the end of the sentence. linguals overall on the L1.
Meuter, Donald, and Ardal (1987) used such a pro- Weber-Fox and Neville (1996, 2001) also found
cedure to study semantic processing in uent bi- that late bilinguals and monolinguals showed
linguals, tested separately in each language, across comparable ERPs, in this case to open class words
left and right frontal and parietal sites. They com- on tasks involving semantic anomaly detection and
pared the N400 effect elicited during L1 versus L2 on tasks that involved the reading of correct sen-
sentence presentation for groups differing in the tences (Neville, Mills, & Lawson, 1992). These
temporal order in which French and English were ERP results were generally consistent with the
learned (i.e., acquisition order) and in their age of meta-analytic (Hull & Vaid, 2003b) laterality nd-
L2 acquisition (i.e., early vs. late). ing that, on semantic processing tasks, late uent
Meuter et al. (1987) found that uent bilingual bilinguals and monolinguals showed similar pat-
groups exhibited bilateral N400 effects that were terns of lateralization.
similar across L1s regardless of whether the L1s
were the same or different languages. This result Syntactic Processing
accorded with the meta-analytic ndings for early
and late uent bilinguals performing language tasks Semantic investigations have found that overall ERP
that involved semantic demands (Hull & Vaid, patterns during L2 use increasingly overlap with
2003b). Specically, we found that uent bilinguals those of the native language and/or of monolinguals
as a group (i.e., including those with early and late as prociency in the L2 increases and diverge more
L2 acquisition) consistently showed a bilateral pat- with decreased prociency (Hahne & Friederici,
tern of hemispheric involvement across different 2001; Hahne, Jescheniak, & Friederici, 2001;
L1s during tasks that tapped semantic features of Weber-Fox & Neville, 2001). Other work suggests
language, such as the sentence comprehension tasks that syntactic processing may be more vulnerable to
used by Meuter et al. Furthermore, Meuter et al. prociency and age of exposure effects.
noted that their French-English bilinguals (i.e., the Hahne and Friederici (2001) conducted a study
late L2 group) tended to show a larger N400 of sentence comprehension in Japanese-German
effect for left parietal sites when reading incon- bilinguals who had acquired the L2 after puberty.
gruent sentence endings and more so in the L2 than The participants listened to German sentences with
in the L1. This nding may also have some support or without syntactic and/or semantic violations and
from the meta-analytic ndings in that, when the judged them for linguistic integrity. The authors
set of uent bilinguals was partitioned by age of L2 reported overlapping bilateral patterns of electrical
acquisition, the subset of late bilinguals was rela- activity for monolinguals and bilinguals in the
tively more LH dominant for language as compared N400 effect, although in certain cases the degree of
to early bilinguals, who showed more bilateral bilateral activation was greater in the bilingual
hemispheric involvement overall. group. However, the bilinguals ERP components
Another ERP study of semantic processing for syntactic violations were quite different from
(Ardal, Donald, Meuter, Muldrew, & Luce, 1990) those of monolinguals. Specically, syntactic vio-
also found evidence consistent with outcomes of lations elicited less LH activation in late bilinguals
Laterality and the Bilingual Brain 491
relative to monolinguals both at early processing used mostly the same participants (Weber-Fox &
stages (left anterior component, or LAN) and later Neville, 1996), English-speaking monolinguals and
ones (P600). Whereas the laterality results were Chinese-English users varying in their age of L2
based on a relatively small data set evaluating exposure showed no group differences in the pro-
syntactic processing as such, the meta-analytic re- cessing of closed class words. Instead, the N280
sults indicated that bilinguals as a group (i.e., col- component was largest over left anterior electrode
lapsed across early and late L2 acquisition age) sites for all groups. This nding was in opposition
showed relatively less LH lateralization than did to a prediction (supported in a previous study with
monolinguals. More behavioral laterality studies American Sign Language [ASL]-English users; see
are needed to determine whether this pattern will Neville et al., 1992) that ERPs elicited by closed
be supported in a larger data set. class words would be more sensitive to delays in L2
To test whether there is electrophysiological immersion relative to ERPs elicited by open class
support for a sensitive period view that late L2 words and semantic anomalies. The divergence
users process language, particularly grammar, in between the performance of the ASL (Neville et al.,
a different way from native speakers, Friederici, 1992) and the Chinese learners of English (Weber-
Steinhauer, and Pfeifer (2002) trained adult par- Fox & Neville, 2001) was interpreted to reect
ticipants on a miniature articial language com- differences in English grammar prociency rather
plete with its own vocabulary and grammatical than L1 differences. To account for a discrepancy
rules distinct from those in the participants native in results for syntactic judgments between the ear-
language. One group of participants was trained in lier Chinese-English study (1996) and the later one,
both the lexicon and the syntax of the L2 and was Weber-Fox and Neville (2001) proposed that the
termed the uent group, whereas a second set, decreased asymmetry found previously could have
designated nonuent, was trained only in the lexi- been in part related to differences in the specic
con. After training, both groups were exposed to language functions tested, i.e., detection of gram-
syntax violations in the (articial) L2. matical anomalies vs. reading appropriately used
In native speakers of a single language, syntactic closed-class words (p. 1350). In other words, tasks
violations elicit a biphasic response consisting of an involving syntactic anomaly detection (which typ-
early negativity (which is interpreted to reect the ically rely on anomalous uses of closed class words)
interruption of an automatic parsing process) and a may have called on different kinds of processes
late positivity (which is thought to reect structural than tasks involving correctly used closed class
reanalysis and repair processes) (Hahne & Friederici, words.
1999). In previous studies of detection of syntactic Although our (Hull & Vaid, 2003b) meta-
anomalies involving phrase structure violations analysis had an insufcient number of data points
(e.g., Hahne & Friederici, 2001; Weber-Fox & to permit reliable conclusions about laterality dif-
Neville, 1996), late L2 learners did not show the ferences in syntactic processing between language
early negativity but showed a reduced late positivity groups (there were only three comparison sets), we
(P600) in the RH, suggesting the use of compensa- did nd that late (uent) L2 learners exhibited
tory or alternative conceptual-semantic processes. lateralization patterns for language processing that
For the articial grammar stimuli, Friederici were identical to those of monolinguals, as was
et al. (2002) found that the uent group exhibited found by Friederici et al. (2002) and Weber-Fox
brain activity patterns that correspond precisely and Neville (2001).
to the biphasic ERP pattern that is commonly Notably, Friederici et al. (2002) found that the
thought to reect automatic syntax parsing in nonuent late bilinguals in their sample failed to
healthy native speakers of natural languages (p. show patterns of brain activity similar to those
531). From these ndings, Friederici et al. con- of native speakers. Similarly, Neville et al. (1992)
cluded that there was no support for the notion noted that longer peak latencies in deaf ASL-
that late acquisition of an L2 dictates the recruit- English users were associated with less-procient
ment of neural substrates for language that are in knowledge of English grammar, as measured by a
addition to or distinct from those of monolinguals. standardized test. Whereas the meta-analysis was
Quite the reverse, they interpreted their ndings as unable to provide reliable data on the laterality
evidence for an overlap in neural activity for late patterns of nonuent (late) L2 learners because of
L2 learners and native speakers. the small sample size (seven comparison groups, in
Similarly, in a recent study by Weber-Fox and this case), the data we did have showed a trend
Neville (2001) that extended a previous one and toward less LH lateralization for nonuent L2 users
492 Aspects and Implications of Bilingualism
compared with monolinguals. Taken together, our bilinguals and late L2 learners. Specically, both
ndings and those of Friederici et al. (2002) with sources of evidence suggest that semantic proces-
respect to L2 prociency suggest that it may have a sing may be relatively less altered than syntactic
moderating effect on the organization of an L2 for processing by differences in language experience
those who acquire the L2 later in life. (e.g., monolingual vs. delayed L2 exposure in bi-
Finally, with respect to our meta-analytic nd- linguals). The ERP evidence further shows that
ing of differential organization for language in syntactic processing may be vulnerable to delayed
early L2 acquirers relative to late L2 learners, L2 exposure, particularly when the individuals are
supportive evidence was found in an ERP study not procient in the L2 (Friederici et al., 2002). The
conducted by Neville et al. (1997). The study re- evidence from ERP and laterality studies also con-
corded ERPs from procient early and late bilin- verges in suggesting that acquisition of an L2 very
guals during a sentence-reading task. The results early in life (i.e., by the age of 6 years) appears to
indicated that the patterns of neural activity in inuence cerebral organization of language in
early bilinguals, at least as related to the syntactic ways that are distinct from those of late L2 learn-
processing of sentences, involved more symmetri- ers (Neville et al., 1997). Specically, early bilin-
cal participation of the two hemispheres relative to guals tend to exhibit bilateral patterns of brain
procient late bilinguals. The nding prompted activation, whereas late bilinguals show LH dom-
Neville et al. to conclude marked effects of age of inance for language overall. The laterality data also
acquisition of language (p. 305) because this rel- point to RH participation in language processing
atively large RH involvement found in early bilin- even in monolinguals, at least for certain tasks (see
guals was not evident in the late L2 learners. That Federmeier & Kutas, 2002, for relevant ERP data
is, late bilinguals in this study showed relatively on the RHs contribution).
more LH dominance than early bilinguals, a nd- One lesson from the laterality literature is
ing that corroborated our own results concerning the importance of including relevant comparison
the directionality of differential lateralization of groups in the research design. Two critical ones we
language in early and late bilinguals. identied were monolinguals and early bilinguals.
In an earlier meta-analysis of the language later-
ality literature with bilinguals (Vaid & Hall, 1991),
Conclusion only 11 of the 59 studies that were included had
monolingual counterparts, the assumption in the
The earliest source of evidence on the neural me- literature at that time apparently was that mono-
diation of language in brain-intact bilingual users, linguals performance indices need not be speci-
namely, cerebral laterality research, has been sup- cally studied as they were presumed to show the
plemented by two additional sources of evidence: canonical pattern of LH dominance. Yet, as re-
hemodynamic neuroimaging and electrophysiolog- vealed in the more recent meta-analysis that con-
ical research on language. These more recent sidered 23 monolingual comparison groups (Hull
sources offer increased spatial and temporal preci- & Vaid, 2003b), this assumption is unwarranted
sion in the mapping of language phenomena and given that monolinguals also may show bilateral
as such show great promise in deepening and re- hemispheric involvement, at least on certain tasks
ning our understanding of neural correlates of and procedures. Thus, the main question implicitly
language processing. Nevertheless, we believe that raised when individual differences in laterality rst
the newer methodologies stand to benet from came under study (i.e., is there more RH involve-
greater consideration of the potential relevance in ment in bilinguals relative to monolinguals?) must
terms of the hypotheses posed, the variables ad- be replaced with more ne-grained investigations
dressed, and the outcomes obtained from the ear- that acknowledge the role of the RH in both bi-
lier laterality literature. lingual and monolingual language processing, and
The laterality literature provided the earliest that look for relative differences in RH involve-
articulation of experimental hypotheses of neural ment across groups and in interaction with stimulus
underpinnings of cognitive differences between and task conditions.
monolinguals and bilinguals and among different So far, only a small subset of the electrophysi-
bilingual subgroups. It is promising that the trends ological and neuroimaging studies with bilinguals
observed in our meta-analytic ndings in the la- has included monolingual or early bilingual con-
terality literature (Hull & Vaid, 2003b) are gener- trols. This should be redressed in future research
ally supported by the ndings of ERP studies with using such techniques. In addition, L2 uency and
Laterality and the Bilingual Brain 493
age of acquisition of the L2 have tended to be manner of L2 acquisition, we were unable to meta-
confounded in recent neuroimaging and ERP re- analytically evaluate the manner hypothesis.
search. These variables also require disentangling 3. Semantic demands also indicated bilateral
in future research. activation in monolinguals, but these data were
Finally, what is particularly intriguing from the drawn from only seven comparison groups; all
other moderators and interactions not reported
earliest laterality studies to the most recent ERP
were also too small to provide reliable data.
ones is the nding that early acquisition of bilin-
gualism appears to result in bilateral involvement
for language processing. More research needs to be References
done to explore how early simultaneous exposure Albert, M., & Obler, L. K. (1978). Experimental
to two languages may alter metalinguistic proces- neuropsychology. In M. Albert & L. K. Obler
sing strategies, either at the executive control level (Eds.), The bilingual brain: Neurolinguistic
(Bialystok, 2001) or at the level of early functional aspects of bilingualism (pp. 158201). New
differentiation of linguistic structures (see Genesee, York: Academic Press.
2003). Ardal, S., Donald, M., Meuter, R., Muldrew, S., &
Although studies on the bilingual brain using Luce, M. (1990). Brain responses to semantic
imaging technologies have revived interest in long- incongruity in bilinguals. Brain and Language,
standing questions about hemispheric differences 39, 187205.
Bialystok, E. (2001). Bilingualism in development:
associated with bilinguality, the research is still in
Language, literacy and cognition. New York:
its infancy. It is our belief that progress in under- Cambridge University Press.
standing the complexities and the specicities of Burton, H., Snyder, A., Conturo, T., Akbudak, E.,
the bilingual brain using these newer technologies Ollinger, J., & Raichle, M. (2002). Adaptive
will require that all available sources of evidence be changes in early and late blind: A fMRI study
consulted when formulating hypotheses and inter- of Braille reading. Journal of Neurophysiol-
preting ndings. Stronger links need to be forged, ogy, 87, 589607.
we feel, with the cognitive literature on bilingual- Chee, M., Tan, E., & Thiel, T. (1999). Mandarin
ism (see De Bruijn, Dijkstra, Chwilla, & Schriefers, and English single word processing studied
2001; Gollan & Kroll, 2001), with bilingual apha- with functional magnetic resonance imaging.
Journal of Cognitive Neuroscience, 19,
sia research (Green & Price, 2001; Paradis, 2001),
30503056.
and last, but not least, with the bilingual laterality Chiarello, C. (2003). Parallel systems for proces-
literature.. sing language: Hemispheric complementarity
in the normal brain. In M. Banich & M. Mack
Acknowledgments (Eds.), Mind, brain and language: Multidisci-
plinary perspectives (pp. 229247). Mahwah,
This research was supported by a Texas A&M NJ: Erlbaum.
University Academic Excellence Award to the rst De Bruijn, E., Dijkstra, T., Chwilla, D., & Schriefers,
author and by a Texas A&M University Honors H. (2001). Language context effects on
Teacher/Scholar Award to the second author. interlingual homograph recognition: evidence
Portions of this research were presented at the from event-related potentials and response
Third International Symposium on Bilingualism times in semantic priming. Bilingualism:
held in Bristol, United Kingdom, in 2001 and at the Language and Cognition, 4, 155168.
annual meeting of the Southwest Cognition Con- De Groot, A. M. B. (1993). Word type effects in
ference held in Dallas, Texas, in 2001. We thank bilingual processing tasks: Support for a
Renata Meuter and Michel Paradis for comments mixed-representational system. In R. Schreuder
on the chapter and Judy Kroll and Annette de & B. Weltens (Eds.), The bilingual lexicon
Groot for their thorough editing. (pp. 2751). Amsterdam: Benjamins.
Dehaene, S., Dupoux, E., Mehler, J., Cohen, L.,
Notes Paulesu, E., Perani, D., et al. (1997).
Anatomical variability in the cortical repre-
1. Segalowitz (1986) provided a concise review sentation of first and second language.
of relevant ndings in the aphasia literature, as well NeuroReport, 8, 38093815.
as useful discussion of reliability and validity issues Fabbro, F. (2001). The bilingual brain: Bilingual
with respect to behavioral measures of brain lat- aphasia. Brain and Language, 79, 201210.
eralization for language. Fabbro, F., Gran, L., Basso, G., & Bava, A. (1990).
2. In light of the fact that only one study in- Cerebral lateralization in simultaneous inter-
volved in our meta-analysis included data on the pretation. Brain and Language, 39, 6989.
494 Aspects and Implications of Bilingualism
Federmeier, K., & Kutas, M. (1999). Right words Hahne, A., & Friederici, A. (1999). Electrophysi-
and left words: Electrophysiological evidence ological evidence for two steps in syntactic
for hemispheric differences in meaning pro- analysis: Early automatic and late controlled
cessing. Cognitive Brain Research, 8, 373 processes. Journal of Cognitive Neuroscience,
392. 11, 194205.
Federmeier, K., & Kutas, M. (2002). Picture the Hahne, A., & Friederici, A. (2001). Processing a
difference: Electrophysiological investigations second language: Late learners comprehen-
of picture processing in the cerebral hemi- sion mechanisms as revealed by event-related
spheres. Neuropsychologia, 40, 730747. potentials. Bilingualism: Language and
Friederici, A., Steinhauer, K., & Pfeifer, E. (2002). Cognition, 4, 123141.
Brain signatures of articial language proces- Hahne, A., Jescheniak, J., & Friederici, A. (2001,
sing: Evidence challenging the critical period November). Processing a foreign language:
hypothesis. Proceedings of the National Sentence comprehension mechanisms as re-
Academy of Sciences, 99, 529534. vealed by ERPs. Poster presented at the annual
Galloway, L., & Krashen, S. (1980). Cerebral meeting of the Psychonomic Society, Orlando,
organization in bilingualism and second lan- FL.
guage. In R. Scarcella & S. Krashen (Eds.), Hausmann, M., Durmusoglu, G., Yazgan, Y., &
Research in second language acquisition Gunturkun, O. (2004). Evidence for reduced
(pp. 7480). Rowley, MA: Newbury. hemispheric asymmetries in non-verbal func-
Genesee, F. (1982). Experimental neuropsycholo- tions in bilinguals. Journal of Neurolinguistics,
gical research on second language processing. 17, 285299.
TESOL Quarterly, 16, 315321. Hull, R. (2003). How does bilingualism matter?
Genesee, F. (2003, April). Bilingual acquisition: A meta-analytic tale of two hemispheres. Un-
Exploring the limits of the language faculty. published doctoral dissertation, Texas A&M
Keynote address, Fourth International University, College Station.
Symposium on Bilingualism, Arizona State Hull, R., & Vaid, J. (2003a, May). A (continuing)
University, Tempe. tale of two hemispheres. Paper presented at
Genesee, F., Hamers, J., Lambert, W. E., the Fourth International Symposium on Bilin-
Mononen, L., Seitz, M., & Starck, R. (1978). gualism, Arizona State University, Tempe.
Language processing strategies in bilinguals: Hull, R., & Vaid, J. (2003b). What is right? A
A neuropsychological study. Brain and meta-analysis of bilingual vs. monolingual
Language, 5, 112. language lateralization. Manuscript submitted
Gollan, T., & Kroll, J. F. (2001). Bilingual lexical for publication.
access. In B. Rapp (Ed.). The handbook of Illes, J., Francis, W., Desmond, J., Gabrieli, J.,
cognitive neuropsychology (pp. 321345). Glover, G., Poldrack, R., et al. (1999). Con-
Philadelphia: Psychology Press. vergent cortical representation of semantic
Goral, M., Levy, E., & Obler, L. K. (2002). processing in bilinguals. Brain and Language,
Neurolinguistic aspects of bilingualism. The 70, 347363.
International Journal of Bilingualism, 6, Jackson, G., Swainson, R., Cunnington, R., &
411440. Jackson, S. (2001). ERP correlates of executive
Green, D. (1998). Mental control of the bilingual control during repeated language switching.
lexico-semantic system. Bilingualism: Bilingualism: Language and Cognition, 4,
Language and Cognition, 1, 6781. 169178.
Green, D., & Price, C. (2001). Functional imaging Johnson, J., & Newport, E. (1989). Critical period
in the study of recovery patterns in bilingual effects in second language learning: The in-
aphasia. Bilingualism: Language and Cogni- uence of maturational state on the acquisi-
tion, 4, 191201. tion of English as a second language.
Grosjean, F. (1998). Studying bilinguals: Method- Cognitive Psychology, 21, 6099.
ological and conceptual issues. Bilingualism: Joseph, R. (1982). The neuropsychology of devel-
Language and Cognition, 1, 131149. opment: Hemispheric laterality, limbic lan-
Grosjean, F. (2000). The bilinguals language guage, and the origin of thought. Journal of
modes. In J. Nicol (Ed.), One mind, two lan- Clinical Psychology, 44, 334.
guages: Bilingual language processing (pp. 1 Klein, D., Zatorre, R., Milner, B., & Zhao, V.
22). Oxford, U.K.: Blackwell. (2001). A cross-linguistic PET study of tone
Hagoort, P., Brown, C., & Osterhout, L. (1999). perception in Mandarin Chinese and English
The neurocognition of syntactic processing. In speakers. NeuroImage, 13, 646653.
C. Brown & P. Hagoort (Eds.), The neuro- Kochunov, P., Fox, P., Lancaster, J., Tan, L.H.,
cognition of language (pp. 273315). Oxford, Amunts, K., Zilles, K., et al. (2003). Localized
U.K.: Oxford University Press. morphological brain differences between
Laterality and the Bilingual Brain 495
English-speaking Caucasians and Chinese- Obler, L. K., Zatorre, R., Galloway, L., & Vaid, J.
speaking Asians. Developmental Neurosci- (1982). Cerebral lateralization in bilinguals:
ence, 14, 14. Methodological issues. Brain and Language,
Kroll, J. F. (1993). Accessing conceptual represen- 15, 4054.
tation for words in a second language. In R. Paradis, M. (1994). Neurolinguistic aspects of im-
Schreuder & B. Weltens (Eds.), The bilingual plicit and explicit memory: Implications for
lexicon (pp. 5381). Amsterdam: Benjamins. bilingualism. In N. Ellis (Ed.), Implicit and
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical explicit learning of rst and second languages
and conceptual memory in the bilingual: (pp. 393419). San Diego, CA: Academic
Mapping form to meaning in two languages. Press.
In A. M. B. de Groot & J. F. Kroll (Eds.), Paradis, M. (1999). Neuroimaging studies of the
Tutorials in bilingualism (pp. 169199). bilingual brain: Some words of caution. Paper
Mahwah, NJ: Erlbaum. presented at 25th Lacus Forum, University of
Kroll, J. F., & Dijkstra, T. (2002). The bilingual Alberta, Edmonton.
lexicon. In R. Kaplan (Ed.), Handbook of Paradis, M. (2000). The cerebral division of labor
applied linguistics (pp. 301321). Oxford, in verbal communication. Brain and
U.K.: Oxford University Press. Cognition, 43, 13.
Kutas, M. (1997). Views on how the electrical Paradis, M. (2001). Bilingual and polyglot aphasia.
activity that the brain generates reflects the In R. S. Berndt (Ed.), Handbook of neuro-
functions of different language structures. psychology (pp. 6991). Oxford, U.K.:
Psychophysiology, 34, 383398. Elsevier Science.
Kutas, M., & Van Petten, C. (1994). Psycholin- Paradis, M. (2003). The bilingual Loch Ness
guistics electried: Event-related brain poten- monster raises its nonasymmetric head
tial investigations. In M. A. Gernsbacher (Ed.), againor, Why bother with such cumbersome
Handbook of psycholinguistics (pp. 83143). notions as validity and reliability? Comments
San Diego, CA: Academic Press. on Evans et al. (2002). Brain and Language,
Liu, Y., & Perfetti, C. A. (2003). The time course 87, 441448.
of brain activity in reading English and Price, C. (1998). The functional anatomy of word
Chinese: An ERP study of Chinese bilinguals. comprehension and production. Trends in
Human Brain Mapping, 18, 167175. Cognitive Science, 2, 281288.
Meuter, R., Donald, M., & Ardal, S. (1987). A Price, C., Green, D., & von Studnitz, R. (1999). A
comparison of rst- and second-language functional imaging study of translation and
ERPs in bilinguals. Current Trends in Event- language switching. Brain, 122, 22212236.
Related Potential Research, (S40), 412416. Segalowitz, S. (1986). Validity and reliability of
Nakada, T., Fujii, Y., & Kwee, I. (2001). Brain noninvasive lateralization measures. Child
strategies for reading in the second language Neuropsychology, 1, 191208.
are determined by the rst language. Neuro- Sussman, H., & Simon, T. (1988). The effects
science Research, 40, 351358. of gender, handedness, L1/L2 and baseline
Neville, H., Coffey, S., Lawson, D., Fischer, A., tapping rate on language lateralization: An
Emmorey, K., & Bellugi, U. (1997). Neural assessment of the time-sharing paradigm.
systems mediating American Sign Language: Journal of Clinical and Experimental
Effects of sensory experience and age of ac- Psychology, 10, 69.
quisition. Brain and Language, 57, 285308. Ullman, M. (2001). The neural basis of lexicon and
Neville, H., Mills, D., & Lawson, D. (1992). grammar in first and second language: The
Fractionating language: Different neural declarative/procedural model. Bilingualism:
subsystems with different sensitive periods. Language and Cognition, 4, 105122.
Cerebral Cortex, 2, 244258. Vaid, J. (1983). Bilingualism and brain lateraliza-
Newport, E., Bavelier, D., & Neville, H. (2001). tion. In S. Segalowitz (Ed.), Language func-
Critical thinking about critical periods: tions and brain organization (pp. 315-339).
Perspectives on a critical period for language New York: Academic Press.
acquisition. In E. Dupoux (Ed.), Language, Vaid, J. (2002). Bilingualism. In V. S. Ramachan-
brain and cognitive development: Essays in dran (Ed.), Encyclopedia of the human brain
honor of Jacques Mehler (pp. 481502). (Vol. 1, pp. 417434). San Diego, CA:
Cambridge, MA: MIT Press. Academic Press.
Obler, L. K. (1981). Right hemisphere participa- Vaid, J., & Genesee, F. (1980). Neuropsychologi-
tion in second language acquisition. In cal approaches to bilingualism. Canadian
K. Diller (Ed.), Individual differences and Journal of Psychology, 34, 417445.
universals in language learning aptitude Vaid, J., & Hall, D. G. (1991). Neuropsychological
(pp. 5364). Rowley, MA: Newbury. perspectives on bilingualism: Right, left and
496 Aspects and Implications of Bilingualism
center. In A. Reynolds (Ed.), Bilingualism, Weber-Fox, C., & Neville, H. (1996). Matura-
multiculturalism, and second language learn- tional constraints on functional specializa-
ing: The McGill conference in honor of Wal- tions for language processing: ERP and
lace E. Lambert, (pp. 81112). Hillsdale, NJ: behavioral evidence in bilingual speakers.
Erlbaum. Journal of Cognitive Neuroscience, 8,
Vaid, J., & Hull, R. (2002). Re-envisioning the 231256.
bilingual brain using functional neuroimaging: Weber-Fox, C., & Neville, H. (2001). Sensitive
Methodological and interpretive issues. In F. periods differentiate processing of open- and
Fabbro (Ed.), Advances in the neurolinguistics closed-class words: An ERP study of bilin-
of bilingualism: A festschrift for Michel guals. Journal of Speech, Language, and
Paradis (pp. 315355). Udine, Italy: Udine Hearing Research, 44, 13381353.
University Press. Zatorre, R. (1989). On the representation of
Voyer, D. (1996). On the magnitude of laterality multiple languages in the brain: Old problems
effects and sex differences in functional lit- and new directions. Brain and Language, 36,
eralities. Laterality, 1, 5183. 127147.
Jubin Abutalebi
Stefano F. Cappa
Daniela Perani
24
What Can Functional Neuroimaging
Tell Us About the Bilingual Brain?
ABSTRACT Over the past decade, functional neuroimaging technologies such as posi-
tron emission tomography and functional magnetic resonance imaging have enabled
neuroscientists to examine the spatial and temporal mechanisms of cognitive func-
tioning and to probe online the close relationship between brain and mind. The advent
of these noninvasive neuroimaging techniques opened a new era in the investigation of
language organization in healthy individuals. The main focus of the present chapter is
to provide an overview of the most relevant results that have so far been achieved in the
eld of the exploration of the cerebral basis of bilingualism using functional neuroi-
maging techniques and to discuss which conclusions may be drawn from these studies.
In particular, this chapter focuses on the potential role of a number of variables sug-
gested to play a role on the shaping of language representations in the bilingual brain.
Consistent results indicate that attained second language (L2) prociency and perhaps
language exposure are more important than the age of L2 acquisition as a determinant
of the cerebral representation of languages in bilinguals/polyglots. Indeed, increasing
L2 prociency appears to be associated at the neural level with the engagement of the
same network subserving the rst language (L1) within the dedicated language areas,
but it has also been shown that age of L2 acquisition may specically affect the cortical
representation of grammatical processing.
497
498 Aspects and Implications of Bilingualism
or rather the degree of L2 prociency? These are dominant hemisphere as Brocas area is widely
just a few of the many questions that can be raised known. However, it was Wernicke, in his mono-
in this respect. The last three issues are of a par- graph (1874) on aphasia, who attempted to create
ticular interest to neuroscientists and may be well a comprehensive model based on anatomoclinical
addressed with the advances of functional neuroi- localization principles.
maging techniques. Within the next dozen or so years, many dif-
The main focus of the present chapter is to pro- ferent cerebral centers for various functions were
vide an overview of the most relevant results that dened, comprising centers for writing, reading,
have so far been achieved in the eld of the cerebral calculating, and so on. In general, these implicated
basis of bilingualism using functional neuroimaging the left side of the brain. Moreover, these discov-
techniques and to discuss which conclusions may be eries gave scientists the rst glimpses of the dis-
drawn from these studies. We start with a brief in- tributed nature of language function in the brain.
troduction about the history of mapping language in The brain seemed to have no single location where
the human brain; this discussion is followed by an language is created or stored. Instead, it looked as
elementary overview of PET and fMRI techniques if different parts of the brain control different as-
and of their contribution to the eld of neuro- pects of speech and language.
linguistics. We then consider those neuroimaging Interest in aphasia in bilinguals developed
studies that have been specically addressed to en- concurrently with the discovery of these various
lighten the cerebral organization of multiple lan- language centers and reected the numerous con-
guages. In particular, this chapter focuses on the troversies about the representation of language
potential role of a number of variables been suggested in the brain. In particular, it was observed that, if
to play a role in the shaping of language representa- a bilingual subject was affected by an aphasia-
tions in the bilingual brain. producing left hemispheric lesion, both languages
were not always affected to the same degree.
Moreover, the recovery of language, which could
The Brain and Language follow, was not always parallel for both languages.
Relationship Many different language recovery patterns have
been described (for a classication, see Paradis,
Since the mid-18th century, brain scientists have 1983): To account in bilingual aphasia for patterns
proposed that several different parts of the brain are of recovery of languages that could be labeled as
involved with language. Indeed, in the 19th century differential, selective, successive, and antagonistic,
there was a rapid expansion of knowledge because neurologists invoked differential cerebral localiza-
of the systematic investigation of the effects of lo- tion for each language.
calized brain damage on language processing (the For example, Scoresby-Jackson (1867) postu-
anatomoclinical method). This marked the begin- lated that the foot of the third frontal convolution
ning of an era of attempts to localize mental func- (Brocas area) should be a sort of language organ
tions within the brain. Although earlier authors had only for native languages, whereas the remaining
appreciated, for example, that the substance of the part of the convolution might be responsible for
brain, as opposed to the ventricles, had specic L2s. He gave this explanation to account for an
functions, the main localization theories began with aphasic patient who selectively lost the use of his
the phrenologists. Gall speculated (1815, as cited in L2 after brain damage.
Leischner, 1987, p. 133) that the human brain was Pitres (1895) strongly argued against this view
composed of many organs in which various human of different cerebral localization for different
faculties resided. In those theories, an essential du- languages. Pitres founded his criticism on Charcots
ality of the brain was assumed; that is, both sides theory, which assumed the existence of four inde-
were considered equipotential. pendent speech centers (articulatory, auditory, gra-
The duality theories were soon superseded by phic, and reading). Pitres indicated that, to recognize
more discrete localization theories, closely related one center for each language, for each language
to the observations of the French surgeon Paul the existence of four centers must be admitted. The
Broca (1861, 1865), who pointed out that there impairment of one language would then presup-
was an area in the brain especially devoted to pose the existence of four lesion foci, which is
speech. The story of Brocas achievements has been unlikely.
well recorded, and the subsequent designation of From that time, the debate of a hypothetical
the foot of the third frontal convolution of the differential localization of multiple languages in the
Functional Neuroimaging of the Bilingual Brain 499
same brain has invigorated discussion (for review, In psycholinguistics, it is well known that several
see Paradis, 1998; Fabbro, 1999). Some authors factors may inuence the bilinguals performance,
argued against an anatomic segregation for multi- among them the age of L2 acquisition, the degree
ple languages within the language areas (Peneld, of prociency in each language, the modality of
1965). The majority of researchers were inclined language learning, and the differential exposure to
to consider various kinds of differential represen- languages. In general, it was quite difcult to ad-
tation, including distinct neuroanatomic localiza- dress these issues in clinical studies, mainly because
tion. Segalowitz (1983) argued that it would be of the well-known limitations of the anatomocli-
surprising if bilingualism had no effect on brain nical method.
organization, and that there are numerous reasons Functional neuroimaging offers a number of
to believe that cerebral representation of language advantages over patient studies, and lesion-based
is not entirely the same in polyglots as in mono- neuropsychology in general, regarding under-
linguals. Others have proposed that bilinguals are standing the functional organization of the bilin-
somewhat less lateralized than monolingual speak- gual brain. First, aphasiological studies deal with
ers, with the right hemisphere prevalently subser- experiments of nature in which it is, of course,
ving one of the languages of the bilingual (Albert & impossible to control for the linguistic variables
Obler, 1978). that may affect language representation. Second,
The aphasiological panorama was enriched aphasiological study may demonstrate whether a
in the late 1970s by studies with electrical cortical certain brain region is necessary for a given lan-
stimulation of language representation in bilin- guage component, but not usually the broader
guals; the result was the temporary inactivation of system of which that region may form a part.
a brain region (Ojemann & Whitaker, 1978). With Third, the kind of anatomic information that can
these techniques, Ojemann and Whitaker mapped be derived from clinical studies is limited, with le-
naming sites in the lateral cortex of the dominant sions that often differ markedly in size and loca-
cerebral hemisphere in bilingual epileptic patients tion across different patients. PET and fMRI allow
chosen for neurosurgical treatment. In all patients more precise spatial characterization of the areas of
studied, each language involved some common activation during language tasks. Thus, the advent
sites of naming interference and some specic of noninvasive neuroimaging techniques, as well as
area in which naming was interrupted only for the application of electrophysiological techniques
one language. In the series of studies carried out by such as event-related brain potentials and magne-
his group, Ojemann postulated that L2s should toencepalography, makes it more feasible to address
be organized in a somewhat different manner be- crucial questions related to the cerebral organiza-
cause their naming areas were generally larger than tion of multiple languages. With these techniques,
those for rst languages (L1s). we can focus on healthy bilingual subjects with
It is noteworthy that both aphasiological studies well-dened language backgrounds, and by us-
and electrical cortical stimulation in bilinguals have ing well-designed paradigms, we can attempt to
provided evidence for heterogeneous patterns of characterize the neural architecture of the bilingual
localization. At the same time, they may have ac- brain.
tually hampered the effort to dene the general
rules and the determinants of language organiza-
tion in the bilingual brain. Indeed, although clinical
studies have enhanced our knowledge about lan- Functional Neuroimaging
guage recovery patterns in bilingual aphasics, they Techniques and Their
were not successful in dening the differential ar- Application in Neurolinguistics
chitecture of the bilingual brain and in identifying
the variables that may be responsible for the het- Functional neuroimaging technologies such as PET
erogeneous patterns of language localization. and fMRI have enabled neuroscientists to examine
More than a century ago, Pitres (1895) theo- the spatial and temporal mechanisms of cognitive
rized that greater exposure to a given language functioning and to probe online the close relation-
prior to disease onset may be a crucial factor for ship between brain and mind. The application of
the differential recovery of languages, and Calvin these technologies to address appropriate research
and Ojemann (1994) questioned if the occurrence issues may enable us to localize the components of
of larger naming sites for the L2 were caused by cognitive processing in the human brain and to im-
decreased knowledge or rather later acquisition. age their orchestration as humans perform a variety
500 Aspects and Implications of Bilingualism
of cognitive tasks. If a cognitive process can be of fMRI are well established and are based on
sustained for only a few seconds, the snapshot re- a phenomenon known as blood oxygenation level
vealed by PET or fMRI can show which parts of dependence (BOLD). In response to the activation,
the brain are active and to what degree (see Perani the rCBF increases to the relevant region, but
& Cappa, 1998, for a review). for reasons that are still not well understood, the
It is generally accepted that regional cerebral rCBF increases far more than the expected increase
blood ow (rCBF) reects synaptic activity. Local in oxygen demand (Ogawa, Lee, Kay, & Tank,
increases in blood ow are necessary to replace the 1990). The BOLD effect is particularly manifested
energy consumed by neurons. These changes in in the venous compartment, which is only 6070%
rCBF have been demonstrated to be closely related saturated with oxygen at rest and hence has the
to changes in neural activity in both space and capacity to get more oxygenated during the acti-
time. In functional neuroimaging studies, images of vation state, with a corresponding increase in MRI
blood ow are collected in at least two different signal intensity. Using this totally noninvasive
conditions (e.g., while generating words and while method, it is possible to localize functional brain
at rest). The perfusion data are then compared to activation with an accuracy of millimeters and a
nd areas in which the experimental task is asso- temporal resolution of about 3 s.
ciated with increased cerebral blood ow in com- Besides these advantages in spatial and temporal
parison with the control task. These areas of resolution when compared to the PET technique,
increased perfusion are typically referred to as the fact that no radionuclides are used makes it
activations. feasible to repeat experiments several times on the
PET measures blood ow employing radioactive same subject. Using fMRI, it is therefore possible
labeled water, specically hydrogen combined with to take advantage of more complex experimental
oxygen (15O), a radioactive isotope of oxygen. The designs.
labeled water, which is administered into a vein in However, fMRI imaging has some limits. For
the arm, emits copious numbers of positrons as it instance, crucial structures of the brain (in partic-
decays. In just over a minute, the radioactive water ular, orbitofrontal and inferior temporal regions
accumulates in the brain, providing an image of and the temporal pole) may not be visualized be-
blood ow. The fast decay of 15O and the small cause of interference with the magnetic eld. This is
amounts permit many measurements of blood ow mainly because the air enclosed in adjacent struc-
to be performed in a single session. Each picture tures (the middle ear and the mastoid bone) creates
serves as a snapshot that provides information serious interference with the magnetic eld, re-
about the momentary activity of the brain. Typi- sulting in a loss of their visualization.
cally, images of blood ow are collected before a A large body of functional neuroimaging studies
task is begun, thus providing a baseline condition has been devoted to the investigation of language
(control task) to compare with those obtained organization in the intact human brain. Briey,
when the brain is engaged in the experimental task. imaging studies employing these techniques have
Subtracting blood ow measurements collected not only largely conrmed the anatomic knowledge
during the control task from those associated with gained from neuropsychological lesion studies, but
the experimental task indicates the parts of the also opened a number of new perspectives in our
brain active during the latter. understanding of the brainlanguage relationship.
Thus, PET allows assay of biological systems in Indeed, most imaging studies underline the impor-
vivo, providing information about brain function tance of classical language-related areas within the
that is complementary to the anatomic information perisylvian cortex of the left hemisphere, such as
portrayed by structural imaging techniques, such Brocas area. However, functional neuroimaging
as computed tomography and magnetic resonance studies have considerably enlarged and redened the
imaging (MRI). Indeed, combining functional PET scope of its participation in language processing:
data with the high-resolution anatomic maps The left frontal convexity is involved in many tasks,
produced by MRI provides powerful data sets to such as word generation (Martin, Wiggs, Unger-
investigate structure/function relationships in the leider, & Haxby, 1996), semantic and phonemic
brain. uency (Mummery, Patterson, Hodges, & Wise,
Compared to PET, fMRI is a more recent non- 1996; Paulesu et al., 1997), semantic monitoring
invasive technique based on the measurement of (Thompson-Schill, DEsposito, Aguirre, & Farah,
MRI signal changes associated with alterations in 1997), and verbal working memory (Smith, Jonides,
local blood oxygenation levels. The fundamentals & Koeppe, 1996). Moreover, language-related
Functional Neuroimaging of the Bilingual Brain 501
activation has been reported also outside the clas- brain may be somewhat differentially organized.
sical language areas, such as in the inferior temporal The current research is focused on the degree of
gyrus and in the temporal pole, in the lingual and functional integration or separation of the lan-
fusiform gyri (see reviews in Price, 1998, and In- guages in the polyglot brain. Several environmental
defrey & Levelt, 2000). Furthermore, right hemi- factors have been considered to affect the neural
spheric activation in mirror regions is observed organization of language, such as age of language
during the performance of most language tasks. acquisition and degree of prociency attained in
These language-related areas located outside the each of the spoken languages.
classical language zone appear to be specialized for Regarding the rst factor, a large body of liter-
specic components of language processing, such ature suggests that linguistic abilities are sensitive
as lexical semantics. Noteworthy, the functional to the age of exposure to language. People who
role of the language-related areas as revealed by learn a language at later ages, particularly after late
neuroimaging techniques appears to be character- infancy or puberty, do not generally achieve the
ized in terms of linguistically relevant systems, such same level of prociency as young learners (Bird-
as phonology, syntax, and lexical semantics, rather song, 1999; Johnson & Newport, 1989). The
than in terms of activities, such as speaking, re- causes of these age effects on language performance
peating, reading, and listening (Neville & Bavelier, are controversial (see also Birdsong, chapter 6, this
1998). For instance, a neuroimaging experiment volume). Explanations range from the postulation
of syntax error detection in monolinguals (Moro, of biologically based critical periods to differ-
Tettamanti, Perani, Donati, Cappa, & Fazio, 2001) ences between infant and adult learning contexts
detected the involvement of a selective deep com- (Lenneberg, 1967; for extensive discussion about
ponent of Brocas area and a right inferior frontal the critical period, see also DeKeyser & Larson-
region in addition to the left caudate nucleus and Hall, chapter 5, this volume). In particular, the
insula activated only during syntactic processing, phonological and morphosyntactical components
indicating their role in syntactic computation. seem particularly decient when L2 is learned later
These ndings provide original in vivo evidence in life, whereas the lexicon seems to be acquired
that these brain structures in fact constitute an in- with less difculty after puberty. This fact may
tegrated neural network selectively engaged in entail the hypothesis that the neural representation
morphological and syntactic computation. of an L2 differs as a function of its age of acqui-
Functional neuroimaging has also taught us that sition.
areas related to linguistic processing in the normal On the other hand, prociency also appears to
human brain appear not only more extended, but play an important role in L2 organization. Several
also less xed than previously thought. For exam- psycholinguistic studies indicated that processing
ple, even when the task and experimental design the L2 changes during the acquisition in late lan-
are held constant, changes in language-related guage learners. For instance, in early stages of
brain activation can be observed as a consequence language learning, lexical items of the L2 are pro-
of increased familiarity with the task. Striking evi- cessed through association with their translation
dence was provided by Petersen, Van Mier, Fiez, equivalents in the L1, whereas in later learning
and Raichle (1998), who investigated the effects of stages (and with increased prociency), processing
practice on a verbal task using PET. The neural of L2 words is more directly conceptually mediated
differences putatively related to processing differ- (Kroll & Dufour, 1995; Kroll & Stewart, 1994).
ences between a high and a low practice perfor- In other words, L1 and L2 lexical items are both
mance of verb generation were highlighted by this thought to access a common semantic system di-
study, in which decreasing brain activity in the left rectly as a bilingual becomes more procient in the
frontal convexity was reported following practice. L2. Thus, it may be asked if the increasing pro-
ciency of late learners also entails a reorganization
of language areas in the bilingual brain. Similarly,
Visualizing the Bilingual Brain it could be asked if a hypothetical segregation of
language areas is only a function of different ages
As illustrated in the discussion here, some apha- of L2 acquisition. These interesting issues have
siological investigations in bilinguals have provided been addressed by functional neuroimaging studies
evidence that a bilingual may selectively lose one in normal adult bilinguals.
of his or her languages while the other is spared, Here, we review these investigations with the
suggesting that multiple languages in the same specic aim to detect the factors that may have a
502 Aspects and Implications of Bilingualism
major impact on the cerebral organization of two overlapping areas of the left frontal lobe. However,
languages. The studies are here divided in two when subjects repeated words by using their L2, a
groups: those investigating language production selective activation was also found in the left pu-
(including word repetition) and those investigating tamen, a subcortical structure belonging to the
language comprehension in bilinguals. As we de- basal ganglia. The authors suggested that the left
scribed elsewhere (Abutalebi, Cappa, & Perani, putamen may be involved in articulation processes
2001), this broad subdivision is only based on the when producing an L2 learned late in life. This
experimental paradigms used for the imaging hypothesis may be supported by lesion studies of
studies, which include a number of diverse behav- the so-called foreign accent syndrome (Blumstein,
ioral tasks, ranging from sentence comprehension Alexander, Ryalls, Katz, & Dworetzky, 1987;
to lexical retrieval. Although some of these can be Gurd, Bessel, Bladon, & Bamford, 1988), in which
clearly considered to focus, respectively, on input monolingual patients acquired a so-called foreign
processes or output processes (word generation), accent when speaking after left subcortical damage.
the distinction is not directly applicable to other We should, however, underline that, with the ex-
language domains, such as word repetition and ception of the second study of Klein and coworkers,
judgment tasks. Nonetheless, this atheoretical and in which the same experimental group of highly
to a certain degree arbitrary subdivision appears to procient bilinguals was used (Klein et al., 1995),
have interesting implications for the interpretation none of the successive studies demonstrated the se-
of language-specic differences of brain activity lective activation of the left putamen for L2.
patterns. A third section considers those studies Moreover, the kind of task used in Klein et al.s
investigating the neural basis of translation. experiment (Klein et al., 1994) allows us to draw
only limited conclusions about the cerebral organi-
Language Production zation of bilinguals because lexical-semantic access
Studies in Bilinguals is not necessarily involved during repetition tasks.
In their second experiment, the authors (Klein
Various functional neuroimaging studies investi- et al., 1995) used several word generation tasks:
gated the neural correlates of language produc- rhyme generation, based on phonological cues;
tion in bilinguals (Table 24.1; Chee, Tan, & Thiel, synonym generation, requiring semantic search; and
1999; Illes et al., 1999; Kim, Relkin, Lee, & Hirsch, translation, requiring lexical access in the other
1997; Klein, Milner, Zatorre, Meyer, & Evans, language. Irrespective of task requirements (rhymes
1995; Klein, Zatorre, Milner, Meyer, & Evans, or synonyms) and language used, a considerable
1994; Perani et al., 2003; Yetkin, Yetkin, Haugh- overlap of activation was observed in frontal areas
ton, & Cox, 1996). These studies differ from a (left dorsolateral frontal cortex, particularly Brod-
methodological point of view because several mann areas [Ba] 9, 45, 46, and 47). Within the ac-
authors did not formally investigate the level of tivated system, the left inferotemporal regions (Ba
prociency in the L2 but divided subjects only on the 20/37) and the left superior parietal cortex (Ba 7)
basis of their age of L2 acquisition (see Table 24.1 were always involved irrespective of language and
for details). A further important variable is that task, with the only exception rhyme generation in
different experimental paradigms and modalities L2. Because no evidence of a differential neural
have been used to study language production in substrate subserving language processing was
bilinguals. found, the authors concluded that a similar dis-
The rst two studies that opened this interesting tributed network of brain areas is engaged irre-
eld were carried out by Klein and coworkers spective of task requirement in language production
(Klein et al., 1994, 1995). PET was used to eluci- in highly procient bilinguals despite the late ac-
date whether production in an L2 involved the quisition of L2 (subjects in this study were late L2
same neural substrates as production in the L1. The learners).
subjects of both studies were 12 Canadian late Contrasting results to the studies of Klein and
bilinguals with a high degree of prociency as es- coworkers were provided by Yetkin and colleagues
tablished by a screening language examination. (1996). In an fMRI experiment based on word
In their rst study, the authors used a word repe- generation (phonemic verbal uency) in multilin-
tition task for L1 (English) and L2 (French) and guals, larger foci of brain activation were reported
reported that the pattern of brain activity was for the less-uent languages. Fluent was dened as
similar across the two languages. In particular, speaking the language currently and for at least 5
both languages commonly engaged brain activity in years, whereas nonuent was used for languages
Functional Neuroimaging of the Bilingual Brain 503
fMRI, functional magnetic resonance imaging; L1, L2, L3, rst, second, third language, respectively; PET, positron emission tomography.
studied for 2 to 4 years and without regular use in the of language used, particularly in the inferior fron-
everyday life. The experimental group was com- tal, middle frontal, and precentral gyri. Additional
posed of heterogeneous subjects, uent in at least foci of brain activation were reported, such as in
two languages and nonuent in a third language. the supplementary motor area and parietal lobe,
The languages ranged from Indo-European (English, but the precise localizations in terms of stereo-
German, Russian, Norwegian, French, Spanish) to tactical coordinates of these activations were not
Ural-Altaic (Turkish, Japanese, and Chinese). specied. It is interesting that in all subjects the
Activations were primarily observed in the left extension of focal brain activation was greater for a
prefrontal cortex (Yetkin et al., 1996), irrespective third language (L3) than for L2 and L1. Whereas
504 Aspects and Implications of Bilingualism
the average activation was less for L1 than for L2, no difference within the left prefrontal cortex when
the difference did not reach statistical signicance. comparing word generation in early bilinguals and
Unfortunately, these ndings are difcult to inter- late bilinguals when the degree of language pro-
pret given the lack of control of important vari- ciency was kept constant. They compared 15 early
ables such as the age of language acquisition and bilinguals (L2 acquisition before age 6 years) to 9
prociency, which cannot be equated with lan- late bilinguals (L2 acquisition after age 12 years).
guage uency. Classifying a bilingual as uent All subjects were native speakers of Mandarin,
only on empirical basis (subjects in the study were with English as L2, and were studied when pro-
only asked how well they spoke each language) is ducing words cued by a word stem presented vi-
far from a detailed psycholinguistic language pro- sually on a screen. Brain activity was mainly
ciency evaluation. Moreover, the authors labeled located in the left prefrontal cortex, along the in-
English always as L1 despite the fact that the native ferior and middle frontal gyri (Ba 44/45 and Ba 9/
language was Turkish in Subject 2 and Chinese in 46). The authors predicted that the processing
Subject 5. of Mandarin would require neural resources dis-
The fMRI study performed by Kim and co- tinct from English because Mandarin has an
workers (1997) also used a very inhomogeneous ideographic writing system. However, the pattern
group of bilinguals. The studied 12 bilinguals; of of brain activation in response to Mandarin words
these, 6 had been exposed to L1 and L2 during was similar to that observed for English, and this
early infancy; 6 began learning L2 after puberty. was true for both early and late bilinguals with
Again, the volunteers were bilinguals for widely high prociency.
different pairs of languages, ranging from Indo- The discrepancy between the studies of Kim and
European to languages from the Far East. In the colleagues (1997; extended language production)
experiment, they had to describe, using covert and Chee, Tan, and colleagues (1999; word stem
language production during fMRI scanning, what completion) might be related to the subjects dif-
they had done at different times of the previous ferent level of prociency in each language. As
day. The brain activity in the left inferior frontal mentioned, in Kims study the differential activa-
cortex (i.e., Brocas area) in this study was differ- tion in Brocas area for the L2 in late bilinguals
entially activated for the two groups: There were could have been caused by inferior prociency in
overlapping activations for both languages in early the L2. On the other hand, Chee, Tan, et al.s
learners; there were spatially segregated activations subjects came from Singapore, which has a really
in the case of late learners. On the other hand, the integrated bilingual society in which bilingual
regions activated by L1 and L2 within Wernickes speakers can be expected to be highly procient in
area overlapped in both groups of subjects, re- each language. In other words, these studies leave
gardless of the age of L2 acquisition. open the possibility that language prociency,
The authors (Kim et al., 1997) conclusion was rather than age of acquisition, may be the crucial
that age of acquisition is a major factor in the factor in determining the neural organization of
cortical organization of L2 processing. However, it language processing in bilinguals, as highlighted
must be underlined that the production of extended from the study of Perani and coworkers (1998) in
speech relies heavily on lexical-semantic and con- language comprehension tasks (discussed sepa-
ceptual processing; in contrast, most of the lin- rately in this section).
guistic processing limitations observed in bilinguals Along similar lines, the fMRI study of Illes and
are related to phonological tasks or to morpho- coworkers (1999) also included only subjects with
syntactic processing. The subtle differences in acti- a controlled degree of language prociency. All
vation in Brocas area may reect these differences were English-Spanish bilinguals recruited from
at the phonological and syntactic level. A further Stanford University (Stanford, CA) and performed
major problem for the interpretation of this study, two kinds of task: semantic decisions about visually
as mentioned, is that no formal assessment of lan- presented words (concrete or abstract) and non-
guage prociency was conducted. Because there is semantic decisions (upper- or lowercase type). This
a general negative correlation between age of ac- study conrmed previous ndings (Chee, Tan, et al.,
quisition and prociency (Johnson & Newport, 1999; Perani et al., 1998): When the degree of
1989), these two variables were confounded in this prociency in bilinguals is very high, a common
experiment. neural network is activated independent of age of
This issue was addressed by Chee, Tan, and their acquisition. Indeed, no differences were found
group (1999) using fMRI. These authors found when directly comparing both languages. The main
Functional Neuroimaging of the Bilingual Brain 505
activation foci were found in the left inferior fewer neural resources. This is in agreement with
frontal gyrus (Ba 44, 45, 47), with some activation previous results (Raichle et al., 1994; Thompson-
in corresponding areas of the right hemisphere in a Shill et al., 1997; Thompson-Schill, DEsposito, &
few subjects. Interestingly, semantic judgments led Kan, 1999) in which a less-automatic cognitive
to a more extensive pattern of brain activity within task engages more cerebral resources, as is the case
those areas than did nonsemantic judgments. Un- for the generation of words in the L2 in bilinguals.
fortunately, the brain regions scanned in this fMRI Another nding from the study of Perani et al.
experiment were too limited to allow further con- (2003) concerned the role of differential exposure
clusions. The scanned area extended from the ste- to a given language. More extensive brain activa-
reotactical coordinates Z 10 to Z 46 and tion in the left dorsolateral frontal cortex was
therefore did not include brain regions, such as the found for the group of Catalans when generating
middle and inferior temporal gyri, that may be words in Spanish when compared to the group
important for semantic judgment tasks (Perani of Spaniards generating words in Catalan. These
et al., 1999; Price, 1998). ndings suggest that an L2 associated with lower
The role of prociency was addressed also by environmental exposure is in need of additional
De Bleser and coworkers (2003). The authors in- neural resources in comparison to L1 (i.e., Spanish
vestigated with PET lexical retrieval by means of language in Catalans). On the other hand, the
naming visually presented cognate and noncognate group of Spaniards, well exposed to Catalan, had a
items in L1 (Dutch) and L2 (French) in a group of reduced area of brain activation for word genera-
Belgian late bilinguals with good, but not native- tion in L2. The authors hypothesized that the brain
like, prociency in their L2. Comparisons of cog- activations were related to exposure and practice.
nate naming in L1 and L2 and noncognate naming The brain might then eventually support the gen-
in L1 showed overlapping brain activation patterns eration of words with less or more recruitment of
in the left hemisphere. Conversely, naming of non- cerebral structures.
cognates in L2 entailed an additional selective acti- Before attempting to draw some conclusions
vation of left prefrontal areas along the left inferior from the results of production studies, several
frontal gyrus. The authors suggested a relation be- limitations of the available evidence must be ac-
tween activation in left prefrontal areas and effort- knowledged. The majority of the production ex-
ful lexical retrieval, as may be the case of retrieval of periments in bilinguals were based on single-word
noncognates in an L2 for which subjects have a processing, particularly in word generation (uen-
lower prociency. cy) tasks. With the exception of Kim et al.s (1997)
A further factor that may be responsible for study, all imaging investigations focused on tasks
differential cerebral organization of languages in expected to involve only single-word lexical pro-
bilinguals was investigated by Perani et al. (2003) cessing with no grammatical processing.
in a study that attempted to assess the effect of Fluency tasks are associated with the same pat-
environmental exposure to one language. This tern of brain activation found previously in mono-
was addressed by examining two groups of early linguals, namely, involvement of the left dorsolateral
bilinguals with a high degree of prociency divided frontal cortex (Poline, Vandenberghe, Holmes,
on the basis of their language dominance, referred Friston, & Frackowiak, 1996). The generation of
to as the language acquired rst in life (6 Spanish- words according to a cue is a complex task that
born versus 5 Catalan-born individuals). All of involves multiple cognitive processes, such as lexi-
these subjects were living in Catalonia (Spain), and cal search, lexical retrieval, and speech production.
Catalan was prevalent in their everyday language Anatomo-functional differences have been reported
exposure, as assessed by detailed psycholinguistic between uency tasks, for example, between pho-
investigations. This study showed rst that the nemic verbal uency and semantic verbal uency
language acquired rst in life, irrespective of lan- (Mummery et al., 1996; Paulesu et al., 1997).
guage prociency and age of L2 acquisition, may Functional studies of brain representation of dif-
be an important factor for differences in the bilin- ferent languages should take carefully into account
gual brain, resulting in some differences in brain these cognitive aspects.
activation even in early bilinguals. In particular, the Considering these limitations, which conclu-
L1 engaged fewer brain areas for the generation of sions may be drawn? Are we now able to answer
words. One explanation may be that the generation the initial question whether there are anatomically
of words in the L1 is a more automatic task and is segregated brain areas subserving two (or multiple)
reected, at the cerebral level, by the engagement of languages in the human brain? And, if so, are there
506 Aspects and Implications of Bilingualism
general rules determining a spatial segregation of they investigated the receptive sentence processing
language areas for bilinguals? of late low-prociency bilinguals (Perani et al.,
From the reported results, we may outline the 1996), early high-prociency bilinguals, and late
following conclusions: There are no differences in high-prociency bilinguals (Perani et al., 1998). In
brain activity for very early bilinguals (we might the rst, Perani and coworkers (1996) studied with
assume that these subjects were highly procient PET nine late acquisition bilinguals (Italian-
for both languages) and, similarly, no differences English) who had low prociency in their L2, En-
for late bilinguals if they are highly procient in glish, which they had studied at school for at least
both languages (Chee, Tan, et al., 1999; Illes et al., 5 years. None of the subjects had spent more than
1999; Klein et al., 1995). Contrasting to this as- 1 month in an English-speaking environment, and
sumption is the study of Kim and coworkers (1997) they therefore mastered L2 poorly. Partially dif-
in which spatially separated regions were activated ferent cerebral substrates were active for the L1
within Brocas area for L1 and L2. However, as and L2 when compared to the baseline condition
this study lacked any information about the degree (attentive rest condition). Areas activated by the L1
of prociency in L2 of the subjects, we do not comprised left perisylvian areas, including the an-
know whether this differential cerebral organiza- gular gyrus (Ba 39), the superior and middle tem-
tion was a consequence of the age of L2 acquisition poral gyri (Ba 21 and 22), the inferior frontal gyrus
or rather of reduced prociency. This critique is (Ba 45), and the temporal pole (Ba 38). Several
also applicable to Yetkin et al.s study (1996), even homologous areas (Ba 21, 22, and 38) were also
if they provided evidence that when a language is activated in the right hemisphere. In contrast, the
spoken less uently, a larger cerebral activation can set of active language areas was considerably re-
be observed in comparison with more uent lan- duced when applying the same analysis to the L2.
guages. We do not know, however, if this result Specically, only the left and right superior and
must be ascribed to high/low prociency or high/ middle temporal areas remained active.
low exposure. In terms of prociency, as addressed One of the crucial areas of differential activa-
with psycholinguistic testing, a recent fMRI study tion was, rather unexpectedly, the temporal pole.
carried out in a group of quadrilinguals (Briell- Activation of this region has been seldom reported
mann et al., 2004) provided further evidence that in the functional imaging studies on language and
larger foci of brain activity, mostly within the left memory. However, some studies have shown that
prefrontal cortex, are related to the less-procient the anterior part of the temporal lobe is activated
languages (such as L3 or L4). by tasks requiring listening or reading sentences or a
Overall, these ndings appear to indicate that continuous text (Bottini et al., 1994; Fletcher et al.,
attained prociency might be more important than 1995; Mazoyer, et al., 1993; Perani et al., 1996)
age of acquisition as a determinant of the cerebral rather than unconnected verbal material. Perani and
representation of languages in bilinguals/polyglots. coworkers (1996) suggested that these regions
Moreover, the results of the study by Perani et al. might be involved in processes associated with the
(2003) underline that differences in environmental sentence or even the discourse level, such as inte-
exposure to a language may also account for gration with prior knowledge, inference, and ana-
functional modulation in the cerebral representa- phoric reference. In addition, the temporal poles
tion of languages, even when age of L2 acquisition might be recruited on the basis of increasing mem-
and prociency are kept constant. ory demands when the subjects are engaged in the
Further investigations, taking into appropriate natural task of listening to some simple narrative.
consideration at least these three important linguistic In the second experiment (Perani et al., 1998),
criteria (age of L2 acquisition, degree of language the authors tested Italian native speakers who
prociency, and preferential exposure to a language), learned English after age 10 years, who had spent
are necessary to draw stronger conclusions. 1 to 6 years in an English-speaking country, and
who currently used English in their daily activities.
Language Comprehension Studies These late bilinguals were scanned during experi-
in Bilinguals mental conditions such as listening to Italian, En-
glish, or Japanese stories (unknown to all subjects)
Many studies have investigated the brain correlates or attentive silence. In the second part of the same
of language comprehension in bilinguals. These experiment, the authors examined early acquisition
studies are listed in Table 24.2. Perani and col- and high-prociency Spanish and Catalan bilin-
leagues carried out several PET studies in which guals. These early bilinguals were scanned while
Functional Neuroimaging of the Bilingual Brain 507
Perani et al., Passive listening to stories Homogeneous group of Greater activations when
1996 in L1, L2, and a third 9 low-prociency late processing the native lan-
unknown language as bilinguals guage in comparison to L2
studied by PET
Dehaene fMRI single-subject study Homogeneous group of Differential brain activation
et al., 1997 of listening to stories in L1 8 low-prociency late for late L2 learners
and L2 bilinguals (including the right
hemisphere)
Perani et al., Two PET studies of two Two homogeneous groups Overlapping patterns of
1998 groups of subjects listening of bilinguals: 9 high-pro- brain activity in all
to stories in L1 and L2 ciency but late bilinguals high-prociency bilinguals,
and 12 high-prociency but underlining the crucial role
early bilinguals of prociency
Chee, fMRI investigation of visu- Homogeneous group of Common patterns of brain
Caplan, et al., ally presented sentence 14 early bilinguals activity for L1 and L2
1999 comprehension in L1
and L2
Price et al., Single-word comprehension Homogenous group formed Greater activity in the left
1999 in L1 and L2 studied by by 6 late bilinguals temporal lobe for L1
PET
Chee et al., fMRI scanning while Two homogenous groups of Reduced brain activity in
2001 bilinguals perform semantic low-prociency and high- left prefrontal and parietal
judgments prociency bilinguals regions when subjects were
highly procient
Wartenburger fMRI investigation of Three controlled groups of Age of acquisition depen-
et al., 2003 grammatical and semantic bilinguals divided on the dency of grammar and
judgment in bilinguals basis of age of L2 acquisi- prociency dependency of
tion and prociency semantic judgments
fMRI, functional magnetic resonance imaging; L1, and L2 indicate rst and second language, respectively; PET, positron emission
tomography.
listening to Spanish and Catalan stories. In both experimental subjects (eight late bilinguals, with
groups of bilinguals (early and late high-prociency French the L1 and English the L2) scanned while
bilinguals), L1 and L2 yielded highly similar listening to short stories alternatively in French and
cerebral activation patterns. In fact, both groups English. Listening to the stories in L1 engaged a set
showed brain activity located mainly in the left of left-sided brain areas, with additional similar,
superior and middle temporal gyrus and in the left although much weaker, activation in the right
temporal pole. hemisphere, whereas this pattern radically changed
The overlapping pattern of activation for L1 and when subjects processed their L2. It is noteworthy
L2 in Perani et al.s 1998 study contrasted to the that a single-subject analysis showed a quite dis-
considerable differences in L1-L2 activations found parate pattern of brain activity for L2, indicating
in low-prociency speakers (Perani et al., 1996). large intersubject variability. Indeed, listening to
The combined results of these studies provided the L2 engaged a highly variable network of left and
rst in vivo evidence for a different functional rep- right temporal and frontal areas among the sub-
resentation of L1 and L2 in comprehension when jects, in some individuals restricted only to the right
a crucial variable such as language prociency is hemisphere. On the basis of these results, the au-
taken into account. thors conrmed that, although the processing of
Dehaene et al. (1997) performed a similar ex- the L1 essentially relies on a dedicated left hemi-
periment using fMRI in a comparable group of spheric cerebral network, the processing of an L2
508 Aspects and Implications of Bilingualism
acquired late in life and mastered with reduced who differed in language prociency were studied.
prociency may be differentially organized. Higher language prociency was associated with
This series of experiments (Dehaene et al., 1997; smaller activation foci within the left prefrontal
Perani et al., 1996, 1998) provided evidence of and parietal areas, whereas lower prociency was
functional modulation in the network that medi- associated with a more extended network of acti-
ates language comprehension in the bilingual brain. vations, including foci in the right hemisphere. The
The main result is that, although listening to stories results are apparently different from those of Perani
in L1 and in L2 yields very different patterns of et al.s (1996, 1998). However, the nature of the
cortical activity in low-prociency subjects, no task should be considered: passive listening to
major differences are present in highly procient stories in the studies of Perani and coworkers and
subjects, even with later L2 acquisition. The lan- active judging of the semantic contents in the study
guages spoken by the low- and high-prociency of Chee and coworkers. The latter seems to require
volunteers were identical and so was the procedure. the engagement of additional neural resources
Hence, we must conclude that the degree of mas- when the language is mastered poorly.
tery of L2 is responsible for the observed differ- In conclusion, in early bilinguals who received
ences between the groups: Auditory language equal practice with their two languages from birth
comprehension in procient bilinguals who have a single and common language system appears to
learned L2 after the age of 10 years relies on a be responsible for the processing of both languages
macroscopic network of areas that is similar for L1 (Chee, Caplan, et al., 1999; Perani et al., 1998).
and L2 groups. This system extends along a left-sided network
It is noteworthy that these results were also comprising all the classical language areas. In the
conrmed by two further studies (Chee, Caplan, temporal lobe, these include the superior and
et al., 1999b; Price, Green, & von Studnitz, 1999). middle temporal gyri, the angular gyrus, and the
In the rst, fMRI was used to investigate a very temporal pole, a structure that seems specically
homogeneous group of 14 early bilinguals, using engaged by sentence- and discourse-level proces-
two orthographically and phonologically distant sing. In the case of late bilinguals, the degree of
languages (English and Mandarin), while they language prociency seems to be a critical factor in
evaluated sentence meaning. A comparable set of shaping the functional brain organization of lan-
brain areas was activated for L1 and L2, among guages because high-prociency late bilinguals ac-
them the left inferior and middle frontal gyri, the tivated strikingly similar left hemispheric areas for
left superior and middle temporal gyri, the left L1 and L2 (Chee, Caplan, et al., 1999; Perani et al.,
temporal pole, the anterior supplementary motor 1998), whereas less-procient subjects had differ-
area, and, bilaterally, superior parietal regions and ent patterns of activation for their two languages
occipital regions. Thus, also with these two ortho- (Chee et al., 2001; Dehaene et al., 1997; Perani
graphically and phonologically distant languages, a et al., 1996; Price et al., 1999). In the case of
strikingly overlapping brain activity pattern was comprehension of extended text (but not semantic
present for both languages, as indicated by the di- judgment), the activation was more limited in the
rect contrasts (English vs. Mandarin and vice versa) case of L2. This may reect a less-consistent pat-
that yielded no signicant differences. tern of activation (as suggested by the results of
The study of Price et al. (1999), in which six late Dehaene s study) or more limited processing, fo-
bilinguals were investigated using PET, provided cusing on a supercial analysis of the less-procient
results at the single-word level. The language areas language. Also, in the case of comprehension,
in the left temporal lobe were more activated when increasing language prociency appears to be a
processing the L1 compared to a less-known lan- crucial factor for language representation in
guage. Indeed, comprehension of words in L1 bilinguals.
yielded greater activation in the temporal pole than We should at this point underline that the par-
comprehension of the words in L2. This is in adigms employed so far with functional imaging in
agreement with Perani et al.s (1996) results in late language studies do not allow a clear differentia-
bilinguals with a low degree of prociency. tion of the various language components (semantic,
Chee, Hon, Lee, and Soon (2001) used a dif- morphological, and syntactic) as traditionally de-
ferent task (semantic judgment) to evaluate with ned within linguistic theory. For instance, there
fMRI the effect of the language prociency on is an ongoing discussion whether there is a critical
cerebral language representation in bilinguals. Two period in L2 acquisition (Johnson & Newport,
different groups of Mandarin-English bilinguals 1989) and whether this period concerns only the
Functional Neuroimaging of the Bilingual Brain 509
phonological and morphosyntactic domains of that age of acquisition specically affects the cortical
language processing. Using ERPs, Weber-Fox and representation of grammatical processes. Only in the
Neville (1996) found that different aspects of lan- case of the L2 acquired very early in life do over-
guage (i.e., semantic and syntax) are differentially lapping neural substrates for L1 and L2 grammar
affected by the age of L2 acquisition. result. In addition, in late bilinguals prociency is the
To address this issue, an fMRI study investigated main determinant of the cerebral organization of
the neural correlates of grammatical and semantic both grammar and semantics. This is illustrated in
judgments in three groups of Italian-German bilin- Fig. 24.1, which reports the brain activity patterns in
guals. The subjects acquired the L1 and L2 from two of the three groups of late bilinguals with dif-
birth (rst group) or after the age of 6 years, but ferent levels of prociency of Wartenburger et al.s
with different prociency levels (second and third study during semantic judgment.
group) (Wartenburger, Heekeren, Abutalebi, Cappa, These ndings are in agreement with the exis-
Villringer, & Perani, 2003). This study demonstrated tence of a critical period for language acquisition
Figure 24.1 Brain templates of the cerebral activation patterns of subjects judging the semantic content
of sentences in their second language (L2) compared to that in their first language (L1). Two groups of
Italian-German bilinguals are displayed: late acquisition and high proficiency (LAHP, top row), and late
acquisition and low proficiency (LALP, bottom row). Brain activity patterns are displayed on the lateral
surfaces of both hemispheres. The activation patterns of the LAHP group entailed a similar neural sys-
tem and did not differ essentially in extension to that of an early acquisition and high proficiency Italian-
German group (data not shown). On the other hand, as shown in the figure, the group of LALP bilin-
guals significantly engaged more extended brain areas. These results underline the crucial role of
proficiency in the cerebral organization of the bilingual brain.
510 Aspects and Implications of Bilingualism
and suggest that grammatical processing, given its Price et al., 1999; Rodriguez-Fornells, Rotte, Heinze,
dependence on age of acquisition, is based on Nosselt, & Munte, 2002).
competence which should be neurologically wired- Price and coworkers (1999) studied six subjects
in (see Fig. 24.2). whose L1 was German and who became uent in
their L2 (English) late, after infancy. Subjects were
studied with PET while they read or translated
Neuroimaging Studies written words, one at a time, from L1 to L2 and
of Translation and the Language vice versa. In distinct blocks, the words were pre-
Selection Mechanism sented only in German, only in English, or in alter-
nation between the two languages. Noteworthy,
Three studies (Table 24.3) addressed the neural the regions most active during translation were lo-
basis of the translation and language selection cated outside the classical language areas. Translat-
mechanisms with functional neuroimaging (Her- ing, when compared to reading, activated mainly
nandez, Dapretto, Mazziotta, & Bookheimer, 2001; the anterior cingulate and bilateral subcortical
Figure 24.2 Brain activity patterns of bilinguals judging the grammatical content of sentences in their sec-
ond language as compared to that in their first language. Three groups of Italian-German bilinguals are
displayed: early bilinguals; late but high-proficiency bilinguals; late but low-proficiency bilinguals. Brain
activity patterns are displayed on the lateral surfaces of the left hemisphere. Although early bilinguals en-
gaged for both languages the same identical neural structures, this did not apply for late bilinguals. Both
groups of late bilinguals engaged more extended brain areas for grammatical processing in their second
language (L2). These results underline the age of acquisition effect on the neural underpinnings of gram-
matical processing.
Functional Neuroimaging of the Bilingual Brain 511
Table 24.3 Neuroimaging Studies Investigating Language Translation and Language Selection Mechanism
Price et al., PET investigation of writ- Homogenous group of six Activation of the anterior
1999 ten word translation from late bilinguals cingulate and bilateral sub-
L1 to L2 and vice versa cortical structures while
translating
Hernandez et Language naming and Homogenous group of six Overlapping brain areas
al., 2001 switching investigated by early bilinguals more uent when naming in either L1
fMRI in English than in Spanish and L2, increasing brain
activity in the left frontal
lobe when switching
Rodriguez- fMRI study of language se- Homogenous group of se- Bilinguals, in comparison to
Fornells lection between Spanish- ven high-prociency early monolinguals, showed a se-
et al., 2002 Catalan visually presented bilinguals compared to a lective activation of a pre-
words group of seven monolin- frontal area that may be
guals implicated in inhibiting the
nontarget language
fMRI, functional magnetic resonance imaging; L1, and L2 indicate rst and second language, respectively; PET, positron emission
tomography.
structures (the putamen and the head of the caudate switching performance by patients with supra-
nucleus). Price and colleagues attributed this to the marginal lesions, a central role for this region in
need for greater coordination of mental operations language switching.
for translation, during which the direct cerebral The second study (Hernandez et al., 2001) was
pathways for naming words must be inhibited in carried out with Spanish-English early and sup-
favor of less-automated circuits. posedly high-prociency bilinguals who were more
This hypothesis was also raised in neu- uent in English as formally tested. To examine the
ropsychological lesion studies in bilinguals, indi- neural correlates of language switching, subjects
cating that damage to subcortical structures may named objects in one language or switched be-
interfere with the complex mechanism implicated tween languages. When confronting the pattern
in the selection of languages. Aglioti and coworkers of brain activity for each language, overlapping
described the case of a bilingual suffering left sub- patterns resulted in the left dorsolateral prefrontal
cortical damage (capsulo-putaminal lesion) that cortex (Ba 46 and 6) and Brocas area (Ba 44 and
inhibited language changes when speaking (Aglioti, 45). It is noteworthy that the authors reported
Beltramello, Girardi, & Fabbro, 1996; Aglioti & increasing activity in the dorsolateral prefrontal
Fabbro, 1993). Abutalebi, Miozzo, and Cappa cortex for the switching condition relative to the
(2000) also reported the case of a polyglot who was nonswitching conditions, suggesting that the left
no longer able to speak in one language, showing dorsolateral prefrontal is implicated in the mecha-
pathological language mixing caused by a lesion nism of language switching and language selection.
located in the head of the caudate nucleus in the left Unfortunately, the whole switching condition was
hemisphere. It has thus been theorized that the bi- pooled together so that we do not know whether
linguals/polyglots lexical representations may be there are differences when switching from L1 to L2
selectively accessed under the control of neural or rather from L2 to L1.
routes involving a cortical-subcortical circuit in The intriguing issue of how bilinguals select
which the left basal ganglia may represent the languages was further addressed by the study of
supervisor of language output in bilinguals. A Rodriguez-Fornells et al. (2002). The main aim of
further interesting nding of Price et al.s study their study was to enlighten how bilinguals inhibit
(1999) was the activation of Brocas area and su- the nontarget language (Catalan in the study) during
pramarginal gyrus during language switching. It is lexical access of visually presented words in the
noteworthy that Poetzl (1925, 1930) and Leischner target language (Spanish in the study). This was
(1943) had suggested, on the basis of defective addressed by studying with ERPs and fMRI a group
512 Aspects and Implications of Bilingualism
of early bilinguals (Catalan-Spanish) reporting a guage. Another possibility, suggested by the single-
high degree of language prociency for both lan- subject study of Dehaene et al. (1997), is the large
guages. The results were compared to a group of intersubject variability in the activation pattern for
Spanish monolinguals selecting visually presented comprehension of L2. It must be underlined that
real Spanish words intermixed with pseudowords. the neuroimaging data do not question the claim
Interestingly, only in the group of bilinguals a se- that age of acquisition is a major determinant of
lective activation of a left anterior prefrontal region prociency in L2. Many linguistic and neurophys-
(Ba 45 and 9) was reported, which the authors cor- iological studies have found that late learners are
related to the inhibition of the nontarget language. typically less procient than early learners (Flege,
A further intriguing nding of Rodriguez-For- Munro, & MacKay, 1995; Johnson & Newport,
nells et al.s study (2002) was that ERPs showed a 1989; Weber-Fox & Neville, 1996). The role of age
typical sensitivity to word frequency only for words of acquisition seems to have crucial implications
in the target language and not for words in the for particular domains of language, such as gram-
nontarget language. It may be hypothesized that mar, as shown by the study of Wartenburger and
words from the nontarget language are not ac- coworkers (2003).
cessed through a direct lexical route, but rather The specic role of practice and exposure, in
they are discarded through a sublexical route. terms of frequency of usage, has to be investigated
In conclusion, these three neuroimaging inves- further and should not be confounded with pro-
tigations underline the role of left subcortical ciency (in terms of absolute level of uency). The
and dorsolateral prefrontal brain regions in the nding that language exposure may be an addi-
mechanism of language selection, supporting neu- tional crucial factor for the neural representation of
ropsychological lesion ndings in bilinguals (Abu- multiple languages (Perani et al., 2003) may pro-
talebi et al., 2000; Aglioti et al., 1993, 1996). vide important inputs either to educational elds,
such as in the case of L2 learning, or to language
rehabilitation in bilingual aphasia.
Conclusions In our opinion, the most important contribution
of imaging studies of bilingualism to our under-
A number of important functional neuroimaging standing of language representation in the brain is
studies addressing the cerebral representation of the observation of aspects of invariance and plas-
bilingualism have been performed. In the present ticity. We can conclude from the available evidence
chapter, we reviewed these studies, emphasizing that the patterns of brain activation associated with
how several factors shown to be crucial in psy- tasks that engage specic aspects of linguistic pro-
cholinguistics may affect the neural basis of the cessing are remarkably consistent across different
bilingual language system. These factors are mainly languages and different speakers. These relatively
represented by the age of L2 acquisition, the degree xed patterns, however, are clearly modulated by a
of prociency for languages, and the degree of us- number of factors analytically addressed in this
age/exposure to languages. The available evidence review. Prociency, age of acquisition, and expo-
suggests that prociency is the most relevant factor. sure can affect brain activity, interacting in a
In the case of language production tasks in general complex way with the levels of language repre-
and in tasks of language comprehension, there are sentation and the modalities of language perfor-
differences that appear to be in opposite directions: mance. Future studies are expected to disentangle
more extensive cerebral activations associated with the specicity and selectivity of these interactions.
production in the less-procient language and In general, the imaging study of multilingual sub-
smaller activations with comprehending the less- jects appears to be a promising model for the study
procient language. Hence, it may be speculated of the interactions between a prewired neurobio-
that this puzzling result may reect the inherent logical substrate and environmental, time-locked
differences of these aspects of linguistic processing. inuences.
In the case of effortful tasks such as word gen-
eration, this difference may be attributed to the
recruitment of additional resources. References
On the other hand, in the case of sentence com- Abutalebi, J., Cappa, S. F., & Perani, D. (2001).
prehension, the automatic nature of the processing The bilingual brain as revealed by functional
may be reected in a more limited elaboration of neuroimaging. Bilingualism: Language and
the linguistic material in the less-procient lan- Cognition, 4, 179190.
Functional Neuroimaging of the Bilingual Brain 513
Abutalebi, J., Miozzo, A., & Cappa, S. F. (2000). imaging. Journal of Neuroscience, 19,
Do subcortical structures control language 30503056.
selection in bilinguals? Evidence from De Bleser, R., Dupont, P., Postler, J., Bormans, G.,
pathological language mixing. Neurocase, 6, Speelman, D., Mortelmans, L., et al. (2003).
101106. The organisation of the bilingual lexicon: A
Aglioti, S., Beltramello, A., Girardi, F., & Fabbro, PET study. Journal of Neurolinguistics, 16,
F. (1996). Neurolinguistic and follow-up study 439456.
of an unusual pattern of recovery from De Groot, A. M. B., & Kroll, J. F. (Eds.). (1997).
bilingual subcortical aphasia. Brain, 119, Tutorials in bilingualism: Psycholinguistic
15511564. perspectives. Mahwah, NJ: Erlbaum.
Aglioti, S., & Fabbro, F. (1993). Paradoxical Dehaene, S. D., Dupoux, E., Mehler, J., Cohen, L.,
selective recovery in a bilingual aphasic Paulesu, E., Perani, D., et al. (1997).
following subcortical lesion. Neuroreport, 4, Anatomical variability in the cortical
13591362. representation of first and second languages.
Albert, M. L., & Obler, L. K. (1978). The bilingual Neuroreport, 8, 38093815.
brain. New York: Academic Press. Dufour, R., & Kroll, J. F. (1995). Matching words
Birdsong, D. (1999). Second language acquisition to concepts in two languages: A test of the
and the critical period hypothesis. Mahwah, concept mediation model of bilingual
NJ: Erlbaum. representation. Memory & Cognition, 23,
Blumstein, S. E., Alexander, M. P., Ryalls, J. H., 166180.
Katz, W., & Dworetzky, B. (1987). On Fabbro, F. (1999). The neurolinguistics of
the nature of foreign accent syndrome: A bilingualism. An introduction. Hove, U.K.:
case study. Brain and Language, 31, Psychology Press.
215244. Flege, J. E., Munro, M. J., & MacKay, I. R. A.
Bottini, G., Corcoran, R., Sterzi, R., Paulesu, E., (1995). Effects of age of second-language
Schenone, P., Scarpa, P., et al. (1994). The role learning on the production of English
of the right hemisphere in the interpretation of consonants. Speech Communication,
figurative aspects of language. A positron 16, 126.
emission tomography activation study. Brain, Fletcher, P. C., Happe, F., Frith, U., Baker, S. C.,
117, 12311253. Dolan, R. J., Frachowiak, R. S. J., et al.
Briellmann, R. S., Saling, M. M., Connell, A. B., (1995). Other minds in the brain: A functional
Waites, A. B., Abbott, D. F., & Jackson, G. D. imaging study of theory of mind in story
(2004). A high-eld functional MRI study of comprehension. Cognition, 57, 109128.
quadrilingual subjects. Brain and Language, Gurd, J. M., Bessel, N. J., Bladon, R. A. W., &
89, 531542. Bamford, J. M. (1988). Neuropsychologia, 26,
Broca, P. (1861). Perte de la parole, ramolissement 237251.
chronique et destruction partielle du Hernandez, A. E., Dapretto, M., Mazziotta, J., &
lobe anterieur gauche du cerveau. Bulletin Bookheimer, S. (2001). Language switching
de la Societe d Anthropologie, 11, and language representation in Spanish-
235237. English Bilinguals: An fMRI study.
Broca, P. (1865). Sur le sie`ge de la faculte du Neuroimage, 14, 510520.
langage articule. Bulletin de la Societe Illes, J., Francis, W. S., Desmond, J. E., Gabrieli,
dAnthropologie, 6, 337393. J. D. E., Glover, G. H., Poldrack, R., et al.
Calvin, W. H., & Ojemann, G. A. (1994). (1999). Convergent cortical representation of
Conversation with Neils brain. New York: semantic processing in bilinguals. Brain and
Addison-Wesley. Language, 70, 347363.
Chee, M. W. L., Caplan, D., Soon, C. S., Sriram, Indefrey, P., & Levelt, W. J. M. (2000). The neural
N., Tan, E. W. L., Thiel, T., et al. (1999). correlates of language production. In M. S.
Processing of visually presented sentences in Gazzaniga (Ed.), The new cognitive
Mandarin and English studied with fMRI. neurosciences. Cambridge, MA: MIT Press.
Neuron, 23, 127137. Johnson, J., & Newport, E. (1989). Critical period
Chee, M. W. L., Hon, N., Lee, H. L., & Soon, C. S. effects in second language learning: the inu-
(2001). Relative language prociency ence of maturational state on the acquisition
modulates BOLD signal change when of English as a second language. Cognitive
Bilinguals perform semantic judgments. Psychology, 21, 6099.
Neuroimage, 13, 11551163. Kim, K. H. S., Relkin, N. R., Lee, K. M., & Hirsch,
Chee, M. W. L., Tan, E. W. L., & Thiel, T. (1999). J. (1997). Distinct cortical areas associated
Mandarin and English single word processing with native and second languages. Nature,
studied with functional magnetic resonance 388, 171174.
514 Aspects and Implications of Bilingualism
Klein, D., Milner, B., Zatorre, R., Meyer, E., & linguistics (pp. 417430). San Diego, CA:
Evans, A. (1995). The neural substrates un- Academic Press.
derlying word generation: A bilingual func- Paulesu, E., Goldacre, B., Scifo, P., Cappa, S. F.,
tional-imaging study. Proceedings of the Gilardi, M. C., Castiglioni, I., et al. (1997)
National Academy of Sciences U.S.A., 92, Functional heterogeneity of left inferior frontal
28992903. cortex as revealed by fMRI. NeuroReport, 8,
Klein, D., Zatorre, R., Milner, B., Meyer, E., & 20112016.
Evans, A. (1994). Left putaminal activation Penfield, W. (1965). Conditioning the uncommit-
when speaking a second language: evidence ted cortex for language learning. Brain, 88,
from PET. Neuroreport, 5, 22952297. 787798.
Kroll, J. F., & Stewart, E. (1994) Category inter- Perani, D., Abutalebi, J., Paulesu, E., Brambati, S.,
ference in translation and picture naming: Scifo, P, Cappa S. F., et al. (2003). The role of
Evidence for asymmetric connections between age of acquisition and language usage in early,
bilingual memory representations. Journal of high-procient bilinguals: A fMRI study dur-
Language and Memory, 33, 149174. ing verbal uency. Human Brain Mapping, 19,
Leischner, A. (1943). Die Aphasien der 179182.
Taubstummen. Ein Beitrag zur Lehre der Perani, D., & Cappa, S. F. (1998). Neuroimaging
Asymbolie. Archiv fuer Psychiatrie und methods in neuropsychology. In G. Denes &
Nervenkrankheiten, 115, 469548. L. Pizzamiglio (Eds.), Handbook of clinical
Leischner, A. (1987). Aphasien und Sprachent- and experimental neuropsychology (pp. 69
wicklungsstoerungen. Stuttgart, Germany: 94). London: Psychology Press.
Thieme Verlag. Perani, D., Cappa, S. F., Schnur, T., Tettamanti,
Lenneberg, E. H. (1967). Biological foundations of M., Collina, S., Rosa, M. M., et al. (1999).
language. New York: Wiley. The neural correlates of verb and noun pro-
Martin, A., Wiggs, C. L., Ungerleider, L. G., & cessing: A PET study. Brain, 122, 23372344.
Haxby, E. (1996). Neural correlates of Perani, D., Dehaene, S., Grassi, F., Cohen, L.,
category-specic knowledge. Nature, 379, Cappa, S. F., Dupoux, E., et al. (1996). Brain
649652. processing of native and foreign languages.
Mazoyer, B. M., Tzourio, N., Frank, V., Syrota, A., NeuroReport, 7, 24392444.
Murayama, N., Levrier, O., et al. (1993). The Perani, D., Paulesu, E., Sebastian-Galles, N.,
cortical representation of speech. Journal of Dupoux, E., Dehaene, S., Bettinardi, V., et al.
Cognitive Neuroscience, 5, 467479. (1998). The bilingual brain: Prociency and
Moro, A., Tettamanti, M., Perani, D., Donati, C., age of acquisition of the second language.
Cappa, S. F., & Fazio, F. (2001). Syntax and Brain, 121, 18411852.
the brain: Disentangling grammar by selective Petersen, S. E., Van Mier, H., Fiez, J. A., &
anomalies. Neuroimage, 13, 110118. Raichle, M. E. (1998). The effects of practice
Mummery, C. J., Patterson, K., Hodges, J. R., & on the functional anatomy of task perfor-
Wise, R. J. S. (1996). Generating a tiger as mance. Proceedings of the National Academy
an animal name or a word beginning with T: of Sciences U.S.A., 95, 853860.
Differences in brain activations. Proceedings Pitres, A. (1895). Etude sur laphasie chez les
of the Royal Society London B., 263, polyglottes. Revue de medecine, 15, 873899.
989995. Poetzl, O. (1925). Ueber die parietal bedingte
Neville, H. J., & Bavelier, D. (1998). Neural Aphasie und ihren Einfluss auf das Sprechen
organization and plasticity of language. mehrerer Sprachen. Zeitschrift fuer die gesamte
Current Opinion in Neurobiology, 8, Neurologie und Psychiatrie, 99, 100124.
254258. Poetzl, O. (1930). Aphasie und Mehrsprachigkeit.
Ogawa, S., Lee, T. M., Kay, A. R., & Tank, D. W. Zeitschrift fuer die gesamte Neurologie und
(1990). Brain magnetic resonance imaging Psychiatrie, 124, 145162.
with contrast dependent on blood oxygena- Poline, J. B., Vandenberghe, R., Holmes, A. P.,
tion. Proceedings of the National Academy of Friston, K. J., & Frackowiak, R. S. J. (1996).
Sciences U.S.A., 87, 98689872. Reproducibility of PET activation studies:
Ojemann, G. A., & Whitaker, H. A. (1978). The lessons from a multi centre European experi-
bilingual brain. Archives of Neurology, 35, ment. Neuroimage, 4, 3454.
409412. Price, C. J. (1998). The functional anatomy of
Paradis, M. (1983). Readings on aphasia in bilin- word comprehension and production. Trends
guals and polyglots. Montreal, Canada: Didier. in Cognitive Science, 2, 281288.
Paradis, M. (1998). Language and communication Price, C. J., Green, D., & von Studnitz, R. (1999).
in multilinguals. In B. Stemmer & A functional imaging study of translation and
H. Whitaker (Eds.), Handbook of neuro- language switching. Brain, 122, 22212236.
Functional Neuroimaging of the Bilingual Brain 515
Raichle, M. E., Fiez, J. A., Videen, T. O., Proceedings of the National Academy of
MacLeod, A. M., Pardo, J. V., Fox, P. T., et al. Sciences U.S.A., 94, 14,79214,797.
(1994). Practice related changes in human Thompson-Schill, S. L., DEsposito, M., & Kan, I.
brain functional anatomy during nonmotor P. (1999). Effects of repetition and competi-
learning. Cerebral Cortex, 4, 826. tion on activity in left prefrontal cortex during
Rodriguez-Fornells, A., Rotte, M., Heinze, H. J., word generation. Neuron, 23, 513522.
Nosselt, T., & Munte, T. F. (2002). Brain Wartenburger, I., Heekeren, H. R., Abutalebi, J.,
potential and functional MRI evidence for Cappa, S. F., Villringer, A., & Perani, D.
how to handle two languages with one brain. (2003). Early setting of grammatical processing
Nature, 415, 10261029. in the bilingual brain. Neuron, 37, 159170.
Scoresby-Jackson, R. (1867). Case of aphasia with Weber-Fox, C. M., & Neville, H. J. (1996).
right hemiplegia. Edinburgh Medical Journal, Maturational constraints on functional
12, 696706. specialization for language processing: ERP
Segalowitz, S. J. (1983). Two sides of the brain. and behavioral evidence in bilingual speakers.
Englewood Cliffs, NJ: Prentice Hall. Journal of Cognitive Neuroscience, 8,
Smith, E. E., Jonides, J., & Koeppe, R. A. (1996). 231256.
Dissociating verbal and spatial working Wernicke, C. (1874). Der aphasiche Sympto-
memory using PET. Cerebral Cortex, 6, menkomplex. Breslau: Cohn & Weigert.
1120. Yetkin, O., Yetkin, F. Z., Haughton, V. M., &
Thompson-Schill, S. L., DEsposito, M., Aguirre, Cox, R. W. (1996). Use of functional MR to
G. K., & Farah, M. J. (1997). Role of left map language in multilingual volunteers.
inferior prefrontal cortex in retrieval of American Journal of Neuroradiology, 17,
semantic knowledge: A reevaluation. 473477.
David W. Green
25
The Neurocognition of Recovery Patterns
in Bilingual Aphasics
ABSTRACT We have yet to explain the variety of recovery patterns in bilingual aphasics
despite the practical and theoretical importance of doing so. I consider the reasons for
this state of affairs and identify what is needed to achieve a causal understanding.
Theoretically, I distinguish the issue of the representation of a linguistic system from its
control and explore the neuroanatomic bases of representation and control. Method-
ologically, I argue for the importance of neuroimaging studies (positron emission to-
mography, functional magnetic resonance imaging) to complement psycholinguistic
and neuropsychological data and propose a number of psycholinguistic and neuroi-
maging studies aimed at clarifying the causal basis of recovery patterns. A nal section
briey considers the implications for rehabilitation.
516
Recovery Patterns in Bilingual Aphasics 517
these patterns and then consider their incidence and of the cases, and better recovery of the L2 occurs
practical impact. in 28% of the cases. In a review of cases reported
Paradis (1977) identied six basic recovery between 1990 and 2000 and that did include re-
patterns. Languages can be affected equally, dif- ports of unselected cases, Paradis (2001) found that
ferentially, or selectively. Parallel recovery occurs the majority exhibited parallel recovery (81/132,
when both languages are impaired and restored at 61%), 24 (18%) showed differential recovery, 12
the same rate; differential recovery occurs when (9%) showed a blended recovery pattern, 9 (7%)
languages recover differentially relative to their had selective recovery, and 6 (5%) had successive
premorbid levels; selective recovery occurs when recovery, but he rightly cautioned against inferring
at least one language is not recovered at all. In population values from these gures.
blended recovery, patients mix their languages in- The communicative impact of these different
appropriately. patterns of recovery varies. Parallel recovery allows
There are different trajectories to a particular an individual eventually to achieve premorbid lev-
end state. Two or more languages may eventu- els of communication with family, peers, and the
ally recover, but the second language may only wider public. In contrast, other patterns, such as
begin to recover when the rst has (fully) recov- antagonistic or selective recovery, can create se-
ered. This is termed successive recovery. A special vere communication problems. A person may be
case of selective recovery, antagonistic recovery, unable to communicate linguistically with their
occurs when as one language recovers, a second immediate family and friends or be unable to work.
language becomes impaired.
Two further patterns, alternating antagonism
and selective aphasia, may be considered variants Why Dont We Have a Causal
of antagonistic and selective recovery, respectively Account of Recovery Patterns?
(Paradis, 2001). In alternating antagonism, patients
can access only one of their languages in spontane- Paradis (2001) argued that we have the theoretical
ous speech for alternating periods of time (Nilipour tools to account for the patterns but do not un-
& Ashayeri, 1989; Paradis, Goldblum, & Abidi, derstand what determines a particular type of re-
1982). In the case of selective aphasia (Paradis & covery or, in the case of nonparallel recovery, what
Goldblum, 1989), in contrast to selective recovery, determines the language that is preferentially re-
there are aphasic problems in one language with no covered. The argument in this section is that two
obvious decits in the other. key ingredients are lacking to answer these ques-
These basic patterns do not exhaust the set of tions: critical data and explicit neurocognitive ac-
possibilities. A language may be recovered in an counts that specify the relevant causal parameters.
antagonistic fashion, and a third never recover at This argument is pursued by considering three as-
all. Or, in the case of alternating antagonism, there pects of the problem: the nature of language, in-
may be a temporary inability to translate into the dividual differences in recovery processes, and the
language that the patient can use spontaneously lesion decit methodology.
(Patient A. D., Paradis et al., 1982). There are also
other rare, but important, cases involving a selec- Abstract Characterization of Language Languages
tive decit, such as the loss of the ability to avoid differ, and myriad language combinations are pos-
switching between languages (Patient S. J., Fabbro, sible. Languages must be considered at a suitable
Skrap, & Agliotti, 2000; see Ansaldo & Joanette, level of abstractness to nd unity among the diversity
2002, for a review and interpretation of reported of outcomes. At an abstract level, there are four
cases of pathological language switching and lan- linguistic means for communicating experience (see,
guage mixing). e.g., Tomasello, 1995): individual symbols (lexical
items), markers on symbols (grammatical morphol-
Incidence and Impact ogy), ordering patterns of symbols (word order), and
prosodic variations of speech (e.g., stress, intona-
The true incidence of the basic patterns of recovery tion, timing).
is unknown. Fabbro (1999; see Paradis, 1977), es- Languages differ in the weight they attach to
timated, on the basis of published cases, that the these different means. In some languages, word
typical pattern of recovery is one in which both order is basically free, and information on who
languages recover in parallel (40% of cases). Better did what to whom is conveyed by word endings
recovery of the mother tongue (L1) occurs in 32% or by prosody in tone languages. By contrast, in
518 Aspects and Implications of Bilingualism
English, such information is conveyed by word mary means for establishing the neuroanatomical
order, and this is relatively rigid. These different representation of such devices was the lesion decit
linguistic means require different processes. A given approach.
lesion can therefore give rise to different outcomes This approach is important because it indicates
in different languages because one process is rela- the cortical regions necessary for performance of a
tively more important in one language than an- linguistic task (e.g., speaking in L1), but it cannot
other, so there can be more opportunities for errors establish whether a given decit reects damage to
of a certain type to reveal themselves in one lan- a specialized device at the site of the lesion or to
guage than another (e.g., Paradis, 2001). Damage a distributed network with connections that pass
to a device implementing that process will exert through the lesion site. More seriously from the
greater effect in one language than another and so point of view of correlating lesion site and extent to
underlie differential recovery in one instance and recovery patterns, it cannot establish whether there
selective recovery in another. is residual capacity in the damaged tissue. It also
leaves open other possible mechanisms of recovery
Individual Differences in Recovery Recent years (e.g., the use of a duplicate but previously inhibited
have seen a number of developments that have mechanism) or cognitive changes in the way a given
improved the quality and validity of data. Stan- task is performed. This lesion decit approach
dardized instruments for assessing language per- needs to be complemented by one involving neu-
formance in different languages (e.g., the Bilingual roimaging (see Price & Friston, 1999; Green &
Aphasia test; see Paradis, 2001) are vital to estab- Price, 2001).
lishing valid data sets. Further, records of unse- In the normal brain, positron emission tomog-
lected cases of bilingual aphasia (see Paradis, 2001, raphy (PET) and functional magnetic imaging
p. 71) help overcome any bias in published case (fMRI) can identify the complete set of regions
reports toward the unusual. These developments associated with one task relative to another and,
are welcome, but they do not go far enough. Indi- critically, how one region interacts with another.
viduals differ in their ability to recover from However, such methods (described in the Explor-
damage. ing Recovery Patterns section) also have inherent
A lesion at a given site and extent may yield limitations. One limitation is pertinent here: These
different effects (e.g., parallel recovery vs. differ- methods (along with other physiological measures
ential recovery) because of a more effective repair such as single- and multiunit electrophysiology or
process in one individual compared to another. A electroencephalography [EEG]) tell us about the
number of factors are known to affect the likeli- activation or engagement of a system in the per-
hood of recovery from a focal lesion. These factors formance of a task but not about its necessity (e.g.,
include age, premorbid IQ/education level, and the Brown & Hagoort, 2000; Sarter, Bernston, &
integrity of the frontal lobes (see, e.g., Robertson & Cacioppo, 1996). A combination of neuropsycho-
Murre, 1999). It follows that neuropsychological logical assessment, data on lesion site, and neu-
assessments that focus only on language tasks may roimaging data (together with other techniques
fail to detect dimensions critical to recovery. such as transcranial magnetic stimulation) can help
identify regions that are both necessary and suf-
Lesion Decit Approach The patterns of recovery cient for task performance. Critical to such an en-
provide evidence of potentially dissociable cogni- deavor is a view of how the processes that mediate
tive systems underlying different languages (e.g., language use map onto the neuroanatomical sub-
Gollan & Kroll, 2001; Paradis, 2001). Consider, strate. The next section addresses this question.
for instance, cases of the preferential recovery of
one language over another. In some cases, the
mother tongue is recovered better than a language A Neurocognitive Approach:
acquired second. In other cases, the converse ob- Representation and Control
tains. Case reports also indicate that the devices
involved in translation from one language to an- A neurocognitive approach must characterize the
other are cognitively distinct from those mediating bilingual system at a cognitive level, state how the
picture naming or spontaneous speech production. devices at this level map onto the neuroanatomical
A neurocognitive account must take the further networks, and show how damage at the neuroan-
step of identifying the neuroanatomical bases of atomical level can give rise to the observed be-
these devices and systems. Until recently, the pri- havior (see Morton & Frith, 1995, for an insightful
Recovery Patterns in Bilingual Aphasics 519
account of causal modeling). It must also state how Monsell, 1996). In the case of picture naming, the
damaged networks and circuits recover function. schema pairs a picture name in L1 (say) with the
The rst part of this section considers the output of a picture recognition device that has ac-
cognitive devices comprising the bilingual system. tivated a set of lexical concepts. The schema for
I presuppose a distinction between thought and producing a name in L1 may be in competition
language (Clark, 1996; Johnson-Laird, 1983) but with the other schemas, particularly with one to
grant that in thinking for speaking (Slobin, 1996) name the picture in L2. Top-down control is
bilinguals formulate their messages in terms of the achieved in the normal case by a higher level, or
concepts of the language (Black & Chiat, 2000; executive, system that boosts the relative activation
Green, 1998a; Levelt, Roelofs, & Meyer, 1999). of the target schema (Shallice, 1988). On this ac-
The second part species how these devices and count, language control is part of a system for the
their properties may be implemented in the neural control of action in general (e.g., Green, 1986,
system. 1998b; Meuter & Allport, 1999; Paradis, 1981),
although such a claim does not preclude circuits
Cognitive-Level Description specialized for the control of linguistic actions (e.g.,
Paradis, 2001). Problems in production can arise
The devices used to perform different tasks refer because of difculty in ensuring that the intended
to actual neurocomputational machines, so an ad- schema is dominant.
equate cognitive description needs to include pa- For bilingual speakers to name the picture in the
rameters relevant to the working of real devices. intended language, the lexical representations of
The focus in what follows is on the types of device, words also need to be coded or tagged for language
but keep in mind that each device is not only in some way (Albert & Obler, 1978) to allow their
dedicated to processing information of a certain selection by the task schemas (Green, 1998b; see
type, but also is both capacity constrained (that is, also De Bot & Schreuder, 1993; Dijkstra & Van
it processes inputs of a certain type at a limited Heuven, 1998, for various views on such tagging).
rate) and resource constrained. It will fail to oper- Comparable tags (or units coding for language) are
ate, for example, without the metabolic means to also part of the monolingual speakers repertoire,
do so. Although the concepts of capacity and re- allowing the selection of vocabulary suited for
source are distinct and, as discussed, can be iden- different registers. Disconnection, or noisy trans-
tied with distinct neural properties, there is a mission, between schemas and tags or between the
relation between the two concepts. As resources representation of lexical concepts and tags provides
decline, the functional capacity of the system ways in which a selective pattern of recovery could
will decrease, although the precise nature of the arise.
decline is an open question. How is selection actually achieved? Given a
I distinguish between a device representing the requirement to speak in only one language, selec-
meanings of words, their syntactic properties, and tion could be achieved at a late stage by ltering or
the word forms (the bilingual lexico-semantic sys- by inhibiting lemmas or lexical nodes that lack
tem) and devices involved in controlling the outputs the requisite tag (Green, 1998b). Individuals, of
from that system. This contrast leads to the expec- course, must select appropriate words in other
tation that certain patterns of recovery may arise tasks. Neuroimaging studies indicate that such se-
from problems in controlling the bilingual lexico- lection involves inhibitory processes both in the
semantic system rather than from damage to it. standard Stroop colorword task (Peterson, Skud-
Damage to different components of the control larski, Gatenby, Zhang, Anderson, & Gore, 1999)
system may yield different outcomes. Alternatively, and, more signicantly, in a pictureword interfer-
the same broad clinical outcome may arise for dif- ence task (De Zubicarary, McMahon, Eastburn, &
ferent reasons (e.g., damage to the lexico-semantic Wilson, 2002). In the case of selecting between
system or to components of the control system). words in different languages, selection may be bi-
To appreciate the problem of control, consider ased against the nontarget language by selectively
the task of naming a picture in L1. To perform this deactivating entire language systems or parts of such
task, individuals must avoid performing other tasks, systems (De Bot & Schreuder, 1993; Grosjean,
such as free-associating to the picture or assessing its 1998, 2001; Paradis, 1981, 2001; see also
aesthetic qualities. Following Green (1998b), I say Rodriguez-Fornells, Rotte, Heinze, Nosselt, &
that individuals must activate a particular task Munte, 2002, for evidence of selection in visual
schema that coordinates relevant devices (see also word recognition) or by inhibiting such systems or
520 Aspects and Implications of Bilingualism
their components (Dijkstra & Van Heuven, 1998; right-handed individuals are typically represented
Green, 1986). In pathological cases, such a mecha- in a distributed left hemisphere network (Loring
nism explains the temporary loss, or permanent in- et al., 1990; Springer et al., 1999). In principle,
accessibility, of a languageit is not destroyed but different languages might be represented in a dif-
inhibiteda conjecture rst proposed by Pitres ferent neuroanatomical substrate (e.g., in homolo-
(1895/1983; see Paradis, 2001). gous areas of the right hemisphere).
There is debate on the extent, or the conditions However, Rapport, Tan, and Whitaker (1983),
under which, naming in one language rather than in a study of right-handed polyglot aphasics prior
another involves selection among lexical candidates to surgery, found no evidence of the disruption of
from both languages or from just one language picture naming following intracarotid injection of
(e.g., Costa, Miozzo, & Caramazza, 1999; see also sodium amytal into the right hemisphere. In con-
Costa, chapter 15, this volume). A detailed analysis trast, naming was massively disrupted following
of the relevant normal data is outside the scope injection into the left hemisphere. Further, in a
of this chapter, but current evidence suggests that study of 88 reported cases of right-handed bilingual
words in L1 invariably compete for selection when aphasics, Fabbro (1999, pp. 210211) found that
individuals are naming pictures in L2 (Hermans, only 8% had a lesion to the right hemisphere.
Bongaerts, De Bot, & Schreuder, 1999). L2 names Taking into account reporting biases, he concluded
for pictures also compete for selection when indi- that the incidence of aphasia in bilinguals with
viduals are naming in L1 when individuals are re- right hemisphere lesions is not in fact higher than
quired also to name in L2 within the same block of that shown by monolingual aphasics. In sum, cur-
trials (Kroll & Peck, 1998, cited in Gollan & Kroll, rent data indicate that although languages form
2001). Unfortunately, there appear to be no exper- distinct subsets (see Paradis, 1981, 2001), they are
imental studies examining competition in gram- represented in a common substrate.
matical encoding. I suppose, more specically, the convergence
Competitive costs can be reduced by differenti- hypothesis (see Green, 2003). According to this
ating lexical concepts in the two languages. Mac- hypothesis, as prociency in L2 increases, the rep-
Whinney (1997, p. 120) commented that individuals resentation of L2 and its processing prole (i.e.,
can limit competition by directly linking the L2 term event-related potential [ERP] and neuroimaging
to its concept rather than by linking it to the L1 data) converge with those of native speakers of that
lexical item (as in the lexical route in the Revised language (see also Abutalebi, Cappa, & Perani,
Hierarchical Model of bilingual memory represen- chapter 24, this volume, for related discussion).
tations; see Kroll & De Groot, 1997). The extent of That is, any qualitative differences between native
such differentiation will presumably depend on the speakers of a language and L2 speakers of that
type of concept (e.g., whether it is language specic). language disappear as prociency increases (for
It may also depend on prociency and usage. rather different views see, e.g., Paradis, 1994, and
Fluency within a language may be viewed as an Ullman, 2001). Current ERP and neuroimaging
outcome of tightening within-language links over data are consistent with this hypothesis (e.g.,
between-language links. Over time, the two sys- Abutalebi, Cappa, & Perani, 2001; Osterhout &
tems achieve a quasi-independent status equivalent McLaughlin, 2000; Weber-Fox & Neville, 2001;
to the subset hypothesis (Paradis, 1981, 2001). but see also Hahne & Friederici, 2001, and Vaid &
Usage may also be important. For instance, when Hull, 2002, for a critical review of the neuroimaging
individuals are required to translate between lan- data and Kroll & Dussias, 2004, for an assessment
guages, selection may rely less on concept differ- of the ERP and psycholinguistic data on syntactic
entiation and more on selection via language tags. processing in L2). Notice that the convergence hy-
Computational modeling of systems evolving under pothesis is a claim about neural representation and
different circumstances would be helpful as a processing proles and not a claim about whether
means to explore these conjectures. an L2 speaker of a language can simulate or pass as a
native speaker of that language.
Neuroanatomical Description Unfortunately, there is a dearth of longitudinal
and the Convergence Hypothesis online psycholinguistic or functional imaging stud-
ies specically on L2 grammatical processing and
An important question concerns the extent to which encoding, so the robustness of this hypothesis is
an L2 is processed differently from the L1. In the open to question. For the present, the convergence
normal brain, language functions in monolingual, hypothesis allows simplication of the problem of
Recovery Patterns in Bilingual Aphasics 521
mapping devices to neuroanatomical structures. Dronkers, Redfern, & Knight, 2000; Noppeney &
I suppose an identity, at least at the broad anatomic Price, 2004).
level, with the representations of monolingual
speakers and differentiation at the microanatomic Neurocognitive Level of Control The devices for
level (Paradis, 1977, 2001). controlling language tasks are likely to be im-
In the following paragraphs, I propose a series plemented by control circuits involving both frontal
of identications (Green & Price, 2001). First, attentional and subcortical mechanisms (see, e.g.,
I consider how the bilingual lexico-semantic system Price, Green, & Von Studnitz, 1999). For instance,
and its control system map onto neural structures. language task schemas may be mediated partly by
Conceivably, a given device maps onto a specic subcortical neural mechanisms (e.g., in the basal
neural mechanism in a restricted neuroanatomical ganglia; see also Crosson, Novack, & Trenerry,
area, but this appears not to be the case with lan- 1988), with the level of activation modulated both
guage. Second, I consider the neural identications by external input and by frontal systems, including
of capacity and resource. the anterior cingulate (but cf. Carter et al., 2000). If
production in one language rather than another re-
The Bilingual Lexico-Semantic System Regions quires suppression of the schema for producing ut-
sustaining word production can be divided into terances in the nonselected language, it follows that
those involved in articulation (including the pre- there must be an executive input when individuals
motor cortex, the supplementary motor area, and are required to switch languages on a designated cue
the cerebellum) and those involved in retrieving (see Jackson, Swainson, Cunnington, & Jackson,
phonology. Phonological retrieval (Price, 2000) in- 2001, for ERP evidence in a numeral naming task
volves the left anterior insula and the left frontal and Hernandez, Dapretto, Mazziotta, & Book-
operculum (part of Brocas area). In the case of heimer, 2001, for functional magnetic resonance
reading, the bilateral supramarginal gyri are im- imaging [fMRI] evidence in a picture-naming task).
plicated in the mapping of orthography to pho- It follows that damage to frontal structures should
nology (see discussion of the claim that they are the impair the ability either to maintain a given lan-
site of a language switch). guage or to avoid switching between languages.
Both neuropsychological and neuroimaging Fabbro, Skrap, and Aglioti (2000) reported the
studies indicate that there is a degree of speciali- case of S. J. (a Friulian-Italian speaker) with a lesion
zation within monolingual speakers for syntactic to the left prefrontal cortex and part of anterior
and semantic processes. Breedin and Saffran (1999) cingulate. The speaker S. J. showed normal com-
reported a patient, D. M., who was good at de- prehension in both Italian and Friulian and intact
tecting grammatical violations despite a pervasive clausal processing in both languages. However, S. J.
loss of semantic knowledge. ERP data from normal was unable to avoid switching into Friulian
individuals indicated that there are distinct mech- even when addressing an Italian speaker S. J. knew
anisms mediating at least postlexical syntactic and spoke no Friulian. Likewise, when required to
semantic processes (Hagoort, Brown, & Osterhout, speak Friulian only, S. J. would switch into Ital-
2000). For instance, N400 (found 400 ms after an ian. Switching can only be considered problematic
event) is sensitive to violations of semantic expec- when it arises inappropriately, as in the case of S. J.
tancy, whereas P600 (found 600 ms after an event) (see Grosjean, 2001). I infer that this selective decit
is sensitive to syntactic violations. in preventing a language switch, or in maintaining a
ERP data cannot provide direct evidence of the monolingual output, was a consequence of a lesion
neural sources of such effects. However, functional in the anterior cingulate that precluded one lan-
imaging studies on grammatical processing and guage schema maintaining dominance over the
encoding in native speakers (Hagoort et al., 2000) other, although this outcome may also partly reect
indicated a common syntactic component sub- an inability to maintain the communicative goal of
served by the left frontal area (a dorsal part of speaking in the target language.
Brocas area and adjacent parts of the middle frontal The need to suppress an alternate schema should
gyrus). Finally, research on semantic representation also arise in the case of translation. Presentation of a
of words identied regions in the temporo-parietal word in L1, say, will also trigger naming. Transla-
regionthe left extrasylvian temporal cortex and tion in this sense is analogous to a Stroop task in
the left anterior inferior frontal cortex. A possible which a habitual response must be suppressed. To
area associated with the integration of syntax and translate from L1 to L2, for instance, individuals
semantics lies in the anterior temporal pole (e.g., must inhibit an L1 production schema and activate
522 Aspects and Implications of Bilingualism
the schema for L2. This schema-level process can been associated with phonemic segmentation and
then modulate output from the lexico-semantic the latter with mapping orthography to phonology
system. Functional imaging studies of performance (see discussion of the claim that they are the site of
in Stroop-like tasks all showed increased activation a language switch), but because a given region
in the anterior cingulate, which may serve to may subserve a number of functions, it is preferable
modulate task schemas. Price et al. (1999; see also to claim that both regions are activated in phono-
Klein, Milner, Zatorre, Meyer, & Evans, 1995) logical processing tasks. Jackson et al. (2001) re-
conrmed such an increase for translation. ported a sustained increase in the size of an ERP
If subcortical mechanisms in the basal ganglia component (the late positive complex) over the
are also implicated in selecting the relevant action, parietal region in a numeral-naming task when
then translation should also increase activation individuals had to switch from naming in one
in these regions. The study by Price et al. (1999) language to naming in another. One interpretation
conrmed increases in the relevant areas (the bi- of these two sets of data, consistent with the pres-
lateral putamen and the head of caudate). Increases ent proposal, is that the parietal region is involved
also were observed in the areas associated with in implementing a change in stimulusresponse
articulation (the supplementary motor area, a mapping driven by the task schema.
ventral region of the left anterior insula and the
cerebellum), consistent with the notion that during
Capacity and Resource
translation responses associated with the input or-
thography must be inhibited. We can identify the capacity of a cognitive device
Left subcortical lesions also lead to outcomes with the average number of functionally intact
compatible with the present proposal. The individ- neural units in the relevant neural mechanism (see
ual E. M. (Aglioti, Beltramello, Girardi, & Fabbro, Shallice, 1988, p. 233). The capacity of the device
1996) suffered damage to the caudate nucleus and for retrieving the phonology of words, for instance,
the putamen and had difculty maintaining her may relate to the number of functioning cells in
native Venetan but would constantly switch into specic regions such as the posterior inferior fron-
Italian, a language learned only at school and rarely tal cortex (Brocas area). Alternatively, a better
spoken. Damage to the basal ganglia could have index of capacity may be the interconnectivity of
limited her ability to activate the production schema neural units. In contrast, the resources can be
in L1 in competition with that for L2. Such a dif- identied with the metabolites, neurotransmitters,
culty would also lead to problems in naming even or neuromodulators needed to operate the neural
in the absence of lexical decits in L1. Lesions lo- mechanism. These two identications lead to the
cated at the head of caudate nucleus in the left expectation that restricting the number of neural
hemisphere can also elicit pathological language units (e.g., via a stroke) reduces capacity and so
mixing in which no one language dominates (Abu- may impair performance.
talebi, Miozzo, & Cappa, 2000). Such cases are Likewise, reducing resource (e.g., through the
consistent with the idea that lexical representations loss of the cells producing a resource) to a given
are accessed under the control of frontal-basal neural mechanism in the absence of any change in
ganglia circuits (see also Abutalebi et al., 2001). its capacity may impair performance. One line of
Lesions in other areas can result in bilinguals support for this claim comes from unmedicated
displaying good comprehension in both languages patients with Parkinsons disease. They have re-
but with an ability to speak in just one of them. duced dopamine levels in the prefrontal cortex
Potzl (1925) supposed that the left parietal area (caused by damage of the cells in the substantia
played a central role in language switching, and nigra) and show decits in working memory (e.g.,
that damage to it prevented switching from one Levin, Labre, & Weiner, 1989). Conversely, heal-
language to another. However, there are patients thy adults given a dopamine agonist show working
with lesions in that area who showed no such dif- memory improvements (e.g., Muller, Von Cramon,
culties (see Paradis, 2001, p. 81). & Pollmann, 1998).
The precise role of the parietal regions in lan-
guage switching still needs to be determined. Price
et al. (1999) found increased activation during Exploring Recovery Patterns
language switching not only in a region of Brocas
area (Brodmann area [Ba] 44) but also in the bi- The key aim of this section is to consider how
lateral supramarginal gyri. The former region has neuroimaging studies can be used to examine the
Recovery Patterns in Bilingual Aphasics 523
causal basis of different patterns of recovery. I rst achieved by creating patterns of connectivity in
review different possible mechanisms of recovery. neighboring neural networks (neural plasticity).
I argue for the importance of Hebbian learning as a Neuroimaging can differentiate these alternatives,
primary mechanism for the restitution of function given the tasks are performed normally, by deter-
and point to how different mechanisms of recovery mining whether the regions activated are identical
might be distinguished in terms of their activation to those in normal bilingual controls. When dif-
patterns. PET and fMRI allow assessment of such ferent regions are involved, neuroimaging will re-
patterns. I describe these methods briey and dis- veal activity in different areas for patients relative
cuss some of the methodological prerequisites for to normal controls.
using these methods to study patients. The nal What further factors (specic or general) may
part outlines possible studies of patients with dif- affect recovery of function? Damage to an area can
ferent patterns of recovery with a view to deter- sometimes suppress activity in a relatively remote
mine their causal basis. undamaged area (diaschisis), thereby temporarily
impairing performance in tasks in which the func-
Mechanisms of Recovery tionality of that area is required. Restitution of
function occurs when diaschisis is reversed and
Recovery may be achieved via different mecha- yields normal activity in that region. Behaviorally,
nisms (e.g., Code, 2001; Papathanasiou & Whurr, recovery because of the reversal of diaschisis is also
2000; Rickard, 2000), yielding recovery based on likely to occur earlier than recovery because of
normal cognitive processes or not. Individuals compensation.
might compensate for loss of function by develop- Damaged circuits may also fail to recover func-
ing a new strategy and so deploy different cognitive tion because of suppression from undamaged cir-
processes involving different neural regions. In cuits that compete to control output. In this case,
other cases, there may be restitution of function new lesions that reduce the activation of intact
within the same, or neighboring, neural networks networks can lead to enhanced functioning by al-
using learning processes identical to those that led lowing Hebbian learning to take place in the net-
to the formation of the network. Following Ro- work that was damaged initially (see Kapur, 1996,
bertson and Murre (1999), I suppose that recovery for a review of paradoxical facilitation).
from brain damage involves a process of Hebbian More generally, recovery may also reect at-
learning (Hebb, 1949). tentional factors. Decits in attentional control
In Hebbian learning, two neurons or neuronal (specically, sustained attention) are strong pre-
groups or circuits can reconnect if they are activated dictors of recovery from brain damage. Attentional
at the same time. Spontaneous recovery can arise by control may be important not only because it is a
random activation of one of the groups in the case of factor in providing suitable input to the damaged
well-connected networks with small lesions. Acti- areas, but also because of its connection to the
vation spreads through the network, and any cur- arousal system. Neurotransmitters (e.g., noradren-
rently activated groups become reconnected. At the aline) associated with the arousal system are also
other extreme, neural self-repair is impossible if strongly implicated in cortical plasticity. Hence, the
circuits are too disconnected or lack neurones, and importance of wider cognitive, and pharmacologi-
only compensation is possible. At some intermedi- cal, assessments of bilingual aphasics in the study
ate point, restitution is possible given suitable input. of recovery patterns.
Intuitively, a network less well connected may be
more sensitive to the precise nature of the inputs, Neuroimaging
and simulations discussed by Robertson and Murre
(1999, pp. 553557) showed that when an inter- Hemodynamic methods (PET and fMRI) rely on
mediate number of connections is lost, restitution close coupling between changes in the activation of
does depend on the partially disconnected network a population of neurons and change in blood sup-
receiving targeted (patterned) stimulation that al- ply. A hemodynamic effect arises only when there
lows appropriate reconnection (see Harley, 1996, is a change in the overall metabolic demand in a
and Plaut, 1996, for existence proofs of recovery of neuronal population. PET and fMRI track different
function in other types of networks). signals. PET measures the decay of a short-lived
Hebbian learning may be important in allowing isotope, which accumulates in a neural region in
function to be restored in areas surrounding the proportion to the amount of blood owing through
lesion site. Restitution of function may also be that region. The most typical fMRI method indexes
524 Aspects and Implications of Bilingualism
metabolic demand, and hence relative neural ac- controls is matched (Price & Friston, 1999). If the
tivity, by assessing the ratio of deoxyhemoglobin to patient cannot perform the task, for instance, the
oxyhemoglobin in the blood. corresponding neuronal responses will not be eli-
Each method has advantages and disadvantages. cited. Further, if the patient performs the task but
Minimally, PET studies examine changes in pat- does so in a way different from normal bilinguals,
terns of activation by contrasting conditions that the neuronal abnormality will covary with the
differ in the cognitive operation of interest. Trials cognitive abnormality, and it is not possible to
of a certain type have to be blocked, and the distinguish the cause of the neuronal abnormality
number of observations is restricted because PET (see Green & Price, 2001, for elaboration within
involves the administration of ionizing radiation. the bilingual context). Differences in the way a task
There are no such constraints in the case of fMRI. is performed may be detectable in the patterns of
However, PET has the advantage that it is more or reaction time or error to stimuli of different types
less equally sensitive to activity in all brain regions, or in individuals verbal protocols of what they are
whereas fMRI signals are not. The magnetic signal doing (Rickard, 2000).
is susceptible to factors other than blood oxygen- Differences in activation pattern may also reect
ation levels, making it difcult to record from differences in the relative difculty of the task for
certain regions (e.g., the orbitofrontal region). the patient compared to a matched control even
PET and fMRI offer important advantages for when the overt performance level is closely mat-
the study of recovery patterns. To explore recovery, ched. One check here is to examine changes in
we need to be able to chart changes. At a minimum, activation patterns with variations in task dif-
performance needs to be assessed after the acute culty. If restitution of function is occurring within
phase and at some later time. Because both meth- normally activated regions, then as relative dif-
ods track changes in the whole brain, neural ac- culty decreases for the patient, there will be con-
tivity can be measured in the absence of overt vergence with the patterns shown by normal
manual or vocal response (Price & Friston, 2002). controls. In contrast, if patients and controls show
In consequence, processing can be assessed even different stable patterns over variations in task
when there is no ability to speak a language or difculty, then it is reasonable to infer that resti-
even, apparently, to understand it. There may also tution is an outcome of neural plasticity (Rickard,
be normal effects in one region but abnormal ef- 2000, p. 308).
fects in another. So, for example, in listening to a
story in a nonrecovered language, activity in the Neuroimaging Bilingual Aphasics
auditory regions may be normal, but there may be
abnormal effects in regions associated with se- Problems of control seem to offer a ready account
mantics. At a later point in recovery, both regions of certain recovery patterns. The case of S. J.
may show normal response. (Fabbro et al., 2000), mentioned in the Patterns of
Neuroimaging also allows consideration of how Loss and Recovery section, offers a clear instance.
areas work together. In the normal case, there is a Lexical representations were intact, but there was
functional integration of different areas. If control a problem in ensuring that one language schema
is normal, then the anterior cingulate will modulate continued to dominate another. I attribute this
activity in the basal ganglia normally. This circuit difculty to the lesion in the area of the anterior
will provide a normal modulatory inuence on the cingulate. Drugs (e.g., dopamine agonists) that
systems mediating word production. In contrast, modulate activity in the anterior cingulate, and so
diaschisis, for example, will yield abnormal pat- alter the resources available to it, may improve the
terns of activation. Buchel, Frith, and Friston ability of patients like S. J. to speak just one of their
(2000) described methods for examining effective languages. It follows that the effective connectivity
connectivity (the inuence one neuronal system of regions associated with language control and
exerts on another, p. 339) using structural equa- regions associated with word production should
tion modeling of the patterns of activation in dif- again be normal when the individual is switching
ferent regions of interest. between languages.
Alternating antagonism also seems suited to a
Methodological Cautions pure control explanation (Green, 1986). The sub-
ject A. D. (Paradis et al., 1982) was a French-
Functional imaging studies of patients require that Arabic speaker with a lesion in the temporo-parietal
the performance level of the patient and normal region of the left hemisphere who presented with a
Recovery Patterns in Bilingual Aphasics 525
specic form of alternating antagonism during the different recovery patterns can be explored. Peri-
course of recovery. On one day, she was able to lesional activity might initially be greater for both
speak French spontaneously but not Arabic. On the L1 and L2 in patients showing parallel recovery of
following day, she was able to speak Arabic spon- both their languages compared to patients showing
taneously but not French. However, on the day, for either selective or antagonistic recovery.
instance, when she was unable to speak Arabic Parallel recovery does not entail that both lan-
spontaneously but could speak French spontane- guages are recovered in the same manner. If there is
ously, she was able to translate into Arabic, sug- restitution of function in one case but compensa-
gesting that the lexical representations of that language tion in the other, then only the former language
were available for production. By contrast, on the will show activation patterns during task perfor-
same day that she could speak French spontaneously, mance (e.g., picture naming) indistinguishable from
she could not translate into it. This pattern of perfor- those of normal bilingual controls.
mance suggests that part of her problem lay in se-
lecting between competing language task schemas of a Differential, Selective, and Antagonistic Patterns of
given type (e.g., translating into L1 vs. translating into Recovery Viewed dynamically, a differential pat-
L2) once one had become dominant or perhaps in tern may arise because use of one language rather
linking a nondominant schema to the relevant lexical than the other during the initial phase of recovery
concepts. leads to greater restitution of its network via
An exploration of the control problem might Hebbian learning. Selective recovery, in contrast,
begin by considering the patients ability to handle may arise because progressive use of just one lan-
conict tasks. For instance, with standard Stroop guage consolidates its network and progressively
stimuli, the patient must suppress the normal read- isolates it from the other.
ing response to name the hue in which a color word An alternative possibility is that both patterns
is printed. Compared to normal bilingual controls, reect problems of control. For instance, there
patients with alternating antagonism might show could be damage to the mechanism that selects the
an abnormal pattern of correlated activity in the intended language or a disconnection or disruption
anterior cingulate when required to process such of the link connecting the representation of the
stimuli. As in the case of S. J., if resource con- meanings of words and the units coding for lan-
straints underlie the problem, impaired perfor- guage (or alternatively between the units coding for
mance should improve with the administration of a language and the schema). Selective recovery, as
dopamine agonist. opposed to differential recovery, reects greater
These patterns of recovery are rare, so it is im- difculty in selecting one language over another. A
portant to examine whether other, more common third possibility is that the lexico-semantic system
patterns reect problems with the control mecha- of one language is marginally more impaired than
nism. In the next section, we consider how neu- that of the other language, and this impairment
roimaging studies may contribute to better leads to a problem in controlling that language.
understanding of four such recovery patterns. This lack of control blocks its recovery via Hebbian
mechanisms and isolates it from the other lan-
Parallel Recovery If control processes are intact, guage. One way of advancing research in this area
then the regions associated with control and those is to determine the nature of the representations
associated with word production should modulate accessible under these patterns of recovery.
normally in both languages (i.e., there should be Consider a selective pattern of recovery. Under
the normal pattern of effective connectivity) during a strong control hypothesis, there is access to
word production and conict tasks. Recovery will meaning but an inability to select lexical concepts
then primarily reect restitution in the lexico- in the nonrecovered language. If there is access to
semantic system. Hebbian mechanisms provide the meaning of words but an inability to select
more complete recovery when the network con- lexical concepts in the nonrecovered language, then
nections are better preserved, so the extent of individuals should still show semantic interference.
perilesional activation is likely to be critical to the Consider a Spanish-English bilingual with selective
recovery of function (see Warburton, Price, Swin- recovery of Spanish performing the following task.
burn, & Wise, 1999, for evidence of perilesional Individuals are required to press one key provided
activation in monolingual aphasic patients). If this an arrow points in one direction and another key if
is so, the link between the amount of preserved the arrow points in the opposite direction. Pairing
functional capacity in the perilesional tissue and the arrow with an incongruent direction word in
526 Aspects and Implications of Bilingualism
English (e.g., / RIGHT) as contrasted with a row As in the case of selective recovery, what in-
of Xs should slow reaction time, reecting an in- formation is available for the less-recovered lan-
crease in response conict. Such a conict might guage can be assessed. The arrow task described in
also be detectable in specic brain regions such as this section offers a partial test. At an early stage of
the anterior cingulate. In the recovered or better- recovery, the patient would be tested in the most-
recovered language, for which semantic access is recovered language, and we would look for evi-
preserved as assessed by this online task, neuronal dence for response conict from the less-recovered
activity in bilingual aphasics should pattern in the language. At a later stage, the patient would be
same way as bilingual controls. In contrast, when tested in the other language, and we would look for
there is evidence of semantic access but an inability the reverse effects. Comparable studies are possible
to select the lexical concept for production, the for examining access to word form.
regions associated with language control will acti-
vate abnormally, an example of differential
diaschisis (cf. Price, Warburton, Moore, Frack- Implications for Rehabilitation
owiak, & Friston, 2001).
If there is access to meaning, is there also access Understanding recovery patterns has the practical
to word form? Consider the task of determining goal of developing a principled basis for rehabili-
whether a predesignated phoneme (e.g., t) is tation. The argument is that understanding the
present in the Spanish name of a pictured object causal basis permits a targeted intervention. Re-
such as a table (mesa in Spanish; Colome, 2001; habilitation in aphasia is a complex topic (see, e.g.,
Hermans, 2000). Normal bilinguals are slower to Code, 2000), so my illustrations here are not in-
reject the target phoneme if it is present in the tended as clinical prescriptions. Consider a pure
translation equivalent compared to control trials control problem, as in the case of S. J. (Fabbro
in which it is not. If segments of the word form of et al., 2000; discussed in the rst paragraph of the
the nonrecovered language are activated, then Neuroimaging Bilingual Aphasics section), a phar-
patients also should show this phoneme interfer- macological intervention might be appropriate.
ence effect. Further, there will be increased acti- But, to the extent inappropriate switching stems
vation in the anterior cingulate on incongruent from a failure to maintain the communicative goal
trials compared to control trials. If, on the other (e.g., speak in L1), an appropriate intervention
hand, there is no access to the word form, then might seek to increase the persons capacity to
bilingual aphasics will react similarly to monolin- sustain attention (see Robertson & Murre, 1999,
gual controls. for relevant techniques) and so maintain the com-
Antagonistic recovery may be construed as a municative goal.
special case of selective recovery. Why should the As indicated, different patterns of recovery may
recovery of one language be impaired when a sec- reect the consequences of random stimulation to
ond language improves (e.g., Paradis & Goldblum, a damaged network. Such stimulation, at least for
1989)? A control explanation of the antagonistic small lesions, can lead not only to an adaptive
pattern of recovery might be as follows. Initial outcome (e.g., parallel recovery of both languages),
language use is probabilistically determined and but also to maladaptive outcomes for larger lesions:
does not directly reect the rate at which recovery A given circuit may become connected to a formerly
will occur for the two languages. A small difference distinct circuit. One relevant factor here may be the
in the rates of recovery, reecting perhaps different extent to which individuals are aware of their
degrees of damage to the lexico-semantic system, speech output (Robertson & Murre, 1999). Shuren,
will be sufcient to induce different end states in the Hammond, Maher, Rothi, and Heilman (1995) re-
course of language use. An initially less-dominant ported the case of a jargon aphasic who was un-
language schema becomes more and more domi- aware of his errors in the normal course of events
nant (via Hebbian learning), inhibiting use of the but recognized error when listening to a tape re-
other language schemas and increasing the con- cording of his own speech. This suggests that part of
nectivity within the lexico-semantic system for the his problem was attentional; when the attentional
selected (and initially less-well-recovered) language. load (involved in planning and in producing speech)
Such an account presumes that, after an initial was reduced, he was able to recognize problems.
phase in which competition between languages is Recognition of error is important because in
weak, inhibiting the schemas for the better-recovered its absence circuits involved in awed production
language becomes more difcult. may become connected. In the case of bilingual
Recovery Patterns in Bilingual Aphasics 527
aphasics, pathological mixing may arise from an which different patterns of recovery may arise.
initial problem of control becoming entrenched. A Such studies provide existence proofs and need to
possible intervention might then be to record, and be constrained by data from neuroimaging studies
to play back, the mixed speech and to create con- to ensure their neurological plausibility.
ditions, with short utterances at rst, during which Finally, the project to understand the patterns
the same language is maintained. Training indi- of recovery requires suitable databases. Reports of
viduals to overcome interference (e.g., in the stan- unselected cases of bilingual aphasics are rare.
dard Stroop task) may also be helpful. Ideally, we need to create researchable databases in
One possible cause of selective recovery is dif- which relevant data (lesion site, language back-
culty in binding lexical concepts to a language ground, performance on standardized tests, cogni-
tag. Associating various kinds of language-specic tive test performance, and functional imaging data)
contextual cues (e.g., music or scenes) to lexical can be used to test and explore different models. In
items may help re-create the units coding for lan- the short term, the most tractable way forward is
guage and allow these to be linked to set an initial through intensive studies of single cases combined
set of lexical concepts. with psycholinguistic and neuroimaging studies of
It is reasonable to expect that treatments that normal bilinguals.
work for monolingual aphasics may be helpful for
bilingual/polyglot aphasics (Juncos-Rabadan, Per- Acknowledgments
eiro, & Rodrguez, 2002), but there is one possi-
bility potentially available for the bilingual aphasic I am grateful to the editors and to Cathy Price for
that is used quite spontaneously. Bilingual aphasics helpful comments and suggestions. I thank the
with parallel recovery frequently self-cue and pro- Wellcome Trust (grant 074735) for support.
duce the correct word in the nontarget language to
retrieve the intended word (e.g., Juncos-Rabadan References
et al., 2002; Roberts & Le Dorze, 1998). One im-
portant area for future research will be to explore Abutalebi, J., Cappa, S. F., & Perani, D. (2001).
the benets of implicit techniques such as priming. The bilingual brain as revealed by functional
Can priming in a language that is accessible affect imaging. Bilingualism: Language and Cogni-
tion, 4, 179190.
access to representations in a language previously
Abutalebi, J., Miozzo, A., & Cappa, S. F. (2000).
inaccessible? What are the conditions for such an Do subcortical structures control language
effect to occur? selection in bilinguals? Evidence from
pathological language mixing. Neurocase, 6,
101106.
Summary Aglioti, S., Beltramello, A., Girardi, F., &
Fabbro, F. (1996). Neurolinguistic follow-up
Paradis (2001) asked: What determines the partic- study of an unusual pattern of recovery from
ular type of recovery? In the case of nonparallel bilingual subcortical aphasia. Brain, 119,
recovery, what selects a particular language for 15511564.
Albert, M. L., & Obler, L. K. (1978). The bilingual
preferential recovery over another? My answer is
brain: Neuropsychological and neurolinguistic
a call to action. We need to gain a more complete aspects of bilingualism. New York: Academic
neurocognitive picture of patients to construct an Press.
adequate causal theory. With this aim in mind, this Ansaldo, A. I., & Joanette, Y. (2002). Language
chapter emphasized the usefulness of the contrast mixing and language switching. In F.
between representation and control. Fabbro (Ed.), Advances in the neurolinguistics
In terms of the studies required, neuroimaging of bilingualism (pp. 261274). Udine, Italy:
research on normal bilinguals, guided by adequate Forum.
theory, is a critical prerequisite. Given the goal, we Black, M., & Chiat, S. (2000). Putting thoughts
must develop tests that are maximally general; they into verbs: Developmental and acquired im-
pairments. In W. Best, K. Bryan, & L. Maxim
should be readily convertible to different pairs of
(Eds.), Semantic processing: Theory and
languages and be capable of being carried out at practice (pp. 5279). London: Whurr.
different stages in the recovery process. Breedin, S. D., & Saffran, E. M. (1999). Sentence
Conjectures on the causal basis of recovery processing in the face of semantic loss: A case
patterns will also be usefully complemented by study. Journal of Experimental Psychology:
simulation studies examining the conditions under General, 128, 547562.
528 Aspects and Implications of Bilingualism
Brown, C. M., & Hagoort, P. (2000). The disorders. In M. S. Gazzaniga (Ed.), The new
cognitive neuroscience of language: Challenges cognitive neurosciences (pp. 949960). Cam-
and future directions. In C. M. Brown & P. bridge, MA: MIT Press.
Hagoort (Eds.), The neurocognition of Fabbro, F. (1999). The neurolinguistics of bilin-
language (pp. 314). Oxford, U.K.: Oxford gualism: An introduction. Hove, U.K.:
University Press. Psychology Press.
Buchel, C., Frith, C., & Friston, K. (2000). Func- Fabbro, F., Skrap, M., & Aglioti, S. (2000).
tional integration: Methods for assessing in- Pathological switching between languages
teractions amongst neuronal systems. In C. M. following frontal lesions in a bilingual patient.
Brown & P. Hagoort (Eds.), The neurocogni- Journal of Neurology, Neurosurgery and
tion of language (pp. 337355). Oxford, U.K.: Psychiatry, 68, 650652.
Oxford University Press. Gollan, T. H., & Kroll, J. F. (2001). Lexical access
Carter, C. S., Macdonald, A. M., Botvinick, M., in bilinguals. In B. Rapp (Ed.), A handbook
Ross, L. L., Stenger, A. V., Noll, D., et al. of cognitive neuropsychology: What decits
(2000). Parsing executive processes: Strategic reveal about the human mind (pp. 321345).
vs. evaluative functions of the anterior cingu- New York: Psychology Press.
late cortex. Proceedings of the National Green, D. W. (1986). Control, activation and re-
Academy of Sciences, 97, 19441948. source. Brain and Language, 27, 210223.
Clark, H. H. (1996). Communities, commonalities Green, D. W. (1998a). Bilingualism and thought.
and communication. In J. J. Gumperz & Psychologica Belgica, 38, 253278.
S. C. Levinson (Eds.), Rethinking linguistic Green, D. W. (1998b). Mental control of the bi-
relativity (pp. 324355). Cambridge, U.K.: lingual lexico-semantic system. Bilingualism:
Cambridge University Press. Language and Cognition, 1, 6781.
Code, C. (2000). Multifactorial processes in Green, D. W. (2003). The neural basis of the lexi-
recovery from aphasia: Developing the con and the grammar in L2 acquisition. In
foundations for a multi-level framework. R. van Hout, A. Hulk, F. Kuiken, & R. Towell
Brain and Language, 77, 2544. (Eds.), The interface between syntax and the
Colome, A`. (2001). Lexical activation in bilinguals lexicon in second language acquisition (pp.
speech production: Language-specic or 197218). Amsterdam: Benjamins.
language independent? Journal of Memory Green, D. W., & Price, C. (2001). Functional
and Language, 45, 721736. imaging in the study of recovery patterns in
Costa, A., Miozzo, M., & Caramazza, A. (1999). bilingual aphasics. Bilingualism: Language
Lexical selection in bilinguals: Do words in the and Cognition, 4, 191201.
bilinguals lexicons compete for selection? Grosjean, F. (1998). Studying bilinguals: Method-
Journal of Memory and Language, 41, ological and conceptual issues. Bilingualism:
365397. Language and Cognition, 1, 131149.
Crosson, B., Novack, T. A., & Trenerry, M. R. Grosjean, F. (2001). The bilinguals language
(1988). Subcortical language mechanisms: modes. In J. Nicol (Ed.), One mind, two
Windows on a new frontier. In H. A. Whitaker languages: Bilingual language processing
(Ed.), Phonological processes and brain (pp. 122). Oxford, U.K.: Blackwell.
mechanisms (pp. 2458). New York: Springer. Hagoort, P., Brown, C. M., & Osterhout, L.
De Bot, K., & Schreuder, R. (1993). Word (2000). The neurocognition of syntactic pro-
production and the bilingual lexicon. In R. cessing. In C. M. Brown & P. Hagoort (Eds.),
Schreuder & B. Weltens (Eds.), The bilingual The neurocognition of language (pp. 273
lexicon (pp. 191214). Amsterdam: 316). Oxford, U.K.: Oxford University Press.
Benjamins. Hahne, A., & Friederici, A. D. (2001). Processing a
De Zubicaray, G. I., McMahon, K. L., Eastburn, second language: Late learners comprehen-
M. M., & Wilson, S. J. (2002). Orthographic/ sion mechanisms as revealed by event-related
phonological facilitation of naming responses brain potentials. Bilingualism: Language and
in the picture-word task: An event-related Cognition, 4, 123141.
fMRI study using overt vocal responding. Harley, T. A. (1996). Connectionist model of the
NeuroImage, 16, 10841093. recovery of language functions following brain
Dijkstra, A., & Van Heuven, W. J. B. (1998). The damage. Brain and Language, 52, 724.
BIA model and bilingual word recognition. Hebb, D. O. (1949). The organization of behav-
In J. Grainger & A. Jacobs (Eds.), Localist iour: A neuropsychological theory. New York:
connectionist approaches to human cognition Wiley.
(pp. 189225). Mahwah, NJ: Erlbaum. Hermans, D. (2000). Word production in a foreign
Dronkers, N. F., Redfern, B. B., & Knight, R. T. language. Unpublished doctoral dissertation,
(2000). The neural architecture of language University of Nijmegen, The Netherlands.
Recovery Patterns in Bilingual Aphasics 529
Hermans, D., Bongaerts, T., De Bot, K., & de Groot & J. F. Kroll (Eds.), Tutorials in
Schreuder, R. (1999). Producing words in a bilingualism: Psycholinguistic perspectives
foreign language: Can speakers prevent inter- (pp. 113142). Mahwah, NJ: Erlbaum.
ference from their rst language? Bilingualism: Meuter, R. F. I., & Allport, A. (1999). Bilingual
Language and Cognition, 1, 213229. language switching in naming: Asymmetrical
Hernandez, A. E., Dapretto, M., Mazziotta, J., & costs of language selection. Journal of Memory
Bookheimer, S. (2001). Language switching and Language, 40, 2540.
and language representation in Spanish- Monsell, S. (1996). Control of mental processes. In
English bilinguals: An fMRI study. V. Bruce (Ed.), Unsolved mysteries of the
NeuroImage, 14, 510520. mind: Tutorial essays on cognition
Jackson, G. M., Swainson, R., Cunnington, R., & (pp. 93148). Hove, U.K.: Taylor & Francis.
Jackson, S. R. (2001). ERP correlates of Morton, J., & Frith, U. (1995). Causal modelling:
executive control during repeated language A structural approach to developmental psy-
switching. Bilingualism: Language and chopathology. In D. Cicchetti & D. J. Cohen
Cognition, 4, 169178. (Eds.), Manual of developmental psychopa-
Johnson-Laird, P. N. (1983). Mental models: thology (pp. 357390). New York: Wiley.
Towards a cognitive science of language, Muller, U., Von Cramon, D. Y., & Pollmann, S.
inference and consciousness. Cambridge, U.K.: (1998). D1versus D2receptor modulation
Cambridge University Press. of visuo-spatial working memory in humans.
Juncos-Rabadan, O., Pereiro, A. X., & Rodrguez, Journal of Neuroscience, 18, 27202728.
M. J. (2002). Treatment of aphasia in bilin- Nilipour, R., & Ashayeri, H. (1989). Alternating
gual subjects. In F. Fabbro (Ed.), Advances in antagonism between two languages with suc-
the neurolinguistics of bilingualism (pp. 275 cessive recovery of a third in a trilingual aphasic
298). Udine, Italy: Forum. patient. Brain and Language, 36, 2348.
Kapur, N. (1996). Paradoxical functional facilita- Noppeney, U., & Price, C. J. (2004). An fMRI
tion in brain-behaviour research. Brain, 19, study of syntactic adaptation. Journal of
17751990. Cognitive Neuroscience, 16, 702713.
Klein, D., Milner, B., Zatorre, R., Meyer, E., & Osterhout, L., & McLaughlin, J. (2000, April).
Evans, A. (1995). The neural substrates un- What brain activity can tell us about second-
derlying word generation: A bilingual func- language learning. Paper presented at the 13th
tional-imaging study. Proceedings of the Annual CUNY Conference on Human Sen-
National Academy of Sciences U.S.A., 92, tence Processing, San Diego, CA.
28992903. Papathanasiou, I., & Whurr, R. (2000). Recovery
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical of function in aphasia. In I. Papathanasiou
and conceptual memory in bilinguals: Map- (Ed.), Acquired neurogenic disorders: A clini-
ping form to meaning in two languages. In A. cal perspective (pp. 2848). London: Whurr.
M. B. de Groot & J. F. Kroll (Eds.), Tutorials Paradis, M. (1977). Bilingualism and aphasia. In
in bilingualism: Psycholinguistic perspectives H. Whitaker & H. A. Whitaker (Eds.), Studies
(pp. 169199). Mahwah, NJ: Erlbaum. in neurolinguistics (Vol. 3, pp. 65121). New
Kroll, J. F., & Dussias, P. E. (2004). The compre- York: Academic Press.
hension of words and sentences in two lan- Paradis, M. (1981). Neurolinguistic organization
guages. In T. K. Bhatia & W. C. Ritchie (Eds.), of the bilinguals two languages. In J. E.
The handbook of bilingualism (pp. 169200). Copeland & P. W. Davis (Eds.), The Seventh
Oxford, U.K.: Blackwell. LACUS Forum (pp. 486494). Columbia, SC:
Levelt, W. J. M., Roelofs, A., & Meyer, A. S. Hornbeam Press.
(1999). A theory of lexical access in speech Paradis, M. (1994). Neurolinguistic aspects of im-
production. Behavioral and Brain Sciences, plicit and explicit memory: Implications for
22, 175. bilingualism and second language acquisition.
Levin, B. E., Labre, M. M., & Weiner, W. J. In N. Ellis (Ed.), Implicit and explicit language
(1989). Cognitive impairments associated with learning (pp. 393419). London: Academic
early Parkinsons disease. Neurology, 39, Press.
557561. Paradis, M. (1995). Epilogue. In M. Paradis (Ed.),
Loring, D. W., Meador, K. J., Lee, G. P., Murro, A. Aspects of bilingual aphasia (pp. 211223).
M., Smith, J. R., Flanigin, H. F., et al. (1990). New York: Pergamon/Elsevier Science.
Cerebral language lateralization: Evidence Paradis, M. (2001). Bilingual and polyglot aphasia.
from intracarotid amobarbital testing. In R. S. Berndt (Ed.), Handbook of neuro-
Neuropsychologica, 28, 831838. psychology, 2nd ed. Vol. 3: Language and
MacWhinney, B. (1997). Second language acqui- aphasia (pp. 6991). Amsterdam: Elsevier
sition and the competition model. In A. M. B. Science.
530 Aspects and Implications of Bilingualism
Paradis, M., & Goldblum, M. C. (1989). Selected Roberts, P. M., & Le Dorze, G. (1998). Bilingual
crossed aphasia in a trilingual patient followed aphasia: Semantic organization, strategy use,
by reciprocal antagonism. Brain and Lan- and productivity in semantic verbal uency.
guage, 36, 6275. Brain and Language, 65, 287312.
Paradis, M., Goldblum, M. C., & Abidi, R. (1982). Robertson, I. H., & Murre, J. M. (1999). Reha-
Alternate antagonism with paradoxical trans- bilitation of brain damage: Brain plasticity and
lation behaviour in two bilingual aphasic principles of guided recovery. Psychological
patients. Brain and Language, 15, 5569. Bulletin, 125, 544575.
Peterson, B. S., Skudlarski, P., Gatenby, J. C., Rodriguez-Fornells, A., Rotte, M., Heinze, H-J.,
Zhang, H., Anderson, A. W., & Gore, J. C. Nosselt, T., & Munte, T. F. (2002). Brain
(1999). An fMRI study of Stroop word-color potential and functional MRI evidence for
interference: Evidence for cingulate subregions how to handle two languages with one brain.
subserving multiple distributed attentional Nature, 415, 10261029.
systems. Biological Psychiatry, 45, Sarter, M., Bernston, G., & Cacioppo, J. (1996).
12371258. Brain imaging and cognitive neuroscience: To-
Pitres, A. (1895). Etude sur laphasie chez les wards strong inference in attributing function
polyglottes. Revue de medicine, 15, 873899. to structure. American Psychologist, 51, 1321.
Translation in M. Paradis (Ed.), Readings on Shallice, T. (1988). From neuropsychology
aphasia in bilinguals and polyglots (pp. to mental structure. Cambridge, U.K.:
2649). Montreal, Canada: Didier, 1983. Cambridge University Press.
Plaut, D. C. (1996). Relearning after damage in Shuren, J. E., Hammond, C. S., Maher, L. M.,
connectionist networks: Towards a theory of Rothi, L. J. G., & Heilman, K. M. (1995).
rehabilitation. Brain and Language, 52, 2582. Attention and anosognosia: The case of a
Potzl, O. (1925). U ber die parietal bedingte jargon aphasic patient with unawareness of
Aphasie und ihren Einuss auf das sprechen language decit. Neurology, 45, 376378.
mehrer sprachen. Zeitschrift fur die gesamte Slobin, D. I. (1996). From thought and language
Neurologie und Psychiatrie, 12, 145162. to thinking for speaking. In J. J. Gumperz &
Price, C. J. (2000). The anatomy of language: S. C. Levinson (Eds.), Rethinking linguistic
Contributions from functional neuroimaging. relativity (pp. 177202). Cambridge, U.K.:
Journal of Anatomy, 197, 335359. Cambridge University Press.
Price, C. J., & Friston, K. J. (1999). Scanning pa- Springer, J. A., Binder, J. R., Hammeke, T. A.,
tients with tasks they can perform. Human Thomas, A., Swanson, S. J., Bellgowan, P. S. F.,
Brain Mapping, 8, 102108. et al. (1999). Language dominance in neuro-
Price, C. J., & Friston, K. J. (2002). Functional logically normal and epilepsy subjects: A func-
imaging studies of neuropsychological pa- tional MRI study. Brain, 122, 20332046.
tients: Applications and limitations. Neuro- Tomasello, M. (1995). Language is not an instinct.
case, 8, 345354. Cognitive Development, 10, 131156.
Price, C. J., Green, D., & Von Studnitz, R. (1999). Ullman, M. T. (2001). The neural basis of lexicon
A functional imaging study of translation and and grammar in first and second language:
language switching. Brain, 122, 22212236. The declarative/procedural model. Bilingual-
Price, C. J., Warburton, E. A., Moore, C. J., ism: Language and Cognition, 4, 105122.
Frackowiak, R. S. J., & Friston, K. J. (2001). Vaid, J., & Hull, R. (2002). Re-envisioning the
Dynamic diaschisis: Anatomically remote and bilingual brain using functional neuroimaging:
context-sensitive human brain lesions. Journal Methodological and interpretive issues. In F.
of Cognitive Neuroscience, 13, 419429. Fabbro (Ed.), Advances in the neurolinguistics
Rapport, R. L., Tan, C. T., & Whitaker, H. A. of bilingualism (pp. 315355). Udine, Italy:
(1983). Language function and dysfunction Forum.
among Chinese and English speaking polyglots: Warburton, E. A., Price, C. J., Swinburn, K., &
Cortical stimulation, Wada testing, and clinical Wise, K. J. (1999). Mechanisms of recovery
studies. Brain and Language, 18, 342366. from aphasia: Evidence from positron emission
Rickard, T. C. (2000). Methodological issues in tomography studies. Journal of Neurology,
functional magnetic resonance imaging studies Neurosurgery and Psychiatry, 66, 155161.
of plasticity following brain injury. In H. S. Weber-Fox, C. M., & Neville, H. J. (2001).
Levin & J. G. Grafman (Eds.), Cerebral reor- Sensitive periods differentiate processing for
ganization of function after brain damage (pp. open and closed class words: An ERP study in
304317). Oxford, U.K.: Oxford University bilinguals. Journal of Speech, Language, and
Press. Hearing Research, 44, 13381353.
Judith F. Kroll
Natasha Tokowicz
26
Models of Bilingual Representation
and Processing
Looking Back and to the Future
531
532 Aspects and Implications of Bilingualism
associated with a selective view of language pro- on the grounds that it would prevent translation.
cessing that suggested that there might be a lan- Model B maintains separate lexical nodes for
guage switch that effectively enabled one language words in each language but includes translation
and shut the other down as needed (e.g., Macna- links across languages. Model D, like Models A
mara & Kushnir, 1971). As Van Heuven, Dijkstra, and B, assumes separate lexical representations.
and Grainger (1998) pointed out, these questions However, now there are not only the translation
can be viewed independently of one another. It links of Model B, but also cross-language connec-
is possible to have shared memory representations tions to associated words. Model C is an extreme
with selective access or separate representations version of the integrated model, with shared lexical
with parallel and nonselective access. One question nodes and therefore shared semantic relations
is about the representation or code; the other is within and across languages. The nal alternative,
about the process of accessing that information. Model E, assumes shared conceptual representa-
A third respect in which the early bilingual tions but separate lexical representations for each
models failed to characterize cognitive activity language.
adequately concerns their scope. Few distinctions Model E has been taken by some (e.g., Kroll &
were made to address the distinct demands of Sholl, 1992; Potter et al., 1984) as a solution to
comprehension, production, or memory. Models the apparent controversy surrounding the issue of
were considered general, and predictions were tes- separate versus shared language representation. If
ted for comprehension, production, and memory as only semantic, but not lexical, representations are
if they were the same. Tests of particular models shared, then tasks that reect lexical-level proces-
might therefore succeed or fail depending on the sing will tend to support independence across the
relation of the nature of evidence to the hypothe- two languages, whereas tasks that reect semantic
sized mechanism. processing will tend to support the common inter-
Finally, although early research focused on dif- dependent alternative.
ferent types of bilingualism (e.g., Weinreich, 1953), The pattern of cross-language priming results
later models for the most part did not consider the obtained by Kirsner et al. (1984) allowed them to
consequences of the bilinguals learning history or reject all but Models D and E, which they were
the developmental changes associated with in- unable to distinguish on the basis of the observed
creasing skill in the second language (L2; see De data. But, the point of this example for present
Groot & Poot, 1997; Kroll & Stewart, 1994; purposes is that the family of models shown in Fig.
MacWhinney, 1997; Magiste, 1984; and Potter, So, 26.1 fails to address each of the issues identied
Von Eckardt, & Feldman, 1984, for exceptions). above. Although lexical and semantic representa-
We can illustrate the problem in the early tions are distinguished, no information is specied
models using an example drawn from an often- concerning the form of those representations. The
cited article by Kirsner, Smith, Lockhart, King, and models do not identify orthographic and phono-
Jain (1984). Our example is not intended to single logical aspects of lexical form or provide adequate
out these authors. To the contrary, this paper made detail that might allow the models to handle cases in
an important contribution by demonstrating that which precise translation equivalents do not exist.
it was possible to obtain cross-language semantic Likewise, no assumptions are made about
priming even when the two languages are mixed the selectivity of lexical access, although the ar-
and orthographically distinct. Kirsner et al. con- rangement of separated versus integrated lexical
trasted ve models of the bilingual lexicon using a representations would appear to suggest selective
scheme initially proposed by Meyer and Ruddy versus nonselective access. The priming focus of
(1974). These models are shown in Fig. 26.1. In the Kirsner et al. (1984) article suggests that the
each of the ve congurations, words in the bilin- models are intended to capture aspects of com-
guals two languages have separate representations prehension, but the arrangement in Model E was
or share the same representation. The grouping of initially proposed by Potter et al. (1984) to account
words vertically reects their semantic relations, for bilingual performance in language production
whereas the horizontal connections depict lexical tasks. Thus, distinctions are not drawn between
connections across translation equivalents. In initial access from word to concept in comprehen-
Model A, there are separate representations for sion and later lexicalization from concept to word
words in each language and lexical connections in production.
only within, but not across, languages. Kirsner et al. Finally, the architecture and processing within
rejected this extreme version of the separate model the lexicon in this family of alternatives appears to
Models of Representation and Processing 533
Figure 26.1 Five models of the bilingual lexicon. Adapted from Kirsner et al. (1984).
reect an ultimate arrangement for procient bi- well as the control mechanisms that permit atten-
linguals that ignores their learning history, struc- tion to be allocated appropriately to the desired
tural differences between their two languages, and language and language task (e.g., Bialystok, chap-
their relative language dominance. How the re- ter 20, this volume; Green, 1986, 1998; Michael &
lations between word and concept develop as L2 Gollan, chapter 19, this volume). In this context,
learners become more procient in the L2 and how the issue of independence or interdependence of
the procient state of the lexicon may reect dy- bilingual language systems has not been abandoned
namic changes in language use and activity are not but instead recast to accommodate what we now
addressed. know about language processing and the cognitive
In response to the problems described, contem- mechanisms that support it.
porary models have become more specialized, fo- In this chapter, we provide an overview of
cusing on particular aspects of the codes that may models that have brought research on the repre-
be shared across languages; on the way in which sentation and processing of two languages to its
lexical or grammatical information is accessed in contemporary state. Our review is necessarily se-
bilinguals under specic task conditions such as lective because of the length limitations of the
reading, listening, speaking, or remembering; and present text and the availability of other recent
on the processes that characterize the cognitive reviews of related material in this volume (e.g.,
changes that enable L2 learners to become pro- Costa, chapter 15; Dijkstra, chapter 9; La Heij,
cient bilinguals. These cognitive changes include chapter 14; Sanchez-Casas & Garca-Albea, chap-
the component processes that allow the develop- ter 11; and Thomas & Van Heuven, chapter 10)
ment of skilled performance in the L2 (e.g., see and others (e.g., De Groot, 2002; Dijkstra & Van
Segalowitz & Hulstijn, chapter 18, this volume) as Heuven, 2002; Gollan & Kroll, 2001; Kroll & De
534 Aspects and Implications of Bilingualism
Groot, 1997; Kroll & Dijkstra, 2002; Kroll & considers models that illustrate the ways in which
Dussias, 2004; Kroll & Sunderman, 2003; Kroll & assumptions have been made about different levels
Tokowicz, 2001). We focus specically on models of representation. The second examines models
that address the issues raised in this section, proposed to address particular language-processing
namely, which aspects of representation are shared tasks, such as comprehension or production. The
and to what extent access to them is selective, how nal section focuses on developmental issues
cross-language interactions change in the face of and their consequences for procient bilingual
different task demands, and how the course of L2 performance.
acquisition affects the form of representations and
connections across the two languages.
Furthermore, we restrict our discussion to mod- Levels of Representation
els of lexical representation and to accounts of on-
line processing because a disproportionate number An advance in modeling the bilingual lexicon came
of studies have addressed the bilingual lexicon. (For from the recognition that different aspects of the
a review of models and evidence regarding syntactic lexical code may distinctly constrain the form of
processes, see Frenck-Mestre, chapter 13, this vol- cross-language interactions. The development of
ume; Kroll & Dussias, 2004; MacWhinney, 1997, computational models of word recognition in the
and chapter 3, this volume; Pienemann, Di Biase, monolingual domain (e.g., Grainger & Jacobs,
Kawaguchi, & Hakansson, chapter 7, this volume). 1996; McClelland & Rumelhart, 1981; Seidenberg
Questions about memory retrieval outside the time & McClelland, 1989) generated a number of al-
frame of initial comprehension or production are ternative candidates that, with some modication,
also of interest but beyond the scope of the present appeared to provide a reasonable extension to the
review. (For a review of research on bilingual mem- bilingual case. Grainger and Dijkstra (1992) and
ory, see Durgunoglu & Roediger, 1987; Francis, Dijkstra and Van Heuven (1998) rst proposed
chapter 12, this volume; Marian & Neisser, 2000; such an extension of the Interactive Activation
Paivio, 1991; Schrauf & Rubin, 1998). Model, called the BIA or Bilingual Interactive Ac-
Finally, where there are available data, we tivation model. The BIA model, a localist connec-
consider the theoretical implications of recent neu- tionist model, is at present the bilingual model of
roimaging evidence for models of bilingual repre- word form that has been studied most extensively.
sentation and processing (see Abutalebi, Cappa, & We describe it only briey because two chapters
Perani, chapter 24, this volume, for a comprehen- in the present volume examine the model, its as-
sive review of the imaging research). Interestingly, sociated evidence, recent extensions, and its rela-
cognitive neuroscience approaches to bilingualism tion to distributed models (Dijkstra, chapter 9;
have returned to the question of whether the bi- Thomas & Van Heuven, chapter 10). We then
linguals two languages are represented in a sepa- examine the Distributed Feature Model (De Groot,
rate or integrated memory system by asking 1992a), a model focused specically on semantic
whether and where there is distinct neural activity representation.
associated with each language. In some respects,
the theoretical implications of this approach are Word Form The BIA model (see Fig. 10.1 in
potentially regressive with respect to model devel- Thomas & Van Heuven, chapter 10, this volume)
opment. However, the new imaging evidence has borrows the basic architecture of McClelland and
also served the important function of reviving in- Rumelharts (1981) Interactive Activation model,
terest in how the timing and context of L2 learning such that processing is initiated by visual input
has an impact on the organization of the two lan- from text and proceeds in a bottom-up manner
guages in the brain. We briey consider the impli- from letter features to letters to words. The BIA
cations of this new approach in our review. model assumes that the lexicon is integrated across
languages, and that lexical access is parallel and
nonselective. Thus, during early stages of word
A Review of Models: Levels recognition, patterns of activation and inhibition
of Representation, Processing within and across levels of representation are hy-
Tasks, and Development pothesized to be language blind. However, unlike
models of word recognition for monolinguals, the
We now turn to a review of specic models. Our bilingual model requires that there be a basis on
review is organized into three sections. The rst which words can eventually be correctly selected in
Models of Representation and Processing 535
the intended language. In the BIA model, the Although the initial evidence for the BIA model
mechanism introduced to achieve language selec- focused on orthographic interactions across lan-
tion is the layer of language nodes. The language guages, more recent evidence on phonology sug-
nodes, as represented in the original BIA model, gests that phonological codes are also active in
inhibit words in the nontarget language. Their top- both languages during word recognition (e.g.,
down inuence is not thought to alter the early Brysbaert, Van Dyck, & Van de Poel, 1999; Dijk-
activation of words in each of the two languages stra, Grainger, & Van Heuven, 1999; Jared &
but to increase the later likelihood of selecting a Kroll, 2001; Jared & Szucs, 2002; Marian &
word from the intended language (see Dijkstra & Spivey, 1999; Schwartz, Kroll, & Diaz, 2003). For
Van Heuven, 2002, for a revision of this mecha- example, Schwartz et al. showed that the time for
nism in the BIA model). English-Spanish bilinguals to name cognates was a
Unlike earlier bilingual models, the BIA model function of the match between the orthographic
proposes a precise mechanism for the way in which and phonological similarity of their forms in the
orthographic forms are activated in two languages two languages. These ndings and others suggest
when a bilingual recognizes visually presented that it is not orthography alone but the interaction
words. For languages with orthographies that are between orthography and phonology across the
similar, there will be parallel activation that results bilinguals two languages that determines the
in competition at the lexical and sublexical levels. course of visual word recognition.
This property of the model has been investigated by The inclusion of cross-language phonologi-
exploiting the presence of words with a form that cal activity requires an extension to the BIA model
overlaps across languages. These include cognates, that was described by Dijkstra and Van Heuven
translation pairs that share word form and mean- (2002) in a model now called the BIA and by Van
ing; interlingual homographs, words that are sim- Heuven (2000) (see the semantic, orthographic,
ilar in form in both languages but not translation and phonological interactive activation [SOPHIA]
equivalents; and orthographic neighbors, words model in Fig. 10.3, chapter 10, this volume). The
in each language with a lexical form that is only new models introduce both lexical and sublexical
slightly different from the target word. If lexical phonology to account for the observed patterns of
access is nonselective across languages, then the orthographic and phonological interaction. Finding
consequences of cross-language activity should in- cross-language phonological interactions also ac-
uence recognition performance. If lexical access is commodates the observation that nonselectivity is
selective, then the presence of other-language form not restricted to languages with a visual form that
relatives should be irrelevant, and processing is similar (e.g., see Gollan, Forster, & Frost, 1997,
should proceed in the same way as for a monolin- for an example of priming between Hebrew and
gual reader. English).
Thus, according to the selective view, a word The specicity of the BIA model allows clear
like room, which is an interlingual homograph in predictions to be tested about the form of cross-
English and Dutch (in which the word room means language interactions during visual word recogni-
cream), would be processed in each language as if it tion. However, by accounting for only one aspect
were an unambiguous word. In contrast, accord- of lexical form, the model fails to fully characterize
ing to the nonselective view, both language senses word recognition in and out of meaningful context.
of the word would be active and compete, similar Just as the new extensions in the BIA and SOPHIA
to the competition observed for lexically ambigu- models include phonological representations, they
ous words within a language. also now include semantic representations to address
Over a large number of studies and using a this issue.
range of experimental paradigms, convincing sup- Only a few studies have investigated the ef-
port for the nonselective alternative has been re- fects of semantic and syntactic context on cross-
ported (see Dijkstra, chapter 9, this volume, for language activation during lexical access (e.g.,
details and Thomas & Van Heuven, chapter 10, Altarriba, Kroll, Sholl, & Rayner, 1996; Elston-
this volume, for a description of how the model has Guttler, 2000; Schwartz, 2003; Van Hell, 1998).
been implemented). A particularly compelling as- The answer to the question of whether context can
pect of the evidence on word recognition is that the modulate the parallel activation of information
activity of the L2 inuences the native language about words in both languages and, if so, at what
even when the task is performed in the native lan- level in the system will be critical to the development
guage alone (e.g., Van Hell & Dijkstra, 2002). of the next generation of models. The preliminary
536 Aspects and Implications of Bilingualism
evidence on this issue suggests that, at least under early stages of processing, but as processing pro-
some circumstances, it is possible to modulate the ceeds, patterns of activation may become more
presence of cross-language interactions. Both Van distinct. It will remain to be seen how well the
Hell (1998) and Schwartz (2003) found evidence developing body of neuroimaging evidence ts with
that, in the context of a highly constrained sen- the nonselective account that appears so compel-
tence, cross-language activity was reduced. How- ling on the basis of the behavioral data.
ever, for the same materials, in low-constraint
sentences and out of context, there was clear evi- Meaning We turn now to the representation of
dence for cross-language inuences. An intriguing meaning because a complete model of the bilingual
aspect of these results is that the language of the lexicon will require that the semantics be specied
context itself does not appear to determine selec- (see Francis, 1999, and chapter 12, this volume, for
tivity. If lexical access was fundamentally selective, a comprehensive review of research on semantic
then we might expect that the language of the and conceptual representation in bilinguals). Much
sentence would be an effective cue to selection. Yet, of the research on language processing in bilinguals
in each of these studies, the low-constraint sentence has assumed that the same semantic representa-
context did not override nonselectivity. Like cross- tions are accessed for both languages (e.g., Costa,
language semantic priming studies in which the Miozzo, & Caramazza, 1999; La Heij et al., 1990;
language of the prime has been manipulated (e.g., Potter et al., 1984) because past research has ap-
De Bruijn, Dijkstra, Chwilla, & Schriefers, 2001), peared to support that assumption. For example,
these results suggest a limited role for language- in a bilingual variant of primed lexical decision,
specic cues per se, at least in comprehension. semantically related words prime each other even
Although most of the evidence for language when the prime and target appear in different
nonselectivity has been based on behavioral mea- languages (e.g., Altarriba, 1990; Chen & Ng,
sures, a few studies have examined neural activity 1989; Keatley, Spinks, & De Gelder, 1994; Meyer
during bilingual word recognition. De Bruijn et al. & Ruddy, 1974; Schwanenugel & Rey, 1986;
(2001) found similar N400 effects in an event- Tzelgov & Eben-Ezra, 1992). Although the pattern
related potential (ERP) study for target words se- of cross-language priming often reveals an asym-
mantically related to an interlingual homograph metry, with more priming from the rst language
regardless of the language bias preceding the ho- (L1) to L2, the presence of cross-language priming
mograph. In ERP research, the N400 has been ta- at all, especially under conditions that minimize the
ken as an index of lexical and semantic processing. likelihood of translation, suggests that some shared
Finding similar results regardless of language con- meaning is accessed across languages. Similarly, in
text would appear to support the nonselective a Stroop-type pictureword production task, in-
account provided by the BIA model. In contrast, terference is observed when a distractor word is
an ERP and functional magnetic resonance imag- semantically related to the pictures name regard-
ing (fMRI) study by Rodriquez-Fornells, Rotte, less of the language in which it appears (e.g., Costa
Heinze, Nosselt, and Munte (2002) claimed that et al., 1999; Hermans, 2000; Hermans, Bongaerts,
selectivity was possible in a language decision De Bot, & Schreuder, 1998).
task in which bilingual participants could use The reliance on out-of-context tasks in which
phonological information to avoid cross-language participants translate concrete nouns or name
interference (but see Grosjean, Li, Munte, & pictured objects may have contributed to the as-
Rodriquez-Fornells, 2003, for a critique of these sumption of shared semantics because these cir-
results). cumstances may be the most likely to evoke the
Marian, Spivey, and Hirsch (2003) also exam- same meaning in each of the bilinguals languages.
ined the parallel activation of the two languages by It also made it reasonable to ignore distinctions
comparing cross-language interactions in an eye between semantic and conceptual representations
movement paradigm and using fMRI measures. and the more general implications of lexical se-
Their results fall somewhere between the two al- mantics that would be relevant in ordinary sen-
ternatives, with some evidence for shared areas of tence contexts. In contrast, research on linguistic
brain activation and other evidence for differences. relativity focuses on just those circumstances that
They suggested that the presence of both similari- are most likely to evoke different meanings across
ties and differences is consistent with a time course languages, for example, when there are no precise
account in which the same cortical mechanisms are translation equivalents or when the linguistic or
likely to be activated for both languages during cultural context biases the appropriate sense of
Models of Representation and Processing 537
meaning (see Pavlenko, 1999, and chapter 21, this The Distributed Feature Model predicts that the
volume, for a discussion of linguistic relativity). degree of overlap across translation equivalents
To accommodate the possibility of both shared will determine the time it takes speakers to trans-
and separate semantics under different circum- late from one language to the other or to recognize
stances, De Groot and colleagues (De Groot, whether two words are the correct translation
1992a, 1992b, 1995; De Groot, Dannenburg, & equivalents of one another. To the extent that
Van Hell, 1994; Van Hell, 1998; Van Hell & De translation requires semantic processing (De Groot
Groot, 1998) proposed a model of bilingual se- et al., 1994; La Heij, Kerling, & Van der Velden;
mantics called the Distributed Feature Model (see 1996; but see Kroll & Stewart, 1994), then the
Fig. 26.2). A key assumption in the model is that speed of access to semantic representations should
the degree to which semantic representations are inuence performance. In a series of studies, De
shared across languages is a consequence of the Groot and colleagues (De Groot, 1992a, 1992b,
words lexical category. Representations for con- 1995; De Groot et al., 1994; Van Hell, 1998; Van
crete nouns and cognates are assumed to be quite Hell & De Groot, 1998) conrmed these predic-
similar across languages, whereas representations tions. Using a range of tasks that included trans-
for abstract nouns and noncognates are assumed to lation production, translation recognition, lexical
be more distinct. In some respects, the Distributed decision, and word association, these experiments
Feature Model uses word type to model lexico- showed that the time to recognize and produce
semantic representations in a manner that resembles translation equivalents is faster when the word
the way in which older models used the learning pairs are concrete nouns or cognates. Van Hell and
history of the bilingual to model compound versus De Groot also demonstrated that word associations
coordinate representations (e.g., Lambert, 1969). were more similar across languages for concrete
In one case, attributes of the language representa- words and cognates than for abstract words and
tion determine the architecture of the system. In the noncognates.
other, attributes of the language user constrain the The Distributed Feature Model assumes that the
nature of the representations and their relations semantic system itself is shared across the bilinguals
(see the discussion of compound vs. coordinate two languages. The features that comprise the
bilingualism in a later section of this chapter). pool of semantic primitives are hypothesized to be
Figure 26.2 The Distributed Feature Model. Adapted from Van Hell and De Groot (1998).
538 Aspects and Implications of Bilingualism
available to either language (see Illes et al., 1999, for additional assumptions about the ways in which
neuroimaging evidence that the same areas of the the goals associated with different tasks utilize those
brain are active when bilinguals make concreteness representations. The same lexicon may underlie word
judgments about words in each of their languages). recognition and word production, but the manner
How those features combine within a language then in which lexical processes are initiated and the de-
determines the similarity of particular concepts. mands on processing resources associated with each
Earlier research on concreteness effects within a task may differentially constrain performance (see
single language showed that concrete words have Kroll & Dijkstra, 2002, for a comparison of bilin-
higher contextual availability than abstract words gual comprehension and production). Here, we
(e.g., Schwanenugel & Shoben, 1983). Abstract consider two models that address the issue of how
words appear to depend more on provided context the task determines the nature of the information
for their meaning than concrete words. The Dis- that is retrieved and the manner in which processing
tributed Feature Model extends this idea to the bi- is controlled. One is a model of lexical production
lingual case. If the context in which words are to account for the way in which a bilingual initiates
processed differs across languages and cultures, a spoken word in response to the requirement to
then the meaning of abstract words will depend name a picture, translate a word, or speak a
more on the context in which their sense is instan- thought. The other is a model of the control pro-
tiated than the meaning of concrete words. cesses that are hypothesized to be recruited so that
In a study with Dutch-English bilinguals who only the intended language is selected for compre-
performed a semantic rating task on translation hension or production.
pairs (i.e., how similar are these two words?), To-
kowicz, Kroll, De Groot, and Van Hell (2002) Production Far less research on the bilingual lex-
found that indeed concrete translation equivalents icon has examined production relative to compre-
were more likely to share meaning than abstract hension. Although code switching has been studied
translation equivalents, although no distinction extensively by linguists and sociolinguists (see
between cognate and noncognate translations was Myers-Scotton, chapter 16, this volume), it is only
found. A further difference reported by Tokowicz recently that the methods developed by psycholin-
et al. was that words with more than one transla- guistics to study speech production within the na-
tion equivalent were rated as less semantically tive language (e.g., Levelt, 1989; Levelt, Roelofs, &
similar to their translation equivalents than words Meyer, 1999; Peterson & Savoy, 1998) have been
with only one translation equivalent. This nding extended to the bilingual case (e.g., Colome, 2001;
suggests that the existence of alternate translations Costa & Caramazza, 1999; Costa et al., 1999; De
inuences how adequate each individual transla- Bot & Schreuder, 1993; Gollan & Silverberg,
tion is considered. Indeed, Tokowicz (2000) found 2001; Hermans et al., 1998; Miller & Kroll, 2002).
that the time to produce translation equivalents in Our purpose in the present discussion is not to
a translation production task was slower for words provide a comprehensive review of this work (see
considered to have multiple translation equivalents Costa, chapter 15, and La Heij, chapter 14, in this
than for words with only a single dominant trans- volume). Rather, we hope to illustrate how the
lation. These results suggest that bilinguals can components of a model of lexical access in pro-
have (nearly) identical concepts for some words duction will necessarily address some different is-
and different concepts for other words. The con- sues than a model of comprehension.
sequence of differences in the degree of overlap in We adopt a model of spoken production pro-
meaning for the nature of cross-language proces- posed rst by Poulisse and Bongaerts (1994) and
sing will depend on the nature of the task, the bi- later extended by Hermans (2000) (see Fig. 26.3).
linguals level of prociency, and the context of The model characterizes the bilingual lexicon in the
acquisition (e.g., Dufour & Kroll, 1995; Finkbei- face of the requirement to speak a word in one
ner, Forster, Nicol, & Nakamura, 2004; Jiang, language or the other. Like models in the mono-
2000; Silverberg & Samuel, 2004). lingual domain (e.g., Levelt et al., 1999), the pro-
duction model shown in Fig. 26.3 assumes that
Processing Tasks there are three levels of representation engaged in
translating an idea into a spoken word. First, the
Even with a model of bilingual representation idea must be represented conceptually. If the event
that provides an adequate characterization of lex- that initiates speaking is a pictured object, as
ical form and meaning, we would need to make shown in the model, then this rst step will involve
Models of Representation and Processing 539
Figure 26.3 A model of bilingual language production. Adapted from Poulisse and Bongaerts (1994) and
Hermans (2000).
recognizing the object and accessing its meaning. In Following conceptual processing, a set of lem-
this model, there is also a language cue represented mas is hypothesized to be activated in each of
at the conceptual level. The language cue signals the bilinguals two languages. Although there is some
information about the language in which the ut- debate about what information is stored at the
terance is to be spoken. If lexical access in pro- lemma level, particularly regarding grammatical
duction is language selective, then the intention to category, there is general agreement that at this
speak a word in one of the bilinguals languages level abstract lexical representations for words in
rather than the other might sufce to turn the re- each language are activated. Unlike the assump-
maining steps in the production process into the tions of an integrated lexicon in comprehension
monolingual case. However, experiments suggest models such as BIA or BIA, the assumption here
that even when bilinguals know that they will be is that the syntactic constraints specied at the
speaking only one of their two languages, that lemma level will require that lemmas are neces-
knowledge is not sufcient to effectively switch off sarily distinct for words in each of the bilinguals
the activation of other language alternatives (e.g., languages. At the nal level depicted in the model,
Colome, 2001; Costa, Caramazza, & Sebastian- the phonology of the spoken word is specied. Like
Galles, 2000; Kroll, Dijkstra, Janssen, & Schrie- the assumption of the Distributed Feature Model
fers, in preparation). for semantics, the production model assumes that
540 Aspects and Implications of Bilingualism
each language draws on a common pool of pho- quence. Given these differences, it is perhaps sur-
nological features. Thus, although there may be prising that the empirical evidence suggests that
distinct aspects of the phonology associated with lexical access is language nonselective for both
each language, the assumption is that the phono- comprehension and production. However, it is
logical system itself is shared so that common important not to conclude that the apparent simi-
phonological elements in each language will acti- larities arise from the same mechanism. Particu-
vate the same or similar representations. larly in production, it should be possible in theory
A key question about production concerns the to use context and available cues to bias selection.
sequencing of these representations prior to the A number of studies suggest that this may be the
production of a spoken word or sentence. In re- case (e.g., Bloem & La Heij, 2003; Miller & Kroll,
search on monolingual production, this issue has 2002).
been at the center of a debate about the nature of
lexical access and the interactions between syntax Control Like many cognitive models, those that
and semantics (e.g., Dell, 1986; Levelt et al., 1999; characterize the bilingual lexicon have been de-
Vigliocco & Hartsuiker, 2002). In the bilingual signed without much concern for how the cognitive
domain, the issue is potentially even more compli- system actually manages to produce actions in
cated because the language of speaking also has to response to task goals. The problem is especially
be determined. If representations are activated for acute in light of the evidence for language non-
both languages in parallel, then the question of selectivity in both comprehension and production.
whether they compete for selection is critical be- If candidates in both of the bilinguals languages
cause procient bilinguals will have at least two are routinely available in comprehension and in
words available for each concept. In the model production regardless of the intention to use one
shown in Fig. 26.3, there is activation of candidates language only, then a mechanism must be in place
in both languages at the lemma level, but the as- to modulate the resulting competition and to con-
sumption is that selection occurs at this level trol performance. A number of models have ad-
(e.g., see Hermans et al., 1998) and that phonology dressed the control problem by nding solutions
is specied only for the language the person intends within the lexicon itself (e.g., the language nodes
to speak. The evidence suggests that there is in fact served this purpose in the original BIA model) or by
activation to the level of the phonology for words positing a mechanism that falls outside the lexicon
in both languages (e.g., Colome, 2001; Costa et al., but uses the output of the lexical system to achieve
2000; Kroll et al., in preparation). The theoretically procient performance. Green (1998) proposed the
difcult question is whether all activated informa- Inhibitory Control (IC) Model, shown in Fig. 26.4,
tion competes for selection or whether the language to accomplish this goal. The model is described in
cue can effectively guide the selection process. In some detail by Meuter (chapter 17, this volume), so
this volume, consult Costa (chapter 15), La Heij our review touches only on its central points.
(chapter 14), and Meuter (chapter 17) for an in- Like other models of production, the IC model
depth discussion of this particular issue. assumes that a conceptual representation is gener-
The point for present purposes is that how the ated at the onset of planning. That conceptual
very same representations may be engaged will activity in turn activates both the lexico-semantic
differ depending on whether the task requires system and the supervisory attentional system
comprehension or production. In production, the (SAS). The role of the SAS is to control the acti-
initiating event consists of conceptual activity. The vation of task schemas for particular language
corresponding sequence of processing from con- processing goals. Thus, the task schema for naming
cepts to words will necessarily engage feedback a picture in the L1 would differ from the schema
from semantics that may or may not be available for naming a picture in L2 or translating a word
during comprehension, for which the sequence from L1 to L2. The IC model further assumes
from words to concepts is more likely to be driven that lemmas are tagged for language membership.
by properties of the stimulus input rather than its A critical function of the task schemas is to acti-
meaning. In the production model illustrated here, vate lemmas in the intended language and to inhibit
the language cue is hypothesized to be encoded as lemmas in the unintended language. Because the
part of the conceptual representation of the event level of competition created by this process will
that initiates production. In a comprehension model require attentional resources, the degree of inhibi-
such as BIA or BIA, the language nodes are not tory control required for a bilingual to perform
activated until relatively late in the processing se- a particular task will be related to the relative
Models of Representation and Processing 541
Figure 26.4 The Inhibitory Control Model. Adapted from Green (1998). SAS, supervisory attentional
system.
activation of lemmas in each language. For exam- A complete understanding of how attentional
ple, if lexical candidates in the more dominant L1 and task processes inuence performance will be an
are active when a bilingual is attempting to name a important feature of the next generation of models
picture in L2, then the inhibitory processes required of the bilingual lexicon (see Dijkstra & Van Heu-
to modulate the competition from L1 to L2 will be ven, 2002, and Von Studnitz & Green, 2002, for
greater than those required when the task is per- an illustration of how this mechanism might work
formed in the L1 itself. in comprehension). As Meuter (chapter 17, this
An important source of evidence regarding in- volume) notes, it is not entirely clear how to think
hibitory control comes from experiments on de- about the scope of the inhibitory mechanism pro-
liberate language switching (see Meuter, chapter posed within the IC model. Identifying the factors
17, this volume, for a review of this literature). that determine the range of inhibition will be crit-
Research on language switching has shown that ical as it seems unreasonable to think that the entire
switch costs are greater when bilinguals are re- language is inhibited at once. Furthermore, the IC
quired to switch into L1 relative to L2 (e.g., Meuter model, as depicted in Fig. 26.4, makes few as-
& Allport, 1999). The asymmetry in switch costs sumptions about the architecture of the lexicon
appears at rst counterintuitive because it might be itself, an issue that we considered in some detail
thought that the more dominant L1 will always be in the previous sections of this chapter. It will be
automatically available. However, from the per- important to understand how assumptions about
spective of the IC model, the result makes a good representation and processing within the lexicon
deal of sense because switch costs will be greatest constrain, and are constrained by, the attentional
when the processing on the trial just prior to the mechanisms that serve as an interface to the more
switch induces a great deal of competition and general cognitive system and to action.
therefore requires signicant inhibition. It is pre- An aspect of the IC model that has only recently
cisely this mechanism that is hypothesized to occur been investigated concerns the implications of in-
when a bilingual performs a task such as picture hibitory control for the achievement of L2 pro-
naming or number naming in the L2. If L1 lemmas ciency. If the mental juggling that appears to be
compete prior to the selection of the word to be required by using two languages can only be ef-
produced in L2, then the inhibition of the L1 fectively controlled by the allocation of sufcient
lemmas will produce a cost when L1 is the target attentional resources, then individuals who already
language to produce on the subsequent trial. possess high memory capacity may be advantaged
542 Aspects and Implications of Bilingualism
L2 learners. A number of studies investigating in- bilinguals are individuals who learn their two lan-
dividual differences in memory and attentional re- guages in succession, in separate contexts. However,
sources for L2 acquisition suggest that this may be it should be noted that there has been disagreement
the case (see Michael & Gollan, chapter 19, this in past research concerning the denitions of com-
volume). Furthermore, in the case of highly skilled pound and coordinate bilingualism and whether
translators and interpreters, it seems clear that they this distinction is a continuum or rather a dichot-
possess extraordinary cognitive resources that en- omy (Lambert, 1969).
able their remarkable performance (see Christoffels One of the hypothesized differences between
& De Groot, chapter 22, this volume). compound and coordinate bilinguals regards the
What is less clear is the direction of causality. similarity of concepts referred to by L1 and L2
We do not yet know whether individuals with en- words. It was believed that compound bilinguals
hanced cognitive abilities are more likely to become had one set of concepts that could be referenced by
skilled bilinguals or whether the process of be- either language, whereas coordinate bilinguals had
coming bilingual has the positive consequences of two sets of concepts, each set uniquely available to
enhancing cognitive skills more generally. The ev- one of the two languages. This distinction has been
idence on young bilingual children (see Bialystok, applied not only to semantic aspects of the language,
chapter 20, this volume) provides compelling sup- but also to syntactic, phonological, and cultural
port for the view that bilingualism itself may confer aspects (e.g., Lambert, Havelka, & Crosby, 1958).
a set of cognitive benets that extend beyond lan- To test the compound/coordinate distinction,
guage processing to executive control functions. Lambert (1961) used the semantic differential to
However, little is known about whether the ad- determine whether translation equivalents shared
vantages observed in early childhood endure into meaning in the two languages. In this task, the bi-
adulthood or whether late acquisition of an L2 has lingual rates a word relative to adjectives (e.g., how
similar consequences. cold is a house?) and then does the same for the
What is evident in this discussion is that models translation equivalent. The theoretical prediction
of bilingual processing and its control will require was that compound bilinguals would rate transla-
an account that considers not only representation tion equivalents more similarly than coordinate bi-
and control, but also developmental aspects of linguals. Lambert found that compound bilinguals
language acquisition. In the nal section of our did give similar ratings to translation equivalents,
review, we turn to models of bilingual representa- but that the ratings of coordinate bilinguals
tion and processing that have focused on these depended on their particular acquisition context
developmental issues. coordinates who learned L1 and L2 in the
same cultural context rated translations similarly,
whereas coordinates who learned the two languages
Developing Second Language in different cultural contexts rated translations dis-
Prociency similarly. These ndings showed that the experi-
ence of the learner is important above and beyond
Early Versus Late Acquisition: Compound Versus the manner of learning. Although the evidence on the
Coordinate Bilingualism We now consider how semantic differential revealed differences between
the context in which L2 is acquired may affect se- compound and coordinate bilinguals, there were
mantic and lexical representations and their inter- also questions whether these differences reected
connections. Although some researchers have the most critical aspects of the learning context.
viewed L2 acquisition from the perspective of the More recent research has examined the neural
critical period hypothesis (e.g., Lambert, 1972), consequences of the language-learning context.
our review considers aspects of L2 acquisition that Whereas psycholinguistic research has moved away
are not typically the focus of research on that topic from the question of whether the two languages are
(see Birdsong, chapter 6, and DeKeyser & Larson- stored together or separately in favor of asking
Hall, chapter 5, this volume, for current reviews of what circumstances elicit behavior that appears
the critical period hypothesis). Depending on the similar or different, neuroimaging studies of bilin-
acquisition context of the bilingual, a distinction gualism have returned to the question to ask whe-
has been made between compound and coordinate ther the same neural tissue is activated during
bilinguals. Generally dened, compound bilinguals processing of the two languages. In some instances,
are individuals who learn two languages simulta- the diffusion of activation as well as the activated
neously, in the same context, whereas coordinate areas are considered important.
Models of Representation and Processing 543
A number of studies have reported that early suggests that the age at which a word is acquired
bilinguals are more likely than late bilinguals to will inuence the connections between that word
activate the same brain regions when processing L1 and its corresponding meaning. It also suggests
and L2 (e.g., Kim, Relkin, Lee, & Hirsch, 1997). that, for late L2 learning, the AoA of an L2 word
However, because early/late bilingualism is often does not simply inherit the AoA of its translation
confounded with prociency, Abutalebi et al. equivalent. The implication for bilinguals who
(chapter 24, this volume) concluded that age of learn their languages later than early childhood is
acquisition (AoA) is not as important as the degree that L2 words will not be as strongly connected to
of prociency attained for determining whether the their meanings as L1 words.
same neural substrates serve the two languages: Some support for that conclusion was reported
When individuals are highly procient in their two in a study by Silverberg and Samuel (2004) in
languages, the languages appear to use the same which they compared semantic priming effects in
neural networks, whereas distinct networks are bilinguals who differed in both the context of ac-
used when bilinguals are not very procient in L2, quisition and in their prociency in L2. Only early
at least when tested using language production bilinguals who were highly procient in the L2
tasks (e.g., Chee, Tan, & Thiel, 1999; Illes et al., produced signicant semantic priming, whereas
1999; Klein, Milner, Zatorre, Meyer, & Evans, late bilinguals failed to show these effects regard-
1995; but see Perani et al., 2003, for evidence that less of their L2 prociency. The consequences of
the language acquired rst may be associated with both factors, context of acquisition and degree of
reduced brain activation during lexical retrieval L2 prociency, will be important foci in future re-
tasks). search on this topic.
In contrast to the conclusion of neuroimaging
studies suggesting that prociency may be more Developing Lexical and Conceptual Representations
important than the context of acquisition, studies in the Second Language In the nal portion of our
of brain laterality show that early bilinguals have review, we consider how the connections between
more bilateral hemispheric involvement for the words and their meanings develop with increasing
native language than monolinguals and late bilin- prociency in the L2. Results from several studies
guals, even when the late bilinguals are procient in in the cognitive literature led to the conclusion that
L2 (see Hull & Vaid, chapter 23, this volume, for words were likely stored separately from concepts
the results of a meta-analysis including a large in memory (e.g., Anderson & Bower, 1973; Potter,
number of bilingual laterality studies). This nding 1979; Snodgrass, 1980). One particularly impor-
suggests that learning an L2 early in life leads to a tant nding was that it takes around 200300 ms
qualitative difference in how language is processed longer to name a picture than to read a word
by the brain above and beyond language pro- aloud (e.g., Cattell, 1886; Fraisse, 1960; Potter &
ciency. Note that although the conclusions of the Faulconer, 1975). In a classic study, Potter
imaging and laterality studies may seem contra- et al. (1984) used this empirical observation as a
dictory on the surface, they are not necessarily at means to evaluate two models of bilingual memory
odds with one another because the neuroimaging representationthe Word Association and Con-
evidence concerns similarity between areas used by cept Mediation Models.
the two languages, whereas the laterality evidence According to the Word Association Model (see
concerns the areas used by the brain to process L1. Fig. 26.5[a]), an L1 word is directly associated to its
Another factor that may be critical in deter- L2 equivalent. To gain access to concepts, L2 words
mining the nature of meaning representations for must rst activate their L1 equivalents. By com-
words in the two languages is age of acquisition parison, the Concept Mediation Model (see Fig.
(AoA). There is ample evidence suggesting that, 26.5[b]) hypothesizes that words in each language
even when word frequency is controlled, words are directly associated to concepts, but that trans-
that are learned earlier in life have a processing lation equivalents are not directly connected to each
advantage over words that are learned later in life other. The concepts in both models are proposed to
(e.g., Gerhand & Berry, 1998; Morrison & Ellis, be amodal, and it is further assumed that pictures
1995; Morrison, Ellis, & Chappell, 1997). Izura have direct access to the same concepts.
and Ellis (2002) reported a similar nding for the To test these models, Potter et al. (1984) com-
effect of AoA in L2. Regardless of L1 AoA, L2 pared the time it took bilinguals to translate words
words that were learned early have a processing from L1 to L2 and to name pictures in L2. The
advantage over L2 words learned later. This result logic Potter et al. used assumed that picture naming
544 Aspects and Implications of Bilingualism
Figure 26.5 The (a) Word Association and (b) Concept Mediation Models. Adapted from Potter et al.
(1984). L1, rst language; L2, second language.
always requires conceptual processing. If transla- Therefore, they estimated the magnitude of the
tion from L1 to L2 resembles picture naming, then difference between picture naming and translation
it can be concluded that translation is also con- time predicted by the Word Association Model to
ceptually mediated. Of interest is that the two be in the range of 200300 ms. In contrast, the
models make different predictions about the rela- Concept Mediation Model predicts that the two
tion between picture naming in L2 and translation tasks should take approximately the same amount
from L1 to L2. The Word Association Model pre- of time because they involve similar component
dicts that L2 picture naming should take more time processes.
than translation because two additional steps are The results of a rst experiment with highly
necessary (concept retrieval and L1 word retrieval). procient Chinese-English bilinguals showed that
Potter et al. reasoned that these two extra steps are L2 picture naming took about the same amount of
also responsible for the difference in the amount of time to perform as L1-to-L2 translation and there-
time it takes to name pictures and words in L1. fore favored the Concept Mediation Model. The
Models of Representation and Processing 545
surprising result was that, in a second experiment, procient than the English-Chinese bilinguals to
a group of less-procient English-French bilinguals whom they were compared (e.g., they were slower
produced the same pattern, suggesting that they also and more error prone), it is possible that they were
conceptually mediated the L2. Potter et al. (1984) beyond an initial stage of lexical acquisition that is
concluded that the Concept Mediation Model more characterized by reliance on word-to-word associ-
accurately characterized the memory representa- ations across the two languages.
tions of both less- and more procient bilinguals To determine whether the Word Association
than the Word Association Model. Model characterizes L2 learners at the earliest
The results of the Potter et al. (1984) study are stages of acquisition, Kroll and Curley (1988) and
counterintuitive because we might have expected Chen and Leung (1989) used a methodology simi-
that the less-procient bilinguals would be more lar to the one used by Potter et al. (1984), but in-
likely to rely on translation equivalents than the cluded participants who were of lower prociency
more procient bilinguals. However, two aspects in L2 than Potter et al.s less-procient group.
of the design may have inadvertently affected the These studies showed that, for learners at early
conclusions. First, the items used in the experiment stages of acquisition, translation from L1 to L2 was
with the less-procient English-French participants indeed performed more quickly than L2 picture
were intentionally selected to be well known by naming, conrming the prediction of the Word
novices in the L2, and items that were not known Association Model. Both studies also replicated the
by half of the participants were removed from the results of the Potter et al. study for more procient
analyses. As we later discuss, this selection criterion bilinguals. Therefore, these data suggest that there
may have biased the results in favor of the concept is a transition from a stage of acquisition in which
mediation pattern. there is reliance on translation equivalents between
A second critical aspect of the Potter et al. L1 and L2 to a stage in which direct concept me-
(1984) study concerns the selection of the less- diation is possible.
procient bilinguals. In this study, they were a To account for this developmental sequence,
group of highly motivated students about to go to Kroll and Stewart (1994) proposed the Revised
France on a study abroad program. Although the Hierarchical Model. The model (see Fig. 26.6) in-
data showed clearly that this group was far less tegrates the connections depicted in the Word
Figure 26.6 The Revised Hierarchical Model. Adapted from Kroll and Stewart (1994). L1, rst language;
L2, second language.
546 Aspects and Implications of Bilingualism
Association and Concept Mediation Models. Un- categorized (e.g., all fruits, all animals, etc.). Only
like the earlier models, the Revised Hierarchical translation from L1 to L2 was affected by the se-
Model makes two critical assumptions about the mantic context in which translation was performed.
strength of connections between words and con- Translation from L1 to L2 was slower for words
cepts in bilingual memory. The rst is that L1 presented in semantically categorized lists than for
words are assumed to be more strongly connected the same words presented in semantically mixed
to concepts than are L2 words. The second is lists; translation from L2 to L1 was unaffected by
that L2 words are assumed to be more strongly this manipulation. These ndings provided initial
connected to their corresponding translation support for the claim that only L1-to-L2 translation
equivalents in L1 than the reverse. The resulting necessarily involves concept mediation.
asymmetries are thought to reect the consequences A number of studies have examined the devel-
of L2 acquisition in late learners who possess a fully opmental predictions of the Revised Hierarchical
developed lexicon for words in L1 and their asso- Model. Talamas, Kroll, and Dufour (1999) had
ciated concepts. Like other claims about transfer more and less-procient bilinguals perform a
from the L2 to L1 (e.g., MacWhinney, 1997), the translation recognition task (De Groot, 1992b) in
Revised Hierarchical Model proposes that, during which a pair of words was presented and partici-
early stages of L2 acquisition, the learner exploits pants indicated whether the two words were
the existing word-to-concept connections in L1 to translations of each other. The critical items were
access meaning for new words in L2. Thus, a strong nontranslation foils that were related to one an-
lexical connection from L2 to L1 will be established other by virtue of being form related (e.g., man
during learning. Over time, there may be feedback hambre [hunger] instead of manhombre [man]) or
that establishes L1-to-L2 connections at this level, meaning related (e.g., manmujer [woman] instead
but they will be weaker than those for L2 to L1 of manhombre [man]) to the correct translation.
because the learner does not need to use L2 in the The results showed that the less-procient bilin-
same way. As learners become more procient in guals suffered more form than meaning interfer-
L2, they will begin to develop the ability to con- ence, whereas the reverse was true for the more
ceptually process L2 words directly, but the con- procient bilinguals. The results are thus consistent
nections between words and concepts are assumed with a developmental shift from form to meaning
to remain stronger for L1 than for L2 for all but the with increasing prociency in the L2.
most balanced bilinguals. In a study similar to that of Talamas et al.
One consequence of the asymmetries rep- (1999), Sunderman (2002) also used a translation
resented within the Revised Hierarchical Model is recognition task to investigate the development of
a predicted asymmetry in translation performance, L2 in a group of native English speakers learning
such that translation from L1 to L2 in the forward Spanish as adults. In that study, three types of no
direction of translation will be conceptually medi- trials were compared: form related (manmano
ated, whereas translation from L2 to L1 in the [hand]), meaning related (manmujer [woman]),
backward translation can proceed directly via the and form related to the translation (manhambre
lexical connections from L2 words to their trans- [hunger]). The results showed that all participants,
lation equivalents. Therefore, forward translation regardless of prociency, were slower to reject
will take longer to perform than backward trans- word pairs that were form or meaning related
lation and will be more likely to engage semantics. relative to unrelated controls. However, only the
As L2 prociency increases, the connection from less-procient participants were slower to respond
L2 words to concepts will strengthen, resulting in to the foils that were form related to the correct
a decrease in the magnitude of the translation translation (e.g., manhambre). Although the pres-
asymmetry and a corresponding increase in the ence of semantic effects for all groups failed to
degree to which backward translation is also con- replicate the results of Talamas et al. (see also Al-
ceptually mediated. tarriba & Mathis, 1997), the differential effect of
To test the hypothesis that only forward trans- the form-related translation foil suggests, as the
lation involves conceptual mediation, Kroll and Revised Hierarchical Model predicts, that access to
Stewart (1994) had relatively procient Dutch- the translation equivalent may play a particularly
English bilinguals translate words from L1 to L2 important role early in L2 learning (see Mac-
and L2 to L1. They manipulated the semantic con- Whinney, chapter 3, this volume, for related ar-
text of the translation lists. One list was semanti- guments about the scope of transfer during L2
cally mixed, whereas the other list was semantically acquisition).
Models of Representation and Processing 547
Additional support for the developmental pre- Altarriba and Mathis (1997) reported another
dictions of the Revised Hierarchical Model comes study that found results counter to the predictions of
from a study by Kroll, Michael, Tokowicz, and the Revised Hierarchical Model. In this study, nave
Dufour (2002). In that study, learners in a summer learners were trained on four color words in Span-
intensive language program and procient bilin- ish. After scoring 100% accuracy on several quizzes
guals performed the same bilingual tasks. The re- testing their knowledge of the Spanish color words,
sults showed that L1-to-L2 translation generally they were tested on a Stroop-type interference task
took longer to perform than L2-to-L1 translation, using the color words they had just learned. The
but that the asymmetry was smaller for the more ndings showed that the learners indeed showed
procient group than for the learners, supporting interference in L2. Because they were at the very
the prediction of the Revised Hierarchical Model earliest stages of learning a new language (they had
that the two directions of translation become more learned only a few words), these results were con-
similar with increased L2 prociency. trary to the predictions of the Revised Hierarchical
There have also been a number of studies in Model. However, it is not clear that these results are
which results contrary to the predictions of the representative of what would be found with L2
Revised Hierarchical Model have been found. In learners in a more typical learning situation (i.e.,
one such study, De Groot and Poot (1997) tested with more word pairs studied over a longer period
learners at three prociency levels (low, average, of time). What is interesting about these results
and high) on a translation production task. The is that they demonstrate the capabilities of the
concreteness or imageability of the translated language-learning situation under unique circum-
words was manipulated such that some items were stances: When a small number of items are learned
concrete (i.e., represented entities that were per- with extensive training, the results mimic those of
ceptible; e.g., table), whereas other items were procient bilinguals. This nding provides evidence
abstract (i.e., represented entities that were imper- that individual items can become conceptually me-
ceptible; e.g., beauty). Because the hypothesized diated; the bounds of this learning have yet to be
difference between these two word types is in mean- demonstrated.
ing, any difference in translation time was taken Perhaps the most compelling evidence support-
to indicate conceptually mediated translation. ing the prediction of the Revised Hierarchical
De Groot and Poot found that there was a Model that L1-to-L2 translation is conceptually
concreteabstract difference for bilinguals in all mediated but L2-to-L1 translation is not comes
three prociency groups and therefore concluded from a study that examined transfer from picture
that translation is always conceptually mediated naming to translation (Sholl, Sankaranarayanan, &
and bilinguals do not need to rely on L1 for access Kroll, 1995). If only translation from L1 to L2 is
to meaning. Furthermore, the results showed that conceptually mediated, then only L1-to-L2 trans-
translation in both directions was inuenced by lation should benet from prior study during which
concreteness to a similar extent, and they therefore concepts are named as pictures, a task also believed
concluded that both directions of translation are to be conceptually mediated (e.g., Potter & Faul-
conceptually mediated, inconsistent with the pre- coner, 1975). This is precisely the result reported by
dictions of the Revised Hierarchical Model. Sholl et al. Translation from L1 to L2 was facilitated
De Groot and Poots (1997) results are impor- when concepts had been named previously as pic-
tant because the performance of learners of dif- tures in L2 or L1. In contrast, translation from L2 to
ferent prociency levels was directly compared. L1 was unaffected by prior picture naming.
However, because the results of this study are The conclusions of Sholl et al. (1995) were
counter to much of the research on language pro- subsequently challenged by a study reported by La
duction, we must interpret them carefully. In par- Heij et al. (1996) in which Dutch-English bilinguals
ticular, more recent research has revealed that were asked to translate words in each direction and
concrete words are likely to have fewer translations to name words in each language. The critical con-
across languages than abstract words (Schonpug, ditions of the La Heij et al. study consisted of
1997; Tokowicz & Kroll, 2003; Tokowicz et al., picture primes that were related to the target word
2002). Furthermore, Tokowicz and Kroll reported to be translated or named. Like the results of Kroll
that the existence of an alternate translation slows and Stewart (1994), there was little effect of the
translation speed considerably. Therefore, the ef- semantic context on word naming. However, un-
fects of concreteness on translation may come from like the results of Kroll and Stewart and Sholl et al.,
multiple sources and must be interpreted carefully. there were signicant semantic effects of picture
548 Aspects and Implications of Bilingualism
primes in both directions of translation, suggesting amount of research in recent years, many of the
that both directions of translation are conceptually issues covered will also apply to other domains of
mediated. Because the Dutch-English bilinguals in language processing. Early models of the bilingual
the La Heij et al. study were very similar to the lexicon were general and largely failed to provide
Dutch-English participants in the Kroll and Stewart an adequate characterization of how information
study, it seems unlikely that the nature of the par- in each language might be represented. Later
ticipants bilingualism was responsible for the dif- models responded to that criticism by providing a
ferent pattern of results across these studies. more specic account but within a relatively nar-
In a study similar to the one by Sholl et al. row focus. Concerns about control mechanisms,
(1995), Francis, Tokowicz, and Kroll (2003) about the manner in which processing changes in
showed that translation from L1 to L2 was facili- the face of task demands, and about the conse-
tated by previous translation only in the same di- quences of the ways in which prociency develops
rection, whereas translation from L2 to L1 was in the L2 will all be crucial to the next stages of
facilitated by previous translation in either direc- model development.
tion, further suggesting that the two directions of With the exception of the BIA model (see also
translation may engage different component pro- Grosjean, 1997; Thomas, 1997), few models have
cesses. The results of the Francis et al. study also been implemented computationally. Likewise, it is
suggest that, as bilinguals become more procient, only recently that a range of evidence has been
the two directions of translation become more available to test behavioral predictions of bilingual
similar because the asymmetrical priming disap- models and then to assess their neurocognitive
pears, and both directions of translation are primed underpinnings. We anticipate that the next period
by previous translation in either direction. Finally, of research and model construction will be in-
the results suggest that, within an individual, there formed by all of these perspectives. Although in the
may be some words that are conceptually mediated future it may become more difcult to answer the
and others that are not. The easier items in the question of whether the bilinguals two languages
Francis et al. study, as dened by relatively higher are maintained in separate or shared memory sys-
word frequency, showed symmetrical priming re- tems, we are condent that research on bilingual
gardless of the bilingual groups prociency. representation and processing will provide impor-
This result may help to explain some of the tant insights not only into the nature of bilingual-
apparently conicting results in the studies re- ism, but also more fundamentally into the relation
viewed here. In the La Heij et al. (1996) experi- between language and cognition.
ments, items were chosen intentionally to be high
frequency, and therefore likely to be known by all Acknowledgments
participants, and were repeated throughout the
experiment. In the Kroll and Stewart (1994) study, The writing of this chapter was supported in part
by National Science Foundation Grant BCS-
the items were generally much lower in frequency
0111734 and National Institute of Mental Health
and presented only once to a given participant. The grant RO1MH62479 to Judith F. Kroll and by
pattern of results reported by Francis et al. (2003) a National Research Service Award NIMH HD-
suggests that both the prociency of the bilingual 42948-01 to Natasha Tokowicz. We thank Erica
participants and the nature of the items will de- Michael for helpful comments on an earlier version
termine the likelihood of observing asymmetries in of the chapter and Nora Kroll-Rosenbaum for as-
performance. These ndings highlight the devel- sistance in graphic design.
opmental nature of becoming bilingual: Transitions
from less to more procient are not limited to the References
individual bilingual but also are relevant for indi-
Altarriba, J. (1990). Constraints on interlingual
vidual words.
facilitation effects in priming in Spanish-
English bilinguals. Unpublished doctoral dis-
sertation, Vanderbilt University, Nashville, TN.
Summary Altarriba, J., Kroll, J. F., Sholl, A., & Rayner, K.
(1996). The inuence of lexical and
In this chapter, we provided a review of the state of conceptual constraints on reading mixed-
bilingual models of representation and processing. language sentences: Evidence from eye-
Although our review was limited to the lexicon, a xation and naming times. Memory &
topic on which there has been a disproportionate Cognition, 24, 477492.
Models of Representation and Processing 549
Altarriba, J., & Mathis, K. M. (1997). Conceptual Bilingualism: Language and Cognition, 4,
and lexical development in second language 155168.
acquisition. Journal of Memory and De Groot, A. M. B. (1992a). Bilingual lexical rep-
Language, 36, 550568. resentation: A closer look at conceptual rep-
Anderson, J. R., & Bower, G. H. (1973). Human resentations. In R. Frost & L. Katz (Eds.),
associative memory. New York: Wiley. Orthography, phonology, morphology, and
Bloem, I., & La Heij, W. (2003). Semantic facili- meaning (pp. 389412). Amsterdam: Elsevier
tation and semantic interference in word Science.
translation: Implications for models of lexical De Groot, A. M. B. (1992b). Determinants of
access in language production. Journal of word translation. Journal of Experimental
Memory and Language, 48, 468488. Psychology: Learning, Memory, and Cogni-
Brysbaert, M., Van Dyck, G., & Van de Poel, M. tion, 18, 10011018.
(1999). Visual word recognition in bilinguals: De Groot, A. M. B. (1995). Determinants of
Evidence from masked phonological priming. bilingual lexicosemantic organization.
Journal of Experimental Psychology: Human Computer Assisted Language Learning, 8,
Perception and Performance, 25, 137148. 151180.
Cattell, J. M. (1886). The time it takes to see and De Groot, A. M. B. (2002). Lexical representation
name objects. Mind, 11, 6365. and lexical processing in the L2 user. In
Chee, M. W. L., Tan, E. W. L., & Thiel, T. (1999). V. Cook (Ed.), Portraits of the L2 user (pp.
Mandarin and English single word processing 3263). Clevedon, U.K.: Multilingual Matters.
studied with functional magnetic resonance De Groot, A. M. B., Dannenburg, L., & Van Hell,
imaging. Journal of Neuroscience, 19, 3050 J. G. (1994). Forward and backward word
3056. translation by bilinguals. Journal of Memory
Chen, H.-C., & Leung, Y.-S. (1989). Patterns of and Language, 33, 600629.
lexical processing in a nonnative language. De Groot, A. M. B., & Poot, R. (1997). Word
Journal of Experimental Psychology: Learn- translation at three levels of prociency in a
ing, Memory, and Cognition, 15, 316325. second language: The ubiquitous involvement
Chen, H.-C., & Ng, M.-L. (1989). Semantic facil- of conceptual memory. Language Learning,
itation and translation priming effects in 47, 215264.
Chinese-English bilinguals. Memory & Dell, G. S. (1986). A spreading-activation theory of
Cognition, 17, 454462. retrieval in sentence production. Psychological
Colome, A`. (2001). Lexical activation in bilinguals Review, 93, 283321.
speech production: language-specic or Dijkstra, A., Grainger, J., & Van Heuven, W. J. B.
language independent? Journal of Memory (1999). Recognizing cognates and interlingual
and Language, 45, 721736. homographs: The neglected role of phonology.
Costa, A., & Caramazza, A. (1999). Is lexical Journal of Memory and Language, 41,
selection language specic? Further evidence 496518.
from Spanish-English bilinguals. Bilingualism: Dijkstra, A., & Van Heuven, W. J. B. (1998). The
Language and Cognition, 2, 231244. BIA model and bilingual word recognition. In
Costa, A., Caramazza, A., & Sebastian-Galles, N. J. Grainger & A. M. Jacobs (Eds.), Localist
(2000). The cognate facilitation effect: Impli- connectionist approaches to human cognition
cations for models of lexical access. Journal of (pp. 189225). Mahwah, NJ: Erlbaum.
Experimental Psychology: Learning, Memory, Dijkstra, A., & Van Heuven, W. J. B. (2002). The
and Cognition, 26, 12831296. architecture of the bilingual word recognition
Costa, A., Miozzo, M., & Caramazza, A. (1999). system: From identication to decision.
Lexical selection in bilinguals: Do words in the Bilingualism: Language and Cognition, 5,
bilinguals two lexicons compete for selection? 175197.
Journal of Memory and Language, 41, Dufour, R., & Kroll, J. F. (1995). Matching words
365397. to concepts in two languages: A test of the
De Bot, K., & Schreuder, R. (1993). Word concept mediation model of bilingual repre-
production and the bilingual lexicon. In sentation. Memory & Cognition, 23,
R. Schreuder & B. Weltens (Eds.), The 166180.
bilingual lexicon (pp. 191214). Amsterdam: Durgunoglu, A. Y., & Roediger, H. L. (1987). Test
Benjamins. differences in accessing bilingual memory.
De Bruijn, E. R. A., Dijkstra, A., Chwilla, D. J., & Journal of Memory and Language, 26,
Schriefers, H. J. (2001). Language context ef- 377391.
fects on interlingual homograph recognition: Elston-Guttler, K. E. (2000). An inquiry into cross-
Evidence from event-related potentials and language differences in lexical-conceptual
response times in semantic priming. relationships and their effect on L2 lexical
550 Aspects and Implications of Bilingualism
processing. Unpublished doctoral dissertation, Grosjean, F., Li, P., Munte, T., & Rodriquez-
University of Cambridge, Cambridge, U.K. Fornells, A. (2003). Imaging bilinguals: When
Finkbeiner, M., Forster, K., Nicol, J., & the neurosciences meet the language sciences.
Nakamura, K. (2004). The role of polysemy in Bilingualism: Language and Cognition, 6,
masked semantic and translation priming. 159165.
Journal of Memory and Language, 51, 122. Hermans, D. (2000). Word production in a foreign
Fraisse, P. (1960). Recognition time measured by language. Unpublished doctoral dissertation,
verbal reaction to figures and words. Percep- University of Nijmegen, Nijmegen, The
tual and Motor Skills, 11, 204. Netherlands.
Francis, W. S. (1999). Cognitive integration of Hermans, D., Bongaerts, T., De Bot, K., &
language and memory in bilinguals: Semantic Schreuder, R. (1998). Producing words in a
representation. Psychological Bulletin, 125, foreign language: Can speakers prevent inter-
193222. ference from their rst language? Bilingualism:
Francis, W. S., Tokowicz, N., & Kroll, J. F. (2003, Language and Cognition, 1, 213229.
April). Translation priming as a function of Illes, J., Francis, W. S., Desmond, J. E., Gabrieli,
bilingual prociency and item difculty. Poster J. D. E., Glover, G. H., Poldrack, R., et al.
presented at the Fourth International Sympo- (1999). Convergent cortical representation of
sium on Bilingualism, Tempe, AZ. semantic processing in bilinguals. Brain and
Gerhand, S., & Berry, C. (1998). Word frequency Language, 70, 347363.
effects in oral reading are not merely age of Izura, C., & Ellis, A. W. (2002). Age of acquisition
acquisition effects in disguise. Journal of effects in word recognition and production in
Experimental Psychology: Learning, Memory rst and second languages. Psicologica, 23,
and Cognition, 24, 267283. 245281.
Gollan, T. H., Forster, K. I., & Frost, R. (1997). Jared, D., & Kroll, J. F. (2001). Do bilinguals ac-
Translation priming with different scripts: tivate phonological representations in one or
Masked priming with cognates and non- both of their languages when naming words?
cognates in Hebrew-English bilinguals. Jour- Journal of Memory and Language, 44, 231.
nal of Experimental Psychology: Learning, Jared, D., & Szucs, C. (2002). Phonological acti-
Memory, and Cognition, 23, 11221139. vation in bilinguals: Evidence from inter-
Gollan, T., & Kroll, J. F. (2001). Bilingual lexical lingual homograph recognition. Bilingualism,
access. In B. Rapp (Ed.), The handbook of Language and Cognition, 5, 225239.
cognitive neuropsychology: What decits Jiang, N. (2000). Lexical representation and de-
reveal about the human mind (pp. 321345). velopment in a second language. Applied
Philadelphia: Psychology Press. Linguistics, 21, 4777.
Gollan, T. H., & Silverberg, N. (2001). Keatley, C., Spinks, J., & De Gelder, B. (1994).
Tip-of-the-tongue states in Hebrew-English Asymmetrical semantic facilitation between
bilinguals. Bilingualism: Language and languages. Memory & Cognition, 22, 7084.
Cognition, 4, 6384. Kim, K. H. S., Relkin, N. R., Lee, K. M., & Hirsch,
Grainger, J., & Dijkstra, A. (1992). On the repre- J. (1997). Distinct cortical areas associated
sentation and use of language information with native and second languages. Nature,
in bilinguals. In R. Harris (Ed.), Cognitive 388, 171174.
processing in bilinguals (pp. 207220). Kirsner, K., Smith, M. C., Lockhart, R. L. S.,
Amsterdam: Elsevier. King, M. L., & Jain, M. (1984). The bilingual
Grainger, J., & Jacobs, A. M. (1996). Ortho- lexicon: Language-specic units in an inte-
graphic processing in visual word recognition: grated network. Journal of Verbal Learning
A multiple read-out model. Psychological and Verbal Behavior, 23, 519539.
Review, 103, 518565. Klein, D., Milner, B., Zatorre, R., Meyer, E., &
Green, D. W. (1986). Control, activation, and Evans, A. (1995). The neural substrates
resource: A framework and a model for the underlying word generation: A bilingual
control of speech in bilinguals. Brain and functional-imaging study. Proceedings of the
Language, 27, 210223. National Academy of Sciences U.S.A., 92,
Green, D. W. (1998). Mental control of the bilin- 28992903.
gual lexico-semantic system. Bilingualism: Kolers, P. A. (1963). Interlingual word associa-
Language and Cognition, 1, 6781. tions. Journal of Verbal Memory and Verbal
Grosjean, F. (1997). Processing mixed language: Behavior, 2, 291300.
Issues, ndings, and models. In A. M. B. de Kroll, J. F., & Curley, J. (1988). Lexical memory in
Groot & J. F. Kroll (Eds.), Tutorials in bilin- novice bilinguals: The role of concepts in
gualism: Psycholinguistic perspectives (pp. retrieving second language words. In M.
225254). Mahwah, NJ: Erlbaum. Gruneberg, P. Morris, & R. Sykes (Eds.),
Models of Representation and Processing 551
Practical aspects of memory (Vol. 2, pp. 389 concept mediation. Journal of Memory and
395). London: Wiley. Language, 35, 648665.
Kroll, J. F., & De Groot, A. M. B. (1997). Lexical Lambert, W. E. (1961). Behavioral evidence
and conceptual memory in the bilingual: for contrasting forms of bilingualism. In
Mapping form to meaning in two languages. M. Zarechnak (Ed.), Report of the 12th
In A. M. B. de Groot & J. F. Kroll (Eds.), Annual Round Table Meeting on Linguistics
Tutorials in bilingualism: Psycholinguistic and Language Studies. Washington, DC:
perspectives (pp. 169199). Mahwah, NJ: Georgetown University Press.
Erlbaum. Lambert, W. E. (1969). Psychological studies of the
Kroll, J. F., & Dijkstra, A. (2002). The bilingual inter-dependencies of the bilinguals two lan-
lexicon. In R. Kaplan (Ed.), Handbook of guages. In J. Puhvel (Ed.), Substance and
applied linguistics (pp. 301321). Oxford, structure of language (pp. 99126). Berkeley:
U.K.: Oxford University Press. University of California Press.
Kroll, J. F., Dijkstra, A., Janssen, N., & Schriefers, Lambert, W. E. (1972). Language, psychology, and
H. (in preparation). Selecting the language in culture. Stanford, CA: Stanford University
which to speak: Cued-picture naming experi- Press.
ments on lexical access in bilingual produc- Lambert, W. E., Havelka, J., & Crosby, C. (1958).
tion. Unpublished manuscript, The The inuence of language acquisition contexts
Pennsylvania State University, University Park. on bilingualism. Journal of Abnormal and
Kroll, J. F., & Dussias, P. (2004). The compre- Social Psychology, 56, 239244.
hension of words and sentences in two lan- Levelt, W. J. M. (1989). Speaking: From intention
guages. In T. Bhatia & W. Ritchie (Eds.), to articulation. Cambridge, MA: MIT Press.
Handbook of bilingualism (169200). Levelt, W. J. M., Roelofs, A., & Meyer, A. S.
Cambridge, MA: Blackwell. (1999). A theory of lexical access in speech
Kroll, J. F., Michael, E., Tokowicz, N., & Dufour, production. Behavioral and Brain Sciences,
R. (2002). The development of lexical uency 22, 175.
in a second language. Second Language Macnamara, J., & Kushnir, S. L. (1971). Linguistic
Research, 18, 137171. independence of bilinguals: The input switch.
Kroll, J. F., & Sholl, A. (1992). Lexical and con- Journal of Verbal Learning and Verbal
ceptual memory in uent and nonuent bilin- Behavior, 10, 480487.
guals. In R. Harris (Ed.), Cognitive processing MacWhinney, B. (1997). Second language acqui-
in bilinguals (pp. 191204). Amsterdam: sition and the competition model. In A. M. B.
Elsevier. de Groot & J. F. Kroll (Eds.), Tutorials in
Kroll, J. F., & Stewart, E. (1994). Category inter- bilingualism: Psycholinguistic perspectives
ference in translation and picture naming: (pp. 113142). Mahwah, NJ: Erlbaum.
Evidence for asymmetric connections between Magiste, E. (1984). Stroop tasks and dichotic
bilingual memory representations. Journal of translation: The development of interference
Memory and Language, 33, 149174. patterns in bilinguals. Journal of Experimental
Kroll, J. F., & Sunderman, G. (2003). Cognitive Psychology: Learning, Memory, and Cogni-
processes in second language acquisition: The tion, 10, 304315.
development of lexical and conceptual repre- Marian, V., & Neisser, U. (2000). Language-
sentations. In C. Doughty & M. Long (Eds.), dependent recall of autobiographical memo-
Handbook of second language acquisition (pp. ries. Journal of Experimental Psychology:
104129). Cambridge, MA: Blackwell. General, 129, 361368.
Kroll, J. F., & Tokowicz, N. (2001). The devel- Marian, V., & Spivey, M. (1999). Activation of
opment of conceptual representation for Russian and English cohorts during bilingual
words in a second language. In J. L. Nicol spoken word recognition. In M. Hahn & S. C.
(Ed.), One mind, two languages: Bilingual Stoness (Eds.), Proceedings of the 21st Annual
language processing (pp. 4971). Cambridge, Conference of the Cognitive Science Society
MA: Blackwell. (pp. 349354). Mahwah, NJ: Erlbaum.
La Heij, W., De Bruyn, E., Elens, E., Hartsuiker, Marian, V., Spivey, M., & Hirsch, J. (2003).
R., Helaha, D., & Van Schelven, L. (1990). Shared and separate systems in bilingual lan-
Orthographic facilitation and categorical in- guage processing: Converging evidence from
terference in a word-translation variant of the eyetracking and brain imaging. Brain and
Stroop task. Canadian Journal of Psychology, Language, 86, 7082.
44, 7683. McClelland, J. L., & Rumelhart, D. E. (1981). An
La Heij, W., Kerling, R., & Van der Velden, E. interactive activation model of context effects
(1996). Nonverbal context effects in forward in letter perception, Part 1: An account of basic
and backward translation: Evidence for ndings. Psychological Review, 88, 375405.
552 Aspects and Implications of Bilingualism
McCormack, P. D. (1977). Bilingual linguistic Potter, M. C., So, K.-F., Von Eckardt, B., &
memory: The independence-interdependence Feldman, L. B. (1984). Lexical and conceptual
issue revisited. In P. A. Hornby (Ed.), Bilin- representation in beginning and more pro-
gualism: Psychological, social, educational cient bilinguals. Journal of Verbal Learning
implications (pp. 5766). New York: Aca- and Verbal Behavior, 23, 2338.
demic Press. Poulisse, N., & Bongaerts T. (1994). First language
Meuter, R. F. I., & Allport, A. (1999). Bilingual use in second language production. Applied
language switching in naming: Asymmetrical Linguistics, 15, 3657.
costs of language selection. Journal of Memory Rodriquez-Fornells, A., Rotte, M., Heinze, H.-J.,
and Language, 40, 2540. Nosselt, T., & Munte, T. (2002). Brain
Meyer, D. E., & Ruddy, M. G. (1974). Bilingual potential and functional MRI evidence for
word recognition: Organization and retrieval how to handle two languages with one brain.
of alternative lexical codes. Paper presented at Nature, 415, 10261029.
the annual meeting of the Eastern Psycholog- Schonpug, U. (1997, April). Bilingualism and
ical Association, Philadelphia. memory. Paper presented at the First Interna-
Miller, N. A., & Kroll, J. F. (2002). Stroop effects tional Symposium on Bilingualism, Newcastle-
in bilingual translation. Memory & Cognition, upon-Tyne, U.K.
30, 614628. Schrauf, R. W., & Rubin, D. C. (1998). Bilingual
Morrison, C. M., & Ellis, A. W. (1995). Roles of autobiographical memory in older adult
word frequency and age of acquisition in word immigrants: A test of cognitive explanations
naming and lexical decision. Journal of of the reminiscence bump and the linguistic
Experimental Psychology: Learning, Memory encoding of memories. Journal of Memory and
and Cognition, 21, 116153. Language, 39, 437457.
Morrison, C. M., Ellis, A. W., & Chappell, T. D. Schwanenflugel, P. J., & Rey, M. (1986). Inter-
(1997). Age of acquisition norms for a large lingual semantic facilitation: Evidence for a
set of object names and their relation to common representational system in the bilin-
adult estimates and other variables. gual. Journal of Memory and Language, 25,
Quarterly Journal of Experimental Psycholo- 605618.
gy: Human Experimental Psychology, 50A, Schwanenflugel, P. J., & Shoben, E. J. (1983).
528559. Differential context effects in the comprehen-
Paivio, A. (1991). Mental representation in bilin- sion of abstract and concrete verbal materials.
guals. In A. G. Reynolds (Ed.), Bilingualism, Journal of Experimental Psychology: Learn-
multiculturalism, and second language learn- ing, Memory, and Cognition, 9, 82102.
ing: The McGill conference in honour of Schwartz, A. (2003). Word and sentence-based
Wallace E. Lambert (pp. 113126). Hillsdale, processes in second language reading. Un-
NJ: Erlbaum. published doctoral dissertation, The Pennsyl-
Pavlenko, A. (1999). New approaches to concepts vania State University, University Park.
in bilingual memory. Bilingualism: Language Schwartz, A., Kroll, J. F., & Diaz, M. (2003).
and Cognition, 2, 209230. Reading words in Spanish and English:
Perani, D., Abutalebi, J., Paulesu, E., Brambati, S., Mapping orthography to phonology in two
Scifo, P., Cappa, S. F., et al. (2003). The role languages. Unpublished manuscript, The
of age of acquisition and language usage in Pennsylvania State University, University Park.
early, high-procient bilinguals: An fMRI Seidenberg, M. S., & McClelland, J. L. (1989). A
study during verbal uency. Human Brain distributed, developmental model of word
Mapping, 19, 170182. recognition and naming. Psychological Re-
Peterson, R. R., & Savoy, P. (1998). Lexical view, 96, 523568.
selection and phonological encoding during Shaffer, D. (1976). Is bilingualism compound or
language production: Evidence for cascaded coordinate? Lingua, 40, 6977.
processing. Journal of Experimental Psychol- Sholl, A., Sankaranarayanan, A., & Kroll, J. F.
ogy: Learning, Memory, and Cognition, 24, (1995). Transfer between picture naming and
539557. translation: A test of asymmetries in bilingual
Potter, M. C. (1979). Mundane symbolism: The memory. Psychological Science, 6, 4549.
relations among objects, names, and ideas. In Silverberg, S., & Samuel, A. G. (2004). The effect
N. R. Smith & M. B. Franklin (Eds.), Symbolic of age of second language acquisition on the
functioning in childhood (pp. 4165). representation and processing of second
Hillsdale, NJ: Erlbaum. language words. Journal of Memory and
Potter, M. C., & Faulconer, B. A. (1975). Time to Language, 51, 381398.
understand pictures and words. Nature Snodgrass, J. G. (1980). Towards a model for
(London), 253, 437438. picture-word processing. In P. A. Kolers, M. E.
Models of Representation and Processing 553
Wrolstad, & H. Bouma (Eds.), Processing of effect. European Journal of Cognitive Psy-
visible language (Vol. 2, pp. 565584). New chology, 4, 253272.
York: Plenum. Van Hell, J. G. (1998). Cross-language processing
Sunderman, G. (2002). Lexical development in a and bilingual memory organization. Unpub-
second language: Can the rst language be lished doctoral dissertation, University of
suppressed? Unpublished doctoral Amsterdam, Amsterdam.
dissertation, The Pennsylvania State Van Hell, J. G., & De Groot, A. M. B. (1998).
University, University Park. Conceptual representation in bilingual
Talamas, A., Kroll, J. F., & Dufour, R. (1999). memory: Effects of concreteness and cognate
Form related errors in second language status in word association. Bilingualism:
learning: A preliminary stage in the acquisition Language and Cognition, 1, 193211.
of L2 vocabulary. Bilingualism: Language and Van Hell, J., & Dijkstra, T. (2002). Foreign
Cognition, 2, 4558. language knowledge can inuence native lan-
Thomas, M. S. C. (1997). Connectionist networks guage performance: Evidence from trilinguals.
and knowledge representation: The case of Psychonomic Bulletin & Review, 9, 780789.
bilingual lexical processing. Unpublished Van Heuven, W. J. B. (2000). Visual word
doctoral dissertation, Oxford University, recognition in monolingual and bilingual
Oxford, U.K. readers: Experiments and computational
Tokowicz, N. (2000). Meaning representation modeling. Doctoral thesis, University of
within and across languages. Unpublished Nijmegen, Nijmegen, The Netherlands. NICI
doctoral dissertation, The Pennsylvania State Technical Report 20-01.
University, University Park. Van Heuven, W. J. B., Dijkstra, A., & Grainger, J.
Tokowicz, N., & Kroll, J. F. (2003). Accessing (1998). Orthographic neighborhood effects
meaning for words in two languages: The in bilingual word recognition. Journal of
effects of lexical and semantic ambiguity in Memory and Language, 39, 458483.
bilingual production. Unpublished manu- Von Studnitz, R., & Green, D. W. (2002).
script, The Pennsylvania State University, Interlingual homograph interference in
University Park. German-English bilinguals: Its modulation
Tokowicz, N., Kroll, J. F., De Groot, A. M. B., & and locus of control. Bilingualism: Language
Van Hell, J. G. (2002). Number of translation and Cognition, 5, 123.
norms for Dutch-English translation pairs: A Vigliocco, G., & Hartsuiker, R. J. (2002). The
new tool for examining language production. interplay of meaning, sound, and syntax in
Behavior Research Methods, Instruments, & sentence production. Psychological Bulletin,
Computers, 34, 435451. 128, 442472.
Tzelgov, J., & Eben-Ezra, S. (1992). Components Weinreich, U. (1953). Languages in contact. New
of the between-language semantic priming York: The Linguistics Circle of New York.
This page intentionally left blank
Author Index
555
556 Author Index
Baker, A., 31, 47 405, 418, 419, 421, 422, 424, 425, 429,
Bakker, P., 31, 44, 328, 345 493, 493
Baldwin, G., 32, 46 Biardeau, A., 183, 199, 209, 223
Ball, J., 91, 92, 104n.3, 104 Bickerton, D., 328, 345
Bamford, J. M., 502, 513 Biederman, I., 359, 370
Banaji, M., 439, 449 Bierwisch, M., 292, 305
Bargh, J. A., 372, 384, 385 Bijeljac-Babic, R., 183, 190, 199, 209, 223
Barik, H. C., 45657, 464, 466, 475 Binder, P., 365, 36869
Barrena, A., 3839, 44 Birdsong, D., 90, 91, 95, 98, 99, 103, 104n.2,
Barresi, J., 257, 267 104, 10911, 114, 11724, 125, 276, 278,
Barroso, F., 446, 450 437, 448, 501, 513
Barsalou, L. W., 293, 305 Bjork, R. A., 256, 264
Barshi, I., 377, 386 Bjorklund, D. F., 425, 430
Barton, G. E., 142, 149 Black, M., 519, 527
Baruch, O., 380, 388 Blackwell, A., 54, 64, 219, 223
Basden, B. H., 256, 257, 262 Bladon, R. A. W., 502, 513
Basden, D., 256, 262 Blanc, M. H. A., 393, 406
Basi, R. K., 257, 262 Blazquez-Domingo, R., 342, 345
Basso, G., 459, 476, 487, 493 Bleckley, M. K., 404, 406
Bates, E., 35, 44, 50, 51, 54, 57, 6466, 131, Bley-Vroman, R., 49, 64, 100, 104, 121, 125,
147, 149, 164, 167 132, 150, 278n.3, 278
Bauman, A., 77, 84 Bliss, L., 445, 451
Bava, A., 459, 476, 487, 493 Bloem, I., 295, 305, 540, 549
Bavelier, D., 481, 495, 501, 514 Blommaert, J., 340, 346
Beaton, A., 1113, 16, 20, 25, 26 Blue, S., 79, 86
Beaujour, E., 437, 448 Blumstein, S. E., 502, 513
Beauvillain, C., 183, 199, 258, 259, 264 Bock, J. K., 269, 279, 285, 286, 287, 313, 324,
Bechtel, W., 156, 167 327, 337, 338, 345
Becker, A., 442, 444, 446, 448 Boehm-Jernigan, H., 269, 27879
Beers, M. H., 120, 125 Bogaards, P., 10, 21, 26
Bellugi, U., 89, 105, 481, 495 Bohn, O. S., 72, 86
Belnap, K., 440, 446, 449 Boies, S. J., 371, 387
Belsham, R. L., 23, 26 Boland, J. E., 269, 27879
Beltramello, A., 511, 513, 522, 527 Bolgar, M., 350, 368
Ben-Zeev, S., 418, 429 Bolonyai, A., 327, 328, 340, 342, 34547
Bentahila, A., 43, 44 Bonanni, M. P., 393, 406
Benton, A. L., 395, 405 Bongaerts, T., 93, 95, 104, 120, 125, 259, 264,
Bergman, C., 32, 34, 44 3015, 306, 31214, 316, 317, 324, 325, 350,
Berkow, R., 120, 125 369, 390, 406, 426, 430, 472, 477, 520, 529,
Berlin, B., 57, 64, 43739, 448, 450 536, 538, 539, 550, 552
Berman, R., 442, 445, 448 Bonilla-Meeks, J. L., 256, 262
Bernardo, A. B. I., 254, 262 Bookheimer, S., 364, 367, 427, 430, 510, 513,
Bernsten, J., 341, 345 521, 529
Bernston, G., 517, 530 Booth, J. R., 64, 64
Berry, C., 543, 550 Borgwaldt, S., 191, 199
Bersick, M., 276, 281 Borkowski, J. G., 395, 405
Bertoncini, J., 69, 76, 8586 Born, D. G., 254, 267
Berwick, R. C., 142, 149 Bornstein, M. H., 111, 125
Besien, V., 465, 475 Boroditsky, L., 439, 440, 443, 446, 448, 449
Besner, D., 185, 199 Bos, M., 191, 199
Bessel, N. J., 502, 513 Bosch, L., 32, 44, 70, 71, 73, 75, 77, 83, 86,
Best, C. T., 72, 74, 83, 166, 167 353, 366
Bever, T. G., 114, 125, 269, 278, 285, 287 Bottini, G., 506, 513
Bhatia, T., 328, 347 Boumans, L., 341, 346
Bhatt, R., 130, 151 Bourne, E., 444, 452
Bi, Y., 322n.2, 323 Bourne, L. E., Jr., 17, 28, 377, 385
Bialystok, E., 88, 9093, 95, 9799, 102, 104, Bourque, T. A., 425, 432
110, 115, 116, 119, 122, 123, 125, 126, Boustagui, E., 95, 1056, 120, 126, 437, 450
137, 14950, 268, 278, 380, 385, 398, Bowen, J., 57, 67
Author Index 557
Hahne, A., 95, 105, 198, 200, 27577, 278n.6, Hernandez, A. E., 364, 367, 427, 430, 510, 511,
280, 48991, 494, 520, 528 513, 521, 529
Haith, M. M., 259, 264, 426, 430 Heyman, G., 445, 449
Hakansson, G., 132, 143, 144, 147, 148, 151, 152 Hickerson, N., 439, 446, 448
Hakuta, K., 54, 65, 93, 97, 98, 101, 104, 105, 115, Hickok, G., 51, 65
116, 119, 122, 125, 126, 418, 430, 468, 478 Hicks, R. E., 255, 265
Hala, S., 423, 430 Hill, J., 434, 435, 449
Hall, D. G., 484, 485, 49596 Hillyard, S. A., 275, 280
Hamayan, E., 257, 266 Hinton, G. E., 156, 169
Hambrick, D. Z., 428, 432 Hird, K., 18, 27, 227, 249
Hamers, J. F., 259, 264, 393, 406, 484, 494 Hirsch, J., 81, 85, 95, 106, 357, 368, 411, 414,
Hammond, C. S., 526, 530 426, 430, 502, 513, 536, 543, 550, 551
Hancin-Bhatt, B., 56, 65, 165, 168 Hirsh-Pasek, K., 32, 45
Happel, B. L. M., 159, 168 Hix, H. R., 423, 429
Hardin, C., 438, 439, 449 Hlavac, J., 335, 342, 346
Harkins, J., 444, 449 Ho, C., 258, 263
Harley, B., 89, 98, 100, 101, 103, 105 Hodges, J. R., 359, 370, 500, 514
Harley, T. A., 290, 292, 306, 313, 324, 472, Hoefnagel-Hohle, M., 89, 96, 107
477, 523, 528 Hoenkamp, E., 133, 135, 137, 151
Harman, D. W., 23, 26 Hoff, M. E., 157, 169
Harnishfeger, K. K., 425, 430 Hoffman, C., 444, 449
Harre, R., 444, 449, 451 Hoffman, E., 437, 449
Harrington, M., 57, 65, 99, 102, 106, 120, 126, Hoffman, R. R., 471, 477
131, 151, 270, 276, 280, 389, 401, 406 Hofstadter, D., 63, 65
Harris, B., 468, 477 Hogben, D., 13, 27
Harris, R. J., 149n.5, 151 Hohle, B., 32, 47
Hart, D., 98, 100, 101, 105 Hohne, E. A., 77, 84
Hartsuiker, R. J., 259, 265, 540, 551, 553 Holcomb, P. J., 269, 276, 28081
Hasher, L., 425, 430 Hollan, D., 444, 449
Hasper, M., 191, 201 Hollingshead, A., 260, 265
Hasselmo, N., 334, 346 Holm, J., 343, 346
Hatch, E., 144, 151 Holmes, A. P., 505, 514
Hauch, J., 259, 264, 426, 430 Holmes, J., 445, 449
Haugen, E., 334, 346 Holmes, V. M., 269, 280
Haughton, V. M., 502, 515 Hon, N., 261, 262, 508, 513
Hauser, M. D., 56, 65, 69, 86 Hooglander, A., 259, 265, 310, 324, 357, 368,
Hausmann, M., 487, 494 391, 406, 464, 478
Hausser, R., 52, 65 Hoover, M. L., 270, 276, 280
Havelka, J., 25556, 265, 542, 551 Hopeld, J. J., 158, 165, 168
Hawkins, R., 111, 127, 146, 152 Horn, P., 89, 105
Haxby, E., 500, 514 Houston-Price, C. M., 423, 432
Hayes, R., 130, 151 Hsieh, S., 353, 366
Hazenberg, S., 10, 27 Huang, J. Y. S., 419, 431
Healy, A. F., 17, 28, 377, 385, 386 Hubbard, E. M., 62, 66
Hebb, D. O., 155, 156, 168, 523, 528 Hubel, D. H., 109, 126
Heekeren, H. R., 509, 515 Hudson, P. T. W., 295, 306
Heelas, P., 444, 449 Hughes, C., 423, 430
Heider, E., 438, 449 Hulk, A., 37, 39, 45
Heilenman, K., 57, 66, 131, 151 Hull, R., 482, 48492, 494, 496, 520, 530
Heilman, K. M., 526, 530 Hulstijn, J. H., 10, 11, 27, 372, 376, 382,
Heinze, H.-J., 427, 432, 510, 515, 519, 530, 384, 386
536, 552 Hulstijn, W., 372, 386
Helaha, D., 259, 265, 551 Hummel, K. M., 257, 264
Henderson, J. M., 269, 279 Humphreys, G. W., 230, 248, 295, 299, 306,
Henik, A., 259, 267, 379, 380, 388 313, 324, 359, 369
Heredia, R., 256, 257, 264, 372, 387 Hung, D. L., 258, 265, 353, 368
Hermans, D., 259, 264, 3013, 306, 31314, Hunt, E., 43639, 449
316, 318, 319, 324, 390, 406, 426, 430, 472, Hunt, R. R., 11, 27
477, 520, 526, 529, 536, 538, 540, 550 Huot, R., 255, 266
Author Index 563
Hurford, J. R., 89, 105 Johnson, J. S., 8992, 9799, 102, 106, 107,
Hyltenstam, K., 88, 91, 92, 99, 102, 103, 105, 112, 114, 11620, 122, 126, 481, 494, 501,
120, 126 504, 508, 512, 513
Hyona, J., 466, 477 Johnson, K., 372, 373, 383, 386
Johnson, M., 74, 86
Idiazabal, I., 36, 3839, 44, 45 Johnson-Laird, P. N., 519, 529
Iglesias, F. J., 427, 430 Johnston, M., 145, 151, 152
Ignatow, M., 254, 265 Jonasson, J. T., 185, 199
Igoa, J. M., 173, 175, 176, 227, 233, 236, Jonides, J., 500, 515
24950 Joseph, J. S., 371, 387
Ijaz, H., 57, 65 Joseph, R., 485, 494
Illes, J., 261, 264, 426, 430, 489, 494, 5024, Joshi, A., 334, 346
506, 513, 538, 543, 550 Juan-Garau, M., 37, 39, 40, 45
Imai, M., 439, 44950 Juang, B. H., 155, 169
Indefrey, P., 162, 167, 501, 513 Juffs, A., 99, 102, 106, 120, 126, 270, 276, 280
Inoue, K., 275, 281 Juliano, C., 52, 64
Ioup, G., 49, 64, 91, 95, 99, 1056, 120, 126, Jun, S.-A., 82, 83, 110, 125
437, 450 Juncos-Rabadan, O., 427, 430, 527, 529
Isham, W. P., 459, 46163, 472, 477 Jusczyk, A. M., 76, 84
Ito, T., 57, 65, 131, 151 Jusczyk, P. W., 70, 76, 77, 79, 8486, 112, 126
Ivry, R., 160, 167 Just, M. A., 54, 66, 398, 400, 406, 407
Izura, C., 175, 176, 543, 550
Kade, O., 457, 477
Jackendoff, R., 252, 264, 326, 332, 34445, 346 Kahneman, D., 371, 386
Jackson, G. M., 364, 369, 488, 494, 521, Kaiser, G., 37, 39, 45, 46
522, 529 Kamwangamalu, N. M., 336, 346
Jackson, S., 158, 169 Kan, I. P., 505, 515
Jackson, S. R., 364, 369, 488, 494, 521, 529 Kane, M. J., 404, 406
Jacobs, A. M., 206, 208, 209, 212, 224, Kanwisher, N. G., 260, 261, 264, 265
534, 550 Kaplan, E., 399, 405
Jacques, S., 421, 430 Kaplan, R. M., 132, 137, 141, 151
Jacquet, M., 308, 324 Kapur, N., 523, 529
Jain, M., 214, 224, 228, 249, 257, 265, 349, Kareev, Y., 100, 106
368, 532, 550 Karmiloff-Smith, A., 74, 86
Jake, J., 32728, 33031, 333, 33638, 34144, Katsaiti, L. R., 257, 267
346, 347 Katz, E., 445, 450
Jamieson, D. G., 74, 84 Katz, L., 227, 249
Janssen, N., 317, 319, 324, 539, 551 Katz, W., 502, 513
Jared, D., 183, 185, 200, 258, 264, 472, 477, Kauffman, S., 165, 168
535, 550 Kawaguchi, S., 14547, 149n.12, 150, 151
Jarnum, H., 129, 150 Kawamoto, A. H., 222, 224
Jarvis, S., 442, 443, 446, 450, 451 Kay, A. R., 500, 514
Javier, R., 446, 450 Kay, P., 57, 64, 43739, 448, 450
Jenkins, W. M., 427, 431 Kazarian, S., 254, 255, 267
Jensen, A. R., 353, 368 Keatley, C., 15, 27, 258, 260, 265, 536, 550
Jerrett, D., 439, 448 Keele, S. W., 298, 306, 365, 368
Jerrett, T., 439, 448 Keidel, J. L., 166, 168
Jersild, A. T., 359, 368 Keijzer, R., 1618, 20, 25, 26
Jescheniak, J. D., 322n.2, 324, 392, 406, Kellerman, E., 132, 151
490, 494 Kellman, S., 437, 450
Jesse, A., 465, 477 Kello, C., 269, 281
Jia, G. X., 91, 92, 97, 103, 106 Kempe, V., 401, 406
Jiang, N., 229, 232, 249, 260, 264, 538, 550 Kempen, G., 133, 135, 137, 151, 472, 477
Jin, Y.-S., 228, 234, 249, 258, 260, 264 Kempton, W., 439, 450
Jisa, H., 37, 39, 40, 45 Kennard, C., 359, 370
Joanette, Y., 517, 527 Kennedy, A., 270, 281
Johnson, C., 32, 45 Kennison, S., 227, 249
Johnson, D., 444, 449 Kerling, R., 189, 201, 259, 265, 310, 324, 357,
Johnson, E. K., 70, 86 368, 391, 406, 464, 478, 537, 551
564 Author Index
Malakoff, M. E., 258, 266, 468, 478 McGuthry, K. E., 428, 432
Malotki, E., 443, 451 McKelvie, S. J., 22, 23, 26
Malt, B., 440, 451 McKinnon, R., 276, 281
Mangun, G. R., 160, 167 McLaren, J., 425, 432
Manly, T., 360, 369 McLaughlin, B., 256, 257, 264, 372, 373, 387
Mann, V., 56, 65, 96, 105 McLaughlin, J., 275, 280, 281, 520, 529
Mannheim, B., 434, 435, 449 McLeod, B., 372, 373, 387
Manuel, S., 123, 126 McMahon, K. L., 519, 528
Marchman, V. A., 114, 126 McNamara, J., 312, 325
Marcus, G., 162, 168, 333, 346 McNaughton, B. L., 62, 66
Mareschal, D., 219, 225 McNeill, D., 393, 405
Marhsall, D. B., 88, 106, 110, 126 McQueen, J., 78, 85
Marian, V., 54, 67, 80, 81, 85, 87, 254, 266, McRae, K., 269, 270, 273, 278n.5, 280
352, 370, 411, 414, 446, 451, 53436, 551 McRoberts, G. W., 72, 83, 166, 167
Marinova-Todd, S. H., 88, 89, 101, 106, 110, McVie, K., 73, 83
123, 126 Meara, P., 15, 2728, 165, 168
Markman, A., 63, 65 Mecklinger, A., 276, 280
Markus, H., 444, 451 Medin, D. L., 293, 307
Marler, P., 109, 127 Meechan, M., 328, 346
Marschark, M., 11, 27 Meeuwis, M., 340, 346
Marsh, L. G., 419, 431 Mehler, J., 6971, 76, 8486
Marslen-Wilson, W., 80, 84, 227, 237, 247n.2, Meiran, N., 359, 360, 362, 368
249, 250 Meisel, J. M., 3135, 37, 39, 4546, 131, 132,
Martin, A., 456, 479, 500, 514 149nn.111, 15152
Martin, J., 57, 67 Menn, L., 56, 66
Martin, M., 422, 429 Mennen, S., 95, 104
Martinez, A., 364, 367, 427, 430 Menon, R., 441, 452
Mascaro, J., 247n.3, 250 Merikle, P. M., 399, 405
Massaro, D. W., 455, 459, 465, 477, 478 Merrield, W., 439, 450
Mathey, S., 187, 201 Merzenich, M. M., 427, 431
Mathis, K. M., 20, 25, 189, 199, 259, 262, 546, Mestre, J. P., 421, 431
547, 549 Meuter, R. F. I., 274, 278, 280, 304, 306, 320,
Matras, Y., 328, 346 321, 325, 35365, 36869, 37980, 387, 391,
Mattingly, I. G., 455, 478 4067, 458, 478, 490, 493, 495, 519, 529,
Maxwell, J. T., 141, 151 541, 552
May, C. P., 425, 430 Meyer, A. S., 287, 287, 290, 292, 3067, 309,
Mayberry, R. I., 89, 106, 120, 127 31216, 322n.2, 32425, 395, 406, 475n.1,
Maye, J., 73, 85 478, 519, 529, 538, 551
Mayer, P., 413, 414 Meyer, D. E., 203, 225, 359, 370, 532,
Mayes, A. R., 136, 151 536, 552
Mayr, U., 365, 368, 400, 406 Meyer, E., 426, 431, 502, 514, 522, 529,
Mazoyer, B. M., 506, 514 543, 550
Mazziotta, J., 364, 367, 427, 430, 510, 513, Michael, E. B., 16, 27, 189, 200, 391, 4012,
521, 529 4067, 464, 478, 547, 551
McAllister, H., 259, 264 Michals, D., 23, 26
McCabe, A., 445, 451 Miikkulainen, R., 160, 165, 16869
McCandliss, B. D., 166, 168 Mildred, H. V., 230, 250
McClain, L., 419, 431 Milech, D., 183, 199, 214, 223, 227, 247,
McClelland, J. L., 51, 62, 65, 66, 80, 85, 123, 256, 263
127, 155, 156, 16264, 166, 16869, 190, Miller, B., 88, 9092, 9799, 102, 104, 110,
201, 206, 210, 213, 219, 221, 223, 225, 243, 123, 125
250, 269, 270, 281, 297, 305, 534, 551, 552 Miller, C., 69, 86
McCloskey, M. S., 259, 266, 293, 306 Miller, G. E., 11, 27, 28
McCormack, P. D., 254, 263, 266, 531, 552 Miller, K., 441, 451
McCormick, C. B., 11, 27 Miller, M. D., 261, 265
McCulloch, W. S., 162, 168 Miller, N. A., 259, 266, 538, 540, 552
McDonald, J. L., 5759, 66, 8991, 97, 99, 100, Mills, A., 34, 46
102, 104, 106, 131, 151, 175, 176, 457, 478 Mills, D., 490, 495
McGee, T., 74, 87 Mills, M., 447, 448
Author Index 567
Milner, B., 363, 369, 426, 430, 431, 480, 494, Murtha, S., 413, 414
502, 514, 522, 529, 543, 550 Muysken, P., 49, 64, 132, 150, 328, 346
Mimica, I., 51, 64 Myers, J., 77, 85
Mimouni, Z., 185, 199, 349, 366 Myers-Scotton, C., 32731, 33344, 34547
Minsky, M. L., 157, 169
Miozzo, A., 511, 513, 522, 527 Naatanen, R., 74, 85
Miozzo, M., 258, 263, 290, 301, 3056, 312, Nachtsheim, C., 115, 127
322n.2, 323, 397, 405, 472, 476, 520, 528, Nair, N. P. V., 124, 126
536, 549 Nakada, T., 480, 495
Mishina-Mori, S., 38, 39, 46 Nakamura, K., 538, 550
Mistry, J., 445, 451 Nakayama, K., 371, 387
Mitchell, D. C., 268, 269, 27274, 27980 Narayanan, S., 62, 64
Mitchell, J. P., 404, 407 Nas, G. L. J., 15, 26, 183, 199, 201, 22729,
Miura, I., 441, 451 234, 239, 248, 25860, 263, 266, 349,
Miyake, A., 54, 66, 359, 363, 365, 367, 389, 36667
400, 401, 407 Nation, P., 9, 10, 28
Molis, M., 90, 91, 98, 103, 104n.2, 104, Naumann, K., 144, 152
11722, 125 Navar, M. I., 255, 267
Mononen, L., 484, 494 Nazzi, T., 69, 70, 8586
Monsell, S., 353, 358, 359, 369, 370, 378, 379, Neely, J. H., 373, 387
387, 519, 529 Neisser, U., 254, 266, 446, 451, 534, 551
Montoya, R. I., 39395, 397, 406 Nelson, T. O., 260, 266
Montrul, S., 120, 121, 127 Nespor, M., 69, 84, 86
Moon, C., 56, 66 Neter, J., 115, 127
Moore, A. W., 155, 167 Neubert, A., 455, 468, 478
Moore, C. J., 526, 530 Neufeld, G. G., 93, 107
Moore, J. C., 13, 15, 23, 28 Neumann, E., 259, 266
Morales, R. V., 419, 431 Neumann, O., 372, 384, 387
Morgan, J. L., 76, 85 Neville, H. J., 95, 99, 102, 108, 198, 201,
Moro, A., 501, 514 27576, 278n.6, 281, 481, 49092, 495, 496,
Morosan, D. E., 74, 84 501, 509, 512, 514, 515, 520, 530
Morris, B. S. K., 89, 106 Newell, A., 55, 66, 372, 387
Morris, S. K., 394, 406 Newport, E. L., 56, 65, 8991, 97100, 102,
Morrison, C. M., 543, 552 1057, 111, 112, 114, 11620, 122, 126, 127,
Morton, J., 203, 225, 295, 298, 300, 304, 306, 481, 494, 495, 501, 504, 508, 512, 513
518, 529 Newson, M., 149n.2, 150
Moselle, M., 95, 1056, 120, 126, 437, 450 Ng, M.-L., 218, 223, 228, 234, 247, 258, 260,
Moser, B., 46163, 465, 478 263, 426, 429, 536, 549
Moser-Mercer, B., 455, 456, 461, 465, Ni, W., 269, 280
46771, 478 Nicol, J., 538, 550
Moses, L. J., 423, 429 Niemeier, S., 434, 451
Moskovlijevic, J., 228, 248 Nikelski, J., 426, 430
Mous, M., 328, 345 Nilipour, R., 517, 529
Moyer, A., 97, 106, 111, 123, 127 Noel, M., 441, 451
Moyer, R. S., 355, 369 Nooteboom, C., 385, 387
Muhlausler, P., 444, 451 Noppeney, U., 521, 529
Muldrew, S., 274, 278, 490, 493 Norman, D. A., 352, 363, 369, 419, 431
Muller, N., 37, 39, 46 Norris, D., 51, 66, 71, 80, 84, 86
Muller, U., 522, 529 Nosofsky, R. M., 155, 161, 169
Mummery, C. J., 500, 505, 514 Nosselt, T. M., 427, 432, 510, 515, 519, 530,
Munnich, E., 437, 442, 451 536, 552
Munoz, C., 101, 1067 Nott, C. R., 254, 257, 266
Munoz, M., 446, 450 Novack, T. A., 521, 528
Munro, M. J., 93, 94, 96, 97, 105, 115, 126, Novell, J. A., 254, 266
512, 513 Nozawa, T., 97, 105, 122, 126
Munte, T. F., 427, 432, 510, 515, 519, 530, 536, Nuyts, J., 434, 451
550, 552
Murre, J. M. J., 154, 155, 15860, 163, 166, Ober, B., 399, 405
16769, 518, 523, 526, 530 Obermeyer, J., 254, 267
568 Author Index
Obler, L. K., 165, 167, 35254, 363, 366, 369, Paus, T., 81, 84
411, 414, 480, 483, 484, 49395, 499, 513, Pavlenko, A., 252, 266, 433, 435, 437, 438,
519, 527 442, 444, 446, 447, 451, 460, 479, 537, 552
Ochs, E., 437, 451 Pavlovitch, M., 30, 47
Odlin, T., 442, 446, 450 Pawley, A., 383, 387
Ogawa, S., 500, 514 Peal, E., 418, 428, 431
Ogden, W. C., 372, 387 Pearlmutter, B. A., 166, 169
Oh, J. S., 82, 83, 110, 125 Pearlmutter, N. J., 52, 66, 268, 280
Ojemann, G. A., 499, 513, 514 Pecher, D., 256, 267
Okamoto, Y., 441, 451 Pechmann, T., 312, 324
Older, L., 227, 250 Pederson, E., 434, 435, 441, 442, 451, 452
Oliphant, G., 228, 250 Pelligrino, J. W., 419, 431
Ollinger, J., 481, 493 Peneld, W., 89, 100, 107, 351, 363, 369,
Olson, L., 89, 107 499, 514
ONeill, W., 25557, 266 Peperkamp, S., 76, 77, 84, 86
Opoku, J., 256, 266 Perani, D., 81, 86, 111, 125, 222, 223, 35657,
OReilly, R. C., 62, 66 369, 426, 431, 500509, 512, 512, 514, 515,
Osgood, C. E., 253, 263 520, 527, 543, 552
Oskarsson, M., 101, 108 Pereiro, A. X., 527, 529
Osterhout, L., 269, 275, 276, 278n.8, 28081, Perez, A., 257, 266
488, 494, 520, 521, 528, 529 Perez-Vidal, C., 37, 39, 40, 45
Ouellette, J. A., 11, 12, 28 Perfetti, C. A., 9, 28, 173, 176, 176, 480, 495
Oyama, S., 89, 9194, 9698, 107, 116, Perner, J., 423, 431
120, 127 Perret, E., 363, 369, 425, 431
Perry, C., 212, 223, 225
Paap, K. R., 372, 387 Persaud, A., 403, 407
Padilla, F., 454, 469, 475, 47879 Peters, A. M., 32, 47, 100, 107
Padilla, P., 454, 456, 469, 475, 47879 Petersen, S. E., 501, 514
Page, M., 205, 221, 225 Peterson, B. S., 519, 530
Paget, R., 62, 66 Peterson, R. R., 313, 325, 392, 407, 538, 552
Paivio, A., 11, 28, 253, 25657, 260, 266, Petrusic, W. M., 254, 263
534, 552 Peynircioglu, Z. F., 254, 256, 257, 259, 266, 267
Pak, Y., 98, 107 Pfaff, C., 34, 47, 328, 347
Palfai, T., 422, 429 Pfeifer, E., 491, 494
Pallier, C., 71, 75, 8082, 86, 349, 369 Phaf, R. H., 158, 169, 295, 297, 306
Palmer, H., 128, 152 Phelps, E., 17, 25
Palmer, M. B., 254, 266 Phillips, C., 74, 86
Pantev, C., 427, 429 Phinney, M., 147, 152
Papagno, C., 16, 17, 1920, 24, 25, 28, 397, Pickering, M. J., 269, 270, 281
398, 405, 407 Pienemann, M., 132, 13436, 139, 14145,
Papathanasiou, I., 523, 529 149nn.911, 15152, 373, 387
Papert, S. A., 157, 169 Piepenbrock, R., 208, 223
Paradis, J., 37, 39, 46 Pine, J., 32, 46
Paradis, M., 133, 13537, 152, 301, 306, 358, Pinker, S., 89, 100, 107, 114, 123, 127, 130,
369, 376, 378, 387, 41114, 41415, 426, 140, 152, 162, 169, 326, 333, 346, 347,
431, 436, 451, 45862, 466, 471, 479, 419, 431
48183, 486, 487, 493, 495, 498, 499, 514, Pisoni, D. B., 56, 66, 75, 80, 85
51618, 52022, 524, 526, 527, 529, 530 Pitres, A., 498, 499, 514, 520, 530
Parault, S. J., 100, 104 Pitts, W., 162, 168
Parker, J. T., 377, 385 Planken, B., 93, 95, 104
Parodi, T., 37, 39, 46 Platzack, C., 131, 152
Partee, B. H., 155, 169 Plaut, D. C., 100, 107, 214, 220, 225, 523, 530
Parziale, J., 261, 264 Pleh, C., 51, 52, 54, 66
Pascoe, K., 11, 27 Plunkett, K., 216, 220, 223, 225
Pashler, H. E., 384, 387 Poetzl, O., 511, 514
Patkowski, M. S., 89, 9194, 98, 99, 101, 107, Poline, J. B., 505, 514
116, 117, 120, 127 Polka, L., 72, 79, 86
Patterson, K., 123, 127, 500, 514 Polkey, C. E., 359, 370
Paulesu, E., 500, 505, 514 Pollatsek, A., 241, 248
Author Index 569
Pollmann, S., 522, 529 Relkin, N. R., 95, 106, 357, 368, 411, 414, 426,
Pollock, J., 278n.3, 281 430, 502, 513, 543, 550
Poot, R., 20, 26, 189, 199, 227, 248, 262, 263, Rey, A., 212, 224
357, 367, 391, 405, 464, 476, 532, 547, 549 Rey, M., 228, 234, 250, 258, 267, 536, 552
Popiel, S. J., 258, 266 Reynolds, A. G., 418, 431
Poplack, S., 328, 346, 347 Ricciardelli, L. A., 418, 43132
Posnansky, C. J., 290, 306 Richman, C. L., 11, 27
Posner, M. I., 69, 86, 371, 387, 427, 431 Rickard, T. C., 377, 385, 523, 524, 530
Potter, J., 434, 452 Riddoch, M. J., 313, 324
Potter, M. C., 15, 20, 28, 189, 201, 203, 218, Rinne, J. O., 464, 467, 479
225, 226, 250, 257, 258, 260, 261, 265, 266, Rintell, E., 444, 452
460, 479, 532, 536, 54345, 547, 552 Ristad, E. S., 142, 149
Potzl, O., 522, 530 Ritchie, W., 328, 347
Poulin-Dubois, D., 32, 47 Ritter, H., 160, 163, 165, 166, 169
Poulisse, N., 3015, 306, 312, 31519, 322n.2, Rivera-Gaxiola, M., 74, 86
325, 327, 328, 347, 350, 355, 369, 472, 474, Rivers, W. M., 383, 387
479, 538, 539, 552 Robbins, M., 94, 108
Poulsen, C., 375, 388 Robbins, T. W., 359, 370
Pouratian, N., 426, 431 Roberson, D., 439, 448
Powell, D., 360, 369 Roberts, L., 89, 100, 107, 351, 363, 369
Prat, C. S., 404, 406 Roberts, M., 364, 369
Prather, P., 133, 153 Roberts, P. M., 527, 530
Pressley, M., 11, 12, 2728 Robertson, D., 273, 281
Preston, M. S., 258, 266, 351, 353, 354, 369 Robertson, I. H., 360, 369, 518, 523,
Price, C. J., 222, 224, 364, 369, 413, 415, 427, 526, 530
431, 458, 479, 481, 488, 493, 494, 495, 501, Robinson, P., 101, 107, 376, 378, 384, 387
505, 5078, 510, 511, 514, 518, 521, 522, Robles, B. E., 254, 267
52426, 52830 Rockstroth, B., 427, 429
Prince, A., 162, 169 Rodrguez, M. J., 11, 28, 527, 529
Prince, P., 15, 28 Rodriguez-Fornells, A., 427, 432, 51012, 515,
Proctor, R. W., 372, 387 519, 530, 536, 550, 552
Pulvermuller, F., 89, 100, 107, 114, 123, 127 Roediger, H. L., 226, 248, 254, 25657, 263,
Purcell, E. T., 96, 107 426, 429, 534, 549
Putz, M., 434, 452 Roelofs, A., 287, 287, 290, 292, 293, 295,
Pynte, J., 258, 264, 26970, 272, 277, 298300, 304, 306, 307, 309, 31316, 324,
279, 281 325, 395, 406, 475n.1, 478, 519, 529,
538, 551
Quay, S., 3133, 37, 39, 42, 45, 47 Rogers, R. D., 353, 35859, 363, 370, 378,
379, 387
Raaijmakers, J. G. W., 155, 169 Rohde, D. L. T., 100, 107
Rabin, L. A., 256, 267 Rohwer, W. D., 353, 368
Rabiner, L. R., 155, 169 Romaine, S., 38, 43, 47, 437, 446, 452
Rado, J., 269, 279 Romani, C., 241, 247
Raichle, M. E., 82, 86, 481, 493, 501, 505, Romo, L. F., 261, 264
514, 515 Ronjat, J., 31, 32, 41, 47
Ramachandran, V. S., 62, 66 Rosaldo, M., 444, 452
Ramus, F., 69, 86 Rose, P. R., 257, 266
Ransdell, S. E., 392, 398, 407 Rose, R. G., 257, 266
Rapp, B., 313, 324, 325 Rosen, V. M., 395, 399, 400, 407
Rapport, R. L., 520, 530 Rosenberg, C. R., 156, 160, 169
Rapus, T., 421, 432 Rosenberg, S., 256, 257, 266, 286, 287
Rastle, K., 212, 223 Rosenblatt, F., 155, 157, 169
Raugh, M. R., 11, 25 Rosenbloom, P. S., 372, 387
Rayner, K., 198, 199, 241, 248, 269, 281, 290, Rosselli, M., 392, 395, 407
306, 535, 548 Rossi-Landi, F., 436, 452
Reason, J., 305n.1, 3067 Rossman, T., 372, 387
Recanzone, G. H., 427, 431 Roth, S., 9, 28
Redbond, J., 423, 432 Rothbart, M., 69, 86
Redfern, B. B., 521, 528 Rothi, L. J. G., 526, 530
570 Author Index
Rotte, M., 427, 432, 510, 515, 519, 530, Schlyter, S., 37, 39, 40, 47
536, 552 Schmidt, R., 37779, 386, 387
Roy, L., 257, 266 Schmitt, B., 439, 453
Rubenstein, J. S., 359, 370 Schmitt, E., 337, 347
Rubin, D. C., 254, 267, 446, 452, 534, 552 Schmitt, N., 383, 387
Ruddy, M. G., 203, 225, 532, 536, 552 Schneider, V. I., 17, 28
Ruin, I., 144, 152 Schneider, W., 371, 387
Rumelhart, D. E., 156, 157, 16264, 16869, Schoknecht, C., 210, 223, 228, 229, 232,
206, 225, 243, 250, 534, 551 248, 249
Rumiati, R. I., 359, 369 Schomaker, L. R. B., 160, 169
Rumsey, A., 434, 452 Schonpug, U., 547, 552
Runquist, W. N., 14, 28 Schoonen, R., 382, 38788
Russell, J., 423, 430 Schrauf, R. W., 254, 267, 446, 452, 534, 552
Rusted, J., 258, 266 Schreuder, R., 10, 28, 259, 264, 292, 301,
Ryalls, J. H., 502, 513 3056, 31314, 324, 352, 358, 366,
Ryan, E. B., 258, 263 390, 406, 426, 430, 458, 472, 476,
477, 519, 520, 528, 529, 536, 538, 549, 550
Sacco, K., 49, 67 Schriefers, H. J., 185, 187, 191, 192, 195,
Sadoski, M., 11, 28 199201, 208, 210, 223, 224, 290, 307, 312,
Saegert, J., 25457, 26567 313, 316, 317, 319, 32425, 352, 367, 454,
Saer, D. J., 418, 428, 432 455, 462, 470, 472, 476, 493, 493, 536, 539,
Saffran, E. M., 520, 527 549, 551
Saffran, J. R., 76, 85 Schriefers, K. I., 392, 406
Sahakian, B. J., 359, 370 Schulpen, B., 191, 201
Sais, E., 397, 405 Schumann, J. H., 89, 100, 107, 114, 123, 127
Salasoo, A., 228, 248 Schwanenugel, P. J., 17, 28, 228, 234, 250,
Salthouse, T. A., 428, 432 258, 267, 536, 538, 552
Samuel, A. G., 538, 543, 552 Schwartz, A., 535, 536, 552
Samuels, J., 372, 386 Schwartz, B. D., 130, 132, 146, 152
Samuels, S., 89, 107 Schweda-Nicholson, N., 457, 479
Sanchez-Casas, R. M., 18, 28, 71, 83, 175, 176, Schweisguth, M., 32, 45
183, 201, 227, 229, 231, 23339, 24850, Scoresby-Jackson, R., 498, 515
260, 261, 267 Scovel, T., 88, 89, 100, 107, 111, 127
Sanders, L. D., 27576, 281 Sebastian, N., 75, 86
Sandra, D., 227, 250 Sebastian-Galles, N., 32, 44, 6971, 73, 7577,
Sandston, J., 363, 370 8384, 86, 87, 313, 316, 323, 349, 353, 366,
Sankaranarayanan, A., 16, 20, 27, 28, 189, 201, 369, 395, 405, 539, 549
547, 552 Sebova, E., 398, 407
Sansavini, A., 76, 86 Secada, W. G., 419, 432
Santesteban, M., 308, 321, 323, 358, 365, 366 Segalowitz, N. S., 274, 281, 37276, 378, 379,
Santiago, J., 299, 300, 305, 307 38284, 386, 388, 389, 407
Sapir, A., 359, 368 Segalowitz, S. J., 37476, 388, 493n.1, 495,
Sapir, E., 434, 452 499, 515
Sarter, M., 517, 530 Seger, C. A., 256, 267
Sasaki, Y., 131, 152 Segui, J., 71, 84, 85, 235, 248, 304, 305
Saunders, B., 439, 446, 452 Seidenberg, M. S., 52, 66, 166, 168, 205, 213,
Saunders, G., 43, 47 220, 225, 268, 280, 380, 388, 534, 552
Savoy, P., 313, 325, 392, 407, 538, 552 Seitz, M., 484, 494
Sawyer, M., 389, 401, 406 Sejnowski, T. J., 156, 160, 169
Saxton, M., 441, 452 Seleskovitch, D., 54, 66, 461, 479
Sayehli, S., 132, 151 Seliger, H. W., 95, 102, 107, 111, 123, 127
Scarborough, D. L., 18182, 188, 200, 214, 224, Selinker, L., 121, 127
228, 230, 233, 249, 250, 257, 259, 264, 267 Semcesen, T. K., 421, 430
Scarcella, R. C., 89, 106 Senft, G., 441, 452
Schaal, S., 155, 167 Senman, L., 424, 429
Schachter, J., 89, 91, 92, 102, 106, 107 Serratrice, L., 35, 38, 39, 47
Scheibel, A. B., 124, 127 Servan-Schreiber, D., 163, 169
Schelleter, C., 38, 39, 47 Service, E., 16, 19, 28, 398, 407
Schils, E., 93, 95, 104 Setton, R., 468, 479
Author Index 571
Van der Slik, F., 95, 104 Warburton, E. A., 525, 526, 530
Van der Velde, F., 297, 306 Wartenburger, I., 82, 87, 507, 509, 515
Van der Velden, E., 189, 201, 259, 265, 310, Wasserman, W., 115, 127
324, 357, 368, 391, 406, 464, 478, 537, 551 Waters, G., 380, 388
Van Dyck, G., 183, 199, 535, 549 Watkins, M. J., 257, 267
Van Gelderen, A., 382, 388 Watson, V., 388
Van Hell, J. G., 1221, 23, 26, 28, 175, 176, Webber, A., 255, 267
184, 185, 189, 199, 200, 201, 310, 325, 390, Weber, A., 78, 81, 87
407, 426, 432, 466, 479, 53538, 549, 553 Weber-Fox, C. M., 95, 99, 102, 108, 198, 201,
Van Heste, T., 182, 201 27576, 278n.6, 281, 490, 491, 496, 509,
Van Heuven, W. J. B., 63, 64, 164, 167, 183, 512, 515, 520, 530
185, 18791, 197, 200, 201, 206, 20810, Wee, G. C., 258, 265, 353, 368
212, 213, 22325, 227, 239, 243, 245, 246, Weerman, F., 385, 387
248, 258, 263, 349, 352, 358, 367, 370, 458, Wei, L., 328, 333, 342, 34748
472, 476, 479, 519, 520, 528, 53235, 541, Weinberg, A., 142, 149
549, 553 Weiner, W. J., 522, 529
Van Jaarsveld, H., 181, 182, 184, 186, 19093, Weinreich, U., 15, 20, 28, 128, 129, 152, 188,
200, 210, 223, 235, 248, 472, 476 201, 226, 250, 253, 267, 334, 348, 531,
Van Mier, H., 501, 514 532, 553
Van Montfort, R., 57, 64 Weise, R., 333, 346
Van Ooyen, B., 69, 84 Weissenborn, J., 32, 47
Van Orden, G. C., 380, 388 Wellman, H. M., 423, 432
Van Petten, C., 489, 495 Weltens, B., 220, 225
Van Rijn, H., 208, 223 Werker, J. F., 56, 62, 67, 7273, 79, 80, 8387
Van Schelven, L., 259, 265, 551 Werner, G., 395, 406
Van Summeren, C., 95, 104 Wernicke, C., 498, 515
Van Wuijtswinkel, K., 120, 127 Wessels, J. M. I., 76, 77, 84
Vandenberghe, R., 505, 514 Wetherell, M., 434, 452
VanPatten, B., 381, 388 Wexler, K., 130, 15253
Varela Ortega, S., 243, 250 Whitaker, H. A., 499, 514, 520, 530
Vermeij, M., 300, 306 White, G., 444, 450
Verspoor, M., 434, 452 White, L., 95, 108, 120, 121, 127, 146, 147,
Vigliocco, G., 540, 553 153, 268, 271, 273, 278nn.34, 281, 373, 388
Vigorito, J., 112, 126 White, N., 466, 477
Vihman, M., 32, 47 Whorf, B., 43436, 438, 443, 447, 452
Villringer, A., 509, 515 Whurr, R., 523, 529
Vitkovitch, M., 300, 307 Wickens, D. D., 254, 264, 267
Volterra, V., 30, 33, 34, 47 Widdicombe, S., 31, 48
Von Cramon, D. Y., 522, 529 Widrow, B., 157, 169
Von Eckardt, B., 15, 28, 189, 201, 203, 225, Wieland, L. D., 11, 26
226, 250, 257, 266, 460, 479, 532, 552 Wienbruch, C., 427, 429
Von Elek, T., 101, 108 Wierzbicka, A., 444, 449
Von Studnitz, R. E., 182, 187, 194, 201, 364, Wiesel, T. N., 109, 126
369, 413, 415, 427, 431, 458, 479, 488, 495, Wiezbicka, A., 437, 444, 453
508, 514, 521, 530, 541, 553 Wiggs, C. L., 500, 514
Vorberg, D., 312, 324 Wijnen, F., 385, 387
Voyer, D., 486, 496 Wiley, E., 115, 126
Vrignaud, N., 465, 477 Wilkins, D., 441, 452
Vygotsky, L. S., 61, 67 Williams, D., 461, 478
Williams, J., 378, 386
Wade, E., 394, 405 Williams, J. N., 229, 250, 258, 260, 267, 318,
Waksler, R., 227, 250 319, 324
Waldorp, L. J., 469, 475 Williams, K. A., 72, 85
Wall, R., 155, 169 Williams, R. J., 156, 169
Walsh, T. M., 89, 108 Willingham, D. B., 378, 388
Wang, A. Y., 1113, 28, 257, 262 Wilson, S. J., 519, 528
Wang, W., 103, 105 Wimer, C. C., 14, 16, 23, 28
Wang, Y., 440, 451 Wingeld, A., 375, 388
Wanner, P., 43, 47 Winkler, I., 74, 87
574 Author Index
575
576 Subject Index
bilingual word recognition, 17982, 185, evidence from priming studies, 22731
18699, 202, 234, 235, 239 priming effect in, 23247
binding problem, 297 cognition, bilingualism and, 433, 447
biological bases of bilingualism. See neural color, 438
mediation of language in bilinguals numerical, 441
blend errors, 31718 spatial, 440, 442
blended recovery, 517 cognitive complexity and control theory, 422
blood oxygenation level dependence (BOLD), cognitive development, bilingualism and,
500 41719, 42428
bottom-up connections, 220 cognitive uency, 378
brain color and language, 43839
duality, 498. See also lateralization common memory theory, 226
language and the, 49899 common representations, 253
visualizing the bilingual, 5012. See also communicative language teaching (CLT), 383
neuroimaging studies compensation, 48687
brain plasticity, 481 competition, 50, 51
Broca, Pierre Paul, 498 cross-language, 318
Brocas aphasia, 137 competition model, 50, 13132, 164. See also
bulk transfer, 14243 unied competition model
competitive arenas, 5152
canonical word order, 14041 competitive learning, 158, 159
cardinality, 42122 schematic overview, 158, 159
case marking cues, 52 complex access, simple selection, 286, 290,
categoryexemplar generation priming, 256 327
central executive (working memory model), 462 complex sentences, 39
childadult differences, 6 compound bilingualism, 226, 542
CHILDES, 43 161, 165 compound representations, 253
Chomsky, Noam, 130, 154 comprehension, 5758, 532, 534, 536, 53841
chunking, 51, 5455, 63. See also parsing in bilinguals, neuroimaging studies
clitics, 39 investigating, 50610
closed class elements, 339 computational models of bilingual word, 204,
clustering, 254 22223. See also distributed models;
code switching, 302, 327, 329, 331, 337, 340 localist models
composite, 334 individual differences in, 176
examples of, 32930, 33437, 340 processes of, 17376
with system morphemes, exemplifying computational models, use of, 2034
asymmetries in, 33942 computational simulation, 7
code switching data, 342, 344 concept mediation, 189, 39192, 403
as window of combinations, 333 concept mediation model, 54346
code-switching evidence holistic nature of concept selection, 290, 291, 29698
noninnite verb form, 33537 concept(s), 435
codes, 51 attrition of previously learned, 438
coefcient of variability, 37576, 37879, internalization of new, 438
38284 two words for one, 310
cognate relations, 247 conceptual domain
and models of bilingual memory, 23847 restructuring of a, 438
as reducible to form or meaning relationships, shift from L1 to L2, 438
23136 conceptual representations in L2. See also
as special kind of morphological relationship, semantic/conceptual level of
23638 representation
cognate status, 15, 16, 18, 227, 228, 247 developing, 54348
and speed of word production, 317 conceptual system, 290
cognate vs. noncognate translations, as conceptual transfer, L1-based, 438
represented in BIA model, 24446 conceptualizer, 133, 134, 373, 475n.1
cognates, 12, 17, 210. See also false friends/false concreteness, word, 15, 1718, 547
cognates conference interpreting. See simultaneous
language nonselective access of, 18285 interpreting
language-selective access of, 18182 congruence, 39
vs. noncognates congruent condition, 259
578 Subject Index
speed-up vs. automaticity, 37576 syntax, 45. See also specic topics
spelling-to-sound, 183, 184 anatomy of, 130
spreading activation, 295, 296, 298, 304 initial hypothesis of, 131
stage of L2 acquisition hypothesis, 484 syntax rst model, 269
starting small, advantage of, 100 system morpheme principle, 330, 334, 335,
stimulus onset asynchronies. See SOAs 33941, 344
storage, 51, 5354 system morphemes, 337, 342. See also early
storytelling, 445 system morphemes; late system
stress, getting, 7577 morphemes
stress deaf bilinguals, 77 dened, 338
stress distance, 77 exemplifying asymmetries in code switching
stress languages, 71 with, 33942
strong interface position (grammar rule reconguring content morphemes as,
acquisition), 37678 34344
Stroop effect, 258, 381, 398, 404
Strooplike interference tasks, 25859, 35355, T (temporal) retention state, 18
398, 404 tags, 186, 294, 303, 519
Strooplike tasks, 298, 52122 target grammar, 130
structural models, 155 task activation, 297, 298
subcortical lesions, 522 task/decision system, 194, 19698
subject identication, cues for, 52 task schema, 390
subject-object-verb (SOV) vs. subject-verb-object task set, inhibition of language vs., 304
(SVO) order, 54, 60, 145, 165 task switching and concept formation,
subject realization, 39 42223
sublexical representations temporal (T) retention state, 18
activation, 319 text representation, 173
selection, 316, 319 and integration, 17576
subset activation, 190 thematic roles, 338
subset hypothesis, 458 theory of mind, 419, 42324
substances. See objects and substances theory theory, 423
successive recovery, 517 thought
superstrate, 337 bilingualism and, 433
supervisory attentional system (SAS), 352, 363, language(s) and, 414, 435, 447
426 interactions between, in bilingual
suppression, articulatory, 19 individuals, 44647
suppression skills, 400. See also inhibitory levels of interaction between, 434
control (IC) model possible relationships between, 438
improved by bilingualism, 393, 39899 time, conceptualization of, 443
suprasegmentals, 7577 time markings, 39
switching, 287, 304, 522 tip-of-the-tongue (TOT) phenomenon,
switching cost, 320, 321. See also language 317, 319
selection tip-of-the-tongue (TOT) states, 39394, 39698
switching effects, language, 187 toneword interference task, 258
switching mechanism, language, 351, 363. See top-down connections, 22021
also language selection topic-nding approach, 163
syllabic languages, 71 TRACE model, 80, 164, 210
syllable detection task, 71 transcoding interpreting strategy, 45962,
symbolic-deductive theory, 154 471
synonyms and translation equivalents, transfer, 7. See also rst language (L1) transfer;
26061 processability theory
syntactic accent, 57 of adaptation, 71
syntactic ambiguities, types of, 269 in articulation, 56
syntactic anomalies. See under sentence in audition, 56
processing codes and, 55
syntactic processing, 49092. See also sentence in lexical learning, 5657
processing in morphology, 5860
in late bilinguals, L2, 27778 in pragmatics, 58
syntactic strings, 276 in sentence completion, 60
syntactic word order, 39 in sentence comprehension, 5758
Subject Index 587