(Second Language Acquisition Research Series) Ronald P. Leow-Explicit Learning in the L2 Classroom_ a Student-Centered Approach-Routledge (2015)

August 8, 2017 | Author: Olga Ichshenko | Category: Second Language Acquisition, Second Language, Awareness, Language Acquisition, Psycholinguistics
Share Embed Donate


Short Description

Second language acquisition...

Description

EXPLICIT LEARNING IN THE L2 CLASSROOM

Explicit Learning in the L2 Classroom offers a unique five-prong (theoretical, empirical, methodological, pedagogical, and model-building) approach to the issue of explicit learning in the L2 classroom from a student-centered perspective. To achieve this five-prong objective, the book reports the theoretical underpinnings, empirical studies, and research designs employed in current research to investigate the constructs of attention and awareness in SLA, with the objectives of (1) proposing a model of the L2 learning process in Instructed SLA that accounts for the cognitive processes employed during this process and (2) providing pedagogical and curricular implications for the L2 classroom. The book also provides a comprehensive treatise of research methodology that is aimed at not only underscoring the major features of conducting robust research designs with high levels of internal validity but also preparing teachers to become critical readers of published empirical research. Ronald P. Leow is Professor of Applied Linguistics and Director of Spanish Language Instruction in the Department of Spanish and Portuguese at Georgetown University, USA. His areas of expertise include language curriculum development, teacher education, SLA, psycholinguistics, cognitive processes in language learning, research methodology, and CALL.

Second Language Acquisition Research Series: Theoretical and Methodological Issues Susan M. Gass and Alison Mackey, Editors

Monographs on Theoretical Issues: Schachter/Gass Second Language Classroom Research: Issues and Opportunities (1996) Birdsong Second Language Acquisition and the Critical Period Hypotheses (1999) Ohta Second Language Acquisition Processes in the Classroom: Learning Japanese (2001) Major Foreign Accent: Ontogeny and Phylogeny of Second Language Phonology (2001) VanPatten Processing Instruction: Theory, Research, and Commentary (2003) VanPatten/Williams/Rott/Overstreet Form-Meaning Connections in Second Language Acquisition (2004) Bardovi-Harlig/Hartford Interlanguage Pragmatics: Exploring Institutional Talk (2005) Dörnyei The Psychology of the Language Learner: Individual Differences in Second Language Acquisition (2005) Long Problems in SLA (2007) VanPatten/Williams Theories in Second Language Acquisition (2007) Ortega/Byrnes The Longitudinal Study of Advanced L2 Capacities (2008) Liceras/Zobl/Goodluck The Role of Formal Features in Second Language Acquisition (2008) Philp/Adams/Iwashita Peer Interaction and Second Language Learning (2013) VanPatten/Williams Theories in Second Language Acquisition, Second Edition (2014) Leow Explicit Learning in the L2 Classroom (2015)

Monographs on Research Methodology: Tarone/Gass/Cohen Research Methodology in Second Language Acquisition (1994) Yule Referential Communication Tasks (1997) Gass/Mackey Stimulated Recall Methodology in Second Language Research (2000) Markee Conversation Analysis (2000) Gass/Mackey Data Elicitation for Second and Foreign Language Research (2007) Duff Case Study Research in Applied Linguistics (2007) McDonough/Trofimovich Using Priming Methods in Second Language Research (2008) Larson-Hall A Guide to Doing Statistics in Second Language Research Using SPSS (2009) Dörnyei/Taguchi Questionnaires in Second Language Research: Construction, Administration, and Processing, 2nd Edition (2009) Bowles The Think-Aloud Controversy in Second Language Research (2010) Jiang Conducting Reaction Time Research for Second Language Studies (2011) Barkhuizen/Benson/Chik Narrative Inquiry in Language Teaching and Learning Research (2013) Jegerski/VanPatten Research Methods in Second Language Psycholinguistics (2013)

Of Related Interest: Gass Input, Interaction, and the Second Language Learner (1997) Gass/Sorace/Selinker Second Language Learning Data Analysis, Second Edition (1998) Mackey/Gass Second Language Research: Methodology and Design (2005) Gass/Selinker Second Language Acquisition: An Introductory Course, Third Edition (2008)

PRAISE FOR THIS BOOK

“This book brilliantly explains how attention and awareness mediate adult second language learning. Synthesizing theory and research from multiple fields, Leow proposes a cogent model for language processing. Engaging, enlightening, and humorous, Explicit Learning in the L2 Classroom provides an essential understanding of how language learning works.” Melissa Baralt, Florida International University, USA “Ron Leow takes us on a clever and entertaining journey that looks at the internal processes involved in the development of a second language. He starts with a comprehensive account of existing theory, and continues with the presentation of his model of the L2 learning process, showcasing recent empirical studies that support it. This work has important theoretical and methodological contributions to the field and will inform SLA researchers and teaching practitioners alike.” Nina Moreno, University of South Carolina, USA “Clearly written by a remarkable scholar with decades of experience both in the classroom and in empirical classroom research, this outstanding volume approaches a critical issue for those of us in the field of teacher education: do L2 teachers know how students learn in the L2 classroom? This book is a long overdue contribution to the L2 teacher education field and the best case for explicit learning in the L2 classroom.” María J. de la Fuente, Associate Professor of Spanish and Director of the Spanish Language Program, The George Washington University, USA

EXPLICIT LEARNING IN THE L2 CLASSROOM A Student-Centered Approach

Ronald P. Leow Georgetown University

First published 2015 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2015 Taylor & Francis The right of Ronald P. Leow to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice : Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Leow, Ronald P. (Ronald Philip), 1954– Explicit learning in the L2 classroom : a student-centered approach / Ronald P. Leow, Georgetown University. pages cm. — (Second language acquisition research series) Includes bibliographical references and index. 1. Language and languages—Study and teaching. 2. Second language acquisition. I. Title. P51.L4968 2015 418.0071—dc23 2014036365 ISBN: 978-0-415-70705-3 (hbk) ISBN: 978-0-415-70706-0 (pbk) ISBN: 978-1-315-88707-4 (ebk) Typeset in Bembo by Apex CoVantage, LLC

To Deborah, Philip, and Stephanie

This page intentionally left blank

CONTENTS

Preface Acknowledgments 1 Introduction, or Strolling Down Memory Lane: Raising Your Awareness

xi xviii

1

SECTION 1

Theoretical Foundations

13

2 A Preliminary Theoretical Framework for the L2 Learning Process in SLA

15

3 Theoretical Foundations for the Role of Attention in Learning from Non-SLA Fields

23

4 Theoretical Foundations for the Role of Awareness in Learning from Non-SLA Fields

48

5 Theoretical Foundations for the Roles of Attention and Awareness in L2 Learning in SLA

68

SECTION 2

Research Methodology 6 Methodological Issues in Research on the Relationships between Attention, Awareness, and L2 Learning in SLA

107

109

x

Contents

7 Deconstructing the Construct of Learning

123

8 Location, Location, Location: Probing Inside the Box

136

SECTION 3

Empirical Research Investigating the Role of Attention/ Noticing in L2 Development 9 Your Attention, Please

157 159

10 Learning Explicitly or Implicitly: That Is the Question

184

11 Depth of Processing in L2 Processing

203

SECTION 4

Model Building

237

12 Toward a Model of the L2 Learning Process in Instructed SLA

239

SECTION 5

Pedagogy

251

13 Toward the Development of Psycholinguistics-Based E-Tutors

253

14 Conclusion: The Changing L2 Classroom, and Where Do We Go From Here?

270

Index

279

PREFACE

To begin at the beginning: Let us begin with two agreements, one professional, one theoretical. Professionally, no one will disagree with the statement that teaching is one of the most rewarding and, at the same time, potentially one of the most frustrating professions to undertake (yes, I am going to wear two hats in this book: Teacher’s, based on four decades of teaching experience, and researcher’s, based on my research in the second language acquisition (SLA) field over the last two decades). It is rewarding when our students perform according to our objectives and “master/mistress” plans, and frustrating when, despite our efforts to facilitate their learning by providing them with an “appropriate” environment, and adequate exposure to and practice with the second or foreign (L2) language (accompanied by much love), they apparently fail to grasp even the “simplest” (that is, from our perspective) grammatical rule we teach them or to which we expose them. What can explain this apparent contradiction? To answer this question, let us discuss some important processes in language learning. First, did you pay attention to or notice the quotation marks around the descriptors “appropriate” and “simplest”? I even bolded them to draw your attention to them, with the hope of making you process them at a deeper level (this, incidentally, is the argument behind any effort to make specific aspects of the written L2 (and even oral L2 via pauses, intonation, funny facial expressions, etc.) more salient to L2 learners, and falls in the research strand of input/textual enhancement, which I will discuss later). Were you aware (pun intended) that there are deeper connotations embedded in these words? You could have paid more attention to these words but perhaps did not process them further—but that is getting into your heads, an internal process. “Simplest” refers to a teachercentered perspective of what constitutes a simple rule for the teacher and not necessarily from a student-centered perspective. “Appropriate” is based on our

xii

Preface

perception of language learning and teaching. Take a pause and contemplate this simple but challenging question: What is your perception of language learning and teaching? In other words, how do we think an L2 is learned (or acquired), and how should it be taught or presented? This is what drives all of us in the “classroom” setting (OK, these quotation marks around “classroom” refer to the fact that nowadays we have the hybrid curriculum in which the classroom can be physical or virtual/electronic or both). There are many responses to this simple but challenging question, some based on personal experience or attendance at teacher education courses, some based on the SLA and/or non-SLA (e.g., cognitive psychology or cognitive science, cognitive neuroscience, etc.) literatures, some based on a combination of two or more, and so on. However, regardless from where our perception may be derived, the fact remains that we, however hard we try, cannot learn for our students. In other words, learning is an internal process that may or may not be manipulated by external factors and, as we know, learning may be explicit, that is, with awareness, or implicit, that is, without awareness. What, then, is awareness? Are you aware that the literature in both SLA and non-SLA fields is literally littered with the role of awareness being explicitly or implicitly subsumed within a remarkable number of variables? We have type of learning (e.g., subliminal, incidental, implicit, explicit), type of learning condition (e.g., implicit, explicit), type of instructional condition (e.g., implicit, explicit, inductive, deductive), type of awareness (e.g., language, metacognitive, phenomenal, situational, self, conscious, unconscious) and so on. We also have constructs such as noticing (attention with awareness), detection (attention, cognitive registration without awareness), perception (with or without awareness), consciousness (even conscious awareness! So we can assume there is unconscious awareness?), and the list goes on. There are quite a large number of definitions, but here are two: “[T]he function of the interpretation of the nature of the encoding and retrieval processes required by the task” (Robinson, 1995: 301), and “a particular state of mind in which an individual has undergone a specific subjective experience of some cognitive content or external stimulus” (Tomlin & Villa, 1994: 193). For now, simply note that the construct of awareness as defined refers to something taking place in our brain as we process language. Now we can go to the theoretical agreement: We are still not sure which type of learning (explicit, that is, with awareness, or implicit, that is, without awareness) works better for our students in the L2 classroom. In other words, the role awareness plays in the learning process is a theoretically valid question and it plays out every day in our classrooms (with or without our own awareness). Yet from a practitioner’s perspective, we “know” (as in “intuition” that comes from vast experience) that raising our students’ awareness of grammatical rules, learning strategies, etc., is the way to go. Many if not all of us even do it today in our classrooms, since it feels so natural and instinctive! Take a look at every single foreign language textbook. Did you find one WITHOUT any grammatical

Preface

xiii

rules? It is not going to sell because we teachers (and students) expect such grammatical explanations, at least in some form, be it traditional—for example, complete verbal paradigms—or a little more progressive, as in partial paradigms that only address the persons involved (e.g., you and I, tú y yo, je et tu, etc.). For those teachers who have been in the profession as long as I have been, I am sure you will fondly remember the good old days of the Grammar Translation Method, when we taught the grammatical rules and then told our students to apply the rules clearly (from our perspective), with a high level of grammatical awareness, in translations. For our slightly younger teachers, you have likely been exposed to myriad methods and theoretical perceptions that view language learning from either a formal or traditional perspective (grammar comes first), an informal perspective (grammar is embedded in the L2 input, so you are wasting your time teaching it), or a combination of the two. Any way we look at all these perspectives with regard to grammar instruction or exposure, and their relationships with the role of awareness of these grammar rules, the focus has clearly shifted from a teacher-centered perspective to a more learner-centered one (at least for most of us), albeit with one caveat: Does “learner-centeredness” mean that (1) we make our students more active by participating in more activities, etc., but we still provide the essential grammatical information based on our individual perception of how the learning process operates, or (2) we make our students more responsible for the learning process obviously premised on a better understanding of the internal processes involved in language learning? The irony of the current status of language learning and teaching, however, may lie in the absence of a clear treatise on the roles of many important variables postulated to promote language learning that not only are theoretically and empirically supported but also offer solid pedagogical implications for the L2 classroom. Explicit Learning in the L2 Classroom: A Student-Centered Approach, then, is a book that provides a theoretically grounded and empirically supported approach to the promotion of explicit learning, that is, learning with awareness, in L2 development with a direct connection to learning in the classroom setting. Put another way, it is not a book on justifying the teaching of grammatical rules in the second/foreign language (L2) classroom. It is also not a book on the acquisition of an L2 in an L2 setting since the author believes that this kind of formal setting does not lend itself to the natural acquisition process that is usually associated with a process similar to that of a child acquiring his/her first language (e.g., L2 classroom contexts are often characterized by impoverished input or L2, an inadequate amount of exposure and interaction, and practice and homework focused on grammar, be it by the teacher or the textbook, etc.). It is a book on learning, a process that involves quite a lot of processing and potential learner awareness while interacting with the L2 inside and outside the classroom setting. I shall elaborate on, or, more specifically, deconstruct this construct of learning in Chapter 7. Now that I have gotten your attention (I am assuming that since you have read this far you must have been paying some attention to the message and, in some

xiv

Preface

cases, processing and interacting with the content from a personal viewpoint), the main purpose of this book is to raise your awareness (pun intended) of many important variables that contribute to language learning and, ultimately, to language teaching. We are going to take a closer and critical look at internal cognitive processes and, more specifically, constructs such as attention, awareness, and working memory, together with activation of prior knowledge and depth of processing postulated to play important roles in language learning, from both a theoretical and empirical perspective, in order to better inform ourselves as we teach in the L2 “classroom” (think hybrid). For those readers who were processing the above information with some level of awareness and making inferences with respect to the title of this book, it will come as no surprise that the overwhelming number of studies that have empirically and directly or indirectly investigated the construct of awareness or lack thereof in L2 development (e.g., de la Fuente, 2015; Bowles, 2003; FarettaStutenberg & Morgan-Short, 2011; Hama & Leow, 2010; Hsieh, Moreno, & Leow, 2015; Leow, 1997, 1998, 2000, 2001, 2015; Leung & Williams, 2011, 2012; Martínez-Fernández, 2008; Medina, 2015; Robinson, 1996; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007; Williams, 2005) have reported substantial and robust support for its role in the learning process, that is, explicit learning in the classroom setting viewed from an internal perspective rocks ! However, please do not take my word since, as a human being, I may be biased, given that the bulk of my published research lies in the attentional and awareness strands of research. There is absolutely no need on my part to mention that the classroom setting is far more conducive to explicit learning than implicit learning, for obvious reasons. So what I propose to offer in this book is simply a presentation of the facts—from an unbiased perspective—and you, my dear readers, will be the judge of the substance and purpose of the book: Promoting explicit learning, that is, learning with learner awareness in the L2 classroom. To this end, I shall present the relevant data from a five-prong approach (theoretical, empirical, methodological, model-building, and pedagogical). Section 1 is all theoretical. Without theory, everything is explained in an ad hoc way, which is not very scholarly—plus I sound smart. Chapter 1 takes a stroll down memory lane to situate the construct of awareness in the SLA literature. To situate the theoretical foundations for explicit learning in both SLA and non-SLA, Chapter 2 presents, from a psycholinguistic perspective, a coarse-grained theoretical framework of the L2 learning process in SLA, to which we shall refer several times throughout the book. Chapters 3 and 4 then present relatively broad overviews of several major theoretical underpinnings postulated for the roles of attention and awareness in learning in non-SLA fields that have influenced those in SLA. Chapter 5 provides a more in-depth discussion of the theoretical underpinnings postulated for the roles of attention and awareness in language learning, together with a summary of their major tenets postulated to account for the learning process in SLA.

Preface

xv

Before we discuss the empirical research, we need to have a relatively good idea of what constitutes a robust research design that produces findings upon which we can place a high level of confidence. Let’s get critical! To this end, Section 2 (Research Methodology) provides an in-depth report of the heart of any empirical study, namely the research design, and its corresponding level of internal validity. Chapter 6 focuses on the methodological issues surrounding the investigation of the relationship between the roles of attention and awareness, or lack thereof, and learning in the SLA field. Chapter 7 presents a tri-dimensional perspective of the construct of learning to address the potential terminological confusion as to what comprises “learning” in SLA. Chapter 8 addresses three major concurrent data elicitation procedures (reaction time, eye-tracking, and think aloud) employed to gather data on learner processes while they are interacting with L2 data, and their benefits to further research on learner processes. Related methodological issues such as reactivity and veridicality are also discussed. Section 3 (Empirical Research) then provides synopses of the empirical research conducted in SLA on the roles of attention/noticing (Chapter 9) and awareness or lack thereof (Chapter 10) in L2 development. As I mentioned above, Section 2 will assist you in being critical of published work. Chapter 11 discusses the concept of depth of processing in the L2 learning process and suggests that we centralize the process of learning not on the construct of attention, which clearly plays a crucial role in L2 learning, but on the notion of how L2 learners process the L2 data. Theoretical, methodological, empirical, and pedagogical benefits are discussed. Section 4 (Model Building) presents in Chapter 12 a proposed model of the L2 learning process in Instructed SLA that draws from previous theoretical underpinnings in both the SLA and non-SLA fields, and attempts to capture the important roles several cognitive processes play along the L2 learning process from input to output. Learning will be represented in this model as being both a process and a product (knowledge), and special emphasis will be placed on the potential roles attention, depth of processing, (levels of) awareness, and activation of prior knowledge play along several stages in the learning process. Section 5 (Pedagogy) is based exclusively on the previous chapters. Chapter 13 provides suggestions for the development of psycholinguistics-based tasks and e-tutors designed to enhance robust learning, especially of problematic grammatical points in the L2. Specific examples are provided. Finally, Chapter 14 discusses some conclusions and questions gleaned from the previous chapters, reports on the inroads technology is currently making at both the curricular and instructional level leading to the changing dynamics of the L2 classroom, and provides one feasible curricular suggestion, namely, a partial hybrid curriculum, to embrace the changing format of the traditional L2 classroom, the role of technology, and Instructed SLA research. It is hoped that after reading this book, readers’ awareness of several important variables postulated to contribute to language learning (and teaching) will be raised, together with more creative ways to enhance and stimulate students’

xvi

Preface

explicit learning, that is, learning with awareness, in the L2 classroom. While Baars (1997) wrote, “Paying attention—becoming conscious of some material— seems to be the sovereign remedy for learning anything applicable to many very different kinds of information. It is the universal solvent of the mind” (sec. 5, p. 304), I personally like to think that the depth of processing, with its potential of raising the level of awareness, is an important step to potential change in the learning process.

References Baars, B. J. (1997). In the theater of consciousness: The workspace of the mind. New York: Oxford University Press. Bowles, M. A. (2003). The effects of textual input enhancement on language learning: An online/offline study of fourth-semester Spanish students. In P. Kempchinsky & C. E. Piñeros (Eds.), Theory, practice, and acquisition: Papers from the 6th Hispanic Linguistics Symposium and the 5th Conference on the Acquisition of Spanish and Portuguese (pp. 395–411). Somerville, MA: Cascadilla Press. De la Fuente, M. (2015). Explicit corrective feedback and computer-based, form-focused instruction: The role of L1 in promoting awareness of L2 forms. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Faretta-Stutenberg, M., & Morgan-Short, K. (2011). Learning without awareness reconsidered: A replication of Williams (2005). In Granena, G., Koeth, J., Lee-Ellis, S., Lukyanchenko, A., Prieto Botana, G., & Rhoades, E. (Eds.), Selected proceedings of the 2010 Second Language Research Forum: Reconsidering SLA research, dimensions, and directions (pp. 18–28). Somerville, MA: Cascadilla Proceedings Project. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32, 465–491. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). A comparison of level of awareness and depth of processing in two types of instructional media (C-FTF vs. CAI): Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (1998). Toward operationalizing the process of attention in second language acquisition: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19, 133–159. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware versus unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2001). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mappings between forms and contextually derived meanings. Studies in Second Language Acquisition, 33, 33–55. Leung, J. H. C., & Williams, J. N. (2012). Constraints on implicit learning of grammatical form-meaning connections. Language Learning, 62, 634–662.

Preface

xvii

Martínez-Fernández, A. (2008). Revisiting the involvement load hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. Medina, A. (2015). The variable effects of level of awareness and CALL versus nonCALL textual modification on adult L2 readers’ input comprehension and learning. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Robinson, P. (1995). Review article: Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule-search and instructed conditions. Studies in Second Language Acquisition, 18, 27–68. Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21, 511–556. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Sachs, R., & Suh, B-R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.), Conversational interaction in second-language acquisition: A series of empirical studies (pp. 197–227). Oxford: Oxford University Press. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269–304.

ACKNOWLEDGMENTS

I would like to thank all my previous teachers from pre-kindergarten to college (especially James F. Lee, my mentor, and Bill VanPatten); my current and former students/colleagues who have, in so many ways, contributed to the content of this book; and especially the think aloud/awareness/CALL group (Elena Rosa, Miguel Angel Novella, Maite Camblor, Melissa Bowles, Ana-María Nuevo, Ya-Chin Tsai, Takako Egi, Maite Camblor, Almitra Medina, Claudia Guidi, Laura Gurzynski-Weiss, Rebecca Sachs, YunDeok Choi, Maymona Khalil Al Khalil, Marisa Filgueras Gómez, de la Fuente), the research methodological group (Kara Morgan-Short, Bo-Ram Suh, Melissa Baralt, Luis Cerezo, Mika Hama, Ellen Johnson Sarafini, Germán Zárate-Sández, Sarah Grey, Silvia Marijuan, Colleen Moorman), the more recent depth-of-processing group (HuiChen Hsieh, Nina Moreno, Ana Martínez-Fernández, Annie Calderón, Sergio Andrada, Johnathan Mercer), and my two co-creators of the criteria for coding depth of processing (Annie Calderón and Ellen Johnson Sarafini) in Chapter 11. Thanks also to the Initiative on Technology-Enhanced Learning (ITEL) grant (coordinated by Peter Janssens) awarded by Georgetown University’s Center for New Designs in Learning and Scholarship (CNDLS) that permitted me to create the Gustar Maze that was designed by Bill Garr and extended by Allison Caras. A special thanks to my graphic designer Steven Mercer for the model of the L2 learning process in Instructed SLA, Johnathan Mercer (yes, they are siblings) for contributing to the visual conceptualization of the model, and Celia Zamora for the indexing. Finally, much appreciation to my two editors (Susan Gass and Alison Mackey) who have been, without their awareness, sources of inspiration to me, and to Renata Corbani and Carmen Baumann for making the publishing process enjoyable.

1 INTRODUCTION, OR STROLLING DOWN MEMORY LANE Raising Your Awareness

Before I jump into the nitty-gritty of the book components, I would like first to broadly describe what comprises L2 learning in this book and where this book is situated, and then I would like to take a quick stroll down memory lane regarding the many changes pertaining to our students’ role in L2 learning in the SLA field, dating back to the theoretical and pedagogical approaches to learning in the 1960s to the current empirical focus on implicit learning in SLA.

L2 Learning and Setting L2 learning, for now ( Chapter 7 elaborates on this construct), can be described broadly as a process in which many changes take place in L2 learners’ cognition as they try to create new representations for the L2 grammar, internalize such data, and restructure if necessary, all the while developing their ability to comprehend and produce the L2, either orally or written, in real time. It takes place in a setting in which the L2 is either viewed as a foreign language (as in English speakers taking the foreign language requirement in an L1 environment) or a second language (as in Japanese speakers taking English classes in an L2 environment, for example, in the USA). In either setting, L2 learners are exposed to naturally occurring languages and are interacting with the language, be it communicatively or performing a task of some ecological validity. In the typical formal classroom setting, the L2 is taught by an instructor, and students learn the so-called traditional four skills of listening, reading, writing, and speaking. There is a curriculum that provides information on, for example, the grading criteria, attendance policy, percent weight of each section of the curriculum, objectives for all four skills, and a syllabus that provides a guideline for each class session. Homework is usually assigned and students follow a prescribed textbook.

2

Introduction

The amount of time spent in this formal setting varies, but the minimum is usually around one hour either daily (for intensive classes) or three or four times a week (for non-intensive classes, depending upon the language program). SLA research that seeks to probe into learner cognition, then, needs to focus on the identification and explanation of the cognitive processes employed by L2 learners as they learn the L2 in these two settings.

Memory Lane Now, let us situate this book’s perspective regarding our students’ role in L2 learning by taking a quick stroll down memory lane over the many selected changes in focus, both theoretical and empirical, toward what Omaggio (1993) calls the “presumed locus of control of the process of language acquisition” (p. 43) over the last several decades. First, a quick pop quiz: How many of you are aware that the construct of awareness has always been subsumed in the teaching profession? Isn’t it true that we language teachers—well, most of us—have this innate desire to promote our students “knowledge” of what they are learning or their awareness of what they are producing? Be it theoretically, empirically, or pedagogically driven, we, as teachers of L2 languages, do incorporate activities or tasks that require some role of awareness or lack thereof on our students’ part. Now that we are on the same page, let us proceed to some previous theoretical perspectives regarding the L2 learning process. As many of us will recall, the two dominant theoretical approaches to L2 learning in the 50s and early 60s were the behaviorist/empiricist (e.g., Hilgard, 1962; Skinner, 1957) versus the rationalist/mentalist/nativist (e.g., Chomsky, 1957) perspective of learning. The former postulated that learning was literally teacher-centered, that is, it was the teacher who was responsible for providing the appropriate stimuli or grammatical data. The student was like the little baby (or Pavlovian puppy) conditioned to absorb all this important information without any personal input or cognitive processes involved, but being rewarded when following instructions correctly (which may explain why many teachers do have jars of candy in their offices). Thus, we had our pedagogical repertoire of repetition exercises or rote memory that was relatively divorced from a relationship with meaning or even real communication (cf. the well-known Audio-lingual Methodology in the 60s). The latter theoretical approach viewed the student as a more active participant in the learning process (after all, s/he already possessed innately Chomsky’s famous language acquisition device (LAD), also referred to as “the black box”), so teachers shifted more responsibilities to the students for the learning process, and in doing so subtly acknowledged the role of cognitive processes in L2 learning. However, like in real life, the rationalist/mentalist/ nativist approach to L2 learning led to several interpretations of this new learnercentered perspective of the learning process. This resulted in several variations of

Introduction

3

teaching practices, depending on one’s personal perception of language learning and teaching within this approach: Witness, for example, the Grammar Translation Method (50s), the Cognitive Code Method that placed a premium on formal instruction of grammar before practice, the humanistic perception as evident in the 70s in The Silent Way, Community Language Learning, and Suggestopedia, and the focus on communication first as in the Direct Method (60s), the Total Physical Response (70s), the Communicative Approach (70s), the Natural Approach (80s), Task-based Learning (current) and so on. I am proud to say that I have tried, during my four decades of teaching, most of these methods, approaches, techniques, etc., as bandwagons came and rode off into the sunset. Interestingly, the two major publications, namely Corder (1967) and Selinker (1972), that provided the foundation for current research in SLA were both learner-centered and repudiations against a behaviorist approach to language learning. For language instruction, Corder suggested the need to seriously address what L2 learners bring to the task of learning an L2, which he called their “internal syllabus.” In addition, he coined the term “intake” that sought to differentiate what learners are exposed to, for example, the L2 (the input), and what they take in. Theoretically, it is assumed that not everything that learners pay attention to in the input is automatically “taken in” or processed, most likely due to processing demands and attentional constraints. Selinker suggested that we acknowledge the internal system, which he calls “interlanguage,” that L2 learners possess as they develop their ability to learn the L2. As the term connotes, interlanguage is a system that is somewhere between the first language (L1) and the L2. Given the status of interlanguage being a system with its own rules, doesn’t it make you wonder whether errors produced by our students are really systematically “correct” according to their own interlanguage system, but are being graded from a native or near-native speaker’s perspective? Put another way, perhaps they are “right” and we are “wrong.” The 70s witnessed several empirical efforts to address Corder’s and Selinker’s calls for more focus on the learner’s involvement in the learning process. Many of these studies were essentially based on L1 research conducted within an L2 context and provided quite a contrast in their pursuits. On the one hand were the acquisition order studies (do some of you recall the famous morpheme studies by, for example, Dulay and Burt (1973, 1974) that attempted to equate the L1 acquisitional process with that of the L2 based on an apparent natural order of morphemes?), while on the other hand we had the error analysis studies that sought to prove otherwise, that is, that the L1 transfer process may not be entirely similar to the L2 (cf. Corder, 1967; Schachter, 1974). The 80s, in my opinion, began a fruitful period of research in the L2 learning process both from an empirical and theoretical perspective. Even though the shift from a strict behaviorist perspective of language learning was relatively accepted in the SLA field, early empirical research still began to focus on the external features of the L2 input (cf. studies on simplification such as Blau, 1982;

4

Introduction

Davies, 1984; Parker & Chaudron, 1987) and the role of interaction in L2 language learning, mostly from a descriptive perspective (cf. Hatch, Shapira, & Wagner-Gough, 1978; Henzl, 1979). At the same time, there were some important theoretical underpinnings that began to focus more closely on learners’ internal processes in relation to the role of awareness. First, the term “consciousness-raising” (Sharwood Smith, 1981) came into being with a direct relationship to students’ internal processes. If we can raise our students’ consciousness of the underlying grammatical rules, this will greatly facilitate their learning (by the way, the Grammar Translation Method could be credited for doing this, though learning was defined as the ability to write or translate instead of the ability to speak). However, Sharwood Smith came to realize (became aware) that he was dealing with an internal process and, consequently, modified his term to “input enhancement” (Sharwood Smith, 1993), arguing that this term was more appropriate in depicting exactly what was being proposed, namely enhancing the L2 input via, for example, grammatical rules, additional emphasis, or anything that could potentially draw students’ attention to the enhanced aspect of the L2 input. Needless to say, this strand of research exploded in the 90s, given its relatively broad definition of what comprises input enhancement (cf. Leow, 2009, for a more elaborated and critical discussion of this issue), and is still current today. In my opinion, there are some milestones along the theoretical and empirical routes to current studies that have gone beyond investigating the role of awareness in L2 learning to addressing whether the absence of awareness also plays a role in L2 learning (cf. Chan & Leung, 2014; Hama & Leow, 2010; Leow, 2000; Leung & Williams, 2011, 2014; Williams, 2005). The first milestone was Krashen’s (1982) Monitor Theory, with its pedagogical sidekick the Natural Approach that initiated and maximally contributed to this theoretical and empirical impetus on internal processes. The Monitor Theory, premised on children’s first language acquisition, was the first theoretical underpinning to raise the issue of the role of the construct of awareness (termed “consciousness” in those days and also today with some researchers) in the L2 learning process and to distinguish between learning (with consciousness), resulting in learned/explicit knowledge, and acquiring (without consciousness), resulting in acquired/implicit knowledge. Krashen also argued that there was no interface (connection) between implicit (acquired) and explicit (learned) knowledge, which led to quite a discussion of whether there exists in SLA a weak interface; for example, explicit knowledge can lead to implicit knowledge (e.g., R. Ellis, 2006), or implicit knowledge may be assisted by explicit knowledge (e.g., N. Ellis, 2005). A strong interface (e.g., DeKeyser, 2007) derived from skill acquisition theory in cognitive psychology (cf. Anderson, 1982) postulates that SLA is largely a conscious process, so we begin the learning process with declarative knowledge that can then become procedural knowledge (after much practice), or none at all (Krashen, 1982; Paradis, 2009). The interesting aspect of this interface debate that rears its head every now and then is that while type of knowledge (a product) is under consideration,

Introduction

5

by attaching the dichotomies implicit versus explicit or acquired versus learned or conscious versus subconscious to the term knowledge, we have shifted the product knowledge to include a process of learning, that is, learning with or without awareness. So the theoretical question is not only whether knowledge can be identified as implicit or explicit but also how such knowledge got to be explicit or implicit. In other words, the end result (product) may not reflect the process of how the knowledge made its way into the internal system, and in order to address adequately the interface issue, concurrent or online data on learners’ processes need to be gathered instead of making extrapolations based on nonconcurrent or offline data. See how convoluted the issue can become, and yet it remains charmingly challenging and stimulating to research? In addition to other postulations, Krashen’s theory equated L2 “acquisition” with L1 acquisition and also postulated that acquisition followed a predictable order. In addition to the obvious critique of the inability to test his theory of L2 learning (“hmm, comprehensible input, to whom? input comprehensible to you may not be comprehensible to me; hmm, i plus 1, where to locate each student’s i, GPS, anyone? and what is the 1 again?” ), serious questions such as, “Do we treat our adult students like babies following a first language (L1) acquisitional trajectory?” (Krashen: look at the evidence of an apparent unchangeable acquisitional sequence, albeit based on morphemes, those little pieces that make up a word ), or “Do we intervene in an ‘appropriate’ way (we need to consider the psycholinguistic or sociocultural factors involved in learning) in their learning process?” (let us provide feedback at an appropriate point during interaction or let us put them into collaborative groups and learning will take place), still need to be more fully addressed. As I mentioned above, Krashen’s scholarly contribution to the SLA field via his Monitor Model ranks very high in my estimation. While we do have the phenomenon called “Krashen bashin’” (cf. for example, Gregg, 1984; McLaughlin, 1978 and others who took him to task), without his theoretical postulations serious research on learners’ internal processes would most likely have taken place at a later date. When you publish a study or postulate a theory or model and subsequently encourage a whole string of further investigation into the issue(s) you initiated, you have my highest respect, irrespective of any potential bashing you may receive, because in a weird sense you have contributed to a better understanding of the learning process by stimulating further and, ideally, more robust research. Around the late 80s, both teachers and researchers were beginning to contemplate the use of two types of instruction: Explicit versus implicit (cf. e.g., Scott, 1989; Shaffer, 1989). Explicit (also referred to as “deductive”) instruction laid the responsibility on the teacher to explain the grammatical rules first, keeping their fingers crossed that students did understand the explanation, before allowing them to practice the rules. Implicit (also referred to as “inductive”) instruction exposed the students to the L2 with many of the targeted grammatical or lexical items embedded in the input, and teachers also kept their fingers (and

6

Introduction

toes) crossed that students would induce these grammatical features all by themselves (given the impoverished environment of the classroom and the paucity of extensive exposure to and interaction with the L2, tough luck). Interestingly, the two studies cited above (Scott, 1989; Shaffer, 1989) actually defined inductive instruction differently. While Scott followed the definition of inductive instruction as described above, Shaffer actually oriented her participants to pay attention to the targeted linguistic items in the input, which, in my opinion, may be best described as a partially combined deductive/inductive definition. Two other terms that arose and were employed very loosely in this period were the constructs of explicit learning, that is, learning with awareness, and implicit learning, that is, learning without awareness, although at that time the construct of awareness was not independently operationalized or measured. Both terms are now playing a prominent role in current SLA research, as mentioned above. With regard to a focus on learners’ internal processes, McLaughlin (1987) posited his Cognitive Theory based on cognitive psychology tenets on the role of attention, limited attentional capacity, and types of processing (controlled versus automatic) assumed to play a role in input processing during the early stages of the learning process. The next year witnessed the first attempt to capture the L2 learning process from input (called ambient speech) > apperceived input > comprehended input > intake > integration > output (Gass, 1988, which has been refined and updated in Gass, 1997, and Gass & Selinker, 2008). The 90s began with Schmidt’s (1990) seminal article on his noticing hypothesis that brought into the SLA field a theoretical postulation that the roles of focal attention and awareness were isomorphic (two sides of the same coin) and crucial in any L2 development, and more specifically during the early stage of the learning process, namely the input-to-intake stage. This is the second milestone along the route to current unawareness or implicit learning studies. The noticing hypothesis immediately became arguably the most influential theoretical underpinning of many strands of SLA research that include input enhancement, interaction, learning conditions, output hypothesis, and so on. Making the attentional strand of research even more interesting and debatable was Tomlin and Villa’s (1994) model of input processing in SLA derived from a cognitive neuroscience perspective that did not posit any role for awareness in the initial stages of the learning process (intake), soon followed by Robinson’s (1995) model of the relationship between attention and memory that attempted to reconcile these two theoretical perspectives in addition to elaborating on the important role of memory. Appearing also were theoretical models (e.g., Gass, 1997, an update of Gass, 1988; VanPatten, 1996, updated in 2004 and 2007) of the L2 learning process that went beyond the initial stage of the learning process to include other stages postulated to occur along this process, for example, beyond intake (internalization) and output (production) (cf. Gass, 1988), and a modular framework (Truscott & Sharwood Smith, 2004, updated in 2011) that attempts to integrate

Introduction

7

language acquisition accounts with proposals of how language, and more specifically phonology and (morpho)syntax, and cognition interact. At this point, methodologically, empirical studies were still gathering data after experimental exposures or treatments. Thus, while theoretically the focus was on the learning process and not on external input or output features, product data were being employed to account for internal processes. Here was (and still is for many current studies) the classic research design: Pretest (to establish that experimental and control groups were statistically similar in ability to perform some task) > treatment or exposure (to the targeted linguistic item in the input) > immediate posttest (e.g., a recognition and/or production test, etc.). Usually the raw scores of the groups were entered into a statistical program, and if a significant difference in performance or between the means of the groups was revealed (ideally, the experimental group outperforming the control group), the researcher would jump up and proclaim to everyone that it was the treatment or type of exposure that contributed to the results, which boils down to the equivalent of assuming that participants did exactly what was expected of them during the experiment and that all variables that could have potentially affected the results were well controlled. In other words, it was inherently assumed that there was high internal validity in the study, that we could safely place our confidence in the findings, and that we could, from a pedagogical perspective, incorporate the findings into our classroom activities, etc. It was only in the latter part of the 90s that the constructs of attention and awareness began to be addressed both methodologically and empirically in an effort to investigate their effects on L2 development. While some studies employed offline measures, that is, after exposure (e.g., Robinson, 1996, 1997), other studies first methodologically established the constructs of attention and awareness before submitting the data to statistical analyses (e.g., Alanen, 1995; Leow, 1997). Data elicitation procedures such as online verbal reports or thinkaloud protocols (in which participants were requested to say aloud whatever came into their head as they processed the L2 data, that is, while they interacted with the new grammatical or lexical information in the L2) began to be employed in many studies in an effort to gather online or concurrent insights into students’ cognitive processes. These cited studies also addressed levels of awareness. Similarly, in the Vygotskyian sociocultural strand of research, a closely related phenomenon to think alouds was called inner speech (e.g., Vocate, 1994), and currently languaging, defined as “the process of making meaning and shaping knowledge and experience through language” (Swain 2006: 89), As can be seen, then, the use of concurrent data elicitation procedures in the research methodology signaled a distinct methodological shift in gathering both process (online) and product (offline) data in an effort to gather important information during and not only after the learning process. The outcome of this new methodological approach to operationalize and measure the constructs of attention and awareness are data that allow us to peek into the internal processes

8

Introduction

learners employed while interacting with the L2 input, without having to rely on researchers’ assumptions of what actually took place based on the results of postexposure assessment tasks. Should I mention that this use of concurrent verbal reports triggered another strand of research that methodologically addressed and is still addressing whether asking participants to think aloud could potentially affect their thought processes? The buzz word is reactivity (thinking aloud potentially affecting learner primary processes) for concurrent think-aloud protocols and veridicality (memory decay) for post-exposure, offline stimulated recalls that ask participants to try to recall either what they were thinking at specific points during an interaction, via a video of the interaction or verbal reports that ask participants to provide an underlying rule embedded in the input. Now, with this background in mind, let us return to the issues of implicit and explicit learning, which are both internal processes. Several studies (e.g., de la Fuente, 2015; Hsieh et al., 2015; Leow, 2000; Martínez-Fernández, 2008; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007) that began to explore the role of awareness in L2 development initiated the opportunity to investigate both theoretically and empirically the role of explicit learning in the SLA field. The 2000s began with attempts to address the construct of unawareness in L2 development (e.g., Leow, 2000), followed by at least seven other studies purporting to do the same (cf. Chan & Leung, 2014; Chen, Guo, Tang, Zhu, Yang, & Dienes, 2011; Faretta-Stutenberg & Morgan-Short, 2011, Hama & Leow, 2010; Leung & Williams, 2011, 2012, 2014). Currently, like life (and politics), the studies are divided in their support or lack thereof regarding the role of implicit learning in L2 development. I will elaborate a bit more on this debate later in Chapter 10.

Conclusion I hope you enjoyed the stroll down “awareness lane.” We shall leave this chapter with one empirically supported finding: Explicit learning is highly related to L2 development.

References Alanen, R. (1995). Input enhancement and rule presentation in second language acquisition. In R. Schmidt, (Ed.), Attention and awareness in foreign language learning and teaching (pp. 395–411). Honolulu: University of Hawai’i Press. Anderson, J. (1982). Acquisition of cognitive skill. Psychological Review, 89, 369–406. Blau, E. (1982). The effect of syntax on readability for ESL students in Puerto Rico. TESOL Quarterly, 16, 517–527. Chan, R., & Leung, J. (2014). Implicit learning of L2 natural language stress rules. Second Language Research, 30, 463–484. Chen, W., Guo, X., Tang, J., Zhu, L., Yang, Z., & Dienes, Z. (2011). Unconscious structural knowledge of form-meaning connections. Consciousness and Cognition, 20, 1751–1760.

Introduction

9

Chomsky, N. (1957). Syntactic structures. The Hague: The Netherlands Mouton and Company. Corder, S. (1967). The significance of learners’ errors. International Review of Applied Linguistics, 5, 161–170. Davies, A. (1984). Simple, simplified and simplification: What is authentic? In J. Alderson & A. Urquhart (Eds.), Reading in a foreign language (pp. 181–195). New York: Longman. DeKeyser, R. (2007). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 97–113). Mahwah, NJ: Erlbaum. De la Fuente, M. (2015). Explicit corrective feedback and computer-based, form-focused instruction: The role of L1 in promoting awareness of L2 forms. To appear in R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Doughty, C., & Varela, E. (1998). Communicative focus on form. In C. Doughty & J. Williams (Eds.), Focus on form in classroom SLA (pp. 114–138). New York: Cambridge University Press. Dulay, H., & Burt, M. (1973). Should we teach children syntax? Language Learning, 23, 245–258. Dulay, H., & Burt, M. (1974). Natural sequences in child second language acquisition. Language Learning, 24, 37–53. Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit language knowledge. Studies in Second Language Acquisition, 27, 305–352. Ellis, R. (2006). Modeling learning difficulty and second language proficiency: The differential contributions of implicit and explicit knowledge. Applied Linguistics, 27(3), 431–463. Faretta-Stutenberg, M., & Morgan-Short, K. (2011). Learning without awareness reconsidered: A replication of Williams (2005). In G. Granena, J. Koeth, S. Lee-Ellis, A. Lukyanchenko, G. Prieto Botana, & E. Rhoades (Eds.), Selected proceedings of the 2010 Second Language Research Forum: Reconsidering SLA research, dimensions, and directions (pp. 18–28). Somerville, MA: Cascadilla Proceedings Project. Gass, S. M. (1988). Integrating research areas: A framework for second language studies. Applied Linguistics, 9, 198–217. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Gass, S. M., & Selinker, L. (2008). Second language acquisition: An introductory course (3rd ed.). New York: Routledge. Gregg, K. (1984). Krashen’s monitor and Occam’s razor. Applied Linguistics, 5, 79–100. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32, 465–491. Hatch, E., Shapira, R., & Wagner-Gough, J. (1978). ‘Foreigner talk’ discourse. ITL: Review of Applied Linguistics, 39–40, 39–59. Henzl, V. (1979). Foreign talk in the classroom. International Review of Applied Linguistics, 17, 159–167. Hilgard, E. R. (1962). Introduction to psychology (3rd ed.). New York: Harcourt, Brace and World, Inc. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). A comparison of level of awareness and depth of processing in two types of instructional media (C-FTF vs. CAI): Revisiting Hsieh (2008). To appear in R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton.

10

Introduction

Krashen, S. D. (1982). Principles and practice in second language acquisition. Oxford: Pergamon Press. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware versus unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2009). Input enhancement and L2 grammatical development: What the research reveals. In J. Watzinger-Tharp & S. L. Katz (Eds.), Conceptions of L2 grammar: Theoretical approaches and their application in the L2 classroom (pp. 16–34). Boston, MA: Heinle Publishers. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mappings between forms and contextually derived meanings. Studies in Second Language Acquisition, 33, 33–55. Leung, J. H. C., & Williams, J. N. (2012). Constraints on implicit learning of grammatical form-meaning connections. Language Learning, 62, 634–662. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. Martínez-Fernández, A. (2008). Revisiting the involvement load hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. McLaughlin, B. (1978). The monitor model: Some methodological considerations. Language Learning, 28, 309–332. McLaughlin, B. (1987). Theories of second language learning. London: Edward Arnold. Medina, A. (2015). The variable effects of level of awareness and CALL versus nonCALL textual modification on adult L2 readers’ input comprehension and learning. To appear in R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Omaggio, A. (1993). Teaching language in context (2nd ed.). Boston: Heinle & Heinle. Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam: John Benjamins. Parker, K., & Chaudron, C. (1987). The effects of linguistic simplification and elaborative modifications on L2 comprehension. The University of Hawaii Working Papers in ESL, 6, 107–133. Robinson, P. (1995). Review article: Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule-search and instructed conditions. Studies in Second Language Acquisition, 18, 27–68. Robinson, P. (1997). Individual differences and the fundamental similarity of implicit and explicit adult second language learning. Language Learning, 47, 45–99. Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21, 511–556. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Sachs, R., & Suh, B-R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.), Conversational interaction in second-language acquisition: A series of empirical studies (pp. 197–227). Oxford: Oxford University Press.

Introduction

11

Schachter, J. (1974). An error in error analysis. Language Learning, 24, 205–214. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Scott, V. (1989). An empirical study of explicit and implicit teaching strategies in French. Modern Language Journal, 73, 14–22. Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 219–231. Shaffer, C. (1989). A comparison of inductive and deductive approaches to teaching foreign languages. The Modern Language Journal, 73, 395–403. Sharwood Smith, M. (1981). Consciousness-raising and the second language learner. Applied Linguistics, 2, 159–168. Sharwood Smith, M. (1993). Input enhancement in instructed SLA: Theoretical bases. Studies in Second Language Acquisition, 15, 165–179. Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts. Swain, M. (2006). Languaging, agency and collaboration in advanced second language proficiency. In H. Byrnes (Ed.), Advanced language learning: The contribution of Halliday and Vygotsky (pp. 95–108). London: Continuum. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203. Truscott, J., & Sharwood Smith, M. A. (2004). Acquisition by processing: A modular approach to language development. Bilingualism: Language and Cognition, 7, 1–20. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. VanPatten, B. (1996). Input processing and grammar instruction: Theory and research. Norwood, NJ: Ablex. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B. (2007). Input processing in adult second language acquisition. In B. VanPatten & J. Williams (Eds), Theories in second language acquisition (pp. 115–135). Mahwah, NJ: Lawrence Erlbaum. Vocate, D. R. (1994). Self-talk and inner speech: Understanding the uniquely human aspects of intrapersonal communication. In D. R. Vocate (Ed.), Intrapersonal communication: Different voices, different minds (pp. 3–32). Hillsdale, NJ: Erlbaum. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269–304.

This page intentionally left blank

SECTION 1

Theoretical Foundations

This page intentionally left blank

2 A PRELIMINARY THEORETICAL FRAMEWORK FOR THE L2 LEARNING PROCESS IN SLA

Theory is usually perceived as abstract and, as we shall see later, even the construct of learning (and even more so the dual phrase of implicit learning) can be also challenging to pin down adequately. Before I discuss explicit learning or learning with awareness in SLA, it is necessary to situate its role along the stages postulated to occur during the learning process. This chapter, then, presents a finer-grained preliminary framework of the L2 learning process that takes into account the notion of postulated stages through which the learning process passes. It also elaborates on the process of learning as comprising both processes and products. To this end, a brief description of the different stages along the framework is provided, including the input, intake, intake processing, internal system, output/ knowledge processing, and output. This global and theoretical view of the learning process has the following main purpose: It allows us to be visually aware of which stage along the learning process the construct of learning is being discussed and investigated. But first, as an introduction, let us take a look at a coarsegrained theoretical framework for the L2 learning process in SLA.

A Coarse-Grained Theoretical Framework for the L2 Learning Process in SLA In SLA, most major theoretical perspectives (I will discuss each individually later in Chapter 5), whether they propose a partial (e.g., Ellis, 2007; McLaughlin, 1987; Robinson, 2003; Schmidt, 2001; Swain, 2005; Tomlin & Villa, 1994; VanPatten, 2007) or full (e.g., Gass, 1997) theoretical account of the learning processes postulated for SLA, agree on the following coarse-grained framework:

INPUT > Intake > Internal system > Output

16 Theoretical Foundations

in which INPUT is the second or foreign language (L2), which viewed more narrowly contains the linguistic and semantic features learners need to pay attention to, Intake is a subset of input that may be taken in by the learner but not necessarily processed further (it can disappear!) into the Internal system, the place where what is learned, correctly or incorrectly (often referred to as knowledge), and stored, and Output is the learner’s production of the L2 and is assumed to represent the L2 knowledge the learner has at that point in time in his/her internal system. Now, let us discuss briefly the issue of your level of awareness involved in processing this coarse-grained theoretical framework. From a researcher’s perspective, I am positive we can find several levels of awareness: (1) Some of you noticed, that is, paid attention with a low level of awareness, to the fonts of INPUT, Intake, and Output. That is it! (2) Some noticed the fonts and thought, “Interesting, hmm, he is being fancy here,” without arriving at any connection to the purpose of the fonts or what they represented, and (3) MANY of you made the full or partial connection of INPUT as being a large amount of the L2 that our students are usually exposed to, noted that only a subset, represented by a smaller font, is actually taken in, and that output is clearly only a subset of what is premised to be stored in the developing internal system. As a researcher, I can also surmise that readers with some background knowledge of SLA would have processed this information at a deeper level and shown a higher level of awareness due to the fact that they made connections to knowledge already stored in their internal system. This process is not learning but activation of prior knowledge, a strengthening of the cognitive bonds between incoming information and existing knowledge. Readers with no background do have the potential to learn and would be able to hold this information in working memory, with the potential of the information being discarded or further processed by deeper processing and/or further similar information. Please note that for humanistic purposes, I have not identified the reader(s) who either paid little attention to or minimally processed or showed no awareness of the presence of these different fonts (they were only skimming or paying peripheral attention).

A Finer-Grained Theoretical Framework for the L2 Learning Process in SLA The framework presented above is relatively coarse-grained, and I have proposed a more fine-grained version that takes into account not only the notion of stages through which the learning process passes but also the notion of learning to include both processes and products (cf. Leow, 2015):

The L2 Learning Process in SLA

17

Stages of the Learning Process in SLA: Of Processes and Products INPUT

{

>

Stage 1

INTAKE >

Stage 2

Stage 3

(Product) (process) (Product) (process) (input)

(input)

(intake)

(intake)

INTERNAL SYSTEM

Stage 4

(product) (L2 knowledge)

>

}

OUTPUT

Stage 5

(process)

(product)

(L2 knowledge/output) (representative L2 knowledge)

As can be seen, Input and Output are external products, whereas there are minimally five internal stages comprising three processes (input processing, intake processing, and knowledge processing) and two products (intake and L2 knowledge). Learning as a process, which occurs internally, occurs at Stages 1 (input processing), 3 (intake processing), and 5 (L2 knowledge processing), while learning as a product (what is learned) is presented internally at Stage 4 (L2 knowledge), and externally as representative L2 knowledge. Knowledge at this point is termed “representative” L2 knowledge, given that it is accepted in the field that learner output does not reflect the totality of what is stored in their developing L2 grammar. Stage 2 represents intake as an initial product kept in working memory that may be retrieved via receptive tests (e.g., recognition or multiplechoice etc.), but has yet to be further processed and internalized or learned. I am going to elaborate briefly on each process and product below. They will be further elaborated in later chapters.

Input L2 input broadly refers to the second or foreign language learners are exposed to, be it aural or written. Input may be authentic, that is, it is used by native speakers for oral and written communication, or it may be pedagogical, that is, it has been modified for L2 learners mostly for use in the formal classroom setting.

Input Processing Input processing usually refers to the processing of both content and linguistic data found in the input. Input processing is postulated to represent the initial stage of the learning process and is theoretically placed between the input-tointake stage of the learning process. This initial process finds its roots in the information-processing strand of research in the cognitive psychology field in the 1970s, which viewed the human mind as a kind of processor constantly engaging in mental processes (cf. McLaughlin, 1987). The notion of input processing in SLA, then, appears to draw from the metaphor of a limited capacity channel or processor (e.g., Broadbent, 1958;

18 Theoretical Foundations

Kahneman, 1973; Norman, 1968; Treisman, 1964), postulated by what were called capacity theories. I will elaborate more fully on these theories in Chapter 3, but what is important to note here is the general idea that (1) there is competition for attentional resources to be paid to incoming information, (2) what is paid attention to may depend on the amount of mental effort required to process the incoming information, and (3) the allocation of attentional resources to incoming information may come from a pool of cognitive resources. Hopefully to pique your interest, I am going to take a slightly different take on the notion of attentional resources as a central function of input processing, but you need to wait until later chapters, as this take unfolds during the chapters. If you cannot wait, you may skip to Chapter 12. Even though input processing in SLA is well known and discussed, it is not as straightforward as it may seem. While some theoretical perspectives view such initial processing of the L2 data as minimally dependent upon attention and some low level of cognitive effort, depth of processing, awareness or lack thereof (McLaughlin, 1987; Robinson, 2003; Schmidt, 2001; Tomlin & Villa, 1994), others appear to assign higher depths of processing during this initial stage (Gass, 1997; Gass & Selinker, 2008; Swain, 2005; VanPatten, 2007). However, what appears uncontroversial in input processing is that L2 learners need to employ selective attention minimally in order to isolate and process some content information or linguistic feature(s), with or without awareness, in the incoming input.

Intake The earliest reference to the concept of intake was made by Corder (1967), who pointed out that there is a fundamental difference between input and intake and that not all input may be attended to by the learners, which appears to fall neatly under the metaphor of L2 learners as limited capacity processors of information (cf. Leow, 2012). As mentioned above, most major theoretical SLA frameworks have adopted this distinction between input and intake by positing at least one intermediate stage through which the input L2 learners receive and process must pass before any or all of it can be learned (or acquired). Given that intake occurs before any learning is assumed to take place, intake crucially does not represent internal L2 knowledge, which occurs further along the learning process. However, as pointed out in Leow (2012), what constitutes intake is not clearly defined in the field. Here we go again! Faerch and Kasper (1980) proposed two types of intake: (1) Intake for communication and (2) intake for learning, which refers specifically to an eventual change in the learners’ interlanguage or current state of linguistic knowledge. Gass (1988; cf. also Gass, 1997) defined intake as a process that assimilates linguistic material and involves mental psycholinguistic activity prior to being incorporated into the L2 learner’s interlanguage. Chaudron (1985) viewed intake as part of a series of cognitive stages ( preliminary intake to final intake) through which input passes until it is fully incorporated

The L2 Learning Process in SLA

19

in the L2 learners’ grammar. Slobin (1985) also proposed two types of processes: (1) Those involved in converting input into stored data that may be used for constructing language, and (2) those used to organize stored data into linguistic systems. The first type of process is what has been empirically addressed in most studies that have investigated the effects of some variable on learners’ intake. This type of intake has been defined as follows: [T]hat part of the input that has been attended to by the second language learners while processing the input. Intake represents stored linguistic data which may be used for immediate recognition and does not imply language acquisition. (Leow, 1993: 334) Similarly, VanPatten (2004: 7) writes that intake is the subset of input that has been processed in working memory and made available for further processing (i.e., possible incorporation into the developing system) . . . I do not use intake to refer to internalized data. VanPatten goes on to postulate that any linguistic data perceived or noticed but not processed in terms of making the form-meaning/function connection is dropped from further processing (p. 9). Based on the different perspectives of what constitutes the construct of intake, it appears that (1) due to L2 learners’ cognitive, attentional constraints, only a subset of input can be converted into intake, (2) not all intake is further processed, and (3) what is processed may be incorporated into the developing L2 grammar. Perhaps these three stages faithfully reflect the differences in our students’ subsequent performances after exposure to new linguistic data. Intake, then, can be viewed from different angles, dependent upon the stage along the learning process and in association with the depth of processing involved. Consequently, initial intake based on simple recognition is postulated to occur before any learning is assumed to take place (Leow, 2012).

Intake Processing Given the theoretical perspective that not all intake is further processed, it is not difficult to assume that intake can be viewed and measured as a product (Stage 2) that can be either discarded from working memory or processed further (Stage 3) for potential incorporation into learners’ internal system (Gass, 1997; Slobin, 1985). Postulated to function in this intake processing component are variables that include data-driven processing and conceptually-driven processing (Robinson, 1995), formmeaning connection (VanPatten, 2004), hypothesis formation and testing, hypothesis rejection, hypothesis modification, and hypothesis confirmation (Gass, 1997).

20 Theoretical Foundations

Internal System This is the location where the L2 knowledge is stored in the brain. Gass (1988, 1997) and Gass and Selinker (2008) postulate that there are at least two outcomes that are derived from the intake processing stage, both of which are a form of integration (Stage 4). According to Gass, one is the development per se of a learner’s second language grammar and the other is storage of some linguistic data awaiting additional information before it becomes integrated. Integration is continuous and the integration component does not function as an independent unit, given that the model “is dynamic and interactive, with knowledge itself being accumulative and interactive” (Gass, 1997: 25). Important variables involved in integration include different levels of analysis and reanalysis from storage into the grammar and within the grammar itself. L2 knowledge, then, may be viewed and measured as a product in progress or fully integrated in the internal system.

Knowledge Processing Knowledge processing (Stage 5), which is the final stage of the continuum of internal processing, is one area that has not received much investigation (cf. Swain’s Output Hypothesis, 2005). This stage deals with learners’ manipulation of the L2 linguistic knowledge, together with other knowledge bases that govern, for example, phonological, syntactic, semantic, cultural, pragmatic, and discourse features that register aspects of the L2 language that are employed to produce the L2. Depending upon level of language proficiency, this stage may be characterized by levels of fluency and accuracy.

Output Output is the L2 data (product) assumed to reflect different linguistic aspects of learners’ internal grammar system or interlanguage. It is any visual or oral manifestation or grammatical description of the learned L2 knowledge.

Conclusion This chapter has presented a finer-grained preliminary framework that takes into account the notion of postulated stages through which the learning process passes. It also elaborates on the process of learning as comprising both processes and products. The notion of stages permits us to be visually aware of which stage along the learning process the construct of learning is being discussed and investigated. Viewing the learning process in terms of processes and products is important to differentiate between learning as a process, that is, an event taking place, and learning as a product, that is, something learned or internalized or

The L2 Learning Process in SLA

21

produced. The framework, as will be elaborated in Chapter 12, can be used to empirically address and report each process or product. Let us now visit theoretical models outside the SLA field that have impacted many of the theoretical underpinnings in SLA.

References Broadbent, D. (1958). Perception and communication. London: Pergamon Press. Chaudron, C. (1985). Intake: On models and methods for discovering learners’ processing of input. Studies in Second Language Acquisition, 7, 1–14. Corder, S. (1967). The significance of learners’ errors. International Review of Applied Linguistics, 5, 161–169. Ellis, N. C. (2007). The associative-cognitive CREED. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 77–95). Mahwah, NJ: Lawrence Erlbaum. Faerch, C., & Kasper, G. (1980). Process and strategies in foreign language learning and communication. The Interlanguage Studies Bulletin—Utrech, 5, 47–118. Gass, S. M. (1988). Integrating research areas: A framework for second language studies. Applied Linguistics, 19, 198–217. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Gass, S. M., & Selinker, L. (2008). Second language acquisition: An introductory course (3rd ed.). New York: Routledge. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall. Leow, R. P. (1993). To simplify or not to simplify: A look at intake. Studies in Second Language Acquisition, 15, 333–355. Leow, R. P. (2012). Intake. In P. Robinson (Ed.), The Routledge encyclopedia of second language acquisition (pp. 327–329). New York: Taylor & Francis. Leow, R. P. (2015). Implicit learning in SLA: Of processes and products. In P. Rebuschat (Ed.), Implicit and explicit learning of languages. Amsterdam: John Benjamins. McLaughlin, B. (1987). Theories of second language learning. London: Edward Arnold. Norman, D. A. (1968). Toward a theory of memory and attention. Psychological Review, 84, 231–259. Robinson, P. (1995). Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Robinson, P. (2003). Attention and memory in SLA. In C. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 631–678). Oxford: Blackwell. Schmidt, R. (2001). Attention. In P. Robinson (ed.), Cognition and second language instruction (pp. 3–32). New York: Cambridge University Press. Slobin, D. (1985). Crosslinguistic evidence for the language-making capacity. In D. Slobin (Ed.), The crosslinguistic study of language acquisition: Theoretical issues (Vol. 2, pp. 1157–1249). Hillsdale, NJ: Lawrence Erlbaum. Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 471–483). Mahwah, NJ: Lawrence Erlbaum. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16 (2), 183–203.

22 Theoretical Foundations

Treisman, A. (1964). Verbal queues, language, and meaning in selective attention. American Journal of Psychology, 77, 533–546. VanPatten, B. (2004). Processing instruction: Theory, research, and commentary. Mahwah, NJ: Lawrence Erlbaum. VanPatten, B. (2007). Input processing in adult second language acquisition. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 115–135). Mahwah, NJ: Lawrence Erlbaum.

3 THEORETICAL FOUNDATIONS FOR THE ROLE OF ATTENTION IN LEARNING FROM NON-SLA FIELDS

One of the phrases we heard as we grew up, especially in the classroom setting, was, “Pay attention!”—yet have we thought seriously about what this admonition really means? Have we paused to contemplate the value of attention in every single task that we perform in our lives? Are we aware that when we attend to some information in the L2 input we raise our perception, which may then lead to some of the information being taken into our short-term or working memory, which may then lead to potential internalization of such information, ultimately leading to learning and remembering? Indeed, what do we mean when we use the term “attention”? Do we mean it is a single entity or mechanism that originates from one or more pools of resources, if viewed from a psychological perspective? Perhaps it comprises more than one entity or mechanism, if viewed from a cognitive scientist and neuroscientist perspective, which are associated with a finite set of modal-specific brain processes all working together with other brain processes to fulfill specific tasks. As way back as 1890, William James provided the well-cited definition of attention as “the taking possession by the mind, in a clear and vivid form, of one out of what seem several simultaneously present objects of trains of thought. Focalization, concentration, of consciousness are of its essence. It implies withdrawal from some things in order to deal more effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state which in French is called distraction, and zerstreutheit in German” (pp. 403–404). Clearly, the process of attention may not be as simple as saying, “Pay attention,” and it is useful to be aware of the theoretical foundations of this process before addressing how it is viewed in the SLA field. The SLA research field is, relatively, like a baby when compared to other fields, and we like to assume that the other fields are more established in their research paradigms and frankly know what they are doing (well, sort of, as

24

Theoretical Foundations

you will read later on). Consequently, many second language acquisition (SLA) researchers have looked mainly to the field of cognitive psychology or science and cognitive neuroscience to provide an explanation of or theoretical account for the role cognitive processes play in SLA (e.g., Bialystok, 1978; DeKeyser, 2007; Robinson, 1995; Schmidt, 1990; Truscott & Sharwood Smith, 2011; VanPatten, 2004). Given that this book is situated within the human information processing framework of attention and/or awareness in the learning process, I shall focus on such models and even provide an overall description of the process before we begin to discuss the various tenets of non-SLA attentional models. In this way, we can use this general knowledge to access the finer details of said models. To this end, this chapter discusses succinctly some of the major theoretical models of attention in cognitive psychology/science and neuroscience. From cognitive psychology/science, these models include filter theories, capacity and non-capacity models (including the notions of selective and focal attention), controlled versus automatic processing, and Wickens’ model of the structure of multiple resources. From neuroscience, Posner’s research on the relationships between attentional networks and other cognitive networks in the brain is reported. The chapter also discusses the process of attention in relation to short- and long-term memory, working memory, and whether learning without attention is possible.

Overview of an Attentional Model of Information Processing The first step is perception of information that is conveyed by our senses. Some researchers (e.g., Gass, 1997, 1998) go farther and associate such perception with some type of prior knowledge related to the sensory data received (apperception). A selection of some aspect(s) of the sensory data (via peripheral, selective, focal attention) is then made potentially based on what is perceived as important to the learner. This selected information (intake?) enters into the learner’s working memory (read short-term memory) and can potentially remain some time or be discarded from memory. For this selected information to move forward into the learner’s internal system (learning?), it needs to remain minimally some time in this stage to be further processed or rehearsed. What has been further processed may then be available for output (production). Here is another version of the same process with different terminology. External stimuli activate memory representations that remain in memory for a short period of time. Type of attention (peripheral, selective, focal) determines the level of activation and quality of these representations and allows this information to be available across several networks in the brain. Now that we have a general idea of the basic sequence of information processing, let us discuss succinctly some of the major theoretical models of attention in cognitive psychology/science and neuroscience. Keep in mind that these theories

The Role of Attention in Learning 25

were generally supported by visual attention and take a close look at the measures employed to address the role or function of attention.

Cognitive Psychology/Science Filter Theories The early theories of attention were what were known as the filter theories of attention (e.g., Broadbent, 1958; Norman, 1968; Treisman, 1964), although attentional theories date back to 18th-century philosophy (cf. Neumann, 1996, for an excellent review of attentional theories). Let us take a closer look at Broadbent’s influential filter theory, since most reviews of theory building do not report the assumptions underlying the tenets of some theoretical underpinning. This theory actually originated from Broadbent’s interest in studying the working environment in an aviation control tower in which flight controllers were simultaneously communicating vocally with pilots in several planes. The inspirations of theory building are quite interesting (cf. Wickens, 1980, 1989, 2007, discussed below), aren’t they? Simulating this scenario, Broadbent presented his participants with verbal questions that they needed to answer based on information visually presented (Broadbent, 1952a, 1952b, 1952c). He found that providing a temporal overlap between two different messages impeded participants, but that informing participants that one of the two messages was irrelevant lessened this interference. So, how does this translate into theory? Well, he postulated three tenets of his filter theory. First, surely interference was due to a limited capacity central channel, which could only handle so much information at the same time (cognitive overload). However, we can reduce the potential of unwanted information entering the central channel by postulating the presence of a filter that would prevent this unwanted information from moving forward. Finally, based on the results of ‘split-span’ or dichotic experiments (in which different types of sequences—for example, three-digit sequences—are provided simultaneously to each ear) that revealed participants’ ability to first remember one digit and then the other (Broadbent, 1958), Broadbent postulated the role of short-term memory, since the second set of digits was presumably held in a short-term storage system. Filter theories, then, viewed the processing of incoming information as moving along a serial path comprising several storage structures (e.g., sensory register > detection device > short-term memory). More specifically, Broadbent’s (1958) model postulated that as information passes through an early sensory register, a selective filter selects specific information based on the form of the message and conveys this information to a detection device that, at this point, assigns semantic value to the message before it is encoded into short-term memory. Broadbent’s model, mainly based on acoustic processing, postulated that events selected to pass through the attentional channel might be their “physical intensity” (p. 297).

26

Theoretical Foundations

Capacity in this model, according to Neumann (1996), was conceptualized as the transmission capacity of a channel, while it was a filter that performed the selection of blocking or attenuating (reducing) information from moving forward. This model was aptly called the “bottle-neck” model, if you imagine the selective filter aspect of the model as two lanes on a highway becoming one lane. It is interesting to see not only how each successive model began to build upon the limitations or critiques of the previous ones, but also how additional factors began to appear alongside the construct or process of attention. Treisman’s (1964) attenuated filter model argued that Broadbent’s model was too restrictive since, based on findings gleaned from dichotic listening tasks (again!) that revealed attention to both attended information and information provided in an “unshadowed” ear, both pieces of information were registered by participants. He postulated that his selective filter, or what he called attenuation control, unlike Broadbent’s model, allows for processing of all information (both the form and meaning of the message) before being relayed to the detection device. Treisman’s model was subsequently critiqued for the potential for cognitive overload occurring during such an early pre-attentive stage of the attentional process, and this led to what is called the late selection model (Norman, 1968) that removed the previous storage structures of selective filter and attenuation control. A late selection model views all incoming information to be processed in parallel, and a decision to process it beyond short-term memory is made in this very storage structure based on its importance. Whatever aspect of the information is deemed to be important is further elaborated or rehearsed, while the rest of the information is quickly discarded. A simple term for this phenomenon is what is known as “selective or focal attention” versus “peripheral” attention, which may be broadly exemplified by the description of “glancing out of the corner of your eye.” Logical extensions of Norman’s (1968) late selection model of attention and sensory processing led to what are called capacity models, which have provided quite a strong foundation for several theoretical underpinnings in the SLA field.

Capacity Models Three features of the filter theory began to emerge, namely, (1) the metaphor of a limited capacity channel (cf. the L2 learner as a limited capacity processor of incoming information), (2) learners’ voluntary control of the deployment of their limited attentional resources toward the incoming information, and (3) the amount of effort related to the nature of the task being performed. Put another way, there was the assumption that the human brain could only handle so much information at any one given time and, consequently, due to insufficient resources and the processing limitations of the human brain, incoming information was selected by the attentional system. Think of computer technology, upon which the capacity perspective was built, by the way, in the 1960s, when its capacity was relatively limited when compared to current computer capacities

The Role of Attention in Learning 27

today. Such attention and processing were also affected by what specifically the learners were doing. While capacity theories agree that there is competition for attentional resources to be paid to incoming information, they went a step further by postulating that what is paid attention to may depend on the amount of mental effort required to process the incoming information. According to Neumann (1996), Kahneman’s (1973) capacity model of attention, in which capacity was now conceptualized as a general, unspecified processing capacity, became the dominant idea and led to the dual-process theory of controlled versus automatic processes (e.g., Posner & Snyder, 1975; Shiffrin & Schneider, 1977), to be discussed later. The metaphor for capacity now began to be one of a supplier, while selection was its allocation. This model, dependent on the learner’s state of arousal, postulated the allocation of attentional resources from a pool of cognitive resources to incoming information. Whereas the filter theories viewed an inevitable competition for the allocation of attentional resources for incoming information, Kahneman’s capacity model allows the possibility of dividing the allocation of resources to different aspects of incoming information. According to Kahneman, performance may not be negatively affected once the state of arousal is adequate and the task demands are not overwhelming. Put another way, the filter theory is like giving our kids money (attentional resources with the assumption that we have limited cash flow) and telling them to buy only one item (in the store/input), while the capacity theory is telling them that they may buy more than one item (in the store/input with the assumption that we have unlimited cash flow), and this is dependent upon whether they are interested in doing so, that is, going to the store in the mall. Measures employed to test the unspecified capacity notion included responses gleaned from a probe stimulus (e.g., pressing a button when a tone was heard) compared to similar response, but this time the probe stimuli was simultaneously presented with the identification of a visual stimulus that was supposedly engaging participants’ processing system. However, the concept of unspecified capacity fell in popularity due to evidence indicating that simultaneously performing two tasks that employed the same processes did not suffer from interference (e.g., Posner & Boies, 1971). This concept was then replaced by the notion of multiple, specific resources, while maintaining the notion of effort and diminishing the focus on selection. To this end, Wickens’ (e.g., 1980, 1989; cf. also Allport, Antonis, & Reynolds, 1972; Navon & Gopher, 1979) influential model of the structure of multiple resources (later known as the Multiple Resources Model of divided attention to task demands, cf. Wickens, 2007) expanded Kahneman’s (1973) single pool of attentional resources model of attention to include the allocation of attentional resources from multiple pools and an additional focus on the nature of the task being performed. Based on the different locations of these attentional resources along three intersecting dimensions of resource systems (cf. Wickens, 1980, for further elaboration), Wickens argued that the difficulty level of two tasks

28

Theoretical Foundations

performed simultaneously may depend on whether the attentional resources are coming from the same pool (serial processing) or different pools (parallel processing). For example, serial processing (try participating in two conversations at the same time) is a much more demanding task than parallel processing, which may be exemplified by driving a car and reading the billboards at the same time. However, Wickens concedes that concurrent processing may be possible in serial processing if one of the tasks has been automatized, that is, it has been practiced many times, thus freeing up additional resources for the other task. What needs to be kept in mind (and if you recall Broadbent’s source of his “bottle-neck” model) is that this Multiple Resources Model (together with the SEEV (selection, effort, expectancy, and value) model of selective attention that also addressed the early stage of the information processing sequence, cf. Wickens, Goh, Helleberg, Horrey, & Talleur, 2003) was initially developed to represent attention via scanning in dynamic visual worlds such as driving (Horrey, Wickens, & Consalus, 2006) and flying (Wickens et al., 2003). In addition, the early resource theories have been critiqued for their use of dual-task data that may rely on too small a number of basic resources to address observed patterns of interference or to be inadequate to tease out performance-resource functions, unless it is known in advance that two tasks are both subserved by the same resource (Neumann, 1996). It is also interesting to note that the 1980s witnessed the explosion of studies investigating different and specific attentional mechanisms and their functions, evidenced in the field of visual attention, the birth of the cuing paradigm, and the emerging impact of connectionism on attention theory by making an association between attention and neural activity in specific parts of the brain.

Non-Capacity Models It is to be noted that other attentional models began to question the overall concept of limited attentional capacity, that is, the need for selection is a consequence of such a capacity. For example, instead of focusing on the interference in processing resulting from the so-called limited capacity, Sanders’ (1983) cognitive-energetical model approached the issue from a psychophysiological perspective and zeroed in on the local aspect of attention (e.g., investigating the effect of sleep loss on arousal) instead of a global one that attempts to explain all types of attentional phenomena. Neisser (1976) questioned the existence of any relationship between selective attention and brain capacity and, like Sanders, viewed interference in the performance of dual tasks not as a competition for limited resources but as local difficulties arising from the performance of one task having an effect on the other (cf. Allport, 1980, for relatively similar views). This line of theorizing on interferences resulting from problems of coordination and control was continued (e.g., Allport, 1993), but now these problems were not viewed as the causes of interference. According to these researchers, there are specific mechanisms that are designed to deal with these problems, if

The Role of Attention in Learning 29

one were to consider the functional characteristics of the central nervous system (cf. the modularity of the brain, the potential for massive parallel processing, etc.); they provide the need for selection, while limited capacity is a byproduct of such selection (cf. Neumann, 1996). It is noted that these theories were mainly based on visual attention.

Summary These attentional theories, discussed above, arguably underlie in some way several of the theoretical underpinnings postulated for the L2 learning process in SLA, with special focus on the perception that the basic feature of attention is the concept of limited (un)specified capacity and that the main function of selection was to alleviate potential cognitive overload taking place due to this capacity. However, it is important to note that three important trends began to take place in the 80s, namely, the expansion of additional functional roles for selection beyond alleviating the limited capacity, a shift from dual-task interference to sensory attention (and especially the visual modality), and the emerging impact of connectionism on attention theory (Neumann, 1996). Indeed, connectionist models of visual attention began to view the effect of attention as one more piece to the already existing units in the brain that correspond to the selected stimuli, thereby reinforcing the strength of this unit. As a consequence, studies began to address specific attentional mechanisms and their accompanying functions, and the popular cuing paradigm (e.g., Posner, 1980; Posner, Snyder, & Davidson, 1980) began to be employed in the field of visual attention to investigate learner attentional shifts and their effects. I will elaborate below on this new shift in focus.

Neuroscience The most cited source in this field of neuroscience in relation to attention is the work of Posner and his colleagues (Posner 1992, 1994, 1995; Posner & Petersen, 1990). Note that these neuroscientists were primarily interested in (1) identifying the locations of different attentional processes functions in the brains of both humans and animals, (2) examining the relationships between attentional networks and other cognitive networks, and (3) using this information to treat pathologies linked to attentional disorders. Three main attentional networks (posterior, anterior, and vigilance) were identified in the brain and associated with their individual functions (Posner & Petersen, 1990) in the process of attention. As reported in Simard and Wong (2001), the posterior network is found in portions of the parietal cortex, associated thalamic areas of the pulvinar and reticular nuclei, and parts of the midbrain’s superior collicus, and its associated attentional function is to orient attention to sensory stimuli, especially visual locations in visual space. The anterior network

30

Theoretical Foundations

is located in the areas of the mid-prefrontal cortex, including the anterior cingulate and the closely related but more superior supplementary motor area, and its associated attentional function is to detect target events, whether sensory or from memory. The vigilance network involves norepinephrine input to the cortex and is most active in the right front lateral lobe, and its associated attentional function is to maintain the alert state. That was tough to process, right? Just take away the information that the prefrontal cortex is assumed to play an important role in learning, as you will see later. Let us now take a quick look at the experimental methodology employed to investigate the process of attention in this field. One of the more popular techniques over a decade ago was the positron emission tomography (PET) that, like the current iMRI technique employed currently in this field, provides scans of brain activity as a person or animal is performing an activity. Detectors are placed on the head, a small amount of radioactivity is introduced into the body, and the behavior of the blood flow in different brain regions is registered based on the type of activity. For example, when the animal or human being had to orient their attention from one visual location to another, an increase of blood flow was registered in the components of the posterior network, signaling that this network was activated to perform this action. Similarly, when a target stimulus was detected, blood flow increased in the anterior network, signaling activation of this network. The attentional function of alertness or being in an alert state was found in the vigilance network in the right frontal lobe. One major feature of neuroscience is the belief that all networks or nodes are interrelated and that one activity in one network or node affects similarly related networks or nodes, which results in faster and more efficient processing. According to Posner and his colleagues, in spite of the separate attentional functions of the three separate networks, they are anatomically linked and consequently work together to carry out their individual main function of alerting, orienting, and detecting. They also caution that the networks may function independently under special circumstances and this may depend upon the nature of the task and the amount of cognitive effort required to perform the task. Posner and his colleagues’ work is the inspiration of Tomlin and Villa’s (1994) model of input processing in SLA.

Summary Up to this point, we have discussed the early attentional models that have impacted several of our current SLA theoretical underpinnings. As you may have noticed, I have also highlighted the what (perceptual or visual attention to nonlanguage stimuli) and the how (type of tasks employed such as dual or dichotic tasks) given that, in some instances, extrapolating the postulations and empirical findings of non-SLA fields to that of SLA need to be viewed with some caution. Now let us discuss the issue of memory.

The Role of Attention in Learning 31

Attention and Memory Attention is also linked to some kind of memory, and it is useful to touch on two types of memory stores employed in capacity theories, namely, short-term memory store (STM) vs. long-term memory store (LTM).

Short-Term Memory Store (STM) vs. Long-Term Memory Store (LTM) Capacity theories are rooted in two important distinctions to describe skills development and performance: (1) short-term memory store (STM) vs. longterm memory store (LTM) and (2) controlled versus automatic processes, which were originally applied to visual detection and search phenomena. These two distinctions account for the learning, storage, and production of language. According to Shiffrin and Schneider (1977: 155), memory is “conceived to be a large and permanent collection of nodes, which become complexly and increasingly inter-associated and inter-related through learning.” Each node has a set of informational elements and is usually inactive and passive. When the system of interconnected nodes is in this state, it is called long-term store. When some of these nodes are activated by some external stimulus, these activated nodes refer to short-term store. Short-term memory store has its nodes activated in memory at the same time and is quite limited in its capacity to deal with incoming information. Learning takes place when information travels from STM and is linked to existing nodes in LTM to form new associations. Whether these two stores comprise two separate systems and stores (Shiffrin, 1993) or one interconnected storage place (Carlson, Khoo, Yaure, & Schneider, 1990) still remains to be resolved. According to Shiffrin and Schneider, these nodes can be activated in two ways, usually referred to the automatic and controlled modes of information processing. Automatic processing is a learned response generated by a consistent activation of the same input to the same node(s) over a long period of time. Because automatic processes are associated with almost the same set of interconnected nodes, once learned, they occur quickly, require a minimum of effort and attention, do not use limited capacity resources, occur without the learner’s attention or control, and are difficult to suppress or modify. Does anyone see the connection with the famous saying of “practice makes perfect,” though we like to throw in the descriptor “meaningful” before “practice” in this saying to differentiate ourselves from the rote memorization and repetition practice of the Audio-lingual Method? On the other hand, controlled processes require a large amount of cognitive effort, are generally conscious, slower, use limited capacity resources, require the conscious attention of the learner and permit only a limited amount of features to be attended to by learners (we will go into more detail in McLaughin’s (1987) cognitive theory for SLA in Chapter 5). The next time you are teaching a beginning

32

Theoretical Foundations

L2 class, take a closer look at your students’ facial expressions as you talk to them in the L2 for the first, or second, or third time, and try to calculate how quickly they appear to be cognitively overloaded trying to process mostly new information in the input. Then, after a couple of weeks of constant exposure to the L2, see whether they appear capable of taking in more information from the input. By the way, we are speaking of the majority of the students. The distinction between controlled and automatic processes remains a research area of interest in many cognitive fields. For example, given that this theory was originally applied to visual detection and search phenomena, cognitive neuroscientists (e.g., Barber & Carreiras, 2005; Batterink, Karns, Yamada, & Neville, 2009; Newman, Pancheva, Ozawa, Neville, & Ullman, 2001; Yamada & Neville, 2007) have investigated whether it may be applicable in the study of language processing, which is hypothesized to involve both automatic and controlled mechanisms (e.g., Neely, 1991). The classic research design in this field of research to address the role of both automatic and controlled processes in language is the use of behavioral priming designs, and the most popular type of priming is semantic priming. In semantic priming studies, a target word is preceded by either a semantically related or unrelated prime (e.g., the target word is beach, the semantically related preceding prime is sand, while the unrelated prime is coconut). Results typically reveal that faster response times and fewer errors are associated with target words that are preceded by a semantically related prime. Neely (1991) provided three mechanisms to account for priming effects reported in the literature. The first mechanism is automatic spread of activation (ASA) of closely related memory representations sharing strong links with each other. Activation of a given node spreads to associated representations resulting in easier processing, reduced reaction times, and fewer errors. This mechanism is automatic, fast, and outside the learner’s control. The second mechanism is expectancy-induced priming, triggered by a related prime that generates an expectancy set of potential targets related to the prime. Subsequent processing of targets belonging to the expectancy set is then facilitated. The third mechanism is post-lexical priming, for example, using both prime and target to access memory instead of the target alone. Post-lexical priming involves processes that take place after the representation of the target has been accessed. Unlike ASA, both expectancy-induced priming and post-lexical priming are thought to be controlled processes. It is useful to describe briefly the current methodological design of these cognitive neuroscientist studies on language processing. One technique is the recording of event-related potentials (ERPs). Given that ERPs have “excellent temporal resolution and do not depend on overt behavioral responses, and thus are sensitive measures of real-time language processing” (Batterink et al., 2009), distinct ERP components have indexed semantic and syntactic processing, establishing the existence of two different neural systems serving these two subsystems (cf. neurocognitive models of the L2 such as Paradis, 2009 and Ullman, 2004).

The Role of Attention in Learning 33

As Batterink et al. reported, according to Kutas and Hillyard (1980), ERP responses to words that violate semantic expectancy are characterized by a negativegoing component that peaks approximately 400 milliseconds post-stimulus, with a posterior and bilateral distribution. This component is known as the N400, and given that larger amplitude N400 responses are associated with words that are semantically unexpected, it has been hypothesized that the N400 component reflects semantic processes of lexical integration (Friederici, Pfeifer, & Hahne, 1993). On the other hand, the processing of syntactic information is indexed by ERP components that differ in distribution and timing. According to cognitive neuroscientists (e.g., Friederici et al., 1993; Hagoort, Brown, & Groothusen, 1993; Neville, Nicol, Barss, Forster, & Garrett, 1991), the classic pattern elicited by syntactic violations is a biphasic response. The first phase occurs during an early time window (between 100 and 500 milliseconds), known as the left anterior negativity (LAN) and consists of a negativity that is usually maximal over the left anterior scalp. This waveform is then followed by a late positivity, broadly distributed over posterior sites, known as the P600. Friederici (2002) postulated that these effects may index distinct phases of language comprehension. In other words, while the LAN may index more automatic processes associated with syntactic processing, for example, the building of an initial syntactic structure based on word category information, the P600 may reflect later, more controlled mechanisms associated with reanalysis and repair of the initial syntactic structure when new words that cannot be easily incorporated into the initially built syntactic structure are encountered. As reported in Batterink et al. (2009), there is convergence between clinical and neuroimaging evidence in the cognitive neuroscience field that ERP components—indexing that semantic and syntactic processing are distinct in both latency and distribution—indicate that these subsystems are mediated by non-identical mechanisms and draw upon at least partially dissociable neural substrates (Friederici, Opitz, & von Cramon, 2000; Newman et al., 2001; Ni et al., 2000). Consequently, it may be assumed that automatic and controlled processes may not play equal roles in semantic and syntactic language processing. Let us take a look at a recent neurocognitive study on L2 learning (MorganShort, Sanz, Steinhauer, & Ullman, 2010). This study employed an artificial language learning paradigm (BROCANTO2) together with a combined behavioral/ event-related potential (ERP) approach to examine the neuro-cognition of the processing of gender agreement, an aspect of inf lectional morphology that is problematic in adult L2 learning. Forty-one participants learned to comprehend and speak an artificial language under what the researchers describe as either an explicit (classroom-like) or implicit (immersion-like) training condition, in which they received extensive comprehension and production practice. In each group, both noun-article and noun-adjective gender agreement processing were examined behaviorally and with ERPs at both low and higher levels of proficiency. Results showed that the two groups learned the language to similar levels

34

Theoretical Foundations

of proficiency but showed somewhat different ERP patterns. At low proficiency, both types of agreement violations (adjective, article) yielded N400s, but only for the group with implicit training. Additionally, noun-adjective agreement elicited a late N400 in the explicit group at low proficiency. At higher levels of proficiency, noun-adjective agreement violations elicited N400s for both the explicit and implicit groups, whereas noun-article agreement violations elicited P600s for both groups. According to Morgan-Short et al., the results suggest that interactions among linguistic structure, proficiency level, and type of training need to be considered when examining the development of aspects of inflectional morphology in L2 acquisition. Crucially, the authors were also cautious to make the disclaimer in a footnote (p. 185) that it was important to emphasize that this study examined neurocognitive outcomes of explicit and implicit training conditions, not whether the resulting learning or knowledge might have been (partly or wholly) explicit or implicit. In other words, while the study revealed interesting data on both participants’ behavioral and neural functioning, the processes (e.g., awareness) employed during the training sessions were not methodologically addressed.

Attention and “Learning” Fortunately, in cognitive psychology/science, the role of attention in learning (note, not necessarily language learning) is not a controversial issue. In cognitive psychology, a popular strand of research to address the importance of attention in learning was the “divided attention” studies that have provided relatively strong empirical support for the claim that no serial learning may take place in the absence of attention, which is crucial for long-term memory storage (e.g., Carr & Curran, 1994; Nissen & Bullemer, 1987). To give you an example of such a study, let us take a look at Nissen and Bullemer (1987). They used a single task in which they asked participants to track the appearance of a light in a series of 10 positions. Attention was measured by participants’ performances on the reaction time test or the serial reaction time task (SRT), which is the standard paradigm for examining all kinds of attention and sequence learning in psychology. In this task, participants demonstrated their attention paid to the stimulus by hitting corresponding response keys. After several trials, participants were able to decrease their reaction time, indicating that some learning of the sequence took place. The same participants then took part in a dual task in which they had to track the appearance of the light at the same time as they counted tones. Nissen and Bullemer compared their performance with a control group lacking previous exposure to the tasks and found no significant difference between the two groups. According to Nissen and Bullemer, the dual task prevented the experimental learners from paying attention to the appearance of the lights, which resulted in a failure to learn the sequence. Note that divided attention in such studies is related to the appearance of a light and that measurement is via

The Role of Attention in Learning 35

a reaction-time task, both of which may not be completely pertinent to natural language processing. The divided attention paradigm was also incorporated into the design of Curran and Keele’s (1993) experiments. These researchers reported on what they called a non-attentional type of learning evident in a serial reaction time dualtask similar to that used in Nissen and Bullemer’s (1987) study. The reaction or response time procedure is typically assumed to measure implicit memory or learning. Participants in their experiment had to track the appearance of an X marked in different numbered quadrants of a computer screen. One group of participants received information on the rules underlying the appearance of the X mark and the other group did not. Participants were then divided in aware and unaware groups, depending on whether they were able to identify the rules regulating the stimuli. Curran and Keele found that learning (measured in reaction time decreases) was superior in the case of instructed learners and in that of aware uninstructed learners. However, participants in the uninstructed unaware group still showed some improvement in reaction time. In order to see whether learning would still occur under a dual task condition, Curran and Keele (1993) added a secondary tone-counting task. The researchers found that even under the dual task condition all participants demonstrated to have acquired some sequential knowledge. On the basis of these findings, Curran and Keele posited the existence of two different types of learning: An attentional type (evident in instructed and aware uninstructed learners under single-task conditions) and a non-attentional type of learning (evident in non-aware subjects under dual-task conditions). Nonetheless, Curran and Keele themselves stated that the type of learning that they called non-attentional does not imply a complete absence of attention, but rather a lower amount of attention. The results of these divided attention studies were interpreted by Carr and Curran (1994) as evidence that structural learning is attenuated or eliminated under attention-distracting conditions. They defended the position that attention is crucial in learning because it allows the learner to parse the sequence of stimuli into chunks so that syntactic processing is facilitated.

Learning without Attention? Can unattended information still induce learning? One term that comes to mind is subliminal learning, that is, learning something without the slightest effort or attention paid to the information. This is equivalent to sleeping while playing a tape on the Spanish subjunctive and hoping to learn this problematic structure from the content on the tape. Barring this scenario, it is really challenging to create a condition in which the participant is exposed to a stream of information to which s/he does not pay attention. Let us take a brief look at some studies (cf. Shanks, 2005) that have claimed that it is possible to learn unattended information. As usual, keep in mind what these studies were studying.

36

Theoretical Foundations

Miller (1987) is a priming study, discussed above, that has been used to address the process of attention and is what is well known as the “flanker” task. In this task, participants saw briefly a target letter “flanked” by two other letters that were to be ignored, for example, [C A C]. Once exposed to the stimulus, participants were asked to press quickly a right- or left-finger response once the target letter was identified, which is the response (or reaction) time procedure. Miller then manipulated the stimuli in such a way that the identity of the flankers was correlated with the target such that specific flankers co-occurred regularly with targets requiring a specific response. The stimuli were thus designed to present valid and invalid trials that required either a right- or left-finger response. Miller reported that response times for valid trials were reliably shorter than response times for invalid trials and, given that the flankers were supposedly ignored, interpreted this as evidence of automatic and non-attentional processing of the flankers combined with implicit learning of the flanker—target correlations. Miller also reported that his participants were unable to recall the flankers, thus providing further evidence of unattended learning. A replication study (Schmidt & Dark, 1998) did not provide external validity support for this study. The authors concluded that selective attention to the target item was broken at some point in time during the processing, which could have led to some attention being paid to the flankers, leading to the formation of memory traces that could be recalled. Eich (1984) employed the popular dichotic listening task in which participants listen to or shadow a stream of information in one attended ear while additional information was presented to the non-attended ear. In this study, participants were shadowing a prose stream in the attended ear while word pairs that included a descriptor and a homophone, such as TAXI—FARE, were presented to the nonattended ear. Eich reported that participants performed at chance on a recognition test for the unattended words and also wrote on a spelling test the homophones in their low-frequency form (i.e., FARE instead of FAIR). Eich concluded that participants had implicitly paid attention to the unattended word pairs, which in turn led to subsequent priming in the spelling test. Once again, these findings were not empirically supported by a subsequent replication study (Wood, Stadler, & Cowan, 1997). To address the potential role played by the slow speed of the prose passage, which could have allowed a reduced load of shadowing while allowing some shift of attention to the unattended stream of information, Wood et al. varied the presentation rate of the attended information. They reported that under faster rates, which would reduce or eliminate participants’ shifts of attention to the unattended stream, no evidence of learning was found on the homophone spelling task. Don’t you like the use of “learning” in these studies? DeSchepper and Treisman (1996) is another study that attempted to address the possibility of learning without paying attention to specific information in the stimuli. Participants were presented with two overlapped nonsense shapes

The Role of Attention in Learning 37

and asked to attend to one of them in relation to their color (e.g., the green one, not the red one). Their response was recorded with the response time procedure. After a series of trials in which participants were asked to pay attention to the green shapes only and then match them to other shapes, the researchers then reversed the target and distracter, that is, a shape previously presented in green (target) was now presented in the opposite color, red (distracter). DeSchepper and Treisman reported that the reaction times for responses revealed participants responded more slowly to previously ignored shapes when compared to novel control shapes. In other words, one plausible explanation for these findings was that the representations of these unattended novel shapes were formed in participants’ memory during the exposure experimental phase. Studies of the attentional blink (e.g., Frings, Bermeitinger, & Wentura, 2011) have also addressed the possibility of semantic processing of targeted L1 words (and even distractors) that were not attended to during the attentional blink yet were reported subsequently. In the attentional blink, “looking at one target in a rapid display makes it difficult to identify a subsequent target that follows it by a certain lag time” (Friedenberg, 2013: 95). While Dehaene, Changeux, Naccache, Sackur, and Sergent (2006) call these instances pre-conscious, with the potential of achieving higher levels of processing without the presence of top-down attention being allocated, Williams (2013) cautions that, in addition to other studies addressing attention and awareness in psychology (e.g., Alonso, Fuentes, & Hommel, 2006; Custers & Aarts, 2011) “in the SLA context we must bear in mind that many of these phenomena concern words in the native language. Perhaps then they tell us more about the automaticity of processing than the non-selectivity of attention, and we may wonder at what level of fluency such effects could be detected in the L2” (p. 51). Overall, especially in relation to the role of memory in cognitive psychology, it is well accepted that not much learning will take place without attention (Baars, 1988; Kihlstrom, 1984; Logan, 1988; Nissen & Bullemer, 1987; Posner, 1992). It is assumed that unattended input may enter into short-term memory, but without attention it will not remain in memory for much longer. To establish absence of attention is very challenging, to say the least, and when you consider in many studies the focus of content (e.g., shapes, letters, colors, digits, words, numbers, etc.) and the methodological (e.g., response or reaction time, divided attention) and statistical (e.g., chance performance) measures employed to address one’s attention, there will always be conflicting findings regarding the effort to create conditions to separate attention from “learning.” More crucially, when we consider that language learning is clearly divorced from the content of study in the majority of these studies, the only conclusion we can make is that attention does play an important role in language learning. Let us now move on to a construct that is very current in the literatures of both SLA and non-SLA fields, namely, working memory.

38

Theoretical Foundations

Working Memory The concept of short-term memory (STM) has given way to the concepts of working memory (WM) (e.g., Baddeley & Hitch, 1974) and working memory capacity (WMC), which are currently used in the SLA literature. The major difference between STM and WM are the mechanisms postulated to be operating during input processing that go beyond mere storage and activation of information. These mechanisms are responsible for active maintenance of information and for cognitive control that coordinate and integrate its storage and processing operations to guide specific behaviors, as seen in the following definitions: The “ability to hold in mind information in the face of potentially interfering distraction in order to guide behavior” (Jarrold & Towse, 2006: 39) and the “ability to maintain information in an active and readily accessible state, while concurrently and selectively processing new information” (Conway, Jarrold, Kane, Miyake, & Towse, 2007: 3). In other words, working memory is required both to store information and to integrate new information with prior knowledge held in long-term memory. Due to its limited capacity, new information held in working memory without further processing is likely to be discarded. This postulation is very important, as it is directly related to the roles of level or depth of processing and cognitive effort (cf. Chapter 11). Given this more explicit and multiple role, many researchers have argued that students’ differential performances on different complex cognitive tasks (e.g., reading and sentence processing, cf. Daneman & Merikle, 1996, for a meta-analysis) and general intellectual abilities (e.g., reasoning and general fluid intelligence, cf. Ackerman, Beier, & Boyle, 2005, for a meta-analysis, and a response to this meta-analysis by Kane, Hambrick, & Conway, 2005) may reveal these students’ individual levels of working memory capacity (WMC). An increasing number of cognitive psychologists have accepted WM as a multi-component system comprising both domain-specific storage mechanisms and domain-general executive functions (e.g., Baddeley, 2012; Miyake & Shah, 1999). However, like every internal process, WM is not without its own controversy with respect to its exact nature and individual differences found in performances on varied tasks. Let us discuss briefly some perspectives, namely, a resource-sharing account, a task-switching account, and an executive attention view (cf. Goo, 2010, for further elaboration). The resource-sharing account (Daneman & Carpenter, 1980, 1983) posits that WM capacity is a limited pool of cognitive resources and the amount of information that can be stored during processing depends on how efficiently such processing can take place, a trade-off between the processing and storage demands. The task-switching account is an alternative proposal to the resource-sharing account about the nature of WM and variation in WMC (Towse & Hitch, 1995, 2007; Towse, Hitch, & Hutton, 1998). This account posits that WMC is limited because individuals undergo the rapid forgetting of to-be-remembered items during the time spent processing. Thus, performance on WM span tasks is to a

The Role of Attention in Learning 39

large extent determined by the temporal dynamics of WM span, which points to the intrinsic involvement of processing efficiency and time in the maintenance or loss of temporary information (see, however, Conway & Engle, 1996; Friedman & Miyake, 2004, for evidence against these processing-based accounts). The executive attention view posits that differential performances on complex cognitive tasks due to individual differences in WMC are derived mainly from variation in domain-general executive attention processes and, to some extent, from variation in domain-specific storage and rehearsal processes (Engle, 2002; Kane, Conway, Hambrick, & Engle, 2007). It is useful to mention briefly Cowan’s (1999) embedded processes model of working memory as a framework to discuss attention and memory in language learning. Memory representations are activated by external stimuli or internally generated associations. While the number of representations that can be activated in memory has not been established, it is postulated that they will only remain active for a short period of time. To remain for a longer period of time, focal attention needs to be paid to act as a boost to the activation level of these representations, to improve their quality, and to enrich the encoding process of the information. Focal or selective attention is usually contrasted with peripheral attention, which, as mentioned above, may be broadly exemplified by the description of “glancing out of the corner of your eye.” Peripheral attention is typically associated with the potential of learning something without awareness. Wen (2012) pinpointed two contrasting research paradigms that have emerged in relation to WM. According to Wen, WM researchers following the European tradition have focused on the critical role the phonological component of WM (which embraces a passive sound-based store and an active sub-vocal rehearsal process) plays in vocabulary acquisition and grammar development (e.g., Baddeley, 2003; Gathercole & Baddeley, 1993). Measurements of WM usually include a simple storage-only memory span task such as the digit span, the word span, or the non-word repetition span tasks. Participants’ total recall score is then used as their WM capacity (Gathercole, 2006; Gathercole, Willis, Baddeley, & Emslie, 1994). In contrast, many cognitive psychologists based in North America have opted to address the executive functions associated with the WM concept and to focus on teasing out implications of its attention-regulating mechanisms (e.g., information updating, task-switching, and inhibitory control; Miyake & Friedman, 2012) for language learning and processing. The popular WM measure is usually a complex memory span task that taxes the storage and processing functions of WM (such as the reading span task designed by Daneman & Carpenter, 1980). Significant effects of WM on selective facets of language processing activities (e.g., comprehension processes, syntactic processing, and speech production, cf. Miyake & Friedman, 1998) have been reported. Taken together, both research camps of WM have accumulated increased evidence for a close link between WM and L1 learning (Baddeley, 2003; Cowan, 2011; Gathercole & Baddeley, 1993).

40

Theoretical Foundations

As noted by Wen (2014), the SLA field is currently emulating the research paradigms of WM language in cognitive psychology, together with an adoption of the measures employed to address WM (recall our perception that they know better?). However, the rationale is relatively rooted in a logical and theoretical foundation, if viewed from an information processing perspective of learning and also in the perceived fundamental difference between acquiring our first language and learning an L2. While acquisition is usually associated with automatic processing, L2 learning relies on more controlled processing, which is assumed to place much greater demands on such cognitive resources as WM and/or attention (Harrington, 1992). In SLA, several researchers, like others in cognitive psychology (e.g., Kintsch, Healy, Hegarty, Pennington, & Salthouse, 1999) highlight the connection between working memory and attention. Schmidt (2001) has stated that “at least one aptitude factor, short term or working memory capacity [. . .] is closely related to attention” (p. 10), which is, he argued, necessary to understand virtually every aspect of second language acquisition. Reinterpreting or redefining Schmidt’s (1990) concept of noticing, Robinson (1995) has proposed that attention can be conceived of as detection plus rehearsal in short-term memory, thereby again making a connection between memory and attention (cf. also. N. Ellis, 2001, for a similar link and Robinson, 2003 for further elaboration). Not surprisingly, understanding the nature of WM and sources of individual variation in WMC remains a major task for researchers both in the SLA and non-SLA fields, and only future research that elicits pertinent data can help test the major tenets of the different theoretical postulations regarding WM and ultimately shed more light on this internal process.

Summary Robinson, Mackey, Gass, and Schmidt (2012) pinpoint the several informationprocessing functions that have been addressed by these major attentional models from many areas of cognitive science and that have impacted SLA models and research. I shall comment on these functions that include the following: 1.

2.

The study of mental effort and divided attention (e.g., Kahneman, 1973; Wickens, 2007). In this information-processing function, the notion of attention is that it is limited, and this is based on the metaphor of a limited capacity channel assumed to characterize attention in several non-SLA fields. The divided attention and mental effort reflect the view that there are multiple attentional resources that are deployed dependent upon the complexity of concurrent tasks to be performed. The study of selective attention in visual processing (e.g., Posner & Peterson, 1990). It is very important to note that visual processing does not equate

The Role of Attention in Learning 41

3. 4.

the language learning process per se, and while the findings can provide important information on how attention functions within the visual world, extrapolating such functions may not reflect natural language processing. The study of the relationship of focal attention to rehearsal in working memory (e.g., Cowan, 1995), and The study of the reduced role for attention during the development of automaticity and skilled responding (e.g., Logan, 1988; Shiffrin & Schneider, 1977). This is based on the notion of controlled versus automatic processes, representing two distinct human information processing systems. Controlled processes, which involve quite a large amount of cognitive effort, are postulated to require a lot of attention by the learner to process the incoming information. However, as the same or similar information is processed repeatedly over a period of time, the amount of attention (or reduced depth of processing?) required in the early stages begins to be reduced, and it is then assumed that less attention is deployed to process the information.

Schmidt (2001) summarizes the basic assumptions of attention taken from the non-SLA field in a more direct manner. Attention is (1) limited, (2) selective, (3) partially subject to voluntary control, (4) controls access to consciousness, and (5) important for learning. While all these attributes have been covered above, let us take a closer look at (4), which posits that attention controls access to consciousness or awareness, which can be interpreted in three ways: That attention (1) is intricately linked to awareness (a la Schmidt, 1990), (2) it can be disassociated from awareness, or (3) it logically follows that what is attended to will lead to awareness. The role of attention as a gateway to consciousness has been postulated decades ago (e.g., Baars, 1988; Marcel, 1983; Neisser, 1967; Wundt, 1903, cited in Neumann, 1996), though we need to be a bit more careful when we phrase it in this way. A closer look at the early postulations will reveal that the concept of attention is actually viewed from different perspectives, and a key feature of attention is its selectivity that assumes some effort on the part of the learner. For example, Wundt proposed the concept of apperception, which is based on the focus of consciousness being determined by the direction of attention (#1). Neisser (1967) employed the term focal attention, whose function was to process the information more deeply (depth of processing) and making it available for further analysis (#3). Marcel (1983) took a step further by establishing focal attention as a mechanism that differentiates an early processing stage that is associated with unconscious representations of the stimuli to a higher stage where phenomenal experience is associated with consciousness (#3). Subliminal priming studies (e.g., Naccache, Blandin, & Dehaene, 2002; Spruyt, De Houwer, Everaert, & Hermans, 2012) have also been cited to support a separation between attention and awareness (#2).

42

Theoretical Foundations

The bottom line appears to be that selected stimuli are represented in what was called conscious awareness, while unselected stimuli remained outside conscious awareness. I shall return to this specific feature of attention in Chapter 4.

Conclusion What can we definitively take away from the information provided in this chapter? For one, we need to be careful about extrapolating the findings of non-SLA studies to the field of SLA, given that it is well accepted that quite a lot of data used to address the process of attention, especially in the visual world, may not be applicable to the same process in L2 learning. However, studies in cognitive psychology/science and SLA share the following conclusion: The chances of learning without attention are minimal. In other words, attention is crucial for further long-term memory storage of L2 information to take place. However, as we shall see in the later chapters, merely paying attention is not the end-all—so telling our students to pay attention is one way to grab their attention, but we need to be aware that there are other factors that get involved. Let us now tackle that slippery eel, also known as consciousness (or awareness).

References Ackerman, P. L., Beier, M. E., & Boyle, M. O. (2005). Working memory and intelligence: The same or different constructs? Psychological Bulletin, 131, 30–60. Allport, D. A. (1980). Attention and performance. In G. Claxton (Ed.), Cognitive psychology— new directions. London: Routledge and Kegan Paul. Allport, D. A. (1993). Attention and control: Have we been asking the wrong questions? In D. E. Meyer & S. Kornblum (Eds.), Attention and performance 14. Cambridge, MA: MIT Press. Allport, D. A., Antonis, B., & Reynolds, P. (1972). On the division of attention: A disproof of the single channel hypothesis. Quarterly Journal of Experimental Psychology, 24, 225–235. Alonso, D., Fuentes, L. J., & Hommel, B. (2006). Unconscious symmetrical inferences: A role of consciousness in event integration. Consciousness and Cognition, 15, 386–396. Baars, B. J. (1988). A cognitive theory of consciousness. Cambridge: Cambridge University Press. Baddeley, A. D. (2003). Working memory and language: An overview. Journal of Communication Disorders, 36, 189− 208. Baddeley, A. D. (2012). Working memory: Theories, models and controversies. Annual Review of Psychology, 63, 1–30. Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. A. Bower (Ed.), The psychology of learning and motivation (pp. 47− 90; Vol. 8). New York: Academic Press. Barber, H., & Carreiras, M. (2005). Grammatical gender and number agreement in Spanish: An ERP comparison. Journal of Cognitive Neuroscience, 17, 137–153. Batterink, B., Karns, C. M., Yamada, Y., & Neville, H. (2009). The role of awareness in semantic and syntactic processing: An ERP attentional blink study. Journal of Cognitive Neuroscience, 22, 2514–2529. Bialystok, E. (1978). A theoretical model of second language acquisition. Language Learning, 28, 69–84.

The Role of Attention in Learning 43

Broadbent, D. (1952a). Speaking and listening simultaneously. Journal of Experimental Psychology, 43, 267–273. Broadbent, D. (1952b). Listening to one of two synchronous messages. Journal of Experimental Psychology, 44, 51–55. Broadbent, D. (1952c). Failure of attention in selective listening. Journal of Experimental Psychology, 44, 428–433. Broadbent, D. (1958). Perception and communication. London: Pergamon Press. Carr, T., & Curran, T. (1994). Cognitive factors in learning about structured sequences: Applications to syntax. Studies in Second Language Acquisition, 16, 205–230. Carlson, R. A., Khoo, B. H., Yaure, R. G., & Schneider, W. (1990). Acquisition of a problem-solving skill: Levels of organization and use of working memory. Journal of Experimental Psychology: General, 119, 193–214. Conway, A. R. A., & Engle, R. W. (1996). Individual differences in working memory capacity: More evidence for a general capacity theory. Memory, 4, 577–590. Conway, A. R. A., Jarrold, C., Kane, M. J., Miyake, A., & Towse, J. N. (2007). Variation in working memory: An introduction. In A. R. A. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variation in working memory (pp. 3–17). Oxford: Oxford University Press. Cowan, N. (1995). Attention and memory: An integrated framework. Oxford: Oxford University Press. Cowan, N. (1999). An embedded-processes model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 62–101). Cambridge: Cambridge University Press. Cowan, N. (2011). Working memory and attention in language use. In J. Guandouzi, F. Loncke, & M. J. Williams (Eds.), The handbook of psycholinguistics and cognitive processes (pp. 75–97). London: Psychology Press. Curran, T., & Keele, S. (1993). Attentional and nonattentional forms of sequence learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 189–202. Custers, R., & Aarts, H. (2011). Learning of predictive relations between events depends on attention, not on awareness. Consciousness and Cognition, 20, 368–378. Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behaviour, 19, 450− 466. Daneman, M., & Carpenter, P. (1983). Individual differences in integrating information within and between sentences. Journal of Experimental Psychology: Learning, Memory and Cognition, 9, 561–583. Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A meta-analysis. Psychonomic Bulletin & Review, 3, 422–433. Dehaene, S., Changeux, J. P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: A testable taxonomy. Trends in Cognitive Sciences, 10, 204–211. DeKeyser, R. (2007). Skill acquisition theory. In B. VanPatten & J. Williams (Eds), Theories in second language acquisition (pp. 97–113). Mahwah, NJ: Lawrence Erlbaum. DeSchepper, B., & Treisman, A. (1996). Visual memory for novel shapes: Implicit coding without attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 27–47. Eich, E. (1984). Memory for unattended events: Remembering with and without awareness. Memory and Cognition, 12, 105–111. Ellis, N. (2001). Memory for language. In P. Robinson (Ed.), Cognition and second language instruction (pp. 33–68). New York: Cambridge University Press.

44

Theoretical Foundations

Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological Science, 11, 19–23. Friedenberg, J. (2013). Visual attention and consciousness. New York: Psychology Press. Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6, 78–84. Friederici, A. D., Opitz, B., & von Cramon, Y. D. (2000). Segregating semantic and syntactic aspects of processing in the human brain: An fMRI investigation of different word types. Cerebral Cortex, 10, 698–705. Friederici, A. D., Pfeifer, E., & Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1, 183–192. Friedman, N. P., & Miyake, A. (2004). The reading span test and its predictive power for reading comprehension ability. Journal of Memory and Language, 51, 136–158. Frings, C., Bermeitinger, C., & Wentura, D. (2011). Inhibition from blinked category labels: Combining the attentional blink and the semantic priming paradigm. Journal of Cognitive Psychology, 23, 514–521. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Gass, S. M. (1998). Integrating research areas: A framework for second language studies. Applied Linguistics, 9, 198–217. Gathercole, S. (2006). Nonword repetition and word learning: The nature of the relationship. Applied Psycholinguistics, 27, 513− 543. Gathercole, S., & Baddeley, A. (1993). Working memory and language. Hillsdale, NJ: Lawrence Erlbaum Associates. Gathercole, S. E., Willis, C. S., Baddeley, A. D., & Emslie, H. (1994). The children’s test of nonword repetition: A test of phonological working memory. Memory, 2, 103−127. Goo, J. (2010). Working memory and reactivity. Language Learning, 60 (4), 712–752. Hagoort, P., Brown, C., & Groothusen, J. (1993). The syntactic positive shift (SPS) as an ERP measure of syntactic processing. Language and Cognitive Processes, 8, 439–483. Harrington, M. (1992). Working memory capacity as a constraint on L2 development. In R. J. Harris (Ed.), Cognitive processing in bilinguals (pp. 123–135). Amsterdam: North Holland. Horrey, W., Wickens, C. D., & Consalus, K. (2006). Modeling drivers’ visual attention allocation while interacting with in-vehicle technologies. Journal of Experimental Psychology: Applied, 12 (2), 67–78. James, W. (1890). The principles of psychology. New York: Holt. Jarrold, C., & Towse, J. N. (2006). Individual differences in working memory. Neuroscience, 139, 39–50. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall. Kane, M. J., Conway, A. R. A., Hambrick, D. Z., & Engle, R. W. (2007). Variation in working memory capacity as variation in executive attention and control. In A. R. A. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variation in working memory (pp. 21–48). Oxford: Oxford University Press. Kane, M. J., Hambrick, D. Z., & Conway, A. R. A. (2005). Working memory capacity and fluid intelligence are strongly related constructs: Comment on Ackerman, Beier, and Boyle (2005). Psychological Bulletin, 131, 66–71. Kihlstrom, J. (1984). Conscious, subconscious, unconscious: A cognitive perspective. In K. Bowers & D. Meichenbaum (Eds.), The unconsciousness reconsidered (pp. 149–211). New York: Wiley.

The Role of Attention in Learning 45

Kintsch, W., Healy, A. F., Hegarty, M., Pennington, B. F., & Salthouse, T. A. (1999). Models of working memory: Eight questions and some general issues. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 412–441). New York: Cambridge University Press. Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207, 203–205. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527. Marcel, A. (1983). Conscious and unconscious perception: Experiments on visual masking and word recognition. Cognitive Psychology, 15, 197–237. McLaughlin, B. (1987). Theories of second language learning. London: Edward Arnold. Miller, J. (1987). Priming is not necessary for selective attention failures: Semantic effects of unattended, unprimed letters. Perception and Psychophysics, 41, 419–434. Miyake, A., & Friedman, N. (1998). Individual differences in second language proficiency: Working memory as language aptitude. In A. Healy & L. Bourne Jr. (Eds.), Foreign language learning: Psycholinguistic studies on training and retention (pp. 339–364). Mahwah, NJ: Lawrence Erlbaum Associates. Miyake, A., & Friedman, N. (2012). The nature and organization of individual differences in executive functions: Four general conclusions. Current Directions in Psychological Science, 21, 8–14. Miyake, A., & Shah, P. (Eds.). (1999). Models of working memory: Mechanisms of active maintenance and executive control. Cambridge: Cambridge University Press. Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. (2010). Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning, 60, 154–193. Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13, 416–424. Navon, D., & Gopher, D. (1979). On the economy of the human processing system. Psychological Review, 86, 214–255. Neely, J. H. (1991). Semantic priming effects in visual word recognition: A selective review of current findings and theories. In D. Besner & G. W. Humphreys (Eds.), Basic processes in reading: Visual word recognition (pp. 264–336). Hillsdale, NJ: Erlbaum. Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts. Neisser, U. (1976). Cognition and reality. San Francisco: Freeman. Neumann, O. (1996). Theories of attention. In O. Neumann & W. Prinz (Eds.), Handbook of perception and action Vol. 3: Attention (pp. 389–446). San Diego: Academic Press. Neville, H. J., Nicol, J., Barss, A., Forster, K., & Garrett, M. (1991). Syntactically based sentence processing classes: Evidence from event-related brain potentials. Journal of Cognitive Neuroscience, 3, 155–170. Newman, A. J., Pancheva, R., Ozawa, K., Neville, H. J., & Ullman, M. T. (2001). An event-related fMRI study of syntactic and semantic violations. Journal of Psycholinguistic Research, 30, 339–364. Ni, W., Constable, R. T., Mencl, W. E., Pugh, K. R., Fulbright, R. K., Shaywitz, S. E., . . . Shankweiler, D. (2000). An event-related neuroimaging study distinguishing form and content in sentence processing. Journal of Cognitive Neuroscience, 12, 120–133. Nissen, M., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32. Norman, D. A. (1968). Toward a theory of memory and attention. Psychological Review, 84, 231–259.

46

Theoretical Foundations

Paradis, M. (2009). Declarative and procedural determinants of second language (Vol. 40). Amsterdam: John Benjamins. Posner, M. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3–25. Posner, M. (1992). Attention as a cognitive and neural system. Current Directions in Psychological Science, 1, 11–14. Posner, M. I. (1994). Attention: The mechanism of consciousness. Proceedings of the National Academy of Sciences USA, 91, 7398–7403. Posner, M. I. (1995). Interaction of arousal and selection in the posterior attention network. In A. Baddeley & L. Weiskrantz (Eds.), Attention: Selection, awareness, and control (pp. 390–405). London: Clarendon. Posner, M. I., & Boies, S. J. (1971). Components of attention. Psychological Review, 78, 391–408. Posner, M. I., & Petersen, S. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25–42. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. Solso (Ed.), Information processing and cognition: The Loyola Symposium. Potomac, MD: Erlbaum. Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of signals. Journal of Experimental Psychology: General, 109, 160–174. Robinson, P. (1995). Review article: Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Robinson, P. (2003). Attention and memory in SLA. In C. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 631–678). Oxford: Blackwell. Robinson, P., Mackey, A., Gass, S. M., & Schmidt, R. (2012). Attention and awareness in second language acquisition. In S. M. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 247–267). New York: Routledge. Sanders, A. F. (1983). Toward a model of stress and human performance. Acta Psychologica, 53, 61–97. Schmidt, P. A., & Dark, V. J. (1998). Attentional processing of ‘unattended’ flankers: Evidence for a failure of selective attention. Perception and Psychophysics, 60, 227–238. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32). Cambridge: Cambridge University Press. Shanks, D. R. (2005). Implicit learning. In K. Lamberts & R. Goldstone (Eds.), Handbook of cognition (pp. 202–220). London: Sage Publications Ltd. Shiffrin, R. M. (1993). Short-term memory: A brief commentary. Memory and Cognition, 21(2), 193–197. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190. Simard, D., & Wong, W. (2001). Alertness, orientation, and detection: The conceptualization of attentional functions in SLA. Studies in Second Language Acquisition, 23, 103–124. Spruyt, A., De Houwer, J., Everaert, T., & Hermans, D. (2012). Unconscious semantic activation depends on feature-specific attention allocation. Cognition, 122, 91–95. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203.

The Role of Attention in Learning 47

Towse, J. N., & Hitch, G. J. (1995). Is there a relationship between task demand and storage space in tests of working memory capacity? Quarterly Journal of Experimental Psychology, 48A, 108–124. Towse, J. N., & Hitch, G. J. (2007). Variation in working memory due to normal development. In A. R. A. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variation in working memory (pp. 109–133). Oxford: Oxford University Press. Towse, J. N., Hitch, G. J., & Hutton, U. M. Z. (1998). A reevaluation of working memory capacity in children. Journal of Memory and Language, 39, 195–217. Treisman, A. (1964). Verbal queues, language, and meaning in selective attention. American Journal of Psychology, 77, 533–546. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/ procedural model. Cognition, 92 (1–2), 231–270. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum. Wen, Z. (2012). Working memory and second language learning. International Journal of Applied Linguistics, 22, 1–22. Wen, Z. (2014). Theorizing and measuring working memory in first and second language research. Language Teaching, 47, 174–190. Wickens, C. D. (1980). The structure of attentional resources. In R. Nickerson (Ed.), Attention and performance VIII (pp. 239–257). Hillsdale, NJ: Lawrence Erlbaum. Wickens, C. D. (1989). Attention and skilled performance. In D. Holding (Ed.), Human skills (pp. 71–105). New York: John Wiley. Wickens, C. D. (2007). Attention to the second language. International Review of Applied Linguistics, 45, 177–191. Wickens, C. D., Goh, J., Helleberg, J., Horrey, W., & Talleur, D. (2003). Attentional models of multi-task pilot performance using advanced display technology. Human Factors, 45, 360–380. Williams, J. N. (2013). Attention, awareness, and noticing in language processing and learning. In J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka, (Eds.), Noticing and second language acquisition: Studies in honor of Richard Schmidt (pp. 51–69). Honolulu: University of Hawai‘i, National Foreign Language Resource Center. Wood, N. L., Stadler, M. A., & Cowan, N. (1997). Is there implicit memory without attention? A reexamination of task demands in Eich’s (1984) procedure. Memory and Cognition, 25, 772–779. Wundt, W. (1903). Grundzuge der physiologischen psychologie (Principles of physiological psychology) (5th ed.). Leipzig: Engelmann. Yamada, Y., & Neville, H. (2007). An ERP study of syntactic processing in English and nonsense sentences. Brain Research, 1130, 167–180.

4 THEORETICAL FOUNDATIONS FOR THE ROLE OF AWARENESS IN LEARNING FROM NON-SLA FIELDS

While we are all happy with the role of attention in the learning process, I am afraid the same cannot be said for the construct of consciousness or awareness and its role during this process. Here is one of my favorite statements: “Consciousness as an object of intellectual curiosity is the philosopher’s joy and the scientist’s nightmare” (Tulving 1993: 283). Tulving certainly knew what he was saying, given that the “multifaceted nature of the construct of ‘awareness’ makes it undoubtedly one of the slipperiest to operationalize and measure in both second language acquisition (SLA) and non-SLA fields such as cognitive psychology, cognitive science, and neuroscience” (Leow, Johnson, & ZárateSández, 2011: 61). In addition, the conflation of the terms “consciousness” and “awareness,” which you probably already noticed in previous chapters, is quite remarkable—for example, “any evidence that perception is not necessarily accompanied by an awareness of perceiving attracts attention because it challenges the idea that perception implies consciousness” (Merikle, Smilek, & Eastwood, 2001: 116). In a similar vein, Schachter (1989) used the term “consciousness” interchangeably with “phenomenal awareness” while referring to Dimond’s (1976) definition of consciousness as “the running span of subjective experience” (p. 377). More recently, Friedenberg (2013) not only underscored the slipperiness of the construct of consciousness but also immediately conflated consciousness with awareness: “Consciousness is perhaps one of the greatest mysteries in the universe. How are we aware of ourselves?” (p. 3). This chapter reports on several varieties of consciousness and revisits the role of consciousness in non-SLA information processing models with respect to its association with the metaphor of the human being as limited capacity processors, a limited capacity information selection system, and the concept of a limited capacity central executive. The global neuronal workspace theory is

The Role of Awareness in Learning 49

also discussed, with its key feature of perceiving the mind as modular. This is premised on the hypothesis that automatic or unconscious cognitive processing depends on multiple processors or modules. The relevance of these models to the L2 is discussed, followed by definitions of what comprises the construct of consciousness in non-SLA fields and the way this construct has been operationalized and measured in empirical research in these fields. But first, here are some musings on the construct of consciousness.

Some Musings on Consciousness Before I go into a concise report of the non-SLA theoretical underpinnings for the role of consciousness (and if you also noticed, I use the term “consciousness” in non-SLA fields but the term “awareness” in SLA), here are a couple of questions to ponder. Where does consciousness reside? If one were to respond “in the brain,” then does consciousness have one home address or more? Does it skip from one neural network to another? How do we know someone is conscious of some data in the input? Is someone’s consciousness the same as another person’s, say, during a similar task or interaction? Indeed, what is consciousness? I am sure you are aware that the answers are not easy, but later in the chapter I am going to provide you with some tentative definitions as we take a quick look at empirical studies purporting to address implicit learning in the non-SLA fields. However, there may be a few facts with which we can all agree, and later, I shall summarize others gleaned from the attentional theories in non-SLA fields. First of all, consciousness is a subjective experience, and we all experience events differently, which appears to indicate that there may be levels of consciousness. Friedenberg (2013) put it in an interesting way: “Although we may be able to better understand what it is like to be another human hearing a Beethoven symphony, smelling a rose or seeing a Monet painting, we can never be sure we are actually having the same exact experience ourselves” (p. 3). Friedenberg also pointed out the mismatch between science, being objective in its scope, and a subjective phenomenon like consciousness. Scientists can talk about neurons firing more rapidly in specific parts of the brain as blood courses through the brain when compared to other regions, but they cannot tell us much about what it is like to be that person whose brain is being analyzed or, more specifically, the individual’s subjective quality of experience. Indeed, from an experimental perspective, does one’s consciousness (or awareness) rise as they know they are about to take part in an experiment?

Varieties of Consciousness There may be varieties of consciousness if one were to accept that conscious experience can be linked to heightened firing of neurons in specific parts of the brain (does this mean that consciousness resides in several homes?). In this

50

Theoretical Foundations

perspective, we can differentiate different types or categories of consciousness; for example, brain activity is differential between, say, being awake or being in different stages of sleeping. Levels of consciousness are also assumed, which can range from being in a conscious state to being in a comatose state to being in a coma (is any level pertinent to some of our students?). Other researchers (e.g., Kihlstrom, 1984) differentiate levels that include conscious, preconscious, and unconscious (cf. Dehaene, Changeux, Naccache, Sackur, & Sergent, 2006, who discussed processing levels that include conscious, preconscious, and subliminal), which, overall, appears to indicate minimally a three-level home (cf. Block, 1995, for a four-fold classification of conscious states). We can also identify two aspects of consciousness, namely, the state of consciousness, as in, for example, being awake or in a coma, and the contents of consciousness, that is, being conscious or aware of a specific item in the input. While these two are not mutually exclusive (Dehaene & Changeax, 2005), we are focusing on the latter aspect in this book. Okay, now that we are all conscious and aware of the slipperiness of this construct (think of trying to get hold of an eel), let us discuss theoretical underpinnings in the non-SLA fields. Just keep in mind, once again, as you read the information below that most of this discussion is provided within the visual attention field, so if we were to throw in natural language processing with all its complexities, the picture does become quite nebulous, doesn’t it?

Theoretical Non-SLA Models of Attention Revisited If you recall in Chapter 3, we discussed the theoretical models of attention from an information processing perspective, which views humans as limited capacity processors of information contained in the input. We also noted that the process of attention was usually addressed via visual attention. We also discussed the notion of several stores postulated to exist along the learning process, from sensory registers to short-term storage of information to long-term storage of said information, with the ensuing discussion of where we should install a filter, leading to the removal of such a filter. We also discussed attention from a neuroscientist perspective, in which this construct or process is viewed as being more fine-grained, occurring in different parts of the brain (a modular perspective), instead of coarse-grained (e.g., a metaphor). Like the rationale for the selection of attentional models in non-SLA fields, let us revisit some of these attentional theories, to see what role is assigned to consciousness or awareness, and also other theories that have played a role in some SLA theoretical underpinnings and research.

The Role of Consciousness in Non-SLA Information Processing Models Schmidt (1990) provided quite a concise summary of the notion of consciousness in these information processing theories in psychology. He reported that it is usually associated with this metaphor of the human being as limited capacity

The Role of Awareness in Learning 51

processors. Within this metaphor, he identified the notion of consciousness in three ways. The first is associated with the contents of a limited capacity memory system. As you will recall, information processing models typically posit a series of storage structures to account for input processing, ranging from a sensory register, in which pre-attentive and usually unconscious processes are employed to select information from the input, to a short-term memory store (working memory) to a final long-term memory store. Consciousness is usually identified with short-term memory that in turn is often conflated with the construct of consciousness and focal awareness (Kihlstrom, 1984), and processing in shortterm memory is regarded as essential for long-term memory. In other words, information held in short-term memory or working memory, unless further processed, will most likely disappear from memory (e.g., Baars, 1988; Cowan, 1999; Kihlstrom, 1984; Logan, 1988; Nissen & Bullemer, 1987; Posner, 1992). Viewed from this perspective, one postulation may be that the role of consciousness is crucial for learning to take place. Keep in mind, though, that it may not be as simplistic as viewing the learning of some feature of the L2 input as either with or without awareness, given the many variables that can be associated with the learning process (e.g., type of linguistic item, amount of prior knowledge, motivation or interest, language proficiency, social setting, etc.). The second association, according to Schmidt, is with a limited capacity information selection system. Consciousness in other information processing models, especially the early filter models discussed in Chapter 3 (e.g., Norman, 1968), is linked to attention viewed as a control process that transforms information (e.g., in the detection storage structure) into focal awareness or as a limited resource, as in the attentional resource models (e.g., Wickens, 1984), where the notion of effort is embedded in the simultaneous performance of two or more tasks. The notion of attention as a resource is also viewed as a distinction between two types of processing, namely, controlled versus automatic processing, where controlled processing is assumed to be under conscious control (Posner & Snyder, 1975; Shiffrin & Schneider, 1977) and typically associated with new or novel information, while automatic processing requires little, if any, mental effort to process the incoming information. The third association has to do with the concept of a limited capacity central executive. In this concept, consciousness is viewed as an internal programmer, executive control center, or a supervisory attentional system to address some sort of planning or making of critical decisions (e.g., Norman & Shallice, 1986). The limited capacity exists in the ability to coordinate mental activity during input processing. There are several other theories or models in non-SLA fields that posit some role for consciousness, but more importantly, these theories or models do allow for the integration of selective attention, working memory, and cognitive control together with consciousness. I am going to select only those that have in some way provided the theoretical foundation for a few current SLA theoretical underpinnings and briefly mention a few others that underscore the care we need to exert when addressing these theoretical underpinnings.

52

Theoretical Foundations

The Global Neuronal Workspace Before beginning, it is important to draw your attention to and make you aware, once again, of the very important issue of what type of data was gathered to support postulations as to the role of consciousness, namely, perceptual data that were mainly visual and auditory. Probably the most known non-SLA theory of consciousness is Baars’ (1988) cognitive theory of consciousness, which was later developed further by other researchers that include Dehaene and colleagues (e.g., Dehaene & Naccache, 2001; Dehaene, Sergent, & Changeux, 2003). Premised on empirical findings on (1) the depth of unconscious processing, (2) attention as a prerequisite of consciousness, and (3) the necessity of consciousness for some integrative mental operations, they postulated the “hypothesis of a global neuronal workspace.” One key feature of the global neuronal workspace theory is the perception of the mind as modular (e.g., Fodor, 1983). This is premised on the hypothesis that automatic or unconscious cognitive processing depends on multiple processors or modules (Baars, 1988; Fodor, 1983; Shallice, 1988), with at least two functional and neurobiological definitions dependent upon field of study. Dehaene and Naccache (2001) pointed out that in cognitive psychology, modules are characterized by their information encapsulation, domain specificity, and automatic processing, while in neuroscience specialized neural circuits responsible for processing specific types of input have been identified (e.g., via brain imaging, neuropsychological dissociations, and cell recording) at several spatial scales, ranging from orientation-selective cortical columns to face-selective areas. As information enters, control is distributed across these processors and what is known as a global workspace or central information exchange. Indeed, Baars (1988) neatly described this global workspace as a broadcasting station that receives information from different sources and shares this information with its listeners. Dehaene and Naccache (2001) proposed that “a given process, involving several mental operations, can proceed unconsciously only if a set of adequately interconnected modular systems is available to perform each of the required operations” (p. 12). This hypothesis implies that multiple unconscious operations can proceed in parallel as long as they do not simultaneously appeal to the same modular systems in contradictory ways (cf. Wickens’ (1989) notion of parallel versus serial processing). In addition, unconscious processing may occur at both low-level, that is, computationally simple, operations or high-level operations, but this kind of processing needs to be associated with “functional neural pathways either established by evolution, laid down during development, or automatized by learning” (p. 13). On the other hand, consciousness depends upon access to this global workspace; conscious experience is informative, and adaptive processes in the nervous system are triggered by conscious events. According to Baars (1988), the learning process begins with the realization that there is something to be learned

The Role of Awareness in Learning 53

and undergoes several stages that set the foundation for understanding the new information. Once this foundation for understanding is established (internalized?), the new information fades out of consciousness and becomes part of the unconscious foundation subsequently employed for interpreting new information (think again of activation of prior knowledge). In other words, new information does not appear to be candidates for implicit or unconscious processing if it is assumed that some connection to previous information is necessary for this type of processing and that some foundation has already been laid down, perhaps via multiple exposures over some period of time, to support such implicit processing. Think of this process as the establishment of knowledge that is subsequently activated to facilitate and process incoming information (or the role of prior knowledge facilitating comprehension, quick retrieval of an associated linguistic form or structure, and learning). Dependent upon the use of prior knowledge, depth of processing or how human beings process incoming information may be affected in the early stages of the learning process. To visualize the workings of this concept of a global workspace, let us take a look at a model of global neuronal workspace theory in relation to the wellresearched attentional blink (e.g., Dehaene et al., 2003). In this paradigm, participants are exposed to two successive stimuli presented at two different time intervals. If the interval between the presentations of the two stimuli is short, participants’ ability to report the second stimulus decreases (as in if their attention “blinks”). To situate this finding within the model premised on a hierarchical nature of cortical organization, Dehaene et al. (2003) postulated that when the first stimulus was presented, the network created a global state in which the initial stimulus is represented at all levels of the hierarchy. Due to its recurrent connections, this information may remain for a short period. If the second stimulus is introduced shortly after the first stimulus, it faces top-down competition from the lingering representation of the first stimulus and cannot be processed effectively (sounds like Wickens’ notion of serial processing, doesn’t it, or perhaps to be more up-to-date, multi-tasking?). Interestingly, while the top-down influences in this model derive from the lingering activation of the previous stimulus, Dehaene et al. have observed that other models have identified this top-down influence as representing the focus of attention (e.g., Corchs & Deco, 2002; Spratling & Johnson, 2004), items in working memory (e.g., O’Reilly, 2003; O’Reilly, Braver, & Cohen, 1999), and task demands (e.g., Cohen, AstonJones, & Gilzenrat, 2004).

Neuroscience and Connectionist Modeling Maia and Cleeremans (2005) examined the construct of consciousness from both cognitive neuroscience and connectionist modeling, underscoring the view that in these areas selective attention, working memory, cognitive control, and consciousness involve competition between widely distributed representations,

54

Theoretical Foundations

which are biased by top-down processes found notably in the prefrontal cortex in the brain. According to Maia and Cleeremans, these models, including the global workspace theory, embody the fundamental principle used in connectionist models decades ago, namely global constraint satisfaction. They listed the mechanisms hypothesized to be associated with consciousness, provided below (p. 397): 1.

Active representation Active neuronal firing is necessary (but probably not sufficient) for consciousness.

2.

Global competition biased by top-down modulation Global competition between representations leads to consciousness. The winning neuronal coalition determines both conscious phenomenal experiences and global accessibility. Active representations maintained by the prefrontal cortex (PFC) are important sources of biases for this competition (Read: higher levels of consciousness based on neuronal firing rate).

3.

Global constraint satisfaction Global competition implements global constraint satisfaction. Conscious experience can be viewed as the result of a large-scale application of the brain’s knowledge to the current situation (Read: the role of prior knowledge).

4.

Reentrant processing Recurrent connections are essential to implement global constraint satisfaction. They allow more global interpretations in higher-level areas to influence processing in lower-level areas (which tend to work more like localized feature detection) (Read: the relationship between frequency of input and activation of prior knowledge).

5.

Meta-representation Higher levels of human consciousness and cognition, such as the ability to think about one’s thoughts, may depend on the creation of representations that are then fed back to the same constraint satisfaction network as input (Read metacognition, “thinking about thinking,” or monitoring our own performance).

Like most connectionist models, two types of representations are identified (cf. Maia & Cleeremans, 2005). On one hand, there is the long-term knowledge that is embedded or latent in the weights of the connections between units. These representations can influence behavior indirectly by eliciting specific firing patterns of neurons over groups of units, and are not directly accessible. This is the type of unconscious prior knowledge that we employ to make sense of most of what we are exposed to. When some piece of information is difficult to process, the usual culprits may be a simple lack of prior knowledge, an activation

The Role of Awareness in Learning 55

of inappropriate knowledge, or the inability to make the appropriate connection. On the other hand, there are representations that are more transient and active in the form of firing patterns, and as such, according to Maia and Cleeremans, conscious representations must depend upon these active representations. The hypothesis, then, is that only the outputs of computations in the brain may be potentially conscious, while the mechanisms of the computations themselves are unconscious. To grasp the concepts of global competition biased by top-down modulation, let us imagine a war of neurons taking place in the brain as information is distributed across several areas of this wonderfully and architecturally designed part of our upper body. One area of neurons is heavily armed and logically overruns a nearby area of neurons. A coalition is then formed between these two areas, leading to increased excitation from the newly combined areas of neurons, thereby increasing its neuronal firing rate. Strong and sustained firing collectively from both areas will most likely lead to being the winning neuronal coalition, which will in turn determine the contents of consciousness. The basic assumption, then, is that the winning coalition of neurons determines conscious experience at a given moment. This assumption explains the observation that information that does not enter the realm of consciousness tends to decay rather quickly (Dehaene & Naccache, 2001). From a more theoretical and anatomical perspective, the prefrontal cortex (PFC) in the brain has been reported to play a crucial role in that it provides the topdown projections that create the idea of large-scale competition, which has been proposed as the mechanism underlying attention (Desimone & Duncan, 1995), working memory (O’Reilly et al., 1999), and cognitive control (Cohen et al., 2004). The PFC is hypothesized not only to maintain active representations but also to switch rapidly between representations when necessary (O’Reilly, 2003). Indeed, this switching ability is “crucial for the ability to maintain and update representations in working memory, change the representations of task demands in cognitive control, or modify the focus of attention, flexibly and quickly” (Maia & Cleeremans, 2005: 399). Representations, actively maintained in PFC and which appear to have the ability to remain for some time, assist in biasing the competition between representations elsewhere in the brain. Consciousness, then, is viewed as closely related to (1) attention, given that unattended stimuli often fail to enter consciousness, (2) cognitive control, given the mental effort associated with “controlled processes,” and (3) working memory. As pointed out by Maia and Cleeremans, this connectionist perspective differs from that of the global workspace theory (e.g., Baars, 1988; Fodor, 1983) in relation to how the brain functions: While the latter views the brain as consisting of specialized, modular processors and a global workspace connecting these processors, the former postulates that the computation is more distributed and interactive on a global scale, especially due to the existence of massive recurrent connections at all levels of the cortex. This postulation is a typical critique

56

Theoretical Foundations

of the existence of strongly encapsulated modules as eschewed by the global workspace theory or, put another way, a general versus specific view of the modular brain.

What Do We Take Away from These Theoretical Underpinnings? Overall, in spite of the association of consciousness to a wide variety of constructs (e.g., limited capacity, WM, attention, control processing, information exchange between autonomous processors, etc.), we may, hopefully, take away four major conclusions. First, consciousness is closely associated with awareness, but whether these two terms can be conflated is another issue. Second, we process incoming information using both conscious and unconscious processes. Conscious processing is viewed as reflective of a limited capacity central processor, and while it is slow, effortful, and mostly serial, it also is subject to deliberate control that can be positively employed to address problems or to set goals. Unconscious processing, on the other hand, is not limited by shortterm memory capacity, not under voluntary control, and is associated with fast and efficient skilled performance. Third, content or information that has not received adequate attention or fails to enter into the realm of consciousness may most likely be discarded from working memory. This assumes that information minimally not attended does not hold much potential to be further processed, and even information minimally attended but not further processed may also be discarded from working memory. Fourth, unconscious processing appears to depend upon some prior knowledge laid down during previous processing by the learner over some period of time. In other words, new information may need to be built upon multiple exposures or occurrences in order to establish a network or connection that would allow minimal effort to process the incoming information. Have you noticed the many references to the role of prior knowledge in performance? Almost everything we do in life is premised on the activation of previous knowledge or experiences, and learning is no different. Gass’s (1997) notion of apperception captures this simple fact in the sense that any aspect of incoming L2 data necessarily needs to be linked somehow to our prior knowledge, whether it has to do with our expectation of how the input will appear, its physical features, syntax, phonological representation, and so on. I shall discuss this role of prior knowledge in Chapter 8 that deals with concurrent data elicitation procedures.

Relevance to L2 Learning? Up to this point, we have discussed several theoretical underpinnings postulated for the construct of consciousness in the non-SLA field, mostly from a cognitive science or neuroscience perspective. The question, logically, is how relevant are

The Role of Awareness in Learning 57

these models to language learning? Let us take a brief look at the kind of stimuli that were investigated to support these models. The typical stimuli were visual, usually associated with conscious perception as compared to unconscious processing, and one of the most popular measures was the contrastive method proposed by Baars (1988) and embodied in the masking paradigm. This masking paradigm has been claimed to be “one of the simplest and most productive situation in which to study conscious access in normal subjects” (Dehaene, 2009: 90). A simple contrastive method investigates a pair of similar events, of which one is conscious and the other is unconscious. Contrastive events identified in the cognitive neuroscience literature include voluntary versus involuntary actions, normal vision versus blindsight, old versus novel stimuli, masked versus non-masked visual stimuli, distinctions within states of consciousness (e.g., sleep, coma, wakefulness, arousal), accessed versus non-accessed meanings of ambiguous stimuli, and so on. Conducting the masking paradigm is relatively simple. Participants are exposed to a target visual stimulus that is flashed briefly on a computer screen during masking. Either before or after this target stimulus, another visual stimulus is presented either at the same screen location or close by. The purpose of the mask is to erase the perception of the target stimulus, which is usually measured by asking participants to report their ability to see the target stimulus. Even when participants report not being able to see the target stimulus, brain activity still occurs (as measured, for example, by the presence of neurons firing in specific regions of the brain), indicating subsequent behavioral priming effects. Performances are then correlated with activation patterns in the brain known to correspond to the relevant aspect of the stimulus (e.g., conscious versus unconscious perception). In this way, probing deeper into the kinds of processing that can potentially take place under subliminal masking and unmasked conditions can shed some light on the nature of conscious access during the experiment. Whether we can extrapolate such findings to the field of SLA is an important issue. Attention to a visual stimulus is like the tip of an iceberg when compared to processing the L2, and here the crucial distinction is between attention and perception versus attention, perception, and the processing of a multitude of features embedded in the L2 data. Let us take an example provided by Sharwood Smith, Truscott, and Hawkins (2013) to illustrate this huge difference in terms of what takes place internally. Exposed to the sentence, “Tim meets Kim at the bus-stop every morning,” which would sound acoustically like “TimmeetsKimatthebusstopeverymorning,” Sharwood Smith et al. wrote: A language learner, whether first or second, has to break into this system and identify the phonemic contrasts, the morphemes, their category membership (whether they are nouns, verbs, members of a Tense category, etc.) and the phrase structure properties and syntactic operations (how questions are formed, how negation is used, whether there is subject-verb agreement, etc.) of the target language. (p. 564)

58

Theoretical Foundations

Enough said. Now, as promised, let us now take a quick look at some nonSLA definitions of what constitutes implicit or explicit learning, or learning with or without awareness, followed by some classic studies in these fields that have attempted to address the issue of “learning.”

Defining Implicit Learning in Cognitive Psychology Early definitions of and references to consciousness/awareness in fields outside SLA are clear indications of the vagueness of what constitutes awareness, as reported in Leow, Johnson, and Zárate-Sández (2011). For example, Garrett (1943) wrote, “[A]wareness is the searchlight of consciousness. It is the process by means of which the individual consciousness as a whole seeks out and finds its associational affinities everywhere in the universe” (p. 12). You can go back and reread this definition. I did several times and still remain unaware of what Garrett was saying. One of the earliest definitions of the term “implicit learning” was used decades ago by Thorndike and Rock (1934). Implicit learning was defined as “learning without awareness of what is being learnt or intent to learn it,” which was later adopted by Reber and his associates in the 60s and even used recently by Leung and Williams (2011) in SLA, as seen in “learning that proceeds without awareness of what is being learned and without intention to learn it” (p. 33). Interestingly, this phrase of “implicit learning” is usually attributed to Reber’s (1967) seminal study in cognitive psychology and, more specifically, to the operationalization and measurement of what was assumed to comprise implicit learning, that is, learning without awareness, conducted in an Artificial Grammar Learning (AGL) experiment. Reber’s (1976) definition of implicit learning is relatively similar to Thorndike and Rock’s, as it has been characterized as a “process [author’s italics] whereby a subject becomes sensitive to the structure inherent in a complex array by developing (implicitly) a conceptual model which reflects the structure to some degree” (Reber, 1976: 88). I shall come back to this definition later.

Investigating Implicit Learning in Cognitive Psychology Irrespective of vagueness of definition, whether the role of consciousness/awareness is crucial for further processing to take place and, ultimately, for learning has been and will remain a contentious issue, as we will see here in non-SLA fields (and later in the SLA field, Chapter 10). On the one hand, several researchers have supported a dissociation between learning and awareness (e.g., Carr & Curran, 1994; Curran & Keele, 1993; Hardcastle, 1993; Tomlin & Villa, 1994; Velmans, 1991), while on the other hand, other researchers have not (e.g., Perruchet & Pacteau, 1990, 1991; Shanks, 2005; Shanks, Green, & Kolodny, 1994; Shanks & St. John, 1994). The classical dispute in cognitive psychology regarding the role of awareness in

The Role of Awareness in Learning 59

learning is exemplified in the research on the issue of implicit and explicit learning initiated by Reber and his colleagues on one hand, and by Dulany and his colleagues on the other (e.g., Dulany, Carlson, & Dewey, 1984, 1985; Howard & Ballas, 1980; Reber, 1967, 1976, 1989, 1993; Reber, Kassim, Lewis, & Cantor, 1980; Winter & Reber, 1994). In cognitive psychology, operationalizing implicit and explicit learning was typically achieved by creating (a) a so-called explicit learning condition in which learners are provided with instructions to look for rules underlying the input, and (b) a so-called implicit learning condition in which learners are instructed to memorize the input. The input provided was typically non-linguistic stimuli governed by complex rule or finite-state systems and better known as artificial grammars (AG). The premise underlying this design is that explicit learning conditions promote explicit learning (with some assumed level of awareness present during exposure), while implicit learning conditions promote implicit learning (that is, without the presence of any level of awareness). Let us take a look at Reber’s (1967) seminal study conducted to address Artificial Grammar Learning (AGL). In this study, the first to incorporate artificial grammars into its design, a group of participants was first placed into a so-called incidental learning condition (they were not informed of the purpose of the experimental task and, as a consequence, they entered the experimental condition with no intent to learn), were presented with strings of letters (e.g., VXVS) that were governed by rules in a finite-state system, and were told to memorize them. A second group of participants were exposed to non-ordered letter strings. With practice, the group exposed to rule-ordered stimuli demonstrated some improvement at memorizing and processing strings, whereas the other group showed no such improvement. To address the role of consciousness or awareness in learning these strings of letters, Reber operationalized and measured the construct of consciousness or awareness non-concurrently via an offline questionnaire that requested participants to verbalize the rules after exposure to the letter strings. Measurement of learning was obtained on grammaticality judgment tasks that included both old and new exemplars, the scores of which were submitted to t-tests based on chance performance (50%). Significant above chance performances were interpreted as evidence of incidental/ implicit learning and, with respect to new exemplars, as learners’ generalization of the underlying grammar. According to Reber, this fact should be taken as evidence that implicit learning had taken place, that is, individuals could predict above chance levels which new strings were acceptable and which were not, even though they were not aware of the existence of a complex rule system underlying them. In a later study, Reber (1976) set out to investigate learning both under implicit and under explicit conditions. He divided participants into two groups, one that was instructed to search for the rules underlying the stimuli and one that was

60

Theoretical Foundations

instructed to memorize exemplars from a synthetic grammar. After the learning phase of the experiment, participants in both groups had to judge the wellformedness of new letter strings. Reber found that participants who had been instructed to search for rules were slower at memorizing strings, and their wellformedness judgments were also less accurate. On the basis of these findings the researcher claimed that implicit learning conditions are more adequate when the stimulus is complex and that, in those circumstances, explicit learning conditions have a detrimental effect. These findings, however, seem to be contradicted by some studies in which learners explicitly instructed to search for patterns underlying certain stimuli outperformed those who were not explicitly instructed. For instance, in Howard and Ballas (1980), making learners aware of the rule-governed nature of the stimuli had a facilitative effect in learning as compared to a group of participants who did not receive any explicit grammatical information. Reber (1989, 1993) and Winter and Reber (1994) argue that the advantage of learning without awareness over explicit learning can only be seen when the underlying rules are complex and cannot be deciphered by the individuals. In his view, implicit learning is characterized by four crucial traits: (a) It is a process that takes place outside of awareness (i.e., formation and testing of hypothesis do not make a difference in implicit learning); (b) knowledge derived from it is represented tacitly and abstractly (i.e., this knowledge can be generalized to strings that are not presented during the training phase and even to different symbol sets); (c) this knowledge resists conscious inspection, but it can be used for grammaticality judgments; and (d) it is not exclusive to any specific domain of knowledge. The other side of the implicit/explicit debate in the cognitive psychology field is represented by the studies carried out by Dulany and his colleagues, who questioned the degree to which participants in Reber’s experiments held abstract knowledge outside of awareness. Drawing from the experiments conducted by Reber and his associates, Dulany et al. (1984, 1985) argued that individuals exposed to finite-state grammars were able to decide on the grammaticality of new structures, not because they had acquired tacit, unconscious knowledge through exposure, but rather because they had developed an idiosyncratic rule system of limited scope that determined their judgment. This would appear to be some type of explicit knowledge below awareness at the level of understanding (Schmidt, 1990). Moreover, they argued that the information participants had derived from exposure to the stimuli was not to be identified with the formal grammar underlying the input, but rather an idiosyncratic grammar was correlated with it. To sum up, Dulany et al. (1984, 1985) defended the view that Reber and his associates had not demonstrated a dissociation between learning and awareness, and that Reber’s interpretation of learning under implicit conditions as abstract and independent of memory was unwarranted. The same stance was also taken by Perruchet and Pacteau (1990, 1991) and Shanks (2005).

The Role of Awareness in Learning 61

In addition to artificial grammar studies, empirical support for the dissociation between learning and awareness at the level of form associations include studies that have used semantic priming tasks (e.g., Marcel, 1983), the serial reaction-time task to address learning sequences learning (e.g., Jiménez & Méndez, 1999), and contextual cuing paradigm in visual perception (e.g., Chun, 2000; Jiang & Chun, 2003) that investigated shapes presented in target and distractor stimuli. These tasks are argued to elicit evidence of sensitivity to specific regularities without being aware of such regularities. Recently, Williams (2013) reported that, in addition to investigating implicit learning of a form association, there has been a move to address this type of learning while making semantic associations, such as becoming sensitive to semantic category sequences that, for example, a picture of an animal will follow a picture of a body part (Goschke & Bolte, 2007), and through the use of contextual cuing paradigms, the prediction of target position by the semantic properties of the distractors, such as the oddness or evenness of digits (Goujon, Didierjean, & Marmeche, 2007), the semantic category of words (Goujon, Didierjean, & Marmeche, 2009), and the type of scene in which the target appeared (Goujon, 2011). To put into perspective what specifically these studies addressed, let us take a closer look at Goujon (2011), who investigated the extent to which learning mechanisms are deployed on semantic categorical regularities during a visual searching within real-world scenes (e.g., a bathroom, a bedroom, a living room). According to Chun (2000), people develop sensitivity to statistical regularities in the stimulus environment, which, in turn, constrains what to expect and where to look. The contextual cueing paradigm, then, was used with photographs of indoor scenes in which the semantic category did or did not predict the target position on the screen. According to Goujon, one advantage of the contextual cueing paradigm, in contrast to subjective measures, is that it uses a visual search task to indirectly examine the progressive development of learning associated with contextual regularities. In a visual search task, participants search for a rotated T among a number of distracting rotated L s. However, participants are not told that the displays are repeated in such a way that certain spatial configurations of distracters tend to appear with certain target positions. In this study, the general principle of the paradigm consisted of presenting regularities within search areas that were predictive of a characteristic of the target (e.g., its location) and exposing participants to these regularities throughout the course of the task. Participants then were instructed to search as quickly as possible for a target, either an L or a T within photographs that were new in each trial of the task. Some of these targets were in the same special location for specific scenes. Once identified, participants pressed the corresponding key for that target. A follow-up explicit memory task (an offline verbal report) was also administered. Implicit learning was reported. Needless to say, the data gathered in these studies need to be viewed with much caution when being extrapolated to naturally occurring languages.

62

Theoretical Foundations

The overall focus of current studies of awareness in non-SLA fields appears to be on the dichotomy between implicit and explicit knowledge, that is, learning is viewed as a product and not a process, and this is revealed in the way awareness is being operationalized and measured, namely offline or after exposure to the experimental data. As I pointed out in Leow (2015), what is of interest is that implicit learning has been characterized as a “process [author’s italics] whereby a subject becomes sensitive to the structure inherent in a complex array by developing (implicitly) a conceptual model which reflects the structure to some degree” (Reber, 1976: 88). However, in spite of this characterization of implicit learning as a process assumedly taking place during exposure to the experimental data, Reber and his colleagues operationalized and measured the learning process by eliciting offline data. If we view the framework of the learning process in Chapter 2, data gathered in these measures to address the role of unawareness in the learning process are located beyond Stage 5 at the output stage and represent knowledge (a product) that is closely associated with learners’ ability to verbalize the targeted underlying linguistic rules. Interestingly, many studies motivated by Reber’s works, in both cognitive psychology and SLA fields, have followed over the last four decades his offline operationalization of awareness (via verbal reports) as a product, including novel additions to measure awareness non-concurrently, such as participant confidence ratings (e.g., Dienes & Altmann, 1997) along with grammaticality judgment ratings (e.g., Tunney & Shanks, 2003), source attributions (e.g., Dienes & Scott, 2005), and wagering (e.g., Dienes & Seth, 2010). In studies employing confidence ratings, participants are asked to report their overall level of confidence regarding their decision on an assessment test (e.g., a GJT), usually from a selection of the following options: No confidence, somewhat confident, very confident, and absolutely certain. Participants selecting no confidence or somewhat confident are coded as unaware, while those selecting very confident and absolutely certain are coded as aware. Source attributions ask participants to decide what knowledge source they used to judge each item on the assessment task (again, usually a GJT) based on one of the options: Rule, memory, intuition, or guess. According to Dienes and Scott (2005), source attributions are believed to reflect the underlying structural knowledge created during training that participants use during testing to make their judgments. Participants selecting rule and memory are coded as explicit learning while those selecting intuition and guess are coded as implicit learning. Wagering is giving participants a certain amount of “money” and asking them to bet on their decisions on a task. Presumably, the amount of money waged is an indication of the amount of confidence participants have in their decisions. Note that these measures are all administered beyond Stage 5 and arguably are eliciting knowledge gained during the experimental exposure or treatment. The employment of this four decades–old, nonconcurrent operationalization and measurement of awareness in current studies, both in SLA and non-SLA fields, is indeed remarkable.

The Role of Awareness in Learning 63

Summary The research methodology employed to address the role of awareness in learning as a process needs to be viewed with some caution, given that it may not have high internal validity (this is more fully discussed in Chapter 6, on research methodology). Operationalizing awareness by the creation of external learning conditions, based on the assumption that participants in one condition represent that condition, may not be robust enough to elicit representative data regarding the role of awareness while interacting with the data. In addition, several researchers (e.g., Carr & Curran, 1994; Leow, 2000; Leung & Williams, 2011; Schmidt, 1995; Simard & Wong, 2001; VanPatten, 1994) have pointed out that the data participants are exposed to during the experimental phase of the study are not entirely relevant to natural languages. For example, the task demands and learning conditions of finite-state grammar learning or visual stimuli or non-language data do not match those of natural language learning or acquisition and lack important features that characterize natural languages, such as the absence of a semantic component, the lack of surface features that encode morphological form, agreement mapping and phonological inflections, and the absence of rules that dictate and restrict syntactic movement.

Conclusion In this chapter, we first acknowledged the vagueness of what comprises the construct of consciousness in non-SLA fields and the challenges facing us to operationalize and measure this slippery construct. We also noted that most of the theoretical underpinnings that have in some way influenced several current theoretical underpinnings in SLA are initially premised on data gathered in the field of visual attention. This, in turn, does require some caution when extrapolating the tenets of these theories to the field of SLA. Finally, we took a brief look at the empirical controversy in non-SLA fields as to whether awareness plays an important role in “learning.” It was noted that we need to be aware that not all studies addressing the role of awareness in non-SLA fields were indeed investigating the process of L2 learning. In the next chapter, we are going to discuss several theoretical underpinnings postulated in SLA to account for different stages of the L2 learning process.

References Baars, B. J. (1988). A cognitive theory of consciousness. Cambridge: Cambridge University Press. Block, N. (1995). On a confusion about a function of consciousness. Behavioral and Brain Sciences, 18, 227–247. Carr, T., & Curran, T. (1994). Cognitive factors in learning about structured sequences: Applications to syntax. Studies in Second Language Acquisition, 16, 205–230.

64

Theoretical Foundations

Chun, M. M. (2000). Contextual cueing of visual attention. Trends in Cognitive Sciences, 4, 170–178. Cohen, J. D., Aston-Jones, G., & Gilzenrat, M. S. (2004). A system-level perspective of attention and cognitive control: Guided activation, adaptive gating, conflict monitoring, and exploitation versus exploration. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 71–90). New York: Guilford Press. Corchs, S., & Deco, G. (2002). Large-scale neural model for visual attention: Integration of experimental single-cell and fMRI data. Cerebral Cortex, 12, 339–348. Cowan, N. (1999). An embedded-processes model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 62–101). Cambridge: Cambridge University Press. Curran, T., & Keele, S. (1993). Attentional and nonattentional forms of sequence learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 189–202. Dehaene, S. (2009). Conscious and nonconscious processes: Distinct forms of evidence accumulation? Séminaire Poincaré, 12, 89–114. Dehaene, S., & Changeax, J. P. (2005). Ongoing spontaneous activity controls access to consciousness: A neuronal model for inattentional blindness. PLoS Biology, 3 (5), e141. Dehaene, S., Changeux, J. P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: A testable taxonomy. Trends in Cognitive Sciences, 10, 204–211. Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness: Basic evidence and a workspace framework. Cognition, 79, 1–37. Dehaene, S., Sergent, C., & Changeux, J. P. (2003). A neuronal network model linking subjective reports and objective physiological data during conscious perception. Proceedings of the National Academy of Sciences, 100, 8520–8525. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222. Dienes, Z., & Altmann, G. (1997). Transfer of implicit knowledge across domains: How implicit and how abstract? In D.C. Berry (Ed.), How implicit is implicit learning? (pp. 107–123). Oxford: Oxford University Press. Dienes, Z., & Scott, R. (2005). Measuring unconscious knowledge: Distinguishing structural knowledge and judgment knowledge. Psychological Research, 69 (5/6), 338–351. Dienes, Z., & Seth, A. (2010). Gambling on the unconscious: A comparison of wagering and confidence ratings as measures of awareness in an artificial grammar task. Consciousness & Cognition, 19, 674–681. Dimond, S. J. (1976). Brain circuits for consciousness. Brain, Behavior, and Evolution, 13, 376–395. Dulany, D., Carlson, R., & Dewey, G. (1984). A case of syntactical learning and judgment: How conscious and how abstract? Journal of Experimental Psychology, 113, 541–555. Dulany, D., Carlson, R., & Dewey, G. (1985). On consciousness in syntactic learning and judgment: A reply to Reber, Allen and Regan. Journal of Experimental Psychology, 114, 25–32. Fodor, J. A. (1983). The modularity of the mind. Cambridge, MA: MIT Press. Friedenberg, J. (2013). Visual attention and consciousness. New York: Psychology Press. Garrett, E. (1943). Awareness. New York: Creative Age Press. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum.

The Role of Awareness in Learning 65

Goschke, T., & Bolte, A. (2007). Implicit learning of semantic category sequences: Response-independent acquisition of abstract sequential regularities. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 394–406. Goujon, A. (2011). Categorical implicit learning in real-world scenes: Evidence from contextual cuing. Quarterly Journal of Experimental Psychology, 64, 920–941. Goujon, A., Didierjean, A., & Marmeche, E. (2007). Contextual cuing based on specific and categorical properties of the environment. Visual Cognition, 15, 257–275. Goujon, A., Didierjean, A., & Marmeche, E. (2009). Semantic contextual cuing and visual attention. Journal of Experimental Psychology: Human Perception and Performance, 35, 50–71. Hardcastle, V. G. (1993). The naturalists versus the skeptics: The debate over the scientific understanding of consciousness. Journal of Mind and Behavior, 14, 27–50. Howard, J., & Ballas, J. (1980). Syntactic and semantic factors in the classification of nonspeech transient patterns. Perception and Psychophysics, 28, 431–439. Jiang, Y., & Chun, M. M. (2003). Contextual cueing: Reciprocal influences between attention and implicit learning. In L. Jiménez (Ed.), Attention and implicit learning (pp. 277–296). Amsterdam: John Benjamins. Jiménez, L., & Méndez, C. (1999). Qualitative differences between implicit and explicit sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 236–259. Kihlstrom, J. (1984). Conscious, subconscious, unconscious: A cognitive perspective. In K. Bowers & D. Meichenbaum (Eds.), The unconsciousness reconsidered (pp. 149–211). New York: Wiley. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2015). Implicit learning in SLA: Of processes and products. In P. Rebuschat (Ed.), Implicit and explicit learning of languages. Amsterdam: John Benjamins. Leow, R. P., Johnson, E., & Zárate-Sández, G. (2011). Getting a grip on the slippery construct of awareness: Toward a finer-grained methodological perspective. In C. Sanz & R. P. Leow (Eds.), Implicit and explicit conditions, processes and knowledge in SLA and bilingualism (pp. 61–72). Washington, D.C.: Georgetown University Press. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mappings between forms and contextually derived meanings. Studies in Second Language Acquisition, 33, 33–55. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527. Maia, T. V., & Cleeremans, A. (2005). Consciousness: Converging insights from connectionist modeling and neuroscience. TRENDS in Cognitive Sciences, 9 (8), 397–404. Marcel, A. J. (1983). Conscious and unconscious perception: Experiments on visual masking and word recognition. Cognitive Psychology, 15, 197–237. Merikle, P. M., Smilek, D., & Eastwood, J. D. (2001). Perception without awareness: Perspectives from cognitive psychology. In S. Dehaene (Ed.), The cognitive neuroscience of consciousness (pp. 115–134). Cambridge, MA: MIT Press. Nissen, M., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32. Norman, D. A. (1968). Toward a theory of memory and attention. Psychological Review, 84, 231–259. Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behavior. In R. J. Davidson, G. E. Schwartz, & D. Shapiro (Eds.), Consciousness and self-regulation (Vol. 4). New York: Plenum Press.

66

Theoretical Foundations

O’Reilly, R.C. (2003). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia (Technical Report 03–03). Institute of Cognitive Science, University of Colorado at Boulder. O’Reilly, R.C., Braver, T. S., & Cohen, J. D. (1999). A biologically based computational model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory: Mechanisms of active maintenance and executive control (pp. 375–411). Cambridge: Cambridge University Press. Perruchet, P., & Pacteau, C. (1990). Synthetic grammar learning: Implicit rule abstraction of fragmentary knowledge? Journal of Experimental Psychology: General, 119, 264–275. Perruchet, P., & Pacteau, C. (1991). Implicit acquisition of abstract knowledge about artificial grammar: Some methodological and conceptual issues. Journal of Experimental Psychology: General, 120, 112–116. Posner, M. (1992). Attention as a cognitive and neural system. Current Directions in Psychological Science, 1, 11–14. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. Solso (Ed.), Information processing and cognition: The Loyola Symposium. Potomac, MD: Erlbaum. Reber, A. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 77, 317–327. Reber, A. (1976). Implicit learning of synthetic languages: The role of instructional set. Journal of Experimental Psychology: Human Learning and Memory, 2, 88–94. Reber, A. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology, 118, 219–235. Reber, A. (1993). Implicit learning and tacit knowledge: An essay on the cognitive unconscious. Oxford: Clarendon Press. Reber, A., Kassim, S., Lewis, S., & Cantor, G. (1980). On the relationship between implicit and explicit modes in the learning of a complex rule structure. Journal of Experimental Psychology, 6, 492–502. Schachter, D. L. (1989). On the relation between memory and consciousness: Dissociable interactions and conscious experience. In H. L. Roediger & F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel Tulving. Hillsdale, NJ: LEA. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Schmidt, R. (1995). Consciousness and foreign language learning: A tutorial on the role of attention and awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 1–63). Honolulu, HI: University of Hawai’i Press. Shallice, T. (1988). From neuropsychology to mental structure. Cambridge: Cambridge University Press. Shanks, D. R. (2005). Implicit learning. In K. L. Lamberts & R. L. Goldstone (Eds.), Handbook of cognition (pp. 202–220). London: Sage. Shanks, D. R., Green, R. E. A., & Kolodny, J. A. (1994). A critical examination of the evidence for unconscious (implicit) learning. In C. Umiltà & M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing (pp. 837– 860). Cambridge, MA: MIT Press. Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447. Sharwood Smith, M., Truscott, J., & Hawkins, R. (2013). Explaining change in transition grammars. In J. Herschensohn & M. Young-Scholten (Eds.), The Cambridge handbook of second language acquisition (pp. 560–580). New York: Cambridge University Press.

The Role of Awareness in Learning 67

Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190. Simard, D., & Wong, W. (2001). Alertness, orientation, and detection: The conceptualization of attentional functions in SLA. Studies in Second Language Acquisition, 23, 103–124. Spratling, M. W., & Johnson, M. H. (2004). A feedback model of visual attention. Journal of Cognitive Neuroscience, 16, 219–237. Thorndike, E. L., & Rock, R. T. (1934). Learning without awareness of what is being learned or intent to learn it. Journal of Experimental Psychology, 17, 1–19. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16 (2), 183–203. Tulving, E. (1993). Varieties of consciousness and levels of awareness in memory. In A. Baddeley & L. Weiskrantz (Eds.), Attention: selection, awareness, and control: A tribute to Donald Broadbent (pp. 283–299). Oxford: Clarendon Press. Tunney, R. J., & Shanks, D. R. (2003). Subjective measures of awareness and implicit cognition. Memory & Cognition, 31(7), 1060–1071. VanPatten, B. (1994). Evaluating the role of consciousness in second language acquisition: Terms, linguistic features & research methodology. AILA Review, 11, 27–36. Velmans, M. (1991). Is human information processing conscious? Behavioral and Brain Sciences, 14, 61–669. Wickens, C. D. (1984). Processing resources in attention. In R. Parasuraman & D. R. Davies (Eds.), Varieties of attention (pp. 63–102). New York: Academic Press. Wickens, C. D. (1989). Attention and skilled performance. In D. Holding (Ed.), Human skills (pp. 71–105). New York, London: John Wiley. Williams, J. N. (2013). Attention, noticing, and awareness in language processing and learning. In J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in honor in Richard Schmidt (pp. 39–57). Honolulu, HI: University of Hawai’i, National Foreign Language Resource Center. Winter, B., & Reber, A. (1994). Implicit learning and the acquisition of natural languages. In N. Ellis (Ed.), Implicit and explicit learning of languages (pp. 115–146). London: Academic Press.

5 THEORETICAL FOUNDATIONS FOR THE ROLES OF ATTENTION AND AWARENESS IN L2 LEARNING IN SLA

A cursory review of the theoretical underpinnings postulated to address the learning process (as in stages) or product (as in knowledge) reveals quite an impressive number that are grounded from several different perspectives of the learning process. These include perspectives that are generative Chomskyan linguistic (e.g., Carroll’s (2007) Autonomous Induction Theory), social (e.g., Lantolf & Thorne’s Vygotskian Sociocultural Theory), cognitive neuroscience–based (e.g., Ullman’s [2004] Declarative/Procedural Model of Memory), psychology-based (e.g., McLaughlin’s Cognitive Theory, Swain’s Output Hypothesis, Ellis’s (2001) Associative-Cognitive CREED Framework, DeKeyser’s (2007) Skill Acquisition Theory, VanPatten’s Input Processing Model/Theory, Pienemanns’s (2007) Processability Theory, Truscott and Sharwood Smith’s (2004) MOGUL (Modular Online Growth and Use of Language)), and interaction/psychology-based (e.g., Gass and Mackey’s Interaction Approach). I am using the phrase “theoretical underpinnings” to subsume all types of theoretical formats (e.g., hypothesis, model, framework, theory; cf. Chapter 12 for further elaboration). In this chapter, I will provide a brief description of the major tenets of several theoretical underpinnings of second/foreign language (L2) learning/acquisition premised on the psycholinguistic underpinnings of attention and/or awareness (Ellis, 2007; Gass, 1997; McLaughlin, 1987; Robinson, 1995, 2003; Schmidt, 1990, and elsewhere; Swain, 2005; Tomlin & Villa, 1994; Truscott & Sharwood Smith, 2011; VanPatten, 2004, 2007 (cf. Mitchell, Myles, & Marsden, 2013; VanPatten & Williams, 2007, for succinct descriptions of other theoretical underpinnings postulated for SLA)). I shall also provide some comments on some of the more popular theoretical underpinnings. As you will see, they all are derived from some branch of psychology and posit important roles for several cognitive processes in L2 learning, and most acknowledge the learning process as a series of stages.

Attention and Awareness in L2 Learning

69

As you read these descriptions, think back to the models of attention discussed in Chapters 3 and 4, and try to identify the specific psychology-based or neuroscience-based foundations upon which SLA underpinnings are based. Truscott and Sharwood Smith’s MOGUL framework is also included as an example of a modular approach to L2 learning. A summary of the principle cognitive processes in the learning process is provided, followed by a discussion of the role of prior knowledge in L2 learning.

Theoretical Underpinnings of Second/Foreign Language Learning/Acquisition One indisputable key process in all of these psychology-based theoretical underpinnings is the role attention plays in the L2 learning process, a process that has almost always been assumed in any learning condition and that is clearly substantiated by our constant question to our students: “Are you paying attention?” In the late 80s and early 90s, the construct of attention began to take on a more cognitive perspective as SLA researchers began to extrapolate theoretical attentional postulations from other fields, such as cognitive psychology/science and cognitive neuroscience, to the SLA literature. These theoretical postulations either addressed a partial, namely Stage 1 and/or Stage 3 (the input to intake stage as in Schmidt, 1990, and elsewhere; Tomlin & Villa, 1994, and including Stage 3 as in McLaughlin, 1987; VanPatten, 2004, 2007; Robinson, 1995, 2003) or Stage 5 (the output stage as in Swain, 2005), or a full perspective, namely from Stage 1 to Stage 5 (input to output as in Gass, 1997, and elsewhere). To situate the theoretical postulations with regard to the learning process, the visual and broad representation of the major stages of the L2 learning process in Chapter 2 can be useful.

Partial Theoretical Underpinnings Stage 1: Input to Intake Several of the theoretical underpinnings postulated to account for the L2 learning process have only addressed fully the early stages of this process, namely, the input to intake phase (Stage 1) and/or the intake processing phase (Stage 3). Let us begin with Schmidt’s (1990, 2001) Noticing Hypothesis.

Schmidt’s Noticing Hypothesis Schmidt’s (1990 and elsewhere) Noticing Hypothesis (modified during later years) was the first theoretical postulation in the SLA field to address the role of attention in direct relation to the construct of awareness at the early inputto-intake stage (Stage 1) of the L2 learning processing. Drawing from works in

70

Theoretical Foundations

cognitive psychology (e.g., the notions of the learner as a limited capacity processor and selective or focal attention) and his own personal experience while learning Portuguese (Schmidt & Frota, 1986), Schmidt’s Noticing Hypothesis agrees with the notion that attention controls access to awareness and therefore is responsible for noticing. Schmidt (1994) defines noticing as “the registration of the occurrence of a stimulus event in conscious awareness and its subsequent storage in long-term memory” (p. 166), and operationalizes this construct as availability for self-report either during or immediately after exposure to the input, with the caveat that lack of self-report does not necessarily imply lack of awareness, since certain subjective experiences may be difficult to verbalize due to memory limitations and to lack of metalanguage. According to Schmidt, to learn any linguistic feature of the L2, for example, sounds, words, grammar, pragmatics, etc., this feature in the L2 input must be noticed, that is, paid focal attention to with minimally a low level of awareness by learners, even though they may lack understanding of the underlying rule associated with this linguistic feature. Drawing on the prominent theoretical assumption in most 19thcentury literature in cognitive psychology that focal attention and awareness are two sides of the same coin (cf. Neumann, 1996, for a review), Schmidt rejects the idea of learning without awareness. To ground his hypothesis more firmly on what comprises consciousness, Schmidt points out the terminological confusion in cognitive psychology and in the SLA literature regarding the constructs of awareness and consciousness (cf. Chapter 4 and McLaughlin, 1990). To address this confusion, Schmidt states that his hypothesis is solely based on the last perspective of consciousness, namely consciousness as awareness (which refers to the issue of acquiring knowledge within or outside awareness at the point of learning), and strongly argues against the possibility of implicit learning when this term is understood as an abstraction that takes place outside of awareness and without the help of conscious processes such as hypothesis formation and testing (Schmidt, 1994). According to Schmidt’s (1994) early postulation of his Noticing Hypothesis, attention controls access to awareness and is responsible for the subjective experience of noticing, which is “the necessary and sufficient condition for the conversion of input to intake” (p. 209). His position is that focal attention is isomorphic with awareness and that learning without awareness must be ruled out. In other words, whereas Schmidt (1995) admits the existence of processes that stand outside of awareness (e.g., implicit learning), he rejects the possibility of abstraction without awareness and contends that abstraction is always associated with conscious cognitive functions. In order to support his hypothesis that only L2 data that are noticed can be converted into intake, Schmidt (1990) provides evidence from his own experience as a learner of Portuguese (Schmidt & Frota, 1986). In this diary study, the researchers found a close correspondence between the elements that Schmidt had noticed and entered into his diary and the items that he produced in a series

Attention and Awareness in L2 Learning

71

of tape-recorded interactions with native speakers. However, as Schmidt (1993, 1994) notes, as well as others (e.g., Leow, 1997), evidence coming from diary studies alone is problematic in that “making a record in a diary requires not only noticing but also a higher level of self-awareness—awareness that one has noticed and needs to make a record of that noticing—which no theory claims as necessary for learning. It appears unlikely, therefore, that diary studies can resolve the zero-point problem” (1993: 211). In Schmidt’s view, even though Schmidt and Frota’s (1986) study provides strong evidence for an association between noticing and emergence in production, it does not show that noticing is sufficient for learning (there were numerous cases where a form was noticed and momentarily used, but never appeared again in his production), nor does it show that noticing is necessary for intake (some forms that appeared in production were never mentioned in the diary as having been noticed). Schmidt (1994), however, concedes that the crucial role of noticing in input processing is controversial, since there is a widespread belief that we can somehow pick up elements of the language without being aware of it (cf. Krashen, 1989; Tomlin & Villa, 1994, and a slew of visual attention studies in cognitive psychology, cf. Chapter 3). Given the methodological problem of establishing zero awareness at the point of noticing or processing, Schmidt has withdrawn from his original postulation of noticing as “the necessary and sufficient condition for the conversion of input into intake” (Schmidt, 1993: 209) to one of “more noticing leads to more learning” (Schmidt, 1994: 129), underscoring the facilitative nature of noticing in the early stages of the learning process. Besides noticing, Schmidt (1990) distinguishes a higher level of awareness that he calls understanding, and which is related to the ability to analyze, compare, and test hypotheses about the linguistic input. In the researcher’s view, whereas noticing is necessary for intake to take place, understanding may act as a facilitator for learning, but its presence is not necessary. The crucial difference between noticing and understanding is that, according to Schmidt (1993), the former results in intake and in item learning, while the higher level of awareness promotes deeper learning marked by restructuring and system learning and is underscored by learners’ ability to analyze, compare, and test hypotheses at this level. In other words, understanding seems to be a more sophisticated process than noticing, although they both allow for storage of linguistic material in long-term memory. Schmidt writes: I use “noticing” to mean conscious registration of the occurrence of some event, whereas “understanding” as I am using the term, implies recognition of a general principle, rule or pattern. Noticing refers to surface level phenomena and item learning, while understanding refers to deeper level of abstraction related to (semantic, syntactic, or communicative) meaning, system learning. (Schmidt, 1995: 29)

72

Theoretical Foundations

Key Features Here are the key features of Schmidt’s hypothesis: 1. 2. 3. 4. 5.

Attention is crucial for intake and, as an extension, for learning; Focal attention is accompanied by a low level of awareness called noticing; What is noticed in the L2 input becomes intake; Intake does not take place without some level of awareness associated with such a process at the preliminary stage of the learning process; and While not necessary for subsequent processing of the input, there is also a higher level of awareness involved during the learning process, namely awareness at the level of understanding.

Comments on Schmidt’s Noticing Hypothesis It is well accepted that Schmidt’s (1990 and elsewhere) Noticing Hypothesis is arguably one of the most influential theoretical underpinnings in SLA over the last two decades. Like new candy, several researchers became attracted (addicted?) to the hypothesis and began to probe deeper into the operationalization and measurement of the constructs of attention and awareness in an effort to understand better the cognitive processes involved during the early stages of SLA and, more specifically, intake. Interestingly, Schmidt’s hypothesis has been, with or without researchers’ awareness, adopted in several major strands of research that include input enhancement, learning conditions, interaction, feedback, and instruction. Schmidt’s Noticing Hypothesis has been well tested in the SLA field (I will elaborate on this in Chapter 9) and, like most popular postulations (recall Krashen?) also critiqued (e.g., Leow, 2013; Truscott, 1998; Truscott & Sharwood Smith, 2011). Truscott (1998) and Truscott and Sharwood Smith (2011), from a modular perspective of the learning process based on Jackendoff’s (1997, 2002) view of the architecture of the language faculty, underscored the vagueness or confusion of what comprises both noticing and consciousness, the boundaries between awareness at the level of noticing and awareness at the level of understanding, what specifically learners notice in the L2 input or, more specifically, the contents or objects of consciousness, the status of meaning during noticing, and the contrast between the noticing issue and implicit learning. I reflected on the Noticing Hypothesis from three perspectives, namely theoretical, methodological, and terminological. Theoretically, like Truscott and Sharwood Smith, I viewed the Noticing Hypothesis as being relatively coarsegrained, but more from the perspective that it does not appear to acknowledge that there may be several other variables potentially associated with the process of noticing. I also noted that while noticing is empirically supported to be facilitative of subsequent intake and potential learning, there is no hard evidence that all such noticed intake is logically processed further and, indeed, learned or

Attention and Awareness in L2 Learning

73

internalized in the internal system. In addition, the construct of intake is clearly in need of further elaboration in the field. As discussed above, concurrent data appear to suggest that there may be different types or levels of intake dependent upon the levels of processing or amount of cognitive effort involved while attending to L2 input. Methodologically, Schmidt’s notion of focal attention being isomorphic with awareness, the two constructs comprising the two sides of noticing, is currently raising the issue of whether it is possible to separate these two constructs. I have personally employed think aloud protocols to operationalize and measure both attention and awareness, with the premise that whatever is verbalized necessarily needed to be paid attention to. At the same time, it is well accepted that not everything that has been attended to is verbalized. Perhaps employing concurrently both eye-tracking measures and think aloud protocols (I will elaborate on these two measures in Chapter 8), while controlling for reactivity, would be the more appropriate methodological procedure to minimally establish the process of attention (via eye-tracking) and (levels of) awareness (via think alouds). In this way, the internal validity of the study is promoted and the strengths and limitations of the two procedures are addressed (cf. also Leow, Grey, Marijuan, & Moorman, 2014). Finally, there may be some unfortunate terminological confusion regarding what stage of the learning process the original Noticing Hypothesis (as postulated by Schmidt in 1990) was targeting (cf. several references made to the Noticing Hypothesis and learning, e.g., Truscott, 1998, and Truscott & Sharwood Smith, 2011, above). Schmidt’s hypothesis was clearly premised on the input-tointake stage (Stage 1) of the learning process. I pointed out that this terminological confusion was perhaps promoted by Schmidt himself as he elaborated his Noticing Hypothesis in many instances in terms of learning (“more noticing leads to more learning” (Schmidt, 1994: 129)) and indirectly postulated that whatever data were noticed were subsequently processed (“the registration of the occurrence of a stimulus event in conscious awareness and its subsequent storage in long-term memory” (Schmidt, 1994: 166)), most likely not paying attention to or being aware of the stage of the learning process originally postulated in his hypothesis. In addition, concurrent data gathered in several empirical studies also indicate that noticed intake may or may not be further processed, and what appears to account for this is whether further processing of such intake does take place. I shall elaborate on this issue of further processing in Chapter 11.

Tomlin and Villa’s Model of Input Processing in SLA While concurring with Schmidt’s Noticing Hypothesis on the important role of attention in learning, Tomlin and Villa’s model of input processing in SLA differs sharply from Schmidt’s regarding the role of awareness in the input-to-intake

74

Theoretical Foundations

process (Stage 1). Before presenting their model, Tomlin and Villa first critique previous SLA theoretical postulations for adhering to a coarse-grained perception of attention (gleaned from cognitive psychology), influenced by four different conceptions of attention: (a) The limited-capacity metaphor, which maintains that the human mind can only process a limited amount of information at one time and that the attentional resources of the mind are also limited (e.g., Kahneman, 1973); (b) the notion of selective attention, by which attention selects only critical information for further processing (Wickens, 1989); (c) the automatic versus controlled processing dichotomy, according to which automatic processes require little or no attention whereas controlled processes require attention and interfere with other simultaneous attention-demanding processes (e.g., Shiffrin & Schneider, 1977); and (d) the notion of control of information and action, according to which control of linguistic processing depends on the individual’s ability to intentionally focus attention on relevant parts of a problem (e.g., Allport, 1988). In Tomlin and Villa’s (1994) view, these conceptions are too coarse-grained and do not specify how exactly attention is allocated to the different aspects of a given task. So, given the limitations of psychology-based foundations, these researchers drew from works in neuroscience (e.g., Posner & Petersen, 1990) and summarized in Posner (1995) that attention is carried out by a network of anatomical areas, and that the areas involved in attention carry out different functions. First, there exists an attentional system of the brain that is at least somewhat anatomically separate from various data-processing systems. By dataprocessing systems, we mean those that can be activated passively by input or output. Second, attention is carried out by networks of anatomical areas. It is neither the property of a single brain area nor is it a collective function of the brain working as a whole. Third, the brain areas involved in attention do not carry out the same function, but specific computations are assigned to different areas. (Posner, 1995: 617) Based on the evidence of different attentional networks (the source of which will be divulged below in a critique), Tomlin and Villa (1994) then propose a functionally based, fine-grained analysis of attention for input processing in SLA “that integrates these related conceptions of attention into a system that allows investigation of SLA data at the moment of acquisition” (p. 190). In their model, attention has three components (all of which have neurological correlates): (1) Alertness (an overall readiness to deal with incoming stimuli whose function is related to the speed of information selection), (2) orientation (the direction of attentional resources to a certain type of stimuli), and (3) detection (the cognitive registration of stimuli or the process that selects, or engages, a particular and specific bit of information). According to Tomlin and Villa, it is detection alone

Attention and Awareness in L2 Learning

75

that is necessary for further processing of input and subsequent learning to take place. The other two components can enhance the chances that detection will occur, but neither is necessary. Following Allport (1988), Tomlin and Villa (1994) propose a set of criteria to determine the presence of awareness. In order to be considered aware, individuals must: (a) Show some behavioral or cognitive change due to the subjective experience, (b) report their awareness, and (c) describe the subjective experience. Methodologically, awareness is usually assessed by noting a cognitive change accompanied by a report that the learner is aware of the experience (lower level of awareness or meta-awareness) or a cognitive change accompanied by a description of the subjective experience (higher level of awareness). According to Tomlin and Villa (1994) and Allport (1988), if an experience cannot be somehow reported or described, then the presence of awareness is suspect. Here are the key features of Tomlin and Villa’s model: Key Features 1. 2. 3. 4.

Attention is carried out by a network of anatomical areas, and the areas involved in attention carry out different functions; Some level of cognitive registration needs to be involved in the process of detection that allows linguistic data to be taken in; Detected input becomes intake; and Awareness does not play a crucial role in the preliminary processing of input into intake during exposure, that is, intake can take place without the presence of awareness.

Comments on Tomlin and Villa’s (1994) Model of Input Processing in SLA If you recall, Tomlin and Villa had postulated in their model that the construct of attention has three components: (1) Alertness, (2) orientation, and (3) detection. According to Tomlin and Villa, it is detection alone that is necessary for further processing of input and subsequent learning to take place. The other two components can enhance the chances that detection will occur, but neither is necessary. I frankly wanted to test whether these researchers from cognitive science were on the ball with regard to their predictions concerning the usefulness of the attentional mechanisms in relation to intake and potential further processing of the L2 (Leow, 1998). I carefully designed a crossword puzzle task to isolate the three attentional functions of alertness, orientation, and detection. The idea for a crossword puzzle was developed during a discussion with one of my assistant directors, who was pursuing a degree in phonology (thanks, Eric). My main aim, then, was to investigate the effects of the different components of attention on intake and written production of a Spanish morphological item. I hypothesized

76

Theoretical Foundations

that detection of the targeted forms would result in superior performance both in a multiple-choice recognition task and in a controlled fill-in-the-blank production task as compared to simple alertness or orientation. As I (and Tomlin and Villa) had predicted, the two [+detection] groups performed significantly better on both assessment tasks than the [-detection] groups. This beneficial effect for detection was observed not only in the immediate posttest, but also in two delayed posttests administered five and eight weeks after the treatment, respectively. Therefore, the results of my investigation appeared to support Tomlin and Villa’s fine-grained analysis of attention. In other words, detection was found to be fundamental for processing morphological material into short-term memory, whereas alertness and orientation were not. However, given that the nature of the experimental task and the fact that the construct of awareness was not addressed in this investigation leaves open the question whether I was actually addressing detection or noticing a la Schmidt. In addition, Simard and Wong (2001) critiqued this study, noting (quite correctly) that based on the research design, the three attentional functions were not clearly separated. Consequently, whether my findings could provide empirical support for Tomlin and Villa’s model remain unanswered. Simard and Wong (2001) provided a critique of Tomlin and Villa’s model that is quite detailed and underscores what I have commented on in previous chapters regarding the caution needed when relying on non-SLA sources for theoretical or empirical support for language learning. While complimenting Tomlin and Villa for moving the SLA field along in terms of expanding on the nature of attentional processes, Simard and Wong pointed out that “their claim that alertness and orientation are not necessary for detection to occur is currently unsupportable and does not reflect the complex nature of SLA” (p. 105). In addition, in relation to my study in 1998 testing Tomlin and Villa’s model, they also pointed out that “Leow’s (1998) efforts to provide empirical support for this model fall short of that goal” (p. 105). To support their critique, Simard and Wong made the following observations: •

The neuroscience sources Tomlin and Villa relied on to support the postulations of their model for input processing in SLA are inadequate to explicate cognitive processes employed in language learning.

“It is not clear how such systems might inform us about how attention might affect L2 learning. Because the acquisition of an L2 as a cognitive system has not been investigated, much still remains to be known about how the three attentional functions might influence the processing of L2 input. Therefore, any claims about the separability of the attentional functions in SLA tasks are probably premature” (p. 109). Indeed, if you recall from Chapter 3 the original research context under which Posner and his colleagues explored the three attentional functions (namely, the

Attention and Awareness in L2 Learning

77

identification of the locations of and relationships between different attentional processes functions in the brains of both humans and animals, and the effort to use this information to treat pathologies linked to attentional disorders), you can easily agree with Simard and Wong’s critique that such information is not relevant to SLA. •

The references Posner and his associates made to the posterior attentional network and its associated function of orientation were limited to orienting to visual locations (Posner, 1995; Posner & Petersen, 1990).

As pointed out in Chapter 3, the majority of studies cited to support the role of attention and/or consciousness actually originated from studies that are associated with the visual world and not from L2 learning. It is not unusual in the SLA field to find some researchers who appear to rely quite heavily on the findings reported in non-SLA fields to support some explanation or postulation made in this field, even though the findings from these studies are not pertinent to the process of learning or processing L2 data. •

Posner and his colleagues’ discussion of the three networks of attention (posterior, anterior, and vigilance) together with their different functions in separate parts of the brain does not postulate that these attentional functions operate separately or that some of them (e.g., alertness and orientation) are not required for detection to take place.

This point is also valid, and Tomlin and Villa’s extrapolation from their cited studies (many investigated sensory information and L1 words and measures included reaction time, the cuing paradigm, and dichotic tasks) may require some caution in regard to the tenets of their proposed model of input processing in SLA. •

The virtual impossibility of creating a research design that could provide data on each attentional function during L2 processing if one were to assume that all three are activated simultaneously.

This is a valid point and leaves the question of whether Tomlin and Villa’s model can indeed be tested in SLA. Now let us take a succinct look at some theoretical underpinnings that addressed both Stage 1 and Stage 3.

Stages 1 (Input-to-Intake) and Stage 3 (Intake Processing) McLaughlin’s Cognitive Theory McLaughlin’s (1987) Cognitive theory is arguably the first theoretical attempt in SLA to posit a role for attention in the early stages of the L2 learning process.

78

Theoretical Foundations

Cognitive theory is generally based on the concepts of controlled and automatic processes in which the allocation of attention and the notions of cognitive effort and restructuring play key roles. Do you recall from Chapter 3 the capacity theories that include the metaphor of a limited capacity channel or processor, in which information competes for attentional resources available to the learner and the mental effort involved in processing information? Cognitive theory views L2 learning, then, as the acquisition of a complex cognitive skill, and early stage adult L2 learners as limited capacity processors of information, that is, they are limited to what they can attend to at a given point in time (selective or focal attention) and what they can process on the basis of previous knowledge and expectations (McLaughlin, Rossman, & McLeod, 1983). Adopting the dichotomy of controlled versus automatic processing (cf. Shiffrin & Schneider, 1977), McLaughlin also postulates that during the early stages of SLA learners’ information processing mechanism is regulated by controlled processes. Now, controlled processes are tightly capacity-limited, require a large amount of cognitive effort, and are under voluntary control of the learners. As expected, these controlled processes are used with any new or inconsistent information learners receive in the input. If you notice your students doing weird stuff with their faces—for example, squinting or rolling their eyes, or making funny faces while listening to you speaking to them in the L2—it is most likely they are consuming a tremendous amount of cognitive effort trying to understand what you are saying (or maybe they are just trying to annoy you or draw your attention). For those students who have a blank expression on their faces, you can rest assured (or maybe not) that they are cognitively inactive. Automatic processes, on the other hand, require little cognitive effort, occur rapidly, and presumably do not require awareness. In other words, after much meaningful practice—let me repeat, after much meaning ful practice—it is quite possible that that very blank face denotes a student who easily takes in what s/he is exposed to. Consequently, the amount of attention paid to the L2 data by L2 learners is largely dependent upon the amount of cognitive effort required by their processing of the input. Thus, when conscious attention is viewed as processing capacity, it is found to be flexible, and the amount of attention can be deployed dependent upon the difficulty of the task and the amount of controlled or automatic processing involved. Interestingly, although the use of a large amount of cognitive effort does imply some level of awareness, McLaughlin (cf. McLaughlin, Rossman, & McLeod, 1983) appears to reject any direct link between controlled processes and awareness by stating that both controlled and automatic processes may be conscious or unconscious. According to these researchers, the distinction between controlled and automatic processing is not based on conscious experience since both controlled and automatic processes can in principle be either conscious or not. McLaughlin also posits a second “phase” of the cognitive process of learning a language, which he calls “restructuring.” This phase appears to belong to

Attention and Awareness in L2 Learning

79

Stage 3, in which restructuring is postulated to take place in the framework. The learner needs to impose organization and to structure the information that has been acquired. As more learning occurs, internalized, cognitive representations change and are restructured. This includes the famous “U-shape behavior” phenomenon. An example of the “U-shape behavior” is the case where a learner produces a correct utterance (evidently learned as a chunk of language without any systematic internalization of the grammar) [top left of the U], then appears to fall back and produce incorrect ones (mainly from L1 interference and generalization) [bottom of the U], and later on returns to the correct version [top right of the U]. A classic example is Spanish me llamo . . . “My name is . . .” (literally “I call myself”). This chunk of language is repeated by the student with the translation “My name is” in his/her mind [100% accuracy, top left of the U]. Quite soon, without much cognitive effort, me llamo becomes associated with me = My, llamo = name, and to show the teacher that s/he knows how to conjugate the verb ser, “to be,” the student adds es to form the sentence Me llamo es . . . “*My name is is . . . .” [0% accuracy, bottom of the U]. Hopefully, the student pays attention to the correct way of stating one’s name in Spanish, produced by his/ her teacher and peers, processes the sentence, and finally gets it [once again 100% accuracy, top right of the U). Agreement issues in Spanish, French, etc., anyone? There is a constant restructuring of internalized representations as learners’ ability to master the L2 increases. The process of “restructuring” is a characteristic of the later stages on the continuum of SLA. This restructuring is linked to what is known in the literature as “interlanguage” (Selinker, 1972), which refers to a learner’s L2 developing system (current L2 knowledge, which may be accurate or inaccurate) and which is an intermediate system located somewhere between the learner’s L1 and L2. What is very interesting about interlanguage is that it is actually variable and possesses its own unique and coherent internalized rule system that typically fails to be completely congruent with the L2’s system. However, there does exist the possibility of these rules being restructured based on additional exposure to the L2 and feedback. This restructuring of the interlanguage data takes place in Stage 3 and involves an interaction between additional exposure or feedback taken in and the learner’s interlanguage. What we see here in the above-mentioned example is the learner’s interlanguage (Me llamo es . . .) being restructured to Me llamo as his/her ability to master the L2 increases. Key Features Here are the key features of McLaughlin’s theory: 1. 2. 3.

The postulation that not all incoming input is attended to; Input that is attended to is selected by the learner; Quite a large amount of cognitive effort is involved when processing new or inconsistent information in the input;

80

Theoretical Foundations

4.

Input processing may require different amounts of cognitive effort or depth or levels of processing dependent upon the difficulty of the task and the amount of controlled or automatic processing involved; Subsequent processing may be dependent upon prior knowledge; and Activating old information requires minimal attention, minimal cognitive effort, and potentially no awareness.

5. 6.

Comments on McLaughlin’s Cognitive Theory McLaughlin’s (1987) Cognitive theory is a classic example of a straight extrapolation of the tenets of the limited capacity models in cognitive psychology with a fairly broad stroke of the L2 learning process. As mentioned above, it is the first theoretical attempt in SLA to posit a role for attention in the early stages of the L2 learning process. At the same time, there may be a mismatch between the concept of cognitive effort being required during controlled processing and the impact such an amount of cognitive processing could potentially have on learner awareness.

Robinson’s Model of the Relationship between Attention and Memory Like Schmidt’s Noticing Hypothesis, Robinson’s (1995, 2003) model of the relationship between attention and memory also derives its theoretical underpinnings from cognitive psychology and, in addition to the construct of attention, includes the role of memory in the early stages of the L2 learning process. Drawing from Cowan’s (1988) unified model of memory and attention, and in an attempt to provide a system’s level characterization of attentional mechanisms, Robinson’s model of the relationship between attention and memory brings together Tomlin and Villa’s (1994) notion of detection (which does not involve awareness) and Schmidt’s (1990 and elsewhere) notion of noticing (which does involve awareness). According to Robinson (1996), detection is responsible for encoding in shortterm memory, and therefore it constitutes the first step towards learning (p. 59). Noticing, on the other hand, implies awareness, and it is the result of rehearsal of material stored in short-term memory prior to encoding in long-term memory. Indeed, Robinson (1995, 1996) states that, in spite of the fact that there is some evidence in cognitive psychology for learning following detection without awareness, the effects of such type of learning are negligible, in that they only last a few milliseconds and are quickly forgotten. In Robinson’s model, the only type of input processing that is critical for language learning is that which is accompanied by awareness. Rather than distinguishing between two completely non-interfaced forms of learning (i.e., learning with awareness, or explicit learning, and learning without awareness, or implicit learning), Robinson, following

Attention and Awareness in L2 Learning

81

Best (1992), proposes two different kinds of processing strategies, both of which require awareness. The first type is data-driven processing, and it involves rehearsal and maintenance in memory of isolated instances. The second type is conceptually-driven processing, and it involves a more elaborated form or rehearsal that distributes instances into abstract configurations. This second form of processing strategy is typically associated with the role and activation of prior knowledge (e.g., Carrell, 1992) and also resembles Schmidt’s notion of awareness at the level of understanding. Robinson, then, constructs a model in which detection is one early stage in the learning process, sequentially prior to noticing. More specifically, linguistic information may be detected and taken in by the learner, but if this information is not accompanied by awareness, then the chance of this information being further processed is relatively minimal. To Robinson, noticing does involve awareness, as in Schmidt’s Noticing Hypothesis, and it is crucial for learning to take place. Regarding the issue of single- or multiple-resource limited capacity models of attention (cf. the issue of attentional resources competing during processing simultaneously for form and meaning, VanPatten, 1990), which has been critiqued by Tomlin and Villa for being coarse-grained, Robinson (2003) puts forward as an alternative the unlimited capacity interference model from cognitive psychology. “[I]nterference models are lower-level (implementational) approaches to describing the causes of attention switching and task competition during control of information flow” (p. 645); that is, it is not attentional resources that are limited, since a breakdown in processing may come from other sources—for example, a lack of comprehension, inefficiency in processing the incoming information, and so on. From the perspective of interference theory, explanations linking relative ease or difficulty of L2 comprehension, or different characteristics of L2 production, to task demands may be more legitimately framed in terms of confusion and cross-talk between codes (of L1, interlanguage, and L2 syntax, morphology, semantics, and phonology/ orthography) within specific resource pools during task performance, rather than in terms of global capacity limitations. (Robinson, 2003: 646) Key Features Here are the key features of Robinson’s model: 1. 2.

The notion of two preliminary stages (detection first, then noticing) before intake can take place; The role of working memory in the learning process;

82

Theoretical Foundations

3.

Detected intake may disappear from working memory if not accompanied minimally by awareness at the level of noticing; Noticed intake may also disappear; Noticed intake has a higher potential to be processed based on some higher level of processing; Notice intake appears to be a step beyond the initial input-to-intake phase along the learning process; The association of the role of awareness in relation to two processing strategies (data-driven and conceptually-driven) postulated to occur after detection is associated with awareness at the level of noticing; and The role of prior knowledge is involved in conceptually-driven processes during the processing of intake.

4. 5. 6. 7.

8.

Comments on Robinson’s Model of the Relationship between Attention and Memory Robinson’s (1995, 2003) model of the relationship between attention and memory appears to have played second fiddle to both Schmidt’s Noticing Hypothesis and Tomlin and Villa’s model of input processing, and is usually viewed as bringing these two theoretical perspectives into a more comprehensive overview, regarding the role of awareness. At the same time, the role of type of processing (data-driven or conceptually-driven) usually provides the theoretical source for any potential activation of prior knowledge reported in the noticing strand of research (e.g., Leow, 1998). Robinson’s (2003) proposal to adopt the unlimited capacity interference model of attention is supported by the fact that this theoretical perspective is less critiqued when compared to the limited capacity models. In addition, it is relatively intuitive if one were to view the role of processing as being more robust than simple attention. Empirical support for the unlimited capacity models in SLA is warranted.

VanPatten’s Input Processing Theory/Model VanPatten’s input processing theory (previously referred to as a model that is still evolving) is another theoretical underpinning that addresses the early stages of input to intake (Stage 1), where learners process the L2 data during comprehension. VanPatten draws from three theoretical sources: (1) The metaphor of limited processing capacity of human cognition, and especially Wickens’ (1989) notion of multiple attentional resources, popular in cognitive psychology; (2) the generative notion of language learning as distinct from other kinds of learning; and (3) first language parsing strategies. As a theory, he seeks to explain “why learners process input as they do and in particular why they make specific form-meaning connections” (VanPatten & Williams, 2007: 14). As a model, he

Attention and Awareness in L2 Learning

83

combines these three underpinnings to explicate how L2 language learners make form-meaning connections. VanPatten (2004), an expansion of VanPatten’s original 1996 model, provides three important stages that involve learners’ internal processes along the learning process, from input to output. Stage I, accorded maximum focus in his model, begins with the conversion of input to intake (input processing) and continues at Stage II with the subsequent partial or total accommodation, restructuring, or incorporation of intake into the L2 learners’ developing system. Stage III occurs between the learner’s developing system and his/her eventual output, characterized by the process of access to the linguistic data currently incorporated in the L2 system. According to VanPatten (2007), the Theory of Input Processing (IP) is premised on the assumption that making form-meaning connections comprise an integral part of language acquisition. To this end, three fundamental questions are formulated: 1. 2. 3.

Under what conditions do learners make initial form-meaning connections? Why, at a given moment in time, do they make some and not other formmeaning connections? What internal strategies do learners use in comprehending sentences and how might this affect acquisition? (p. 116)

Subsequently, the theory makes several assumptions or predictions regarding what guides L2 learners’ processing of linguistic information in the input while processing the input for comprehension. These predictions are the following: 1. 2.

3.

Learners pay attention to meaning during exposure to the L2 input. Due to learners’ limited cognitive capacity and working memory, comprehension is effortful and this, in turn, has an impact on what the input processing mechanism will pay attention to. This type of processing is unlike that of native speakers in terms of how much information can be processed or stored during input processing. While learners may employ certain universals of input processing, they may also rely on L1 input processor or parser during this exposure.

To address these questions and predictions, VanPatten postulates that at Stage I, the process between input and intake, interaction and negotiation of meaning feed learners’ working memory capacity, which is associated with principles that guide both form-meaning connections and parsing during input processing. VanPatten (2004) defines processing as “making a connection between form and meaning. That is, a learner notes a form and at the same time determines its meaning (or function)” (p. 6). To explicate the first stage, VanPatten discusses input processing vis-à-vis several principles. According to VanPatten,

84

Theoretical Foundations

these principles “guide learner attention to linguistic form in the input” (VanPatten, 2004: 5). His first and major IP principle is the Primacy of Content Word Principle, which claims that learners are driven to process or make form-meaning connections of content words (e.g., lexical items such as words) in the input before non-content words (such as the, is and so on). In other words, if learners hear the sentence, “The man is running,” they are most likely going to process man and run before processing is. VanPatten points out that this type of processing, that is, making form-meaning connection, is not equivalent to mere noticing these forms, and that even if such non-content words were processed, (other) processors responsible for data storage and grammar construction may reject them, preventing further processing. VanPatten also postulates several other principles, which are briefly presented below: The Lexical Preference Principle postulates that if grammatical forms express a meaning that can also be encoded lexically (i.e., the grammatical form is redundant), then learners will not initially process those grammatical forms until they have lexical forms to which they can match them. For example, with the sentence, “Yesterday, I studied in the library,” learners will process the semantic notion of the adverb yesterday and use this to ignore the grammatical past tense marker -ed in “studied” that carries the same semantic notion. The Preference for Nonredundancy Principle postulates that learners are more likely to process nonredundant meaningful markers before they process redundant meaningful markers. For example, in the sentences “the man is running” and “the man runs,” learners will process the -ing of “running” sooner than the -s of “runs.” This occurs because there is no other lexical form of semantic notion of the progressing action indicated in the gerund, whereas the third person marker is already embedded lexically in “man” (third person, singular). The Meaning before Nonmeaning Principle deals with the non-processing of meaningful grammatical markers before that of nonmeaningful grammatical markers such as, for example, the relative pronoun that, which does not encode any semantic information. VanPatten also includes the notion of parsing, which deals with the computation of the syntactic structure of sentences the learner receives in the L2. With specific reference to word order, he originally postulated The First Noun Principle, in which learners (whether from a Subject-Verb-Object (SVO) or non-SVO language) will tend to process the first noun or pronoun as the subject of the sentence. For example, when English-speaking learners of Spanish hear “La besa Juan,” the tendency will be to assign la as the subject and Juan as the object, leading to an interpretation of the sentence as, “She kisses Juan.” However, VanPatten also adds another alternative principle to take into account the potential of the L1 parser playing a role in such interpretations and postulates the L1 Transfer Principle, in which learners begin acquisition with L1 parsing procedures. However, learners may override the First Noun Principle or The L1 Transfer principle via The Event

Attention and Awareness in L2 Learning

85

Probability Principle, which postulates that learners may rely on event probabilities. An example is a sentence like, “The child scolded the mother,” which the L2 learner may parse and reinterpret based on the probability of real-life scenarios that logically assigns the person doing the scolding to the mother. A similar issue lies with specific verbs that require specific situations. While a sentence like “Johnny was kissed by Mary” may cause some misinterpretation due to the First Noun Principle or The L1 Transfer Principle, a syntactically similar sentence like “The house was kissed by Johnny” is unlikely to do so, due to what is known as lexical semantics, which are the requirements that specific verbs assign to nouns for some event to take place. For example, the verb “to kiss” requires that an animate subject perform this action. To account for this type of processing, VanPatten postulates The Lexical Semantics Principle. Postulating also that context may play in role in how or why learners process a specific way, VanPatten offers The Contextual Constraint Principle, in which learners may rely less on the First Noun Principle or the L1 Transfer Principle if the preceding context (e.g., “John is in the hospital,” followed by “porque lo atacó Mary”) constrains the possible interpretation of a clause or sentence. Finally, VanPatten postulates the Sentence Location Principle, in which learners tend to process items in sentence initial position before those in final position and those in medial position. The eventual product of input processing is intake that VanPatten (2004: 7) defines as “the subset of input that has been processed in working memory (i.e., possible incorporation into the developing system)” (p. 7) and that has been made available for further processing. Like Robinson and Gass (cf. below), VanPatten does not view intake as merely a subset of the input but that may include incorrectly processed data. Key Features Here are the key features of VanPatten’s theory/model: 1. 2. 3. 4. 5.

6.

The focus of the model is on input processing; During the early stage of L2 learning, learners process for meaning before processing for form during exposure to the L2 input; Learners have a limited cognitive capacity and limited working memory; Learners may employ both universals of input processing and L1 input processor or parser during exposure; At Stage I (input-to-intake phase), VanPatten provides principles that are assumed to be associated with what specifically learners process in the input; and Intake is not just a subset of input but needs to be processed input, that is, the learner has not only noticed the linguistic data but has also achieved a formmeaning/function connection, arguably indicating a relatively high level of cognitive effort or depth of processing during the input processing stage.

86

Theoretical Foundations

Comments on VanPatten’s Input Processing Theory/Model Testing VanPatten’s Input Processing Model/Theory, and more specifically his Primacy of Content Word Principle, undoubtedly is one of the more popular strands of research in SLA over the last two decades. From the first study (VanPatten & Cadierno, 1993), researchers began by typically comparing a traditional experimental group that received teacher-centered instruction, together with production practice, with a Processing Instruction (PI) group that received both metalinguistic information on some incorrect learner strategy (e.g., word order SVO in English affecting the processing of Spanish OVS), structured input (SI) comprising well-designed MC interpretation activities that are assumed to alter learners’ previously incorrect strategy, and implicit feedback, usually in a yes/no format. A control group that received no instruction or exposure to the targeted form or structure was also included in the design (cf. Morgan-Short & Bowden, 2006, for one notable exception). Assessment tasks were typically a sentencelevel interpretation and a sentence-level controlled written production task (e.g., VanPatten & Cadierno, 1993; VanPatten & Oikkenon, 1996). The combination of PI and SI led to subsequent studies that teased out these two variables (e.g., Benati, 2004; Farley, 2004; VanPatten & Oikkenon, 1996; Wong, 2004), which were then followed by studies addressing type of feedback (e.g., Sanz & MorganShort, 2004) and the contrast between input-based versus meaningful outputbased instruction (e.g., Morgan-Short & Bowden, 2006). Languages included Spanish, Italian, French, and Latin; language levels included the first two levels of L2 proficiency; and targeted items included Spanish word order with clitic pronouns (Fernández, 2006; Morgan-Short & Bowden, 2006; Sanz & MorganShort, 2004; VanPatten & Cadierno, 1993; VanPatten & Fernández, 2004; VanPatten & Oikkenon, 1996), third person future forms in Italian (Benati, 2004), forms de/d’ in French negative constructions (Wong, 2004), Spanish subjunctive (Farley, 2004; Fernández, 2006), the nominative and accusative case marking on articles in L2 German sentences (Henry, Culman, & VanPatten, 2009), and the assignment of the thematic agent/patient roles to nouns at the sentence level in Latin (Stafford, Bowden, & Sanz, 2012). In addition to the typical interpretation and written production assessment tasks, reaction times (e.g., Sanz, Lin, Lado, Bowden, & Stafford, 2009) and trials-to-criterion (the number of attempts it takes to start processing the input correctly) were also employed (e.g., Fernández, 2006; Henry et al., 2009). The overall results tend to be relatively similar performances between the traditional and PI groups, with both groups outperforming the control group that did not receive exposure to the target item or the PI group outperforming in a few studies the traditional group on the interpretation test. There are several plausible explanations outside of PI itself that could account for or contribute to these findings. One explanation may be that structured input activities are a form of task-essential practice (Sanz & Morgan-Short, 2004, cf. Rosa & Leow,

Attention and Awareness in L2 Learning

87

2004; Rosa & O’Neill, 1999), which is premised on learners paying attention to target items in the input to successfully complete the activity, instead of more exposure. A second explanation may be the role feedback played in the study. Some of these studies (Benati, 2004; Farley, 2004; Sanz & Morgan-Short, 2004; VanPatten & Oikkenon, 1996; Wong, 2004) also provided yes/no feedback during treatment, which does conflate PI with feedback. A third explanation may lie with the type of linguistic item and its potential saliency. Fernández (2006) targeted both Spanish word order (more salient) and the subjunctive (less salient), and the PI group needed fewer trials than the SI group to start processing it correctly. Although Farley (2004) and Wong (2004) did not separate PI from SI, they both reported differential results from previous studies, as their PI outperformed SI in both recognition and production tests.

N. C. Ellis’s Associative-Cognitive CREED N. C. Ellis (2007) draws from the field of cognitive psychology (especially from the connectionist perspective of learning) and several mechanisms assumed to play a role in learner processing of L2 data, and places these interrelated mechanism into one theoretical framework called “associative-cognitive” CREED, in which each letter represents the mechanisms of Construction-based, Rational, Exemplar-driven, Emergent, and Dialectic. Let us take a brief look at each of these mechanisms and keep in mind one view of the brain “as a massive parallel network that, among other things, acquires language” (Holme, 2013: 626). A Construction-based perspective of language learning views the L2 learner learning and recycling symbolic language units that link lexical, morphological, and syntactic forms and structures with corresponding semantic, pragmatic, and discourse functions. According to Ellis, the formats of constructions are very diverse. For example, they can range from simple lexical items, chunks of language such as excuse me, slot-and-frame constructions, for example, tell [someone] [something], and even very abstract constructions, such as [Subj V Obj Obj2]. Learning, then, does not occur from the construction of abstract rules in the L2 but more from the frequency-based patterns they extract from multiple repeated associations created during language use. Associations that re-occur frequently will strengthen, becoming part of a larger network of associations, the cognitive bonds between the elements. Put very simply, the more learners exposed to a satisfactory amount of the L2 input that allow them to extract a set of probabilistic patterns from this input will strengthen these patterns through their repeated activation, almost similar to the notion of “practice makes perfect.” According to N. C. Ellis, Rational accounts for the belief that language representations in the brain are “tuned” to make predictions on the pertinent linguistic constructions needed to process the ongoing discourse context. Factors that play a role in the development of these representations include the frequency of the representations, how recent were their occurrences, and the context of specific

88

Theoretical Foundations

constructions. Exemplar-driven is premised on learners’ ability to abstract regularities in similar types of constructions encountered. For example, if learners are exposed to a series of English past tense verbs, they may most likely abstract the past tense morpheme of -ed, as in walked, and subsequently apply this morpheme to all past tense verbs until further data change their past tense representation of the irregular past tense verbs. Contrary to theoretical postulations that there exists a pre-existing language device, this framework postulates that it is through language usage and exposure to the L2 language that regularities in the L2 Emerge. Finally, Dialectic refers to interactions with other speakers or teachers, with the aim of addressing the aspects of associative learning that may cause problems for the L2 learners. Let us review briefly the key features of this framework, namely, frequencybiased probability calculations, overshadowing and attention blocking, sequence or statistical learning, construction learning, and the implicit learning of formmeaning connection. The frequency-biased probability calculations is based on the attested finding that the more a stimulus is encountered, the faster and more accurately it is processed when compared to a lesser frequent stimulus. Another way of looking at this is the relationship between perceptual saliency and frequency. An item in the L2 input becomes more salient through repeated occurrences, which will then attract more cognitive attention. In addition, the number of stimuli also supports the notion that learners possess the innate ability to abstract patterns in the L2 data without much conscious effort. As N.C. Ellis (2006: 12) writes, “Learning languages can thus be viewed as a statistical process in that it requires the learner to acquire a set of likelihood-weighted associations between constructions and their functional/semantic interpretations.” In addition to frequency, the concepts of overshadowing or attention blocking in relation to an already established L1 also form part of the associativecognitive CREED. These concepts have provided an alternative explanation for the apparent limited attainment (when compared to near native-like competence) typically evidenced in adult L2 acquisition. This perspective explores adults’ difficulty in acquiring L2s in terms of “cognitive principles of transfer in the associative learning of form-meaning relations in linguistic constructions” (Ellis & Sagarra, 2010: 554), a perspective that includes the notions of transfer and learned attention also observed in several theoretical underpinnings, for example, the competition model (MacWhinney, 2001) and the input processing model (VanPatten, 2004). Given adult L2 learners’ limited cognitive resources to process all incoming L2 data, especially learning form-meaning connections that require much mental effort, learners have “to select which aspects of the input to process” (Ellis & Sagarra, 2010: 554). According to the researchers, one factor for such selection is cue saliency. For example, recall the verbal inflection discussed above, such as the past tense morpheme -ed in walked. Such inflections are less likely to be paid attention to when compared to, for example, temporal adverbs, such as “yesterday,” that are quite pronounced in the speech stream. Viewed from an associative learning theory, L2 learners’ prior knowledge of “yesterday” as a cue

Attention and Awareness in L2 Learning

89

for past action will block attention to, or perhaps, to be more precise, processing of, the redundant verbal inflection -ed as a result of prior experience. Finally, the associative-cognitive CREED leans heavily on the concept of statistical learning (also referred to a chunking, or artificial language learning) that originated in L1 child acquisition studies (e.g., Saffran, Aslin, & Newport, 1996; Saffran, Aslin, Johnson, & Newport, 1999), which reported children’s ability to extract underlying structures or tones from exposure to a stream of input. The idea behind statistical learning is that the learners, after exposure to a tremendous amount of unknown data, begin to extract regularities in the data leading to implicit knowledge. Statistical learning is closely associated with connectionist views of language learning, based on the strengthening of links between cognitive nodes due to frequent and repeated activation of the same information. This is fairly similar in concept to that of automaticity that is achieved after doing some task or activity over and over again until not much effort is needed to perform the task or activity (think of learning to drive that manual shift car and being able to finally do so without shutting the car down). Relatively similar notions to the concept of statistical learning are construction learning, comprising the need to learn a diverse variety of form-meaning mappings as a result of language usage over time, and the implicit learning of such mappings or connections, which I will discuss more fully in Chapter 10. Key Features Here are the key features of N. C. Ellis’s (2007) associative-cognitive CREED: 1. 2. 3. 4. 5. 6. 7. 8. 9.

Connectionist perspective of language learning; L2 learning as a recycling of symbolic language units that are linguistically linked with corresponding semantic, pragmatic, and discourse functions; Language learning takes place through usage and exposure; Social aspect of learning; Frequency-driven learning; Overshadowing or blocked attention; Sequence or statistical learning; Construction learning; and The implicit learning of form-meaning connection.

Comments on N. C. Ellis’s Associative-Cognitive CREED While there is no study that has tested Ellis’s associative-cognitive CREED in its entirety, several of its features (e.g., blocked attention, frequency-driven learning, statistical learning, implicit learning of form-meaning connection) have been empirically investigated with or without being grounded theoretically on his theoretical underpinning. Grounding their theoretical underpinning on the concepts of overshadowing or attention (processing?) blocking as postulated in Ellis’s

90

Theoretical Foundations

associative-cognitive CREED, N. C. Ellis, Sagarra, and colleagues conducted a series of studies (e.g., Ellis & Sagarra, 2010, 2011; cf. also earlier studies by Lee, Cadierno, Glass, & VanPatten, 1991; Musumeci, 1989, for similar designs), that have provided overall empirical support for these concepts. To address the features of frequency-driven learning (e.g., Hamrick & Rebuschat, 2014) or in combination with saliency (e.g., Goldschneider & DeKeyser, 2001), statistical learning (cf. Rebuschat & Williams, 2012 for a recent edited book on this issue), and implicit learning (e.g., Leung & Williams, 2014), recent studies have relied on L1 acquisition (e.g., Lieven, 2010; Saffran, 2003) or psychology-based sources for support (e.g., Conway & Christiansen, 2006; Perruchet & Pacton, 2006). Like attention blocking, findings appear to support these features, although there remains some debate whether statistical learning may be pertinent when dealing with L2 learners (e.g., Robinson, 2005) or whether it is methodologically possible to address the role that the construct of unawareness plays in L2 learning (e.g., Leow & Hama, 2013).

Stage 5 (Knowledge Processing) Swain’s Output Hypothesis Swain’s (2005) Output Hypothesis is the only theoretical postulation in SLA that is directed at Stage 5 (the knowledge processing stage) of the learning process, although this stage plays an important role in the Interactionist Approach (cf. Gass & Mackey, 2007). It was a direct rebuttal to Krashen’s (1982) Monitor Model and, more specifically, his Input Hypothesis and postulation that learning takes place via exposure to comprehensible input. Interestingly, Swain came to question the role of comprehensible input after observing students’ performances in content-based L2 French classrooms in immersion schools in Canada. While students demonstrated near native comprehension ability in French, most likely due to the emphasis on listening comprehension and reading activities, their speaking ability did not match this comprehension ability. While this result is not surprising (think of the good old Grammar Translation Method in which speaking was also not easy), Swain felt that allowing students to have more opportunities to produce the L2 would lead to an improvement of their productive ability. According to Swain, producing the L2 (output) allows the students to move away from semantic processing, or processing for meaning, and to begin to involve more grammatical (such as syntactic and morphological) processing. In her Output Hypothesis, Swain makes three major claims regarding the functions of learner production during the learning process at this later stage: •

A noticing/triggering function, or what might be referred to as a consciousnessraising role.

Here Swain is claiming that the opportunity to produce the L2 may push learners to raise their awareness of potential gaps between their current interlanguage

Attention and Awareness in L2 Learning

91

(or knowledge) and the L2. Of course, learner awareness may also depend upon several other variables that may prevent such awareness-raising. •

A hypothesis-testing function.

This function is related to the opportunity to experiment with new forms and structures and see whether they are correct or need correction. This function appears to be more relevant to learners willing to consciously produce the language for feedback. •

A metalinguistic function, or what might be referred to as its “reflective” role.

Swain is assuming that opportunities to produce the L2 will lead learners to move away from the task of communication and to spend time explicitly analyzing the mismatches identified during output. She also refers to metatalk, that is, the use of language to talk about language, as seen in students who ask questions or make comments such as, “I guess I should make the adjective agree with the noun, right?” While the Output Hypothesis was originally postulated within a cognitive interactionist framework, with its references to consciousness-raising and noticing and tested within this cognitive interactionist framework (e.g., de la Fuente, 2002; Izumi, 2002), it is interesting to note that Swain’s early work with her colleague Lapkin (Swain & Lapkin, 1995) focused more on her metalinguistic function and, more specifically, the role of collaborative metalinguistic discussion between L2 learners in L2 development. Even though Swain has subsequently begun to frame the collaborative aspect of the learning process within a Vygotskian perspective of language learning (Swain, 2000; cf. Lantolf & Thorne, 2007), researchers still appear to view her Output Hypothesis as related to psycholinguistic underpinnings and directly associated with L2 development. Key Features Here are the key features of Swain’s hypothesis: 1. 2. 3. 4.

The notion of effort or depth of processing during output processing; The role of learner interlanguage or current state of knowledge; The activation of prior knowledge; and The role of awareness.

Comments on Swain’s Output Hypothesis Swain’s (2005) Output Hypothesis has been tested in several empirical studies in both the oral (e.g., de la Fuente, 2002) and written mode (e.g., Izumi, 2002;

92

Theoretical Foundations

Swain & Lapkin, 1995). The overall results appear to provide some support for the Output Hypothesis, with studies underlying the role of deeper processing of the target items (mostly vocabulary) during output production when compared to absence of such outputting. However, there is no empirical research that can report that output is necessary for learning to take place. At the same time, it may be argued that output does allow some type of feedback to be provided during interaction (cf. the Interactionist Approach) to the L2 learner, who then uses this feedback, if noticed, to correct their original utterance(s). However, making an argument that learning takes place at Stage 5 appears to be empirically unsupported and reinforces the importance of input in the L2 learning process.

From Stage 1 to Stage 5 (Input Processing to Knowledge Processing) Gass’s (1988, 1997) Model of Second Language Acquisition Gass (1988, 1997, and slightly expanded later (e.g., Gass & Selinker, 2001, 2008)), when compared to the other theoretical underpinnings discussed above, is arguably the most elaborated model that has attempted to address the stages between input and output. Gass posits in her Model of Second Language Acquisition, framed within an interactionist perspective, five stages in the learning process from input to output, namely, (1) apperceived input, (2), comprehended input, (3), intake, (4) integration, and (5) output. Apperceived input is based on learners’ recognition that there is a gap between what they know and do not know, or as she wrote in 1988, “to perceive in terms of past perceptions” (p. 201), and is related to selective attention. According to Gass, it is that piece of language that is noticed (a la Schmidt) by the learner due to some recognizable features that are linked to their prior knowledge. While Gass postulates that at her preliminary level of apperception specific linguistic data are noticed, Robinson allows for the possibility of such linguistic data to be detected, that is, without awareness playing a role at this stage. Apperception is “an internal cognitive act in which a linguistic form is related to some bit of existing knowledge (or gap in knowledge)” (Gass, 1997: 4). Once the particular piece of input has been apperceived, the potential for intake to take place depends upon what Gass calls comprehended input. Comprehended input may be analyzed at different levels of analysis—for example, global comprehension versus a more linguistic focus—and these analyses have an impact on what becomes intake, which is controlled by the learner. The stage of intake is viewed as “a process of assimilating linguistic material. It refers to the mental activity that mediates between input and grammars and is different from apperception or comprehension as the latter two do not necessarily lead to grammar formation” (Gass & Selinker, 2001: 302). According to Gass and Selinker, intake is not merely a subset of input, indicating apparently that no linguistic data can be taken in without relatively robust processing or high

Attention and Awareness in L2 Learning

93

level of analysis, relatively similar to VanPatten’s postulation. It is in this intake component that psycholinguistic processes take place in relation to internalized grammatical prior knowledge, (over) generalizations are made, memory traces are formed, and fossilization stems. Major processes include hypothesis formation and testing, hypothesis rejection, hypothesis modification, and hypothesis confirmation. Whether awareness plays a role in these processes is not elaborated. According to Gass and Selinker (2001), “Input refers to what is available to the learner, whereas intake refers to what is actually internalized’’ (p. 197). There are at least two outcomes that are derived from the intake stage, both of which are a form of integration (Stage 4). According to Gass, one is the development per se of a learner’s second language grammar, and the other is storage. The distinction lies between integration and non-integration of new linguistic data. Input may be dealt with via four possibilities: The first two take place in the intake component and involve hypothesis confirmation/rejection or hypothesis reconfirmation that strengthens existing underlying rules, the third results in storage of some linguistic data awaiting additional information before it becomes integrated (somewhat similar to Robinson’s data driven data), and the fourth is nonuse, that is, learners make no use of the input at all, potentially due to a lack of comprehension at a useful level. Integration is continuous and the integration component does not function as an independent unit given that the model “is dynamic and interactive, with knowledge itself being accumulative and interactive” (Gass & Selinker, 2001: 304). Important variables involved in integration include different levels of analysis and reanalysis from storage into the grammar and within the grammar itself. The final stage is output, although Gass points out that it is not truly a stage in the acquisition process but more of an overt manifestation of the process. One important role of this stage is that it may serve as a means of confirming or disconfirming prior hypotheses of the L2 via feedback provided by someone else. Not elaborated but clearly depicted in Gass’s model is the role output plays in the learner’s intake component and, given the interactionist framework within which the model is subsumed, in subsequent negotiation with the L2 speaker’s and potential modifications based on the learner’s output. Key Features Here are the key features of Gass’s (and Gass and Selinker’s) model: 1. Minimally two stages occur before specific features of the input are converted into intake, namely apperception and comprehended input; 2. At the initial stage of apperception, there is a notion of a link between the incoming linguistic data and prior knowledge; 3. Noticing the gap at the stage or apperception involves some low level of awareness;

94

Theoretical Foundations

4. At the stage of comprehended input, there are subsequent levels of analyses (depth of processing?) that are responsible for converting the noticed input into intake; 5. Intake is not merely a subset of the input but requires some higher level of processing to be converted from input (different from apperception and comprehension); 6. Intake is viewed as a process and appears to be a step beyond the initial input-to-intake phase along the acquisitional (learning) process; 7. The role of prior knowledge also plays a role in the processing of intake; 8. What is taken in is controlled by the learner; 9. There is a distinction between integrated linguistic data and stored nonintegrated linguistic data that appears to equate the notions of conceptuallydriven and data-driven processes (cf. Robinson, 1995, above), respectively; and 10. The notion of amount of exposure that provides opportunities for restructuring or strengthening of prior knowledge. Comments on Gass’s Model of Second Language Acquisition In spite of the relatively completeness of Gass’s Model of second language acquisition in relation to the other theoretical underpinnings and stages addressed, not many researchers have reported grounding their theoretical foundations on this model. What is even more surprising is that the concept of apperception, to which Schmidt in his Noticing Hypothesis sometimes relates his construct of noticing, was postulated over a decade before the more well-known Noticing Hypothesis! One plausible explanation for the dearth of theoretical grounding may lie in the fact that Gass’s model was framed within the interactionist framework, and perhaps more focus was placed on Long’s Interaction Hypothesis and his works (e.g., Long, 1981, 1983, 1996) around that period. This model, even though it differs in many aspects from the one that I am proposing in Chapter 12, nevertheless provided an overall framework for mine.

One Modular Approach to the L2 Learning Process: Truscott and Sharwood Smith’s (2011) MOGUL Truscott and Sharwood Smith’s (2011) MOGUL (Modular Growth and Use of Language), an update of their 2004 model, is clearly different from the theoretical underpinnings presented above due to its modular approach to the L2 learning process. While it does not easily fit within the framework that I have presented, I do feel it is instructive to include it in this book, given its attempt to address some of the terminological and theoretical confusion related to concepts and constructs such as input, intake, and awareness. MOGUL is an interdisciplinary, processing-oriented framework of L2 development that integrates Jackendoff’s (1997, 2002) modular view of language with

Attention and Awareness in L2 Learning

95

a processing component (cf. also Carroll’s (2007) Autonomous Induction Theory). This framework is derived from the researchers’ perception of key SLA concepts such as input, intake, and consciousness, and motivated by what Truscott and Sharwood Smith refer to as the fundamental problem of Schmidt’s (1990) Noticing Hypothesis, namely the lack of a solid account of what consciousness is (cf. their critique above). Truscott and Sharwood Smith’s framework attempts to integrate language acquisition accounts with proposals of how language (and, more specifically, phonology and (morpho)syntax) and cognition interact. According to the researchers, “The integration of linguistic theory and language processing accounts can allow the mapping out, in a more detailed and coherent manner, of both the content aspects of linguistic knowledge and the real-time processing of that content” (Truscott & Sharwood Smith, 2011: 507). As you recall the description of a modular approach to the learning process in Chapter 3, you can begin to visualize the brain being divided into different components or modules, each responsible for processing some aspect of the L2 data. In MOGUL, there is a modularized core language system that contains two separate subsystems: The phonological module and the syntactic module. It also has an interface system that determines which phonological structures and syntactic structures are linked up, as well as interfaces with adjacent systems outside. On one side lie the auditory-acoustic systems that feed into the phonology and the articulatory systems responsible for the production of speech. On the other side lies the conceptual system that not only is responsible for the interpretation and encoding of meaning but also plays a crucial role in conscious introspection, as it is the basis for thought. Conceptual structures more or less cover the traditional areas of semantics and pragmatics. According to Truscott and Sharwood Smith, each of the modules, while operating according to their own unique code, shares the same basic internal structure consisting of (a) an information store of structures (representations), some of which are universal primitives (think of Chomsky’s language acquisition device or LAD) and others that are language-specific structures (think of grammatical differences between languages), and (b) a computational system or processor, which determines how the structures are selected, combined, and integrated into larger structures in the same code—that is, informally, the module’s rule system. The interface processors serve to match up structures in adjacent modules, such as the phonological representation of a word with its syntactic function (e.g., / hot/ with (adj.)). Truscott and Sharwood Smith also include an additional system, the affective structure that, like the other systems, consists of an information store that contains representations of the possible emotions and a processor responsible for dealing with them. All the modality-specific processing systems are tightly connected via the interfaces, and the current state of one system can potentially affect that of the other systems (do you recall the neuronal battle in Chapter 3?). In turn, such connectedness can produce a tendency to synchronize the final products

96

Theoretical Foundations

(perceptual output structures) of each system. So if activation in one or more systems is stronger than in others, then these systems will form a coalition and synchronize their perceptual output structures, which will remain activated while the others will simply disappear. Put another way, if we divide our attention to or processing of several aspects of the L2 data simultaneously, the chances of retaining in our working memory some solid data (synchronized perceptual output structures) to further process are quite minimal. Interestingly, Truscott and Sharwood Smith associate these perceptual output structures to Atkinson and Shiffrin’s (1968) short-term store, Baddeley’s (1986) working memory, and Baars’s (1988) global workspace. Working memory (WM) in MOGUL, following Cowan (1988), is defined as “the set of currently activated items in long-term memory (LTM)” (Sharwood Smith, Truscott, & Hawkins, 2013: 571). For example, syntactic LTM refers to all items in the syntactic memory store, and syntactic WM is a subset of those items currently undergoing activation. According to the framework, “acquisition is the lingering effects of processing” (p. 572). The concept of awareness or consciousness within the MOGUL framework is explicitly defined and is premised on the network of strong interconnections between systems and the accompanying tendency toward synchronization of the perceptual output structures. According to the framework, the content of awareness is not dependent upon terms such as input, language, language forms, information, and so on, but upon representation of such terms in the brain. Once the content of awareness is viewed as representations, then these representations can only become conscious if their activation level crosses a threshold, that is, the perceptual output structures reach exceptionally high levels of activation. Incomplete activation results in the failure of the perceptual output structure to attain the level of consciousness. People can only be aware of linguistic information in the form of sound (auditory structures) or written form (visual structures). However, indirect awareness of extra-modular linguistic knowledge is possible by virtue of its connections with perceptual representations that cross the consciousness threshold. According to the MOGUL framework, language learning can be explained using the MOGUL framework via several representations. First, input is defined as the representation(s) available to the language module. In the case of language acquisition, input is a perceptual representation (an auditory or visual structure) of spoken or written language. If the representation fails to reach a sufficient level of activation to cross the threshold of consciousness, then any subsequent use of this representation qualifies for subliminal perception. On the other hand, if it does cross the threshold, then this is a case of awareness of input. If this input or representation is subsequently abandoned, then only awareness of the incoming input has occurred. However, a portion of the representation may remain and form a new representation. If this new representation then potentially receives a higher level of activation, it may cross the threshold of awareness and qualify for what Schmidt calls “noticing.”

Attention and Awareness in L2 Learning

97

On the other hand, intake consists of any information in an input representation that can be used by the processor for a certain purpose at that moment. However, this information must be extracted from the perceptual output structure representation because this representation is nonlinguistic. Therefore, whether or not an input representation is actually intake depends on the module’s ability to perform this extraction, which in turn depends partially on the current state of the information store. The final output is represented by perceptual output structures, each dealing with input from one of the senses. Awareness of the input is considered to be awareness of a perceptual representation, while noticing is awareness of a follow-up perceptual representation that contains one part of the original representation. Notably, Truscott and Sharwood Smith posit that within the MOGUL framework noticing is important in acquiring metalinguistic knowledge, but should not play a direct role in language development. Truscott and Sharwood Smith propose the following levels of processingawareness, shown in Table 5.1. TABLE 5.1 Hierarchy of processing-awareness

Level of processing-awareness

Description

Subliminal perception

When an AS representation of an utterance is constructed but does not reach a sufficiently high activation level to become conscious, any use of information it contains constitutes subliminal perception. The AS representation of an utterance reaches a sufficiently high activation level to become conscious. A follow-up POpS representation, consisting of one portion of the original representation, is constructed as the result of processing that treats it as an instance of a particular form, and it reaches an activation level sufficient for awareness. Additional CS-POpS processing produces one or more CS of the meaning or significance of the noticed representation, resulting in a perceptual representation with a high enough activation level to become conscious.

Awareness of input Noticing-understanding

Conscious understanding beyond the noticed representation

Source: Truscott and Sharwood Smith (2011: 520). AS = auditory structure; CS = conceptual structure; POpS = perceptual output structures

Key Features Here are the key features of Truscott and Sharwood Smith’s (2011) MOGUL framework: 1. 2.

Provides an attempt to refine the key concepts of input, intake, and awareness in SLA; Provides a modular perspective of the learning process;

98

Theoretical Foundations

3.

Provides a more fine-tuned differentiation between the concepts of noticing and levels of awareness; Provides a more fine-tuned postulation of the role of awareness during the input-to-intake stage of the learning process; Provides different levels of processing in relation to awareness.

4. 5.

Comments on Truscott and Sharwood Smith’s (2011) MOGUL Framework The MOGUL framework that first appeared over a decade ago in 1994, seeking to explicate the key concepts of input, intake, and consciousness (awareness) in SLA, is similar to the other theoretical underpinnings in that it is derived from a non-SLA source, but different in the sense that it provides a modular perspective of the L2 learning process. Crucially, Truscott and Sharwood Smith (2011) point out that their interdisciplinary, processing-oriented framework of L2 learning is “not a theory designed to explain specific phenomena. It is rather a broad framework within which specific theories can be formulated to allow research and theory from a range of disciplines to be brought together. Thus, the issues of testability that are crucial for theories are not directly applicable to MOGUL in itself” (p. 522). This statement may account for the paucity of empirical studies that have grounded their theoretical underpinning in their framework or attempted to address empirically its tenets. This is not surprising, given the potential challenge of not only teasing out the roles of the various modules from their interconnectedness with each other but also addressing representations beyond lexical forms. A similar issue, as you may recall, lies with Tomlin and Villa’s (1994) model that postulated the three attentional functions and attempted to separate these functions during the learning process.

Summary of Principle Cognitive Processes in the L2 Learning Process The key features of the different theoretical underpinnings to account for the preliminary exposure to L2 input and learners’ eventual output may be summed up in the following chart. Stages Attention Awareness Levels of Levels of Prior Amount WM before awareness processing knowledge of intake exposure Noticing hypothesis Model of input proc. (T/V)

1

yes

yes

yes

(yes)

no

no

no

1

yes

no

n/a

no

no

no

no

Attention and Awareness in L2 Learning

99

Stages Attention Awareness Levels of Levels of Prior Amount WM before awareness processing knowledge of intake exposure Model of SLA & memory Cognitive theory CREED Model of input proc. (VP) Output hypothesis Model of SLA (Gass) MOGUL

2

yes

yes

yes

yes

yes

no

yes

1

yes

no

n/a

yes

(yes)

yes

no

n/a 1

yes yes

n/a (yes)

n/a no

n/a yes

yes (yes)

yes no

no yes

n/a

yes

yes

no

(yes)

yes

no

n/a

2

yes

yes

(no)

yes

yes

no

(yes)

n/a

yes

yes

yes

yes

yes

no

yes

If we look carefully at all these theoretical postulations concerning the stages of the learning process, we can easily identify the principal cognitive processes and features shared by over half of the different theoretical frameworks to account for the preliminary exposure to L2 input and learners’ eventual output. In addition to input, intake, and knowledge processing already discussed in Chapter 2, and working memory in Chapter 3, we have attention, awareness, depth or levels of processing, and prior knowledge. Before I elaborate on the constructs of attention and awareness more fully in the following chapters, accompanied by concise reports of empirical studies that have addressed these processes, let us take a brief look now, as promised in Chapter 4, at the role of prior knowledge.

The Role of Prior Knowledge It is well established in the L2 reading literature conducted within the “schema theory” framework (e.g., Carrell, 1987, 1992; Rumelhart, 1980) that background knowledge and top-down processing are the major factors in both native and foreign language reading comprehension (Bransford, 1979). Most researchers posit in their theoretical underpinning, explicitly or implicitly, a role for prior knowledge in relation to noticing and processing L2 data. For example, recall that Gass’s (1997) apperception is “an internal cognitive act in which a linguistic form is related to some bit of existing knowledge (or gap in knowledge)” (p. 4). According to Robinson (1995), “prior experience may predispose learners to attend, for example, to form or meaning in processing a stimulus . . . leading to which information gets noticed during task performance” (p. 296), and underscores the role of conceptually-driven processing after the occurrence of detected intake. Swain views output processing as the opportunity to push learners to raise their awareness of potential gaps between their current interlanguage (current knowledge) and the L2. Evidence of activation of prior knowledge were

100

Theoretical Foundations

evident in think-aloud protocols used to provide “valuable insights into instances of activation of the linguistic forms at the time of the second exposure” (Leow, 1998: 53). A review of the verbal reports showed that whereas several individuals in the multipleexposure LC group often made reference to prior knowledge of the target forms, some other learners performed the task as if it were the first time. In addition, learners who showed higher levels of awareness in the first exposure (i.e., awareness at the level of rule formulation) were precisely those who demonstrated activation of prior knowledge. Participants who demonstrated awareness at the level of noticing during the first exposure failed to report activation of prior knowledge. Based on this evidence, Leow hypothesized that perhaps the former group had engaged conceptuallydriven processing, whereas the latter activated data-driven processing. Gass’s (1988) initial postulation of apperception made decades ago and subsequently linked to Schmidt’s (1990) and Robinson’s (1995) notion of noticing later in 1997 makes sense if we are to accept the simple fact that in many cases our students already have, in addition to other types of knowledge, an L1 that shares grammatical structures with the L2. The role of prior knowledge in the processing of the L2 may be a no-brainer.

Conclusion This chapter has provided a brief description of the major cognitive tenets of several theoretical underpinnings of L2 learning premised on psycholinguistic underpinnings and reported empirical findings to support such crucial cognitive processes postulated to play important roles in L2 processing and learning. In many of the theoretical underpinnings, the process of attention clearly plays a major role in L2 development in SLA, and is broadly viewed as the first step for some aspect or feature in the L2 input to be taken in, whether at the input processing stage (Stage 1) or the knowledge processing stage (Stage 5), or stored briefly in working memory with the potential to be further processed and/or linked to awareness. Indeed, some researchers (e.g., Schmidt, 2001) view the allocation of attention as “the pivotal point at which learner-internal factors . . . and learner-external come together” (p. 11). However, as we shall see later, it may be the case that it is after attention has been paid to particular aspects of the L2 input that any potential for learning or L2 development takes place. Now, we do not want to simply accept everything we are told, so we will also take a close look in Chapters 9, 10, and 11 at the empirical support for these processes based on studies conducted in the SLA literature. Even more interesting, and to explicate empirically these key processes, these three chapters will incorporate the insights gleaned from the increasing amount of concurrent (online) empirical data reported over more than a decade ago in SLA literature on L2 learners’ cognitive processes while interacting with L2 data. However, we need to arm ourselves methodologically before we tackle this empirical support, and to this end Chapters 6, 7, and 8 will provide key information on what we need to look for in published empirical studies.

Attention and Awareness in L2 Learning

101

References Allport, A. (1988). What concept of consciousness? In A. J. Marcel & E. Bisiach (Eds.), Consciousness in contemporary science (pp. 159–182). London: Clarendon Press. Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research and theory (Vol. 2). New York: Academic Press. Baars, B. J. (1988). A cognitive theory of consciousness. Cambridge: Cambridge University Press. Baddeley, A. (1986). Working memory. Oxford: Oxford University Press. Benati, A. (2004). The effects of structured input activities and explicit information on the acquisition of the Italian future tense. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 207–225). Mahwah, NJ: Lawrence Erlbaum. Best, J. B. (1992). Cognitive psychology. New York: West Publishing. Bransford, J. D. (1979). Human cognition: Learning, understanding and remembering. Belmont, CA: Wadsworth Publishing. Carrell, P. (1987). Content and formal schemata in ESL pedagogy. TESOL Quarterly, 21, 461–481. Carrell, P. L. (1992). Awareness of text structure: Effects on recall. Language Learning, 42, 1–20. Carroll, S. E. (2007). Autonomous induction theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 155–173). Mahwah, NJ: Lawrence Erlbaum. Conway, C., & Christiansen, M. (2006). Statistical learning within and between modalities. Psychological Science, 17, 905–912. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention and their mutual constraints within the human information processing system. Psychological Bulletin, 104, 163–191. DeKeyser, R. (2007). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 97–113). Mahwah, NJ: Lawrence Erlbaum. de la Fuente, M. (2002). Negotiation and oral acquisition of L2 vocabulary: The roles of input and output in the receptive and productive acquisition of words. Studies in Second Language Acquisition, 24, 81–112. Ellis, N. C. (2001). Memory for language. In P. Robinson (Ed.), Cognition and second language instruction (pp. 33–68). Cambridge: Cambridge University Press. Ellis, N. C. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27, 1–24. Ellis, N. C. (2007). The associative-cognitive CREED. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 77–95). Mahwah, NJ: Lawrence Erlbaum. Ellis, N. C., & Sagarra, N. (2010). The bounds of adult language acquisition: Blocking and learned attention. Studies in Second Language Acquisition, 32, 553–580. Ellis, N. C., & Sagarra, N. (2011). Learned attention in adult language acquisition: A replication and generalization study and meta-analysis. Studies in Second Language Acquisition, 33, 589–624. Farley, A. P. (2004). Processing instruction and the Spanish subjunctive: Is explicit information needed? In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 227–239). Mahwah, NJ: Lawrence Erlbaum. Fernández, C. (2006). Re-examining the role of explicit information in processing instruction. Studies in Second Language Acquisition, 30, 277–305.

102

Theoretical Foundations

Gass, S. M. (1988). Integrating research areas: A framework for second language studies. Applied Linguistics, 9, 198–217. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Gass, S. M., & Mackey, A. (2007). Input, interaction, and output in second language acquisition. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 175–199). Mahwah, NJ: Lawrence Erlbaum. Gass, S. M., & Selinker, L. (2001). Second language acquisition: An introductory course (2nd ed.). New York: Routledge. Gass, S. M., & Selinker, L. (2008). Second language acquisition: An introductory course (3rd ed.). New York: Routledge. Goldschneider, J., & DeKeyser, R. (2001). Explaining the ‘natural order of L2 morpheme acquisition’ in English: A meta-analysis of multiple determinants. Language Learning, 51, 1–50. Hamrick, P., & Rebuschat, P. (2014). Frequency effects, learning conditions, and the development of implicit and explicit lexical knowledge. In J. Connor-Linton & L. W. Amoroso (Eds.), Measured language: Quantitative studies of acquisition, assessment, and variation (pp. 125–139). Washington, D.C.: Georgetown University Press. Henry, N., Culman, H., & VanPatten, B. (2009). The role of explicit information in processing instruction: An on-line study with German accusative case inflections. Die Unterrichtspraxis, 42, 19–31. Holme, R. (2013). Emergentism, connectionism, and complexity. In J. Herschensohn & M. Young-Scholten (Eds.), The Cambridge handbook of second language acquisition (pp. 605–626). New York: Cambridge University Press. Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An experimental study on ESL relativization. Studies in Second Language Acquisition, 24, 541–577. Jackendoff, R. (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff, R. (2002). Foundations of language: Brain, meaning, grammar, evolution. New York: Oxford University Press. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall. Krashen, S. (1982). Principles and practice in second language acquisition. Oxford: Pergammon. Krashen, S. (1989). We acquire vocabulary and spelling by reading: More evidence for the input hypothesis. Modern Language Journal, 73, 440–464. Lantolf, J. P., & Thorne, S. L. (2007). In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 201–224). Mahwah, NJ: Lawrence Erlbaum. Lee, J. F., Cadierno, T., Glass, B., & VanPatten, B. (1991). The effects of lexical and grammatical cues on processing past temporal reference in second language input. Applied Language Learning, 8 (1), 2–14. Leow, R. (1997). Attention, awareness, and foreign language learning. Language Learning, 47, 467–505. Leow, R. (1998). Toward operationalizing the process of attention in SLA: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19, 133–159. Leow, R. P. (2013). Schmidt’s noticing hypothesis: More than two decades after. In J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in honor in Richard Schmidt (pp. 23–35). Honolulu, HI: University of Hawai‘i, National Foreign Language Resource Center. Leow, R. P., Grey, S., Marijuan, S., & Moorman, C. (2014). Concurrent data elicitation procedures, processes, and the early stages of L2 learning: A critical overview. Second Language Research, 30 (2), 111–127.

Attention and Awareness in L2 Learning

103

Leow, R. P., & Hama, M. (2013). Implicit learning in SLA and the issue of internal validity: A response to Leung and Williams’ “The implicit learning of mappings between forms and contextually derived meanings.” Studies in Second Language Acquisition, 35(3), 545–557. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. Lieven, E. (2010). Input and first language acquisition: Evaluating the role of frequency. Lingua, 120, 2546–2556. Long, M. (1981). Input, interaction, and second language acquisition. Annals of the New York Academy of Sciences, 379, 259–278. Long, M. (1983). Linguistic and conversational adjustments to non-native speakers. Studies in Second Language Acquisition, 5, 177–193. Long, M. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). San Diego: Academic Press. MacWhinney, B. (2001). The competition model: The input, the context, and the brain. In P. Robinson (Ed.), Cognition and second language instruction (pp. 69–90). New York: Cambridge University Press. McLaughlin, B. (1987). Theories of second language learning. London: Edward Arnold. McLaughlin, B. (1990). “Conscious” versus “unconscious” learning. TESOL Quarterly, 24, 617–634. McLaughlin, B., Rossman, T., & McLeod, B. (1983). Second-language learning: An information-processing perspective. Language Learning, 33, 135–158. Mitchell, R., Myles, F., & Marsden, E. (2013). Second language learning theories (3rd ed.). New York: Routledge. Morgan-Short, K., & Bowden, H. W. (2006). Processing instruction and meaningful output-based instruction: Effects on second language development. Studies in Second Language Acquisition, 28, 31–65. Musumeci, D. (1989). The ability of second language learners to assign tense at the sentence level: A cross-linguistic study. (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, Champaign, IL. Neumann, O. (1996). Theories of attention. In O. Neumann & W. Prinz (Eds.), Handbook of perception and action Vol. 3: Attention (pp. 389–446). San Diego: Academic Press. Perruchet, P., & Pacton, S. (2006). Implicit learning and statistical learning: One phenomenon, two approaches. Trends in Cognitive Sciences, 10, 233–238. Pienemann, M. (2007). Processability theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 137–154). Mahwah, NJ: Lawrence Erlbaum. Posner, M. I. (1995). Interaction of arousal and selection in the posterior attention network. In A. Baddeley & L. Weiskrantz (Eds.), Attention: Selection, awareness, and control (pp. 390–405). London: Clarendon. Posner, M. I., & Petersen, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25–42. Rebuschat, P., & Williams, J. (2012). Statistical Learning and Language Acquisition. Berlin: Mouton de Gruyter. Robinson, P. (1995). Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule search and instructed conditions. Studies in Second Language Acquisition, 18, 27–67. Robinson, P. (2003). Attention and memory in SLA. In C. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 631–678). Oxford: Blackwell.

104

Theoretical Foundations

Robinson, P. (2005). Cognitive abilities, chunk strength, and frequency effects in implicit artificial grammar and incidental learning: Replications of Reber, Walkenfeld, and Hernstadt (1991) and Knowlton and Squire (1996) and their relevance to SLA. Studies in Second Language Acquisition, 27, 235–268. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21, 511–556. Rumelhart, D. E. (1980). Schemata: The building blocks of cognition. In R. J. Spiro, B. C. Bruce, & W. F. Brewer (Eds.), Theoretical issues in reading comprehension (pp. 33–58). Hillsdale, NJ: Lawrence Erlbaum. Saffran, J. (2003). Statistical language learning: Mechanisms and constraints. Current Directions in Psychological Science, 12, 110–114. Saffran, J. R., Aslin, R. N., & Newport, E. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926–1928. Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27–52. Sanz, C., Lin, H-J., Lado, B., Bowden, H. W., & Stafford, C. A. (2009). Concurrent verbalizations, pedagogical conditions, and reactivity. Two CALL studies. Language Learning, 59, 33–71. Sanz, C., & Morgan-Short, K. (2004). Positive evidence versus explicit rule presentation and explicit negative feedback: A computer-assisted study. Language Learning, 54, 35–78. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Schmidt, R. W. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13, 206–226. Schmidt, R. W. (1994). Implicit learning and the cognitive unconscious: Of artificial grammars and SLA. In N. Ellis (Ed.), Implicit and explicit learning of languages (pp. 165–209). London: Academic Press. Schmidt, R. (1995). Consciousness and foreign language learning: A tutorial on the role of attention and awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and teaching (pp. 1–64). (Second Language Teaching and Curriculum Center Technical Report No. 9). Honolulu, HI: University of Hawai’i Press. Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32). New York: Cambridge University Press. Schmidt, R., & Frota, S. (1986). Developing basic conversational ability in second language. In R. Day (Ed.), Talking to learn (pp. 237–326). Rowley, MA: Newbury House. Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics, 10, 219–231. Sharwood Smith, M., Truscott, J., & Hawkins, R. (2013). Explaining change in transition grammars. In J. Herschensohn & M. Young-Scholten (Eds.), The Cambridge handbook of second language acquisition (pp. 560–580). New York: Cambridge University Press. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190. Simard, D., & Wong, W. (2001). Alertness, orientation, and detection: The conceptualization of attentional functions in SLA. Studies in Second Language Acquisition, 23, 103–124.

Attention and Awareness in L2 Learning

105

Stafford, C., Bowden, H., & Sanz, C. (2012). Optimizing language instruction: Matters of explicitness, practice, and cue learning. Language Learning, 62 (3), 741–768. Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through collaborative dialogue. In J. Lantolf (Ed.), Sociocultural theory and second language learning (pp. 97–114). Oxford: Oxford University Press. Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 471–483). Mahwah, NJ: Lawrence Erlbaum. Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive processes they generate: A step towards second language learning. Applied Linguistics, 16 (3), 370–391. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16 (2), 183–203. Truscott, J. (1998). Noticing in second language acquisition: A critical review. Second Language Research, 14 (2), 103–135. Truscott, J., & Sharwood Smith, M. A. (2004). Acquisition by processing: A modular approach to language development. Bilingualism: Language and Cognition, 7, 1–20. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/ procedural model. Cognition, 92 (1–2), 231–270. VanPatten, B. (1990). Attending to form and content in the input: An experiment in consciousness. Studies in Second Language Acquisition, 12, 287–301. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B. (2007). Input processing in adult second language acquisition. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 115–135). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B., & Cadierno, T. (1993). Explicit instruction and input processing. Studies in Second Language Acquisition, 15, 225–241. VanPatten, B., & Fernández, C. (2004). The long term effects of processing instruction. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 273–289). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B., & Oikkenon, S. (1996). Explanation versus structured input in processing instruction. Studies in Second Language Acquisition, 18, 495–510. VanPatten, B., & Williams, J. (Eds.). (2007). Theories in second language acquisition. Mahwah, NJ: Lawrence Erlbaum. Wickens, C. D. (1989). Attention and skilled performance. In D. Holding (Ed.), Human skills (pp. 71–105). New York: John Wiley. Wong, W. (2004). The nature of processing instruction. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 33–63). Mahwah, NJ: Lawrence Erlbaum.

This page intentionally left blank

SECTION 2

Research Methodology

This page intentionally left blank

6 METHODOLOGICAL ISSUES IN RESEARCH ON THE RELATIONSHIPS BETWEEN ATTENTION, AWARENESS, AND L2 LEARNING IN SLA

This chapter provides an in-depth report of the heart of any empirical study, namely the research design. If the research design of a study is not robust, then obviously the findings are not going to be robust, the level of confidence we can place on the findings is going to be lessened, and whether we can extrapolate the findings to the classroom setting warrants much caution. I am not afraid to admit that in my early days of academia I recall skimming the research section of the study, especially the statistical component, and concentrating on the conclusion without much critique. The ability to critique research designs is of paramount importance in not only being able to extract robust research findings, from both the SLA and non-SLA literatures, to implement into our language curriculum and classrooms but also being able to improve our professional education as capable judges of empirical research. To this end, I will spend some time focusing on several methodological issues surrounding the investigation of the relationship between the role of attention and awareness or lack thereof and learning in the SLA field. More specifically, we will discuss the heart of the research design, namely the level of validity (internal and external) of the study. Put simply, the higher the level of internal validity, the more confidence we can place in the findings. I will provide and discuss a checklist of the validity criteria I have proposed for classroom-based research in SLA premised on the roles of attention and awareness in L2 development, which will easily generate many questions as we read empirical studies conducted within this attentional framework.

Research Methodology and Attentional SLA Studies Did I mention that research methodology is arguably the cornerstone of any empirical study? It is the part of a study in which researchers, based on wellmotivated research questions or hypotheses, carefully

110





• •

• •

Research Methodology

select the appropriate participants and type of research design (e.g., experimental, that is, variables being investigated are tightly controlled; quasiexperimental, with less control of variables; case study with few participants, limiting generalizability; etc.); develop or create experimental materials or tasks to elicit specific data to address the research questions or hypotheses, tests (e.g., pre-, post-, delayed, multiple-choice, written production, etc.); plan the procedure to conduct the experiment (e.g., what exactly participants are going to do during the study); decide on the scoring and/or coding of data elicited, including assessing inter-rater reliability (that is, having more than one rater code the data to ensure that the overlap of similar codings between the raters is high) if the data are not categorical (one point or none) or can be potentially coded subjectively; submit the data elicited to appropriate quantitative (statistical) or qualitative analyses to address the research questions or hypotheses; and, finally, write up the results section in relation to the research questions or hypotheses (in the driest tone possible).

How do we evaluate whether a study employed a robust research design that produced findings in which we can place quite a high level of confidence and take them to our classrooms for implementation? We look at the internal and external validities of the study. Let us discuss these two validities in more detail.

Internal and External Validities Two crucial features of a robust research design are the levels of the internal validity and external validity of the study itself. Internal validity deals with whether the interpretation of the research findings is firmly based within the study itself or, in other words, how well the data elicited answer the research questions (Hatch & Lazaraton, 1991). External validity deals with whether the findings can be generalizable to the participant population and other similar settings. A study cannot have external validity if it does not have internal validity. The level (high or low) of internal and external validity depends not only on the amount of inherent methodological limitations found in the study but also on the degree of impact the limitation may have on the interpretation of the results. For example, a study conducted within an attentional framework with a pretest–posttest–delayed posttest design might have omitted information on the participant pool and still achieved a relatively high level of internal validity. However, another study did not control for outside contamination of the data during the period between the posttest and delayed posttest and therefore could not rule out the alternative interpretation of the results due to confounding

Methodological Issues in Research

111

of experimental exposure and external information. As can be seen, the latter limitation is far more serious than the former. Consequently, it is important to control, as far as possible, all variables that can potentially provide an alternative interpretation of and/or affect the generalizability of the results. The following sample of an empirical study exemplifies an experiment with a high level of internal validity. The study is conducted to investigate the effects of type of exposure (textual enhancement?) on learners’ second/foreign language development using a pretest (exposure)–posttest–delayed posttest design. Delayed posttests are very useful, if well controlled to address the potential of external contamination having an impact on the subsequent results, to provide important information on whether the targeted linguistic information has indeed been internalized or retained in the learners’ internal system, classic evidence of robust learning. The experimental exposure is carefully designed to draw learners’ attention to or noticing of a specific linguistic form in the input. The premise of the study is that the externally manipulated input will draw substantially more attention to the targeted form from the experimental participants when compared to the attention paid by the control participants. The outcome will be statistically superior performance by the experimental participants when compared to the control participants as measured on the posttests. Participants are randomly assigned to either the experimental group or the control group. Participants’ lack of prior knowledge of the targeted linguistic form is statistically measured and established on a pretest administered before the exposure (if the same pretest is being used as the posttest, it is useful to administer the pretest about three weeks before the experimental exposure and also to reorder the items in an effort to control for any potential test effect, that is, participants remembering the targeted items from the pretest instead of the experimental exposure). While the ideal design will only include participants with zero knowledge of the targeted form, a low cutoff point of prior knowledge is used to include participants in the study (the higher the cutoff point, the higher potential for prior knowledge possibly playing a role also in the results). All variables with a potential to impact the results (e.g., time on task, location) are controlled. Concurrent or online data (i.e., think aloud protocols, discussed in detail below) are gathered while participants are exposed to the targeted input to establish that (1) participants did pay attention to the targeted forms and (2) the role of prior knowledge did not play a crucial role in participants’ performances. Once the roles of attention/noticing and prior knowledge are addressed methodologically via the concurrent data collected, that is, participants failing to satisfy the experimental conditions of the study are removed from the participant pool, statistical analyses are then conducted on the posttest scores. The results indicate that the experimental group did not perform significantly better after the exposure when compared to the control group’s performance. The quantitative analysis of the online data confirms that both experimental groups paid a relatively similar amount of attention to the targeted items in the input, but that participants reporting higher levels of awareness or depth

112

Research Methodology

of processing of the targeted items performed substantially better than those who did not on all assessment tasks. The researcher can then claim that the statistically similar performance between the two groups was due to the amount of attention paid during the experimental exposure, which reduced the effect of the manipulated input. The researcher may also suggest that, based on the performance of the successful participants’ protocols, perhaps it was not a simple case of paying attention to the targeted items but also processing such items at a deeper level, as evidenced in several other published studies on levels of awareness. This study is said to be high in internal validity when no alternative interpretation can be provided for the results found. If another similar study, employing a different online data elicitation procedure (e.g., eye-tracking, which is a better procedure to measure the process of attention, as discussed below), reports that the experimental group did in fact pay a statistically greater amount of attention but still no significant difference in performance was observed between the experimental and control groups, then the first study receives empirical support for both its external validity and plausible explanation of deeper processing of the targeted items in the input. Note, however, that a relatively similar study can potentially have a lower level of internal validity. For example, the researcher had only used a pretest-exposureposttest design with no concurrent or online data collected that addressed the roles of attention and prior knowledge. Given that these two variables were not methodologically addressed or controlled, the study may be said to have a lower level of internal validity due to (1) the absence of any process measure or concurrent data elicitation procedure and (2) an alternative interpretation for the findings, namely, the potential role of prior knowledge that somehow could have triggered further processing of the targeted linguistic form by some participants in one group, with the subsequent performance a result of a confounding of attention plus prior knowledge. In other words, it is not very clear what contributed to the final results of the study due to the research design employed and the potential of more than one interpretation of the results, namely, was it the construct of attention, prior knowledge, or a combination of both variables that contributed to the results? I have critiqued in several publications (e.g., Leow, 1999, 2000; Leow & Hama, 2013) the internal validity of studies addressing internal processes such as attention and awareness that have employed a pretest–exposure (minus online data elicitation procedure)–posttest design by describing the failure to operationalize and measure these two internal processes in second language acquisition (SLA) studies. Given that the research methodology employed in any empirical study is so vital to the reader’s confidence in the findings of said study, I am going to walk through the criteria for both internal and external validity (Leow, 1999) that I listed to be seriously considered while preparing a research design or reading about one in a published study premised on the role of learner internal processes. I will also include the recommendations I made to raise the internal validity level of the study. Let us begin with the more important one: internal validity.

Methodological Issues in Research

113

Internal Validity I identified 17 criteria for internal validity divided into three categories, namely General design, Possible confounds, and Measurement.

For Internal Validity (Leow, 1999) Table 1: 17 selected criteria for internal validity General design 1. 2. 3. 4. 5. 6.

There was a control group There was randomization of participants to experimental and control groups Participant attrition (mortality) was roughly equivalent in all groups There was an explicit description of independent variable(s) There was an explicit description of dependent variable(s) Hawthorne effects were unlikely

Possible confounds 7. 8. 9. 10. 11.

Both control and experimental groups were exposed to the same material If so, an equal amount of time was allotted to both groups Amount of time was reported on dependent variable tasks for both groups Same experimenter provided treatment for all conditions External exposure (e.g., outside the classroom, textbook, etc.) was controlled

Measurement 12. There were manipulation checks (to ensure that participants were performing as instructed) 13. There were process measures (e.g., think alouds or other measures of actual attention paid) 14. Alternate forms were used in repeated measures designs 15. Reliability of dependent measures were reported 16. Inter-rater reliabilities were reported 17. Regression to the mean was eliminated as a possible interpretation of results

General Design As can be seen in General design, there are six criteria that address internal validity. To compare the effect of any independent variable (the variable we want to study, such as type of instruction or level of proficiency, etc.) on some dependent variable (such as learners’ oral or written production or comprehension, etc.), the presence of a control group, that is, one that is not exposed to any special treatment or exposure, is vital to the research design. Control groups can

114

Research Methodology

also be viewed from two perspectives: One that controls for normal maturation, that is, over a period of time learners will learn, and one that is called the true control group that receives a shared baseline exposure but not the additional exposure that the experimental group receives. By sharing the same baseline of information, comparisons are being made between the additional exposure or lack thereof. For example, we want to investigate the effects of a special type of instruction on learners’ ability to produce in writing the targeted grammatical structure, which is unknown to the participants. Our design has two groups: A (maturational) control group that is not exposed to the targeted grammatical structure and an experimental group that does receive instruction on this structure. We administer the typical pretest (luckily the two groups are statistically similar in grammatical ability, with almost zero pretest scores!) and posttest and then statistically compare the two groups’ performances on the posttest. The experimental group outperforms statistically the control group, so we claim that instruction works! What do you think? Now let us return to this same design, but instead of using this maturational control group we opt for the true control one. In this design, we also expose the control group to the targeted grammatical structure in a reading passage but do not inform them of its presence. The experimental group receives the instruction. Statistical analyses on the posttest reveal that the experimental group performs significantly better than the control group. These findings are more robust than the other study’s given that it is more of an apple to apple comparison than a pineapple to apple comparison as in the first study. Participants need to be randomly assigned to both experimental and control groups, that is, every participant must have an equal chance of being assigned to any condition. For instructional studies, the computer makes this far easier to achieve instead of using whole classes due to administrative issues. Why do we also need to report the equivalence in participant attrition (mortality) in groups? Let us imagine that after randomization, we have 2 groups of 25 participants. A large number of unmotivated participants in one group decide to leave their group, leaving highly motivated ones, while the other group remains intact with a mixture of both motivated and unmotivated participants. Any difference in performance may be related to the composition of the groups and not necessarily to the independent variable being investigated. In other words, participant attrition may provide an alternative interpretation of the results and therefore weakens the internal validity of the study. Explicit descriptions of both independent and dependent variables are obvious so we know exactly what the study is all about and, ideally, we can replicate the study. There is currently an increasing focus on the need for and value of replication studies (cf. Mackey, 2012), easily exemplified in several refereed journals that do publish this type of research (e.g., Language Teaching, Studies in Second Language Acquisition). Hawthorne effects, which refers to participants in one group feeling so special for being in this group that they put greater effort

Methodological Issues in Research

115

than normal into their participation, may also provide an alternative explanation for this group outperforming another group in the study. We can eliminate Hawthorne effects as a potential interpretation of the results by simply informing the control group of their participation in the experiment, especially if the regular teacher is not the one providing the treatment to the experimental group.

Possible Confounds In this category of Possible confounds, I identified five criteria for internal validity. Relatively similar to the use of the type of control group discussed above, for true comparisons to be made both control and experimental groups must be exposed to the same experimental material. Possible interpretations of the differential performances between the groups could include lack of exposure to the materials or the experimental treatment. The amount of time of exposure should not be statistically different, and we also need to report the amount of time spent on the dependent variable tasks (e.g., a written production task) for both groups since the difference in performance may be due to the amount of time spent during exposure or on dependent tasks and not related to the experimental treatment. One way to avoid the time on dependent variable task violation may be to allow all participants to complete the task at their own pace and then analyze time on the task as an independent variable. If there is a significant difference in performance due to time, then we need to report this and water down our conclusions of the study. Other researchers may opt for careful piloting testing of the time spent on task before administering the tasks. If the same experimenter does not provide treatment for all conditions (like in whole classes), this could weaken the findings of the study since it will not be clear whether effects found between groups are due to the differences between the teachers themselves (e.g., their individual presentations) or differences between instructional treatments or exposures. Two ways we can avoid this limitation is to counterbalance the treatment so that all groups get an equal exposure to different teachers, or we can videotape or record all groups in order to ascertain that no substantial difference exists between the teachers’ expectations regarding their role in the study. Fortunately, the current use of technology in classroom-based research has been useful to address this potential issue by tightly controlling several experimental variables (such as type, amount, and timing of input). Finally, in this category it is very important to control for any access to additional information related to the targeted item in the study that could come from external exposure (e.g., outside the classroom, textbook, instructors, etc.) during the experimental period, especially between the posttests and delayed posttests, or when the experimental treatment takes place on several occasions or over more than one day. Whether participants’ performances on subsequent occasions or on the delayed posttest were due to the experimental exposure or confounded with external exposure provides an alternative interpretation of said performances. One attempt to

116

Research Methodology

address this limitation is to administer a debriefing questionnaire at the end of the study requesting specific information concerning potential external exposure during the experimental period, that is, from pretest to delayed posttest or from the beginning to the end of the experiment. Note that this is not 100% perfect but minimally an effort to control for an important limitation of the research design.

Measurement Six internal validity criteria are identified for the measurement component. It is very important to ensure that participants are following instructions. There are several studies that have reported that, based on both online and offline data, several participants did not perform according to their experimental condition. In other words, participants’ similar performances in different groups and variation of performance within one cell are also other alternatives identified by some researchers. To underscore the importance of participants following faithfully the instructions provided per experimental condition, imagine the following study investigating the effect of types of dance on participants’ subsequent happiness. We randomly assign participants to one of three experimental conditions: In condition 1, participants are asked to dance the salsa, in condition 2, they are asked to dance the merengue, and in condition 3 (control) they are asked to do nothing while listening to the music. To control for some participants’ reluctance to dance in public, all conditions were dark. The music began and we quietly entered each condition. We noted that few of the salsa participants were dancing, or if they were, for some of them it was definitely not the salsa. A different observation was made in the merengue condition where most of the participants were having a ball, though in this group there were some salsa dancers. And in the control group, apparently the music was too much and most of the participants were dancing. Without this information on participants’ performances, we would have assumed that all participants followed the instructions faithfully and submitted the scores on happiness test to statistical analyses. The results indicate that there was a significant difference between condition 1 (salsa) and 2 (merengue) and also even between condition 1 and control, given that almost everyone in these two groups enjoyed themselves. So, can we conclude that, yes, type of dance did indeed have an effect of subsequent happiness, although will we have to explain why the control group also expressed more happiness than condition 1? Or do we eliminate those participants who did not represent their experimental condition faithfully? Incidentally, a relatively high rate of exclusion of participants is typically reported in many studies (e.g., Gurzynski-Weiss, Al Khalil, Baralt, & Leow, 2015; Hama & Leow, 2010; Leow, 1997, 1998a, 1998b; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999) that have employed concurrent data elicitation procedures (e.g., verbal reports) to ascertain what processes are employed by learners while exposed to the L2 data. Concurrent data tend

Methodological Issues in Research

117

to reveal that several participants do not faithfully follow the instructions they receive or do not perform according to the condition designed by the experimenter. Re-assigning or eliminating these participants from the study for noncompliance raises the level of internal validity in the study. Interestingly, even studies employing offline measures also report noncompliance with instructions. Williams and Evans (1998) reported that participants in both experimental and control groups appeared to be aware of the targeted forms in the experiment: “[T]he subject even came up in one discussion in group C” (p. 150). Likewise, Leeman, Arteagoitia, Fridman, and Doughty (1995) reported that there were some non-quantifiable external evidence that at least some of them noticed the enhancement in the input they received . . . At least some participants also seemed aware of the feedback they received and one person in the Focus on Form group clearly used feedback as a way to evaluate his own performance in a debate . . . Nonetheless, it also seems that not all of the enhancement were noticed by all the participants . . . On the other hand, a number of participants did comment specifically on the target structure, which seems to indicate that instruction did increase their attention to these forms. (p. 248) Harley (1998) reported that her participants’ responses during an informal interview “revealed that some children were consciously aware of the relevance of noun endings for gender attribution” (p. 168). Similarly, Robinson’s (1996) statistical analyses “do show evidence of wide-scale awareness at each of these levels during training under all conditions” (Implicit, Incidental, Rule-search, Instructed; p. 76). I described in 1999 the failure to use process measures (e.g., think alouds, eye-tracking, or other measures of actual attention paid) as “probably the most important shortcoming in measurement in the field of attention in SLA” (p. 65). That is, I found that 87% of the attentional studies up to that year failed to establish methodologically participants’ online processing of the L2 input and the role attention played in such processing. Most researchers appeared to have relied totally on the performances of participants on the post-exposure tasks to infer as to what they paid attention to while exposed to the L2 data. Leow and Hama (2013) cautioned that for studies empirically addressing cognitive processes such as learner attention and learner awareness, high internal validity means that (a) the independent variable has been well operationalized, that is, the stage of operationalization is appropriate, (b) the measure is sensitive to providing empirical evidence or data to establish that the independent variable was indeed playing a role during the experimental phase and thus clearly contributed to the results, (c) most variables have been controlled in the study, and (d) appropriate statistical analyses have been employed. Without concurrent

118

Research Methodology

data or empirical evidence to demonstrate that no intent or conscious effort was made during exposure to learn targeted items in the input (whether learners did, for example, pause at some targeted items and processed them with some level of cognitive effort or awareness), type of learning remains an unanswered question and ultimately lowers the level of internal validity of the study. The inclusion of more online data elicitation procedures in addition to think aloud protocols (discussed below) certainly augurs well for more robust research designs, although it still remains a methodological issue in current research premised on the role of cognitive processes in L2 development. Another way to avoid or reduce test effects is to use alternate forms instead of using the same pretest as the posttest(s), especially when the time period between the pretest and posttest is relatively short, for example, within a week. Designing parallel tests is one way to address this problem. For example, two sets of pretest and posttests are prepared. One group receives A as the pretest and B as the posttest while the other group receives the tests in reverse. It is very important to ensure that these tests are equivalent or potential gains may be lost in the data. This procedure can also be used for three or more groups and is usually known as the split block design. To increase the internal validity of the study, we also need to report the reliability of dependent measures, that is, whether the assessment tasks or tests we are employing are internally reliable based on participants’ performances on the tests. A typical statistical analysis used to provide this information is the Cronbach’s alpha. Cronbach’s alpha is a measure of internal consistency, that is, how closely related a set of items are as a group. If the value of alpha is “high,” we often use this level as evidence that the items on the test measure an underlying (or latent) construct, for example, comprehension. Technically speaking, Cronbach’s alpha is not a statistical test—it is a coefficient of reliability (or consistency). A report of inter-rater reliabilities is also essential when we need to code data that require some level of interpretation on the part of the coder. For example, we have collected several think aloud protocols and need to code these protocols in relation to reported noticing of targeted items in the input. To avoid researcher bias, a minimum of two coders is required. A high inter-rater reliability indicates that confidence can be placed on the relative accuracy of the coded data. Finally, eliminating regression to the mean as a possible interpretation of results is only necessary when there is a significant difference in performances between the groups on the pretest. For example, Group A scores a mean of 8.5 on the pretest when compared to a mean of 2.6 for Group B, a difference found to be statistically significant. Both groups are then exposed to either a simplified (Group B) or un-simplified text (Group A). On the posttest Group A has a mean of 6.2 while Group B has a mean of 5.2. There is no difference between these posttest means. Can we conclude that simplification works? No, we cannot, since it is well accepted that such a statistical difference in means on the pretest will likely

Methodological Issues in Research

119

result in the higher mean decreasing or regressing to the mean (approximately 5.0), while the lower mean will rise to this level. In summary, it is very important that empirical studies have a robust research design with a high level of internal validity. We do not place a lot of confidence in research designs with a low level of internal validity, that is, one with many limitations and alternative interpretations of the data due to uncontrolled variables that could have potentially impacted the findings. At this stage, if you use the checklist on Table 1, you will also be able to ascertain whether the research design of any published empirical study has a high level of internal validity. Go ahead and select any published study, even in our prestigious refereed journals, and see whether you can be confident in what the researcher offers in his/her interpretation and conclusion of the data elicited. Have fun!

External Validity Having established several selected criteria for addressing the internal validity of a study, let us now discuss the other type of validity, namely external validity. A study is said to have external validity if the findings can be generalizable to the participant population and also to other academic settings. For example, my participants were at the Intermediate 2 level. I found that a certain type of exposure worked wonders. Can I extrapolate my findings to all Intermediate 2 students at other institutions? If there are other variables that may play a role in the results— for example, amount of hours, different curricula, theoretical underpinnings driving specific instruction, etc.—it is safer to cautiously state that the findings are only pertinent to the participants used in this study. Once again, it is very important to note that a study cannot have external validity if it does not have internal validity. Hatch and Lazaraton (1991) write: For internal validity, we worry about how well the data answer the research questions from a descriptive standpoint for this specific data set. When we want to generalize from the data set, we are concerned not only with internal validity but external as well—how representative the data are for the group(s) to which we hope to generalize. We need to overcome the threats to external validity so that we can generalize, can make inferential claims. (pp. 41–42) Table 2: Four selected criteria for external validity 1. 2. 3. 4.

There was a theoretical or research basis for research questions/hypotheses The sample was adequately described Information of prior knowledge of targeted forms before the experiment There were measures of delayed effects

120

Research Methodology

I identified four criteria for external validity. First, studies must have a theoretical or research basis for research questions/hypotheses, or results can only be explained from an ad hoc perspective. A theoretical underpinning frames the study and subsequent interpretation of the data gathered. Second, we need to provide an adequate description of the sample population. While it is almost impossible to replicate a study with the same population from another institution, we can minimally provide useful information that includes level of language experience, teaching methodology, age, gender, skills promoted, amount of hours classes meet per week, textbooks, and so on. Third, we need to report whether the participants possessed some level of prior knowledge of targeted forms before the experiment (this is usually achieved via the pretest). Finally, it is always useful to include measures of delayed effects when we want to address the issue of retention of the effect of the independent variable(s). Given the obvious methodological benefits of employing concurrent data elicitation procedures to address empirically internal processes and the issue of internal validity in an appropriate way, I am positive you have in your mind the $64,000 question: Are concurrent data elicitation procedures more the norm than the exception to investigate internal cognitive processes in current SLA literature? Well, no, and yes. The negative side is that there is still a fair amount of studies that fail to follow this methodological advice to address internal processes, and inevitably report in the conclusion sections the need to have employed concurrent data elicitation procedures in order to more robustly address the internal process being investigated. The positive aspect is that we have seen a concerted effort over the last decade or so by several researchers to address the operationalization of the process of attention, noticing, and the construct of awareness as evident in the use of concurrent and online procedures (taking place during experimental exposure) such as verbal reports and, more recently, the inclusion of eye-tracking, reaction time, mouse-tracking, event-related potentials, selfpaced reading or listening paradigms, the recording of voice onset time in studies on L2 phonetics and so on, and concurrent offline procedures (taking place after experimental exposure), such as stimulated recall in the interaction research strand where, for obvious reasons, concurrent data are hard to gather during an oral interaction, and offline verbal reports in which participants are asked to reflect on whether they were aware of the targeted item in the experimental phase of the study. Whether all these procedures access the same data is an issue that will be discussed in more detail in the next chapter.

Conclusion To achieve a high level of internal validity in our studies should be the goal of empirical research. In other words, if we report that A had an effect on B, we are saying that we have controlled as many variables as possible in our research design and that if we were to replicate this study (hence the importance of

Methodological Issues in Research

121

providing adequate information on the research design), the findings should be similar. High internal validity = high level of confidence we can place in the findings and, especially with respect to the formal classroom setting, having this high confidence is extremely important to us teachers. I provided a checklist for both internal and external validity that can be used by both budding researchers and lay persons interested in confirming that a study is indeed high in internal validity. Note that it is the internal validity that matters most, so even if a study has a high level of external validity, without a corresponding level for internal validity, the findings are moot and uninteresting. Let us now proceed to a discussion of a construct that actually lies at the heart of what we (both researchers and teachers) pursue in our endeavors, namely the construct of learning.

References Gurzynski-Weiss, L., Al Khalil, M., Baralt, M., & Leow, R. P. (2015). Levels of awareness in relation to type of recast and type of linguistic item in computer-mediated communication: A concurrent investigation. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32, 465–491. Harley, B. (1998). The role of focus on form tasks in promoting child L2 acquisition. In C. Doughty & J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 156–174). Cambridge: Cambridge University Press. Hatch, E., & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. Boston: Heinle & Heinle. Leeman, J., Arteagoitia, I., Fridman, B., & Doughty, C. (1995). Integrating attention to form with meaning: Focus on form in content-based Spanish instruction. In R. W. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 217–258). Honolulu, HI: University of Hawai’i, Second Language Teaching and Curriculum Center. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (1998a). Toward operationalizing the process of attention in second language acquisition: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19, 133–159. Leow, R. P. (1998b). The effects of amount and type of exposure on adult learners’ L2 development in SLA. Modern Language Journal, 82, 49–68. Leow, R. P. (1999). The role of attention in second/foreign language classroom research: Methodological issues. In J. Gutiérrez-Rexach & F. Martínez-Gil (Eds.), Advances in Hispanic linguistics: Papers from the 2nd. Hispanic Linguistics Symposium (pp. 60–71). Somerville, MA: Cascadilla Press. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P., & Hama, M. (2013). Implicit learning in SLA and the issue of internal validity: A response to Leung and Williams’ ‘The implicit learning of mappings between forms and contextually derived meanings.’ Studies in Second Language Acquisition, 35(3), 545–557.

122

Research Methodology

Mackey, A. (2012). Why (or why not), when and how to replicate research. In G. Porte (Ed.), Replication research in applied linguistics (pp. 34–69). Cambridge: Cambridge University Press. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule search and instructed conditions. Studies in Second Language Acquisition, 18, 27–67. Rosa, E. M., & Leow, R. P. (2004a). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Rosa, E. M., & Leow, R. P. (2004b). Computerized task-based exposure, explicitness and type of feedback on Spanish L2 development. Modern Language Journal, 88, 192–217. Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21, 511–556. Williams, J., & Evans, J. (1998). What kind of focus and on which forms? In C. Doughty & J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 139–155). Cambridge: Cambridge University Press.

7 DECONSTRUCTING THE CONSTRUCT OF LEARNING

Before the birth of the SLA field several decades ago, non-SLA fields had been using the term “learning,” and still do, to describe stimuli such as colors, numbers, lights, dark spots on a computer, and so on. With the birth of SLA, researchers have placed a premium interest in the exploration of the construct of learning as in second or foreign language (L2), learning in the naturalistic (immersion or study abroad), classroom, and laboratory settings. If we were to take a quick survey of any representative number of published theoretical or empirical studies in both SLA and non-SLA fields, we would find an inevitable mention of the term “learning.” At the same time, it is quite revealing that a closer and more careful look at what comprises “learning” within and between the SLA and nonSLA fields may not be the same construct. For example, the concept of intake is not well acknowledged in many non-SLA fields, and whatever is taken in may be viewed as learning. In addition, there may be quite a lot of terminological confusion given that the construct of learning appears to be operationalized or measured by quite a wide range of assessment tasks. In an effort to avoid terminological confusion, this chapter proposes looking at the construct of learning from a tri-dimensional perspective, namely (1) learning as a process versus learning as a product, (2) the kind of learning, namely item versus system learning, and (3) the type of processing involved, namely explicit (i.e., with awareness) versus implicit (i.e., without awareness). In addition, this construct will be situated within the stages postulated to occur along the learning process in SLA ( Chapter 2). This global and theoretical view of the learning process has a three-fold purpose: It allows researchers to (1) identify which stage along the learning process learning is being investigated, (2) address the appropriate assumptions in relation to the tri-dimensional perspective of learning, and (3) interpret the results within this theoretical framework. A visual framework

124

Research Methodology

for researchers to measure and interpret the process and product of learning will be provided, together with a discussion of both receptive and productive measures employed to address learning. Finally, I shall administer a test to address this construct (so take good notes). But first, a terminological issue to discuss.

Acquisition Versus Learning Before we begin to deconstruct the construct of learning, I would like to situate this construct in relation to the term “acquisition” and raise one basic question: Is the learning process the same as the acquisitional process? It would appear that some researchers do not see any difference between these two processes, and this conflation is observed in both SLA and non-SLA fields. In other words, there appears to be a somewhat lax approach in reporting these two processes in terms of what type of processing was specifically addressed. For example, two recent articles in cognitive science appeared to indicate that the two processes are similar: “higher order concept learning” “. . . the other set was focusing on acquiring the individual input-output pairs (exemplars) vs. . . .” “this set of participants was attempting to learn the function rule during training.” (McDaniel, Cahill, Robbins, & Wiener, 2013: 18–19, emphasis added) “Chinese participants learned more knowledge . . . Chinese participants acquired greater knowledge . . .” (Fu, Dienes, Shang, & Fu, 2013: 9, emphasis added) In SLA, similar conflation is easily found in, for example, the title of Godfroid, Boers, and Housen (2013), which included a reference to incidental L2 vocabulary acquisition while the authors discussed “word learning” in their interpretation of the data. Stafford, Bowden, and Sanz (2012) reported that their interpretation test assessed “what they had learned during treatment” (p. 752) while their production test assessed “to what degree knowledge of Latin morphosyntax acquired through interaction with the input-based treatment could be transferred to a guided production task” (p. 753). Indeed, this conflation may be due to the name of the field in which we conduct our research, namely, SLA. As I mentioned in Chapter 1, Krashen’s (1982) Monitor Theory was the first theoretical underpinning in those days to raise the issue of the role of the construct of awareness or “consciousness” in the L2 learning process and to distinguish between learning (with consciousness), resulting in learned/explicit knowledge, and acquiring (without consciousness), resulting in acquired/implicit knowledge. Now we need to consider the source of Krashen’s model, namely

Deconstructing the Construct of Learning

125

L1 acquisition and, more specifically, children’s L1 acquisition. Acquisition of a language takes place in the environment in which we learn our first language, surrounded by this language and used to communicate, read, share information, and so on. We acquire our L1 by living it, to put it mildly, with minimum effort on our part to acquire it. L1 processing is largely unconscious and takes place over a long period of time until we hit kindergarten (or beyond), where we begin to be exposed to explicit instruction, including how to read, write, pronounce words properly, and so on. In this formal environment, we begin to learn explicitly how to use our first language, and this continues for quite some time (it is all relative) after. The one environment that can offer an L2 learner a similar acquisitional condition is that of the immersion setting, with the understanding that the L2 learner will engage in similar activities as the L1 speaker did over a relatively long period of time. To this end, the learning process is distinct to the acquisition process due to obvious reasons and this distinction is followed in this book. Now let us begin our deconstruction.

The Construct of Learning First, let me ask a very basic question (both researchers and teachers share the same love for asking questions): How do you define learning? Take a pause and think a bit deeply (put in some cognitive effort) about this question. When someone says that learning is taking place or took place, are these two statements referring to the same idea or are they conceptually different? Dictionary definitions of what comprises learning view this construct from minimally two main perspectives, namely as a process (verb) and as a product (noun). As a process, it is defined in Wikipedia as “acquiring new, or modifying existing knowledge, behaviors, skills, values, or preferences, and may involve synthesizing different types of information,” and in the Merriam-Webster dictionary as “to gain knowledge or understanding of or skill in by study, instruction, or experience.” As a noun, it is the knowledge or skill acquired by instruction, study, or experience (Merriam-Webster Online). Based on these definitions, the construct of learning appears to involve both old and new information with the potential for the old information (prior knowledge) to be modified based on the new information. However, note that these definitions do not delve explicitly into whether learning (as a process) (1) can be focused on discrete items in the input (itemlearning), resulting in knowledge (as a product) comprising learners’ subsequent ability to recognize or even produce these individual items, or (2) involves some internalization of rules underlying such discrete items (system-learning) resulting in knowledge (as a product) comprising learners’ subsequent ability to verbalize a grammatical rule or, minimally, to demonstrate an ability to generalize this underlying rule to new exemplars. OK, you may go back and re-read that last sentence (keep in mind the stages of the learning process), but I am going to elaborate on it later. In addition, these definitions do not offer much insight

126

Research Methodology

into whether the process and product of learning involves a role for awareness (explicit learning and explicit knowledge, respectively) or lack thereof (implicit learning and implicit knowledge, respectively). Let us first take a look at some operationalizations of what comprises learning.

Operationalizing the Construct of Learning A close review of many empirical studies in both SLA and non-SLA literatures will easily demonstrate that the construct of learning is rarely operationalized. Here are two recent noteworthy attempts to provide an operationalization in both non-SLA and SLA fields, just to give you a feeling of different perceptions. This recent operationalization (McDaniel et al., 2013) from cognitive psychology appears to posit levels of learning: 1. 2.

3.

Learning: Learners were distinguished from non-learners as having a mean absolute error of less than 10 during training. Exemplar learning: Learning of the individual cue-criterion pairing for the 20 training points was operationalized as showing relatively flat extrapolation after having met a strict learning criterion (the one above). Rule learning: Learning of the relations/rules among training points was operationalized as showing general extrapolation along the slopes of the bilinear function.

This recent operationalization (Stafford et al., 2012) from SLA appears to posit different stages (initial vs. late?) of learning in relation to specific offline assessment tasks: Initial language learning was operationalized as follows: overall accuracy on tests of written interpretation, aural interpretation, grammaticality judgment, and written production. (p. 754) The failure to operationalize or even to define the construct of learning in most studies has led one to seek this operationalization in either what (e.g., colors, visual stimuli) is being measured in non-SLA fields or the assessment measures employed in SLA to address this construct. The many measures of “learning” appear to indicate several interpretations of what comprises the construct of learning, leading to an inevitable terminological confusion. For example, learning has been measured by a series of receptive tests such as recall or remembering (e.g., Shekary & Tahririan, 2006), trials-to-criterion (the number of attempts it takes to start processing the input correctly; e.g., Fernández, 2008), selection/recognition (Godfroid et al., 2013; Leow, 2000), selection/interpretation (Williams, 2005), a four-alternative, forced-choice picture matching task

Deconstructing the Construct of Learning

127

(Hamrick & Rebuschat, 2012), productive tests such as written interpretation (Stafford et al., 2012), acceptability judgment (Grey, Williams, & Rebuschat, 2014), fill-in-the-blank (Medina, 2015), modified cloze test (e.g., Rossomondo, 2007), and so on. In addition, learning has also been measured concurrently via reaction time (Leung & Williams, 2014) and non-concurrently (most empirical studies), etc. Embedded in these measures are several perspectives of what comprises learning, as evident in whether learning is viewed as a process (e.g., Rosa & Leow, 2004a) or a product (e.g., Williams, 2005), the kind of learning assumed, that is, item (e.g., Leow, 2000) versus system learning (e.g., Rosa & Leow, 2004b), or the type of learning assumed, that is, implicit or explicit learning (Chan & Leung, 2014).

A Tri-Dimensional Perspective of the Construct of Learning In an effort to avoid terminological confusion, I propose that we look at the construct of learning from a tri-dimensional perspective, namely, (1) learning as a process versus learning as a product, (2) the kind of learning, namely item versus system learning, and (3) the type of processing involved, namely explicit (i.e., with awareness) versus implicit (i.e., without awareness). In addition, we situate this construct within the stages postulated to occur along the learning process in SLA (Chapter 2). This global and theoretical view of the learning process has a three-fold purpose: It allows researchers to (1) identify which stage along the learning process learning is being investigated, (2) address the appropriate assumptions in relation to the tri-dimensional perspective of learning, and (3) interpret the results within this theoretical framework.

Learning as a Process Versus Learning as a Product If you recall the postulated stages of the L2 learning process in Chapter 2, you will recall that learning as a construct may be viewed from two perspectives: (1) Learning as a process, which occurs internally, takes place at Stages 1 (input processing), 3 (intake processing), and 5 (L2 knowledge/output processing) and (2) learning as a product (what is learned) is presented internally (at Stage 4) in the learner system as knowledge, and externally as representative L2 knowledge. Stage 2 represents intake as an initial product kept in working memory that may be retrieved immediately via concurrent receptive tests that only require learners to recognize, select, or identify discrete target items presented in the input, but has yet to be further processed, internalized, or learned. In other words, when we report about the learning process, we are referring to the process of converting input into intake, the process of converting intake into the internal system, which is typically assumed to contain some type of knowledge (systemized or discrete/un-systemized items, explicit or implicit, declarative or procedural), and there is also the process of producing output that allows

128

Research Methodology

the learner to potentially receive additional L2 input that allows either a confirmation of his/her L2 knowledge or a restructuring of his/her interlanguage. To measure any stage of the learning process, concurrent data elicitation procedures that can provide insights into learners’ thought processes may be more appropriate than non-concurrent or offline measures. A process or, more specifically, the processing of the L2 goes beyond paying mere attention to the linguistic features in the input (Gass, 1997; VanPatten, 2004). When we report about learning as a product, we may be referring to what has been initially attended to or processed internally (e.g., intake) or the result or outcome of what has been further processed along the learning process (e.g., L2 knowledge) and demonstrated externally as output. At Stage 2, it is usually referred to as stored linguistic data kept in working memory at the stage of intake that is available for concurrent recognition and potential further processing but does not represent internalized or learned knowledge, which occurs further along the learning process. Typical assessment tests employed to measure intake include multiple-choice (MC) recognition or selection/identification tests. These tasks, usually administered offline, may differ in relation to the content of the items based on whether the content is true or not to the input provided or the number of options, which usually range from 4 (e.g., Leow, 2000) to 18 options (e.g., Godfroid et al., 2013). Tests of more than two options are usually considered more robust in relation to decreasing the potential of guessing.

Kind of Learning (Item Versus System Learning) Intake may also be available for subsequent processing for potential incorporation into the developing internal system as systemized knowledge or un-systemized discrete items. If a learned product is hypothesized to reside at the internalization stage (Stage 4) of the learning process, that is, after the intake processing stage (Stage 3), then intake as a product at Stage 2 does not represent learning when viewed as L2 knowledge. This learned product is referred to at Stage 4 as stored knowledge that forms part of the developing grammatical system of the learner, potentially available for restructuring and subsequent language use. This stored knowledge may be accurate or inaccurate. If inaccurate, it is available to undergo further restructuring. Learner product or L2 knowledge in SLA is usually measured by offline assessment tasks after exposure to the L2 data. Tests designed to measure explicit knowledge include either oral (e.g., describe a series of drawings) or written (e.g., fill-in-the-blank) production tests, grammaticality judgment tests, offline verbal reports, and so on that require some visual or oral manifestation or grammatical description of the learned L2 knowledge. Tests employed to measure implicit knowledge include oral production, elicited imitation tests, and grammaticality judgment tests that are all timed to promote time pressure to encourage the use of feel rather than

Deconstructing the Construct of Learning

129

rule and to reduce the opportunity to access metalinguistic knowledge (cf., e.g., Ellis, 2005). To measure un-systemized (item) and systemized knowledge, tests with old and new exemplars are employed, respectively. This learner product or L2 knowledge is typically what we test in our classrooms on quizzes and exams, especially if we ask our students to produce the L2 orally or in writing. To address robust learning, it is always advisable to include delayed posttests to measure retention.

Type of Learning (Implicit vs. Explicit) Learning can also be viewed as being implicit, that is, without awareness, or explicit, that is, with awareness. Type of learning, then, refers to whether the process of learning involved the construct of awareness. Awareness is defined by Tomlin and Villa (1994) as “a particular state of mind in which an individual has undergone a specific subjective experience of some cognitive content or external stimulus” (p. 193). Note that implicit or explicit learning is different from implicit or explicit knowledge, that is, the former takes place in Stages 1 and 3 while the latter resides in Stage 4 and is usually measured beyond Stage 5. Type of learning (implicit versus explicit) will be fully elaborated in Chapter 10.

Measuring the Construct of Learning in SLA Now with this tri-dimensional perspective of what comprises learning in this book in mind, which will form the foundation for my proposed model of the L2 learning process in Instructed SLA ( Chapter 12), the next step is to situate the experimental and assessment tasks employed in SLA studies to measure the construct of learning within the fine-grained framework of the L2 learning process presented in Chapter 2. Stages of the learning process in SLA: Of processes and products INPUT

{

>

Stage 1 (Product) (process) (input)

(input)

INTERNAL SYSTEM

INTAKE

>

Stage 2

Stage 3

(product)

(process)

(intake)

(intake)

Stage 4 (product) (L2 knowledge)

>

}

OUTPUT

Stage 5 (process)

(product)

(L2 knowledge/output) (representative L2 knowledge)

eye-tracking recognition

eye-tracking

reaction time

reaction time

stimulated recall

verbal reports

verbal reports

confidence ratings

verbal reports

verbal reports

source attribution Receptive tasks

Productive tasks

MC recognition

Fill-in-the-blank MC interpretation/ selection

MC interpretation oral elicited imitation GJT (plus correction), etc.

130

Research Methodology

Regarding measurement of learning as a process or a product, once again the stage along the learning process from which data are elicited will indicate what type of process or product is being measured. With regard to learning as a process, be it item learning (exposure to old exemplars) or system learning (evidenced by a task or test that contains new exemplars), there are currently three major concurrent data elicitation procedures being employed in the SLA field, namely eye-tracking, online verbal reports or think aloud protocols, and reaction time (these are discussed in more detail in Chapter 8). At the input and intake processing stages, the three procedures may be used to gather data on learners’ processing and processes. As will be discussed later, whether processing or processes are the focus of the study will determine the selection of the concurrent data elicitation procedure. With regard to learning as a product, there are several measures that can be employed to elicit data on learner attention or noticing reportedly paid (stimulated recalls), learner intake (e.g., offline MC recognition, interpretation), which would qualify as minimally addressing intake as a product held in working memory or lodged in the internal system as un-systemized discrete items, knowledge (e.g., GJT, fill-in-the-blank, etc.), and type of knowledge, whether implicit or explicit (e.g., offline verbal reports, confidence ratings, and source attribution).

Receptive Assessment Tasks As can be seen, receptive tasks and tests are situated to address an early stage along the learning process, namely, the (processing of) intake held in working memory (e.g., Leow, 2000) or in relation to some kind of knowledge already stored in the internal system (e.g., Leung & Williams, 2014). Receptive tasks and tests can be further divided into concurrent (e.g., interpretation/selection) and nonconcurrent (e.g., offline MC recognition, interpretation). Let us take a closer look at two popular receptive tasks employed in current SLA research, namely, the MC recognition (e.g., Godfroid et al., 2013; Leow, 2000) and the MC interpretation (e.g., VanPatten & Cadierno, 1993; Williams, 2005) tasks. Leow (2000) employed a four-option off line MC recognition task of old items immediately after exposure to the L2 data and reported that the aware group improved significantly from the pretest to the immediate posttest. Based on this assessment task alone, Leow could only report with some certainty that the ability to recognize the old targeted items was due to the presence of these items currently held in working memory (or episodic memory), and could also suggest or assume without hard evidence or much confidence that such items were also internalized as discrete items. However, a controlled written production test was also administered, and gain scores on this test were also significant. These scores on a production test provided tangible evidence that targeted items were indeed internalized in the learners’ system, which allowed their production. However, neither assessment task (recognition or production) addressed

Deconstructing the Construct of Learning

131

new exemplars, so this study only investigated item or un-systemized learning and could not report whether such learning was robust enough to remain over a longer period of time since no delayed posttest was administered. The interpretation of this study, then, is simple, as seen below: For this sample of adult beginning learners of Spanish, the findings of the present study appear to indicate that learners who demonstrated awareness of the targeted morphological forms during the experimental exposure took in and produced in writing significantly more of these forms when compared with the group that demonstrated a lack of such awareness. Also, aware learners significantly increased their ability to recognize and produce the targeted morphological forms in writing after exposure, whereas the unaware group did not. (Leow, 2000: 573) Godfroid et al. (2013) employed an 18-option offline MC recognition task of old items immediately after exposure to the L2 data and reported that participants learned the targeted items. Did the participants learn the targeted items or did they simply demonstrate an ability to identify or recognize such items immediately after exposure (cf. Leow, 2000)? In other words, were participants accessing these items from their internal system in order to successfully recognize them on the posttest, or were these items still in working or episodic memory that was providing the source of these items? The inclusion of a production test or a delayed recognition posttest in the research design could have provided some tangible evidence that these items had progressed from working memory to the internal system as discrete items, which would then provide evidence of internalization or learning. The interpretation task is widely employed in the Processing Instruction (PI) strand of research (e.g., VanPatten & Cadierno, 1993), mainly as an offline assessment task and in the implicit learning strand, in which it has been used as an online (e.g., Leung & Williams, 2014) and offline assessment task (e.g., Williams, 2005). Participants are exposed during a treatment phase or after instruction to usually two pictures or stimuli (cf. Morgan-Short & Bowden, 2006, for a threeoption format and Hama & Leow, 2010, for a four-option format) and are asked to select one option based on a sentence that they hear or see. The task is premised on the activation of some type of knowledge that participants gained during the experimental treatment phase. In other words, this task is not similar to the recognition assessment task described above (Godfroid et al., 2013; Leow, 2000) given that it is closely linked to the assumed deployment of some type of knowledge in the responses. This explains why this type of receptive task is physically and visually closer to the internal system to account for a closer connection to this system when compared to the simple recognition task. Significant performances measured by mean scores in the PI strand and mean scores statistically above chance in

132

Research Methodology

the implicit learning strand are reported as successful learning. While the PI strand typically includes both production and delayed posttests that may provide tangible evidence of some type of learning having taken place, the implicit strand relies on concurrent or immediate posttest performances, including generalization items, on the interpretation task to do so. Given the brevity of the exposure, including an immediate written production test and delayed posttests in the research design would clearly contribute to a better understanding of the robustness of what was learned and whether memory may be playing a key role in participants’ performances.

Productive Assessment Tasks Productive tasks and tests, on the other hand, due to their placement in the output stage, are only able to address learning as a product or knowledge given that their immediate source of information resides in the learner’s internal system, which will be used to access such knowledge or lack thereof to perform the test. These tasks and tests are also used to access data in relation to type of knowledge (implicit vs. explicit) and kind of knowledge (systemized or un-systemized). Regarding the kind of knowledge or what is assumed to be representative of knowledge stored in the internal system, a production assessment task that elicits learners’ generation of “old” exemplars of targeted linguistic information may be measuring knowledge of the targeted linguistic data (Stage 4) that potentially have not been fully internalized systematically, while generation of “new” exemplars is interpreted as system learning with its ability to be applied to novel contexts. What these tasks and tests are unable to address is the type of learning (implicit vs. explicit) that occurred before the L2 data became internalized into the learner’s system. In other words, only assumptions can be made with regard to the process or type of learning, that is, whether learners were aware or unaware of target L2 data while processing this information. Now, let us test our understanding of the above. Go ahead and reread the above before taking the test. With some prior knowledge of the content, it should be easier to process.

Putting Learning to the Test I posed the question earlier regarding your definition of learning, and I have provided the one that guides this book, namely that learning is the internalization of new L2 data in the internal system, whether the data are novel or restructured. Ideally, evidence of learning should be evaluated minimally on an interpretation or productive test and retention should be measured. Let us now visit an empirical study reported in Science Now (Grainger et al., 2012) and see whether we can identify and report on what specifically took place in this study.

Deconstructing the Construct of Learning

133

Five non-speaking English participants were exposed to English words consisting of four capital letters, for example, “DONE” or “LAND,” and “nonsense” words, for example, “DRAN” or “LONS.” A key feature between the words and nonsense words was that the latter contained pairs of letters (e.g., “HT”) that are infrequent in English. A word was shown on a computer screen and the participants touched the correct shape on the screen: An oval on the right of the screen if the word was real, and a cross on the left if it was nonsense. Whereas participants saw the real words many (500) times during the trials, the nonsense words were rarely repeated. Each participant completed between 40,000 and 60,000 trials over the course of a month and a half. It was reported that at least one participant demonstrated the ability to distinguish scores of real words at 75% accuracy above chance (and this is statistical). Let us now return to the stages along the learning process. What does the result indicate based on the assessment task and what assumptions or interpretations can we make? This participant 1. 2. 3. 4. 5.

learned to read the words (make form-meaning connection). (Internalization/ systemized learning) learned to recognize the words only and did not learn to read the words. (Intake) learned to both recognize and read the words. (Intake and internalization/ systemized learning) learned the words. (Internalization/item learning) oops, none of the above.

Which answer did you choose? For answer 1, where is the evidence that s/he did or did not learn to read? The research design did not address or measure this ability, so we cannot choose this answer and, similarly, answer 3. Answer 2 assumes that the participant only took in the words, kept them in working memory, and was subsequently able to recognize them. This may be a plausible answer, but then again the exposure took place over a month and a half, so this would put a question mark on the role of working memory. If you chose answer 4, the argument that can be made here is that, due to this long period of time, the recognized words were internalized minimally as discrete items with no connection to their meanings, which would result in item learning. Question 4, then, may be a plausible option. Or did you choose 5 and hope for the best? OK, now let us replace the term “participant” with the real subject of the study—a really cute baboon—and return to the first answer, the baboon learned to read. Intuitively we will say “no,” but where is the evidence? Knowing intuitively that the baboon was clearly not able to make word-meaning connections, that is, she did not learn to read (though we did not measure this ability in this study, so the question is still unanswered), it may be argued that she was trained to recognize common arrangements of letters (visual word recognition) among

134

Research Methodology

similar-looking English nonsense words, and the possibility may exist that she did retain knowledge of these words in her internal system as discrete items that allowed her to recognize them after a month and a half of exposure (Answer 4). The moral of the study: We can only report on what we asked participants to do in the study and must avoid making too many assumptions beyond this level.

Conclusion In summary, learning in this framework, and in this book, is defined as any new L2 data, correct or incorrect, novel or restructured, that enters the learner’s internal system, while any data before this stage are assumed to be taking place in the learner’s working memory with some potential to be further processed and learned or internalized. The process of learning and type of learning is better operationalized via the use of concurrent measures, while the product of learning is better served by both online and offline measures. Based on the measures and type of assessment tasks we employ in our research designs to address L2 learning, it is recommended that we report our findings either based on the stage of the learning process or simply by reporting what participants did in the study. Now it is time to explore some popular data elicitation procedures employed in SLA to address internal cognitive processes in the next chapter.

References Chan, R., & Leung, J. (2014). Implicit learning of natural language stress rules. Second Language Research, 30, 463–484. Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A psychometric study. Studies in Second Language Acquisition, 27, 141–172. Fernández, C. (2008). Re-examining the role of explicit information in processing instruction. Studies in Second Language Acquisition, 30, 277–305. Fu, Q., Dienes, Z., Shang, J., & Fu, X. (2013). Who learns more? Cultural differences in implicit sequence learning. PLoS ONE, 8 (8), e71625. doi:10.1371/journal. pone.0071625. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in L2 vocabulary acquisition by means of eye tracking. Studies in Second Language Acquisition, 1–35. Grainger, J., Dufau, S., Montant, M., Ziegler, J. C., & Fagot, J. (2012). Orthographic processing in baboons (Papio papio). Science, 336, 245–248. Grey, S., Williams, J. N., & Rebuschat, P. (2014). Incidental exposure and L2 learning of morphosyntax. Studies in Second Language Acquisition, 29, 1–35. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32, 465–491. Hamrick, P., & Rebuschat, P. (2012). How implicit is statistical learning? In P. Rebuschat & J. Williams (Eds.), Statistical learning and language acquisition. Boston: de Gruyter. Krashen, S. (1982). Principles and practice in second language acquisition. Oxford: Pergammon.

Deconstructing the Construct of Learning

135

Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. McDaniel, M. A., Cahill, M. J., Robbins, M., & Wiener, C. (2013). Individual differences in learning and transfer: Stable tendencies for learning exemplars versus abstracting rules. Journal of Experimental Psychology: General, 143(2), 668–693. Medina, A. (2015). The variable effects of level of awareness and CALL versus non-CALL textual modification on adult L2 readers’ input comprehension and learning. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Morgan-Short, K., & Bowden, H. W. (2006). Processing instruction and meaningful output-based instruction: Effects on second language development. Studies in Second Language Acquisition, 28, 31–65. Rosa, E. M., & Leow, R. P. (2004a). Computerized task-based exposure, explicitness and type of feedback on Spanish L2 development. Modern Language Journal, 88, 192–217. Rosa, E. M., & Leow, R. P. (2004b). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Rossomondo, A. E. (2007). The role of lexical temporal indicators and text interaction format in the incidental acquisition of the Spanish future tense. Studies in Second Language Acquisition, 29, 39–66. Shekary, M., & Tahririan, M. H. (2006). Negotiation of meaning and noticing in textbased online chat. The Modern Language Journal, 90 (4), 557–573. Stafford, C., Bowden, H., & Sanz, C. (2012). Optimizing language instruction: Matters of explicitness, practice, and cue learning. Language Learning, 62 (3), 741–768. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B., & Cadierno, T. (1993). Explicit instruction and input processing. Studies in Second Language Acquisition, 15, 225–243. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269–304.

8 LOCATION, LOCATION, LOCATION Probing Inside the Box

In Chapter 6, we discussed the importance of achieving a high level of internal validity in our empirical studies so that our readers can have some level of confidence that we have addressed what specifically we set out to investigate, and, hopefully, the findings can be extrapolated to the classroom setting. Achieving a high level of internal validity is of paramount importance when we probe into learner processing and internal processes, and the preferred way data can be accessed to illuminate internal cognitive processes is arguably via concurrent or online data-elicitation procedures, that is, data gathered while participants are exposed to or interacting with the L2. In other words, without concurrent data on what the learner is currently thinking, processing, or reacting to during exposure to the L2 data, in many cases only speculations or assumptions can be made regarding this processing or processes. I have selected to discuss in this chapter, in some detail, three concurrent data-elicitation procedures, namely eye-tracking (ET), reaction time (RT), and think aloud protocols (TA) or online verbal reports currently employed in the SLA field, and in lesser detail two other procedures that are offline (stimulated recalls and verbal reports). This does not mean that other procedures are less important, but I think that these three are more relevant to addressing the early stages of the learning process, namely the input-to-intake and intake processing stages (Stage 1 and Stage 3). The major characteristics of the RT, ET, and TA procedures in SLA research design are provided together with a finergrained methodological analysis of the construct of awareness in SLA.

Concurrent Data-Elicitation Procedures in SLA In Leow (2000), I discussed the advantages of online or concurrent versus offline or post-exposure data-elicitation measures (for example, performance on a posttest, post-exposure questionnaires, etc.) to gather information on learners’

Location, Location, Location

137

internal processes. I pointed out that the major difference between the two kinds of data-elicitation measures may be the level of internal validity of the study with respect to the information on learners’ actual performances during exposure to and/or interaction with the L2 data. Online process measures provide relatively more substantial evidence of processing and processes being measured than offline measures and thus are, by nature, higher in internal validity. The terms “processing” and “processes” are typically conflated in the SLA literature, and one way to distinguish them may be to view “processing” as an event taking place and “processes” as what (e.g., attention, awareness, knowledge) are being employed during this event. Offline measures can only make inferences as to whether learners, for example, paid attention to, became aware of, or employed prior knowledge during the processing of targeted items in the input and, consequently, constitute a broad-grained measurement of cognitive processing or processes. In addition, one other benefit of using online process measures is the opportunity for qualitative analyses that provide a richer source of information on learners’ internal processes when compared to quantitative analyses. Recently my colleagues and I (Leow, Grey, Marijuan, & Moorman, 2014) provided a critical overview of these three concurrent data-elicitation procedures in the SLA literature, specifically in relation to the early stages of the L2 learning process. Let us take a brief look at each of these procedures and their benefits and limitations by beginning with the latest arrival on the scene to address the early stages of the L2 learning process: Reaction time.

Reaction Time (RT) Reaction time (RT) measures have been a popular procedure in psychology and other non-SLA fields since the 1800s and have been used to address a range of issues that include retrieval of information from short-term memory (e.g., Klatsky & Smith, 1972) and long-term memory (e.g., Anderson, 1970), parallel and serial information processing (e.g., Egeth, Marcus, & Bevan, 1972), the psychological representation of semantic and logical representations, naming and letter classification tasks (e.g., Posner, 1978), and selective attention (e.g., Pachella, 1973). The measure itself was also a topic of research, seeking to understand what factors could cause or account for variation in RTs, especially with respect to the speed-accuracy trade-off (e.g., Yellott, 1971), and implicit learning (e.g., Reber & Allen, 1978), such as the co-occurrence of cues and first-order dependencies in sequence structure (e.g., Nissen & Bullemer, 1987), first-order dependencies (e.g., Frensch, Buchner, & Lin, 1994) and higher-order dependencies in sequence structure (e.g., Cleeremans & McClelland, 1991). Additionally, RT studies addressed the contextual cueing paradigm, largely carried out with spatial cues on non-linguistic targets (e.g., Chun, 2000) and first language acquisition of word segmentation (e.g., Saffran, Newport, & Aslin, 1996), verb distribution (e.g., Wonnacott, Newport, & Tanenhaus, 2008), and the acquisition of syntax (e.g., Chang, Dell, & Bock, 2006).

138

Research Methodology

The Standard Procedure to Collect RT Data The standard procedure for collecting reaction time data is to ask participants to press a button on a keyboard, computer mouse, or a response box as quickly and accurately as possible in response to a particular stimulus. In simple reaction time experiments, participants press the button whenever a pre-defined stimulus appears, such as a particular tone or image. In recognition reaction time experiments, participants respond only to certain stimuli (e.g., a memory set), but not to other stimuli (e.g., a distractor set). Finally, in choice reaction time experiments such as lexical decision or grammaticality judgment, participants are asked to press a pre-assigned button for a certain decision (i.e., “Yes, the sentence is good” or “Yes, that is a word”) and a different button for an alternative decision (i.e., “No, the sentence is bad” or “No, that is not a word”). Before beginning the experimental task, many RT experiments provide participants with practice or warm-up so that they can become accustomed to the task demands and responding quickly. Given its popularity in cognitive psychology for investigating various types of information processing, memory, and implicit learning, it is not surprising to find a similarly wide range of research strands in SLA that have adopted the RT measure to study a variety of topics, including automaticity (e.g., DeKeyser, 2001; Segalowitz & Segalowitz, 1993), feedback (e.g., Lyster & Izquierdo, 2009) explicit instruction (e.g., Sanz, Lin, Lado, Bowden, & Stafford, 2009), L2 processing (e.g., Alarcón, 2009), and, more recently, implicit learning (e.g., Leung & Williams, 2011, 2012, 2014).

Benefits of RT Based on the strands of reaction research that are most pertinent to the topic of this book, namely automaticity, L2 processing, and implicit learning, the benefits of RTs include the exploration of (1) theoretical issues related to linguistic processing, specifically of gender and gender agreement, and native and non-native learner processing differences; (2) different speeds of processing for certain linguistic cues (e.g., animacy vs. noun class); (3) the role of L2 automaticity; and (4) the online operationalization of type of learning. At the same time, it may be worthwhile discussing the notion of automaticity more fully given that this notion appears to underlie most uses of RT in the SLA field. The notion of automaticity refers to the instances when “we perform aspects of a task automatically, we perform them without the need to invest additional effort and attention . . . Also, performance appears to be more efficient; it is faster, more accurate, and more stable” (Segalowitz, 2003: 383). Automaticity has been operationalized as faster processing, ballistic (unstoppable) processing, or that which functions the same regardless of the amount of information to be processed (load independent). It has also been framed in terms of effortless or

Location, Location, Location

139

unconscious processing (Segalowitz, 2003). In SLA, the view of automaticity as being “faster processing” appears to be the most dominant angle (Hulstijn, Van Gelderen, & Schoonen, 2009). As mentioned earlier, it appears that many strands of RT research in SLA have subsumed either explicitly or implicitly the notion of automaticity in their research designs. Many of the SLA studies that used RT measures appear to place a premium on faster reaction times, which is reflective of effortless or unconscious processing (e.g., Leung & Williams, 2011, 2012, 2014), positive effects of feedback or instructional context (barring decreases in accuracy; e.g., Lyster & Izquierdo, 2009), and speed of processing (e.g., Alarcón, 2009). As such, data evidencing faster mean reaction times compared to some experimental baseline (a priori assumptions of no or limited knowledge, or a pretest measure, for example) would be considered evidence of L2 automaticity. In other words, automaticity has been empirically measured by assuming that decreases in average reaction time over the course of experimental observation index increased automatic processing on the part of the L2 learner. However, some researchers argue that it is not mean reaction time per se, but instead the coefficient of variation (calculated using participant mean reaction times and standard deviations) that is the most informative measure of automaticity—in that it may tease apart automatic processing and speeded-up control-like processing (Segalowitz, 2003; Segalowitz & Segalowitz, 1993; but see also Hulstijn, et al., 2009, and Lim & Godfroid, 2014, for lexical decision and semantic classification in L2). The theoretical and empirical concern is essentially that “faster processing,” as indexed by analyses of mean reaction times, may not adequately distinguish between automatically carrying out a task and quickly applying control-related procedures to the task, where such a distinction would be crucial in L2 automaticity research. Note, however, that this analytic debate on reaction time data (i.e., means versus coefficients of variation) is somewhat tangential to favoring the collection of RT data for research on L2 automaticity. The use of reaction time measures to answer research questions on this issue seems to be a valid and potentially powerful application, assuming that researchers’ operationalization of automaticity coincides with the “faster processing” perspective and that this perspective appropriately captures qualitative differences in processing over time. However, it remains an open issue whether this is in fact the most valid operationalization and how RT data might be used both for differentiating between operationalizations of automaticity and especially for determining the status of automatic and controlled processing. With future research using RTs that is premised on L2 automaticity, researchers should be cognizant of all of these factors before making strong conclusions about how speed of processing, as measured by mean RTs, translates to one type of processing (automatic) compared to another (speeded-up control), and whether speed of processing is the counterpart of controlled processing, as opposed to automatic processing closely tied to implicit learning and knowledge.

140

Research Methodology

Eye-Tracking (ET) Did you know that early information about eye movement behavior, obtained centuries ago, was actually achieved using visual observation of the eyes? Eyetracking is the process of measuring either the point of gaze (that is, where one is looking, which could be lateral or vertical) or the motion of an eye relative to the head. An eye tracker is a device that measures eye positions and voluntary or involuntary movements of the eye, which helps in obtaining, fixating, and tracking visual stimuli. Not surprisingly, eye trackers are heavily used in research on the visual system, which forms part of the central nervous system that gives us the ability to process visual details as we receive them. Do you recall the studies on visual perception so popular in non-SLA attentional theories? Fortunately, with the advances in technology, studies on the usefulness of the eye-tracker has revealed a close link between the eye and the mind (Carpenter & Just, 1976; Rayner, 1998), that is, there may be a direct relationship between eye movements and underlying cognitive processes. It is an uncontroversial measure of the allocation of overt attention (Blair, Watson, Walshe, & Maj, 2009), and a close link between covert attentional processes and eye movement has been established (see Godfroid, Boers, & Housen, 2013; Rayner, 1998; and Wright & Ward, 2008, for reviews). Interestingly, it has even been proposed that eye-tracking data can be used to measure cognitive effort by means of intensity and time (observed through pupillary dilation), both of which can be captured by eye-fixation location and eye-movement time (Kahneman, 1973).

The Standard Procedure to Collect ET Data Eye-tracking data are typically gathered in a laboratory setting and, depending on the type of equipment (e.g., a head-mounted, video-based eye-tracker like EyeLink II or a remote eye-tracker like Tobii 1750), some time is spent calibrating the eye-tracker to participants’ eyes in order to accurately record their gaze direction. Participants are usually given a practice run before actual data collection begins.

Benefits of ET The use of eye-tracking in SLA research is a recent effort to employ another concurrent data-elicitation procedure to address the initial stages of the learning process with, not surprisingly, much focus on the attention paid to L2 input by participants. In SLA, the eye-tracking procedure has been employed to address, for example, the constructs of attention and noticing in L2 development (e.g., Ellis et al., 2012; Godfroid, Housen, & Boers, 2013; Smith, 2010, 2012), L2 sentence and discourse processing while reading (e.g., Foucart & FrenckMestre, 2012), and L2 speech processing (e.g., Lew-Williams & Fernald, 2010).

Location, Location, Location

141

The benefits of eye-tracking methodology include the following: (1) It is nonintrusive (Dussias, 2010; Godfroid et al., 2013), (2) it has been argued to measure overt attention (i.e., conscious focus, Ellis et al., 2012) and learner processing (e.g., L2 gender agreement) and to detect very subtle effects in relation to when and where difficulties occur during syntactic processing, as well as the extent of the difficulty (Foucart & Frenck-Mestre, 2012), (3) it offers high temporal resolution and the ability to divide reading time into distinct components during online L2 sentence comprehension (e.g., Dussias & Sagarra, 2007), (4) it is clearly superior to other measures of reading such as the self-paced reading in that it allows for the naturalness of reading to take place, (5) the eye-tracking procedure is arguably the most robust measure of learner attention given the type of data it gathers in relation to participants’ eye movements, and (6) unlike other concurrent procedures, ET data can provide insights even into what has only been peripherally attended to in the input.

Online Verbal Reports or Think Aloud Protocols (TA) The use of online or concurrent verbal reports to investigate participants’ cognitive processing, thought processes, and strategies in many areas of psychology, cognitive science, and education is not a new data-elicitation procedure. Indeed, their use has been documented extensively in other fields since the 1950s. I have elected to discuss the think aloud procedure last given its extensive usage to address the language learning process when compared to reaction time and eye-tracking procedures. Since TAs appear to have the advantage of providing insights into learners’ cognitive processes as opposed to simple processing, as demonstrated in both non-SLA and L1 literature, the strands of research that have employed TAs in SLA also include L2 reading and writing (e.g., Cohen & Cavalcanti, 1987), as well as comparisons between L1 and L2 strategies (e.g., Yamashita, 2002), L2 test-taking strategies (e.g., Cohen, 2000), translation (e.g., Jaaskelainen, 2000), interlanguage pragmatics (e.g., Kasper & Blum-Kulka, 1993), and L2 attention and awareness studies (e.g., de la Fuente, 2015; Alanen, 1995; Hama & Leow, 2010; Hsieh, Moreno, & Leow, 2015; Leow, 1997, 1998a, 1998b, 2000, 2001a, 2001b; Martínez-Fernández, 2008; Medina, 2015; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999; Sachs & Suh, 2007 [cf. Bowles, 2010, for a review]). However, all verbal reports are not equal. It is important to point out the different methods of eliciting verbal reports, broadly categorized as either introspective (concurrent or online) or retrospective (online or offline) and metacognitive or non-metacognitive (Ericsson & Simon, 1993).

Introspective vs. Retrospective Introspective verbalization is gathered as participants are performing a task. Hence, verbalizations are not constrained by memory. Retrospective verbalization is usually

142

Research Methodology

conducted immediately after some form of processing has taken place, either during specific breaks in the actual task (online) or immediately after the completion of the task (offline). This type of verbalization has been critiqued for the potential effects of memory constraints and reconstructive processes—that is, additional information reported in one’s recall of the data (Nisbett & Wilson, 1977). Ericsson and Simon (1993) advise that retrospective protocols be used with caution, since it is impossible to “rule out the possibility that the information [subjects] retrieve at the time of the verbal report is different from the information they retrieved while actually performing the experimental task” (p. xii) or to rule out the issue of veridicality, that is, whether memory decay could be playing a role in the protocols.

Metacognitive vs. Non-Metacognitive Verbal Reports In non-metacognitive verbalization, learners are focused on the task with the thinkaloud secondary and only voice their thoughts without explaining them (Type 1 and Type 2 verbalization). In metacognitive verbalization, the researcher may ask for specific information (e.g., reasoning or explanation), and learners provide a metacognitive report on what they think their processes are (Type 3 verbalization). Cohen (2000) distinguished metacognitive verbalizations from non-metacognitive verbalizations by characterizing the former as self-observational and the latter as self-revelational. In order for verbalizations to reflect learners’ processes, it has been recommended that introspective, non-metacognitive verbalizations be gathered (Cohen, 2000; Ericsson & Simon, 1993) “to avoid this problem of accessing information at two different times—first during the actual cognitive processing and then at the time of report” (Ericsson & Simon, 1993: xiii).

To Think Aloud or Not to Think Aloud: The Issue of Reactivity One of the prominent critiques of the TA procedure is that it is intrusive and may be subjected to the issue of reactivity, that is, whether thinking aloud could have affected participants’ primary cognitive processes while engaging with the L2 or even add an additional processing load or secondary task on participants, which would not reflect a pure measure of their thoughts. Additionally, as noted by Rosa and O’Neill (1999), TAs may also present considerable variation due to individual differences. At the same time, as pointed out in Leow et al. (2014), the level of intrusiveness may depend on type of protocol employed (nonmetacognitive vs. metacognitive) and type of experimental task employed (e.g., problem-solving vs. reading). Other variables may include working memory, language of report, and proficiency level. To empirically address this methodological issue in SLA (like in other nonSLA fields), Leow and Morgan-Short (2004) reported the failure to find a reactive

Location, Location, Location

143

effect on participants’ performances after a reading exposure when compared to a control group. It is noteworthy that they also cautioned readers that “given the many variables that potentially impact the issue of reactivity in SLA research methodology, it is suggested that studies employing concurrent data-elicitation procedures include a control group that does not perform verbal reports as one way of addressing this issue” (p. 50). The reactivity strand of research grew exponentially in this second part of the decade with several studies (e.g., Bowles, 2008; Bowles & Leow, 2005; Egi, 2008; Rossomondo, 2007; Sachs & Polio, 2007; Sachs & Suh, 2007; Sanz et al., 2009; Yoshida, 2008) addressing the issue of reactivity in relation to various variables. While a cursory glance at the eight studies (with a total of ten experiments) published in this period would reveal one study reporting positive (Sanz et al., 2009) and one reporting negative effects in one of two experiments (Sachs & Polio, 2007, in which the protocols were produced in the L2), and another reporting positive effects (Rossomondo, 2007), a recent meta-analysis (Bowles, 2010) has reported an effect size value that “is not significantly different from zero” (p. 138), that is, it is not a reliable effect. A few more recent empirical studies (Morgan-Short, Heil, Botero-Moriarty, & Ebert, 2012; Stafford, Bowden, & Sanz, 2012) reported similar findings, while Stafford et al. (2012) also appeared to contradicted Sanz et al.’s (2009) reactive findings in one of their experimental groups. Goo (2010), on the other hand, reported negative reactivity for comprehension based on a trend toward statistical significance ( p = .054) with a medium effect size (d = .62). Another recent study (Yanguas & Lado, 2012) addressed the issue of reactivity in the written mode (learners writing in their heritage language) and reported positive reactivity in terms of fluency and accuracy. Currently, the standard methodological practice in research designs employing concurrent TAs is to follow Leow and Morgan-Short’s (2004) suggestion cited above.

The Standard Procedure to Collect RT Data The standard procedure to collect concurrent data is to ask participants to think aloud while performing an experimental task and to record the protocols produced during this experimental phase for subsequent coding. To guide participants to produce non-metacognitive protocols, instructions usually request that participants be as natural as possible as they perform the experimental task, to think aloud constantly from the time they start the task until they finish the task, and not to try to plan out or explain what they are saying. The data collected are then coded to establish the presence or absence of the cognitive construct(s) under investigation. Before beginning the experimental task, many TA experiments provide participants with practice or warm-up so that they can become accustomed to the task demand.

144

Research Methodology

Here is a typical instruction for participants to follow, taken from Bowles (2008): INSTRUCTION In this experiment I am interested in what you think about when you complete these tasks. In order to find out, I am going to ask you to THINK ALOUD as you work through the mazes. What I mean by “think aloud” is that I want you to verbalize your thoughts the entire time you are working on the tasks. I would like you to talk CONSTANTLY. Do not plan out what you are saying or explain what you’re saying. Just act as if you are alone in the room talking to yourself while you complete the tasks. What is most important is that you keep talking throughout and talk clearly into the microphone. You can speak in English. Just say whatever passes through your mind as you complete the tasks.

Benefits of TA The benefits of TA protocols gathered in the SLA field include information on (1) participants’ allocation of attention to or noticing of targeted forms or structures in the input (e.g., Alanen, 1995; Leow, 1997, 2001b; Martínez-Fernández, 2008), (2) the operationalization of the construct of (un)awareness (Hama & Leow, 2010; Leow, 2000), (3) the roles of different levels of awareness (e.g., Leow, 1997, 2000; Martínez-Fernández, 2008; Rosa & Leow, 2004b; Rosa & O’Neill, 1999, Sachs & Suh, 2007), (4) different levels or depths of processing and strategies employed (e.g., Hama & Leow, 2010; Hsieh et al., 2015; Leow, Hsieh, & Moreno, 2008; Morgan-Short et al., 2012; Qi & Lapkin, 2001; Rott, 2005), and (5) different types of processing, that is, conceptually-driven (activation of prior knowledge) versus data-driven processing (e.g., de la Fuente, 2015; Leow, 1998a). From a methodological perspective, TA protocols have revealed important data about the representativeness of participants within an experimental cell (e.g., Alanen, 1995; Hama & Leow, 2010; Leow, 1997, 1998a, 1998b, 2000; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999) and evidence of additional exposure to targeted forms or structures via posttests (Hama & Leow, 2010; Leow, 2000). Overall, these protocols assist in ascertaining, to a certain extent, whether participants’ performances are directly linked to what they received within each different experimental condition. As discussed earlier in Chapter 6, several studies that have employed concurrent data-elicitation procedures (think alouds) have clearly revealed that the assumption that all participants in a cell are going to perform according to the specific conditions created for that cell needs to be carefully addressed. This is an issue that can clearly have an impact on the internal validity of the study. For example, Leow (2000) found that even though all his participants in this study received the same instructions and the same experimental exposure

Location, Location, Location

145

task, the think-aloud protocols clearly revealed that one-half performed differently from the other half (cf. also Alanen, 1995; Leow, 1997, 1998a, 1998b; Leow et al., 2008; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999). These findings raise the serious question of how representative learners’ performances in experimental groups can claim to be in studies premised on the minimal role of attention or awareness if efforts have not been made to ascertain what learners really attended to or became aware of while exposed to or interacting with L2 data. In other words, not all participants may represent what constitutes an experimental group. This is clearly a very important methodological issue that needs to be carefully addressed by future empirical studies on the process of attention in SLA. As can be seen, concurrent data have provided a wealth of data that allow us to peek a little deeper into not only the roles of constructs such as attention and awareness, but also how participants process L2 data, that is, insights into the roles of depth or levels of processing, levels of awareness, and activation of prior knowledge and potential interactions between them during processing. I shall return to these insights in Chapter 11.

Major Characteristics of the RT, ET, and TA Procedures in SLA Research Design Leow et al. (2014: 122) provided a useful chart summarizing the major characteristics of the three concurrent data-elicitation procedures in SLA research design and the processes that may be measured in the early stages of the L2 learning process. They pointed out that all three concurrent data-elicitation procedures have their strengths and limitations, and one way to maximize the strengths of a particular procedure while minimizing its weaknesses may be to employ a procedural combination of ET, RT, and TA that aims to increase the level of internal validity of the study (cf. also Godfroid & Uggen, 2013; Leow, 2013; and Winke, 2013, for similar suggestions). As can be seen, the three concurrent data-elicitation procedures have common and distinct characteristics. TA relies on verbalizations, while both ET and RT use time as their unit of measure (with ET also addressing eye movement). All three procedures interpret the data gathered to address how learners are processing the L2 data. However, while ET and RA make assumptions regarding the underlying cognitive processes, TA directly observes such processes from the raw data (do you recall the abundance of raw data provided above?). Regarding the impact of the procedure on the task and learner, unlike TA, both ET and RA procedures are non-intrusive. At the same time, the level of intrusion in TA may depend on the type of task (e.g., problem-solving vs. reading) or protocol (metacognitive vs. non-metacognitive). Finally, both RT and TA are relatively easy to incorporate in any research design, while implementing ET is dependent upon access to expensive equipment.

146

Research Methodology

TABLE 8.1 Summary of major characteristics of the three concurrent data-elicitation

procedures in SLA research design and processes that may be measured in the early stages of the L2 learning process Major characteristics

Measure Evidence of how learners are processing Impact on Task/Learner Usage

Eye-tracking

Reaction time

Think aloud

TimeEye movement behavior Interpretable Assumptions

Time

Protocols

Interpretable Assumptions

Interpretable Observation

Non-intrusive

Non-intrusive

Requires access to equipment: Head-mounted or remote eye-tracker. Laboratory restricted.

Requires access to computer or handheld electronic device.

Intrusive (dependent upon task and type of TA) Requires access to voice recorder.

Input–intake system processes Attention Peripheral Attention Awareness Levels of Awareness Processing Depth of Processing Type of Processing

Yes Yes

Yes No

Yes No

Dichotomy: +/– noticing No

N/A

Continuum

N/A

Yes

Yes Low

Yes Low

Yes Low and above

Data-driven Conceptually-driven?

Data-driven

Data-driven Conceptually-driven

Source: Leow, Grey, Marijuan, & Moorman (2014: 122)

Pertaining to the cognitive processes postulated to play a role in the early stages of the learning process, all three procedures share the ability to address the construct of attention, with preference given to ET to provide more robust data, especially on lower levels of attention (e.g., peripheral). Similarly, all three procedures may be employed to address a low level of processing that includes data-driven processing. When data are to be used to address higher levels (e.g., conceptually-driven processing) or differential amounts of cognitive effort, preference is given to TA that relies on raw verbalizations. Finally, both ET and TA may be employed to investigate the construct of awareness, but this may depend upon the operationalization of awareness, that is, whether as a dichotomy (ET) or a continuum (TA), with the TA procedure preferred to obtain more robust data to address levels of awareness.

Location, Location, Location

147

Offline Procedures I am going to briefly discuss two other procedures employed to address internal processes but that are administered offline or after the actual exposure to the L2 data. The first is offline verbal reports.

Offline Verbal Reports Arguably one of the most popular procedures for measuring awareness in nonSLA fields and recently employed in the SLA field (e.g., Williams, 2004, 2005, Leung & Williams, 2011, 2012, 2014) is to prompt participants to verbalize any rule they might have learned during the experimental or treatment phase of the study (e.g., Dienes, Broadbent, & Berry, 1991; Reber, 1967; see Rebuschat, 2013, for a review). Prompts include questions such as what criteria they had used to make choices (Williams, 2005) or whether they had any feelings about when certain targeted items were used (Leung & Williams, 2011, 2012). Participants were then coded as unaware if they did not provide any minimal references to the targeted underlying rule or connection, and if their above-chance performance on a subsequent assessment task (for example, a grammaticality judgment task) was statistically above the chance level, researchers assumed that the knowledge that was employed to perform the task correctly was unconscious. Did you notice that knowledge is in italics? This is because, if you recall the stages of the L2 learning process in Chapter 2, gathering data offline has moved away from the initial concurrent stage of intake processing or encoding (Stage 3) to a stage beyond Stage 5 that is at the non-concurrent (offline) stage of retrieval of stored knowledge of the targeted rule (more of this below), that is, when learners indicate offline (beyond Stage 5) after they have processed the incoming information whether they were aware of the targeted underlying rule or connection during the experimental exposure. Operationalizing awareness or lack thereof at this stage views learning, then, as a product that is more closely associated with learned knowledge at Stage 4, and may not represent whether awareness or lack thereof occurred at Stages 1 and 3. I also pointed out that offline measurement procedures may not address the issue of veridicality or memory decay. Indeed, memory decay may fail to capture sporadic instances of learner awareness along the L2 learning processing during the experimental phase or subsequent exposure to targeted items during the testing phase (cf. Hama & Leow, 2010: 484 for one think aloud exemplar of such instances). In addition, offline data-elicitation procedures have been critiqued by both cognitive psychology (e.g., Eriksen, 1960; Shanks, Green, & Kolodny, 1994) and SLA researchers (e.g., Bialystok, 1979; Leow & Hama, 2013) as being an inaccurate or insensitive measure of awareness. In addition to memory decay and/or fabrication due to the time lag between the exposure and recall when using offline protocols, the validity issue also includes a mismatch between the actual knowledge employed by the learner to process the L2 data and what

148

Research Methodology

is being sought (Shanks & St. John, 1994), differential test sensitivity (that is, the withholding of knowledge of which learners may not be very confident to report (Berry & Dienes, 1993)), or even an inability at that point in time to describe the underlying rule in metalinguistic or non-metalinguistic terminology (Leow, 2015). In addition, Eriksen (1960) provides several important aspects upon which the operational meaning of a definition based on data gathered at a non-concurrent stage hinges critically: [T]he adequacy of the questioning of the subject (S), the motivation of the S to respond with the care and precision that is required, the care taken to assure that the S understands what is being asked him, consideration of the effects of the interpretation itself upon the delicate process of awareness, and most importantly an adequate schema for classifying the S’s verbalizations along relevant dimensions. (p. 280)

Stimulated Recall (SR) The stimulated recall procedure (Gass & Mackey, 2000) is the second retrospective (offline) procedure employed in SLA to access participants’ reflections on their cognitive or mental processes during an oral interaction (or task) in which they had previously participated. SR was clearly motivated by the inability of the interactionist strand of research to address empirically the important role of attention/ noticing in relation to feedback, especially recasts, during meaning-focused oral interaction. Recasts are a type of negative feedback that an interlocutor provides in response to an L2 learner’s erroneous utterance that also maintains the L2 learner’s original meaning. The procedure is quite simple. The interaction is recorded or videotaped and after the interaction, participants are shown the recording and asked to recall or verbalize what they were thinking or noticing at specific points (especially during feedback episodes) during the interaction. If learners provide any mention of the targeted item or the corrective nature of the feedback, it was assumed that this was evidence of noticing. Given the issues of memory (cf. Egi, 2004; Philp, 2003, who attempted to address this issue), retrieval, timing, and instructions, Mackey and Gass (2005), in an effort to strengthen the robustness of the administration of SR, provided recommendations that include (1) administering the SR as soon as possible after the interaction, (2) training participants to minimally carry out the procedure without providing irrelevant information, (3) employing a strong stimulus, and (4) allowing participant involvement in the selection and control of the stimulus episodes to decrease researcher interference. According to Mackey and Gass, SR is an effective way to gain the perspectives of learners, their interpretation of events, and their thinking at a particular point in time. SR has been subsequently employed in several areas of research inside and outside of the

Location, Location, Location

149

interactionist strand (e.g., Beers, Boshuizen, Kirschner, Gijselaers, & Westendorp, 2006; Egi, 2004; Mackey, Al-Khalil, Atanassova, Hama, Logan-Terry, & Nakatsukasa, 2007; Mackey, Gass, & McDonough, 2000; Mackey, Philp, Egi, Fujii, Tatsumi, 2002; Sime, 2006). Like offline verbal reports, SR also suffers from the issue of veridicality or memory decay, in addition to double exposure to the targeted item under investigation in the interaction. However, unlike offline verbal reports, which have a counterpart procedure online, SR, despite its limitation, is clearly a methodological effort to gather some data on the cognitive processes potentially employed during oral interaction and should be continued with the limitations in mind. At the same time, the increase interest in the synchronous computer-mediated communication (SCMC) platform that has been argued to possess relatively similar characteristics to the characteristics of oral interaction (e.g., Li, 2010) may provide the opportunity to probe a bit deeper into these cognitive processes by obtaining concurrent data during said SCMC (e.g., Baralt, 2013; Gurzynski-Weiss, Al Khalil, Baralt, & Leow, 2015).

Summary Now that we have discussed the importance of achieving a high level of internal validity and an appropriate data-elicitation procedure that addresses adequately the internal construct that we are investigating, it is time to view, from a global perspective, a finer-grained methodological approach to the study of the construct of awareness in both non-SLA and SLA fields and its effects on learning (including both L2 and non-L2 learning).

Toward a Finer-Grained Methodological Analysis of the Construct Awareness in SLA As mentioned earlier, investigating the construct of awareness, specifically in relation to learning, is thorny in both SLA and non-SLA fields, and to get a broader and clearer overview of this issue, my colleagues and I reviewed representative studies in different fields from the perspective of the what (is being learned), the where (awareness is being investigated, concurrently or nonconcurrently), and the how (experimental task, type and location of measurement employed to investigate awareness). To fully appreciate this report, we need to keep in mind once more the theoretical framework in SLA that I presented in Chapter 2. Our review of both non-SLA and SLA fields provided several interesting revelations that we proposed future studies need to consider carefully in any investigation of or report on the role of awareness or lack thereof in learning in non-SLA and SLA fields. First of all, the “what” in non-SLA and SLA fields was inherently different, with the latter field investigating naturally occurring languages that are intrinsically different from much of the targeted information

150

Research Methodology

employed in the non-SLA fields, such as colors, sequences, nonsense words, pictures, symbols etc. (cf. Leow, 2000; Leung & Williams, 2011; Simard & Wong, 2001). The closest target to naturally occurring languages found in non-SLA fields are the artificial languages employed by researchers such as Reber and his associates, though artificial languages cannot be equated with naturally occurring languages, for obvious reasons. “Where” awareness was investigated is determined by the stage along the learning process at which measurement of awareness is performed. Recall that we can investigate awareness at the concurrent (online) stage of encoding or accessing the incoming experimental information, that is, where learners receive and process online the incoming information (Stage 3). We can also investigate awareness at the non-concurrent (off line) stage of retrieval of stored knowledge of the construct, that is, where learners indicate off line after they have processed the incoming information (beyond Stage 5) whether they were aware of the targeted underlying rule during the experimental exposure. The non-SLA fields predominantly chose the retrieval stage for their measurement, while the SLA field had opted, until recently, to measure awareness more at the stage of encoding, that is, concurrently. Indeed, since 2010 we have witnessed an increase of implicit learning studies in SLA (e.g., Chan & Leung, 2014; Faretta-Stutenberg & Morgan-Short, 2011; Hama & Leow, 2010; Leung & Williams, 2011, 2012, 2014), that have shifted the stage of measurement to the off line stage due to a strong methodological inf luence from the field of cognitive psychology/science. We noted that many studies appeared to conf late the role of awareness at one stage along the learning process (that is, from input to intake to output) with the role played at a different stage beyond this process. The stage at which awareness is operationalized and measured brings us back to the issue of whether we are addressing learning as a process (concurrent) or learning as a product (non-concurrent). It is quite debatable to address learning as a product (e.g., implicit knowledge) and then claim with much confidence that the data can inform us of how the data got to be such a product. We reported that “how” awareness was measured was also fairly different in the non-SLA and SLA fields, but with the advent of the recent implicit learning studies in SLA this difference has numerically changed. The popular offline measurements in the non-SLA field, namely questionnaires, verbal reports, confidence ratings, and source attributions, are now fairly predominant in both fields in terms of percentages. In addition to the what, the where, and the how, our careful review of the awareness literature in the SLA field also revealed that investigating this construct needs to take into account the findings that there does appear to be levels of awareness (cf. Leow, 1997; Martínez-Fernández, 2008; Rosa & O’Neill, 1999; Sachs & Suh, 2007). I will elaborate on these findings in Chapter 11.

Location, Location, Location

151

Conclusion I hope this chapter and section have provided you with a healthy methodological dose of the requirements for research designs in SLA studies that address the constructs of attention and awareness and the roles they should have in L2 development so we can place confidence in their findings. The importance of employing appropriate measures to address internal processes cannot be understated. Armed with this methodological tool, let us take a look at the empirical studies on cognitive processes.

References Alanen, R. (1995). Input enhancement and rule presentation in second language acquisition. In R. W. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 259–302). Honolulu, HI: University of Hawai’i, Second Language Teaching and Curriculum Center. Alarcón, I. (2009). Applied linguistics in the processing of gender agreement L1 and L2 Spanish. Hispania, 92, 814–828. Anderson, J. A. (1970). Two models for memory organization using interacting traces. Mathematical Biosciences, 8, 137–160. Baralt, M. (2013). The impact of cognitive complexity on feedback efficacy during online versus face-to-face interactive tasks. Studies in Second Language Acquisition, 35, 689–725. Beers, P. J., Boshuizen, H. P. A., Kirschner, P. A., Gijselaers, W., & Westendorp, J. (2006). Cognitive load measurements and stimulated recall interviews for studying the effects of information and communication technology. Education Technology Research Development, 56, 309–328. Berry, D. C., & Dienes, Z. (1993). Implicit learning: Theoretical and empirical issues. Hove: Lawrence Erlbaum. Bialystok, E. (1979). Explicit and implicit judgments of L2 grammaticality. Language Learning, 29, 81–103. Blair, M. R., Watson, M. R., Walshe, R., & Maj, F. (2009). Extremely selective attention: Eye-tracking studies of the dynamic allocation of attention to stimulus features in categorization. Journal of Experimental. Psychology: Learning, Memory, and Cognition, 35, 1196–1206. Bowles, M. A. (2008). Task type and reactivity of verbal reports in SLA: A first look at an L2 task other than reading. Studies in Second Language Acquisition, 30, 369–387. Bowles, M. A. (2010). The think-aloud controversy in second language research. New York: Taylor & Francis. Bowles, M., & Leow, R. P. (2005). Reactivity and type of verbal report in SLA research methodology: Expanding the scope of investigation. Studies in Second Language Acquisition, 27(3), 415–440. Carpenter, P. A., & Just, M. A. (1976). Eye fixations and cognitive processes. Cognitive Psychology, 8, 441–480. Chan, R., & Leung, J. (2014). Implicit learning of natural language stress rules. Second Language Research, 30, 463–484. Chang, F., Dell, G. S., & Bock, K. (2006). Becoming syntactic. Psychological Review, 113(2), 234–272.

152

Research Methodology

Chun, M. M. (2000). Contextual cueing of visual attention. Trends in Cognitive Sciences, 4 (5), 170–178. Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235–253. Cohen, A. D. (2000). Exploring strategies in test taking: Fine-tuning verbal reports from respondents. In G. Ekbatani & H. Pierson (Eds.), Learner-directed assessment in ESL (pp. 127–150). Mahwah, NJ: Lawrence Erlbaum Associates. Cohen, A. D., & Cavalcanti, M. C. (1987). Viewing feedback on compositions from the teacher’s and the student’s perspective. ESPecialist, 16 (Apr), 13–28. DeKeyser, R. M. (2001). Automaticity and automatization. In P. Robinson (Ed.), Cognition and Second Language Instruction (pp. 125–151). Cambridge: Cambridge University Press. De la Fuente, M. J. (2015). Explicit corrective feedback and computer-based, formfocused instruction: The role of L1 in promoting awareness of L2 forms. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Dienes, Z., Broadbent, D. E., & Berry, D. C. (1991). Implicit and explicit knowledge bases in artificial grammar learning. Journal of Experimental Psychology: Learning, Memory,& Cognition, 17, 875–882. Dussias, P. E. (2010). Uses of eye-tracking data in second language sentence processing research. Annual Review of Applied Linguistics, 30 (1), 149–166. Dussias, P. E., & Sagarra, N. (2007). The effect of exposure on syntactic parsing in Spanish-English bilinguals. Bilingualism: Language and Cognition, 10 (1), 101–116. Egeth, H., Marcus, N., & Bevan, W. (1972). Target-set and response-set interaction: Implication for models of human information processing. Science, 176, 1447–1448. Egi, T. (2004). Verbal reports, noticing, and SLA research. Language Awareness, 13(4), 243–264. Egi, T. (2008). Investigating stimulated recall as a cognitive measure: Reactivity and verbal reports in SLA research methodology. Language Awareness, 17, 212–228. Ellis, N. C., Hafeez, K., Martin, K. I., Chen, L., Boland, J., & Sagarra, N. (2012). An eye-tracking study of learned attention in second language acquisition. Applied Psycholinguistics, 1(1), 1–33. Ericsson, K., & Simon, H. (1993). Protocol Analysis: Verbal reports as data (Revised ed.). Cambridge, MA: MIT Press. Eriksen, C. W. (1960). Discrimination and learning without awareness: A methodological survey and evaluation. The Psychological Review, 67, 279–300. Faretta-Stutenberg, M., & Morgan-Short, K. (2011). Learning without awareness reconsidered: A replication of Williams (2005). In G. Granena, J. Koeth, S. Lee-Ellis, A. Lukyanchenko, G. Prieto Botana, & E. Rhoades (Eds.), Selected proceedings of the 2010 second language research forum: Reconsidering SLA research, dimensions, and directions (pp. 18–28). Somerville, MA: Cascadilla Proceedings Project. Foucart, A., and Frenck-Mestre, C. (2012). Can late learners acquire new grammatical features? Evidence from ERPs and eye-tracking. Journal of Memory and Language, 66(1), 226–248. Frensch, P. A., Buchner, A., & Lin, J. (1994). Implicit learning of unique and ambiguos serial transitions in the presence and absence of a distracter task. Journal of Experimental Psychology: Learning Memory and Cognition, 20, 567–584. Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research. Mahwah, NJ: Lawrence Erlbaum.

Location, Location, Location

153

Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in L2 vocabulary acquisition by means of eye tracking. Studies in Second Language Acquisition, 35, 483–517. Godfroid, A., & Uggen, M. S. (2013). Attention to irregular verbs by beginning learners of German. Studies in Second Language Acquisition, 35(2), 291–322. Goo, J. (2010). Working memory and reactivity. Language Learning, 60 (4), 712–752. Gurzynski-Weiss, L., Al-Khalil, M., Baralt, M., & Leow, R. P. (2015). Levels of awareness in relation to type of recast and type of linguistic item in synchronous computermediated communication: A concurrent investigation. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32 (3), 465–491. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hulstijn, J. H., Van Gelderen, A., & Schoonen, R. (2009). Automatization in second language acquisition: What does the coefficient of variation tell us? Applied Psycholinguistics, 30 (4), 555–582. Jaaskelainen, R. (2000). Focus on methodology in think-aloud studies on translating. In S. Tirkkonen Condit & R. Jaaskelainen (Eds.), Tapping and mapping the processes of translation and interpreting: Outlooks on empirical research (pp. 71–82). Amsterdam: John Benjamins. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice-Hall. Kasper, G., & Blum-Kulka, S. (1993). Interlanguage pragmatics. New York: Oxford University Press. Klatsky, R. L., & Smith, E. E. (1972). Stimulus expectancy and retrieval from short-term memory. Journal of Experimental Psychology, 94, 101–107. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (1998a). The effects of amount and type of exposure on adult learners’ L2 development in SLA. Modern Language Journal, 82, 49–68. Leow, R. P. (1998b). Toward operationalizing the process of attention in second language acquisition: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19, 133–159. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2001a). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leow, R. P. (2001b). Do learners notice enhanced forms while interacting with the L2?: An online and offline study of the role of written input enhancement in L2 reading. Hispania, 84, 496–509. Leow, R. P. (2013). Schmidt’s noticing hypothesis: More than two decades after. To appear in J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in honor in Richard Schmidt (pp. 23–35). Honolulu, HI: University of Hawai‘i, National Foreign Language Resource Center. Leow, R. P. (2015). Implicit learning in SLA: Of processes and products. In P. Rebuschat (Ed.), Implicit and explicit learning of languages. Amsterdam: John Benjamins.

154

Research Methodology

Leow, R. P., Grey, S., Marijuan, S., & Moorman, C. (2014). Concurrent data elicitation procedures, processes, and the early stages of L2 learning: A critical overview. Second Language Research, 30 (2), 111–127. Leow, R. P., & Hama, M. (2013). Implicit learning in SLA and the issue of internal validity: A response to Leung and Williams’ “The implicit learning of mappings between forms and contextually derived meanings.” Studies in Second Language Acquisition, 35(3), 545–557. Leow, R. P., Hsieh, H-C., & Moreno, N. (2008). Attention to form and meaning revisited. Language Learning, 58, 665–695. Leow, R. P., & Morgan-Short, K. (2004). To think aloud or not to think aloud: The issue of reactivity in SLA research methodology. Studies in Second Language Acquisition, 26, 35–57. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mappings between forms and contextually derived meanings. Studies in Second Language Acquisition, 33, 33–55. Leung, J. H. C., & Williams, J. N. (2012). Constraints on implicit learning of grammatical form-meaning connections. Language Learning, 62, 634–662. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. Lew-Williams, C., & Fernald, A. (2010). Real-time processing of gender-marked articles by native and non-native Spanish speakers. Journal of Memory and Language, 63, 447–464. Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60, 309–365. Lim, H., & Godfroid, A. (2014). Automatization in second language sentence processing: A partial conceptual replication of Hulstijn, Van Gelderen, and Schoonen’s 2009 study. Applied Psycholinguistics, 35, 1–36. Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language Learning, 59 (2), 453–498. Mackey, A., Al-Khalil, M., Atanassova, G., Hama, H., Logan-Terry, A., & Nakatsukasa, K. (2007). Teachers’ intentions and learners’ perception about corrective feedback in the L2 classroom. Innovation in Language Learning and Teaching, 1, 129–152. Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. Mahwah, NJ: Lawrence Erlbaum. Mackey, A., Gass, S. M., & McDonough, K. (2000). How do learners perceive interactional feedback? Studies in Second Language Acquisition, 22, 471–497. Mackey, A., Philp, J., Egi, T., Fujii, A., & Tatsumi, Y. (2002). Individual differences in working memory, noticing of interactional feedback, and L2 development. In P. Robinson (Ed.), Individual differences and instructed language learning (pp. 181–210). Amsterdam: John Benjamins. Martínez-Fernández, A. (2008). Revisiting the involvement load hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. Medina, A. (2015). The variable effects of level of awareness and CALL versus non-CALL textual modification on adult L2 readers’ input comprehension and learning. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton.

Location, Location, Location

155

Morgan-Short, K., Heil, J., Botero-Moriarty, A., & Ebert, S. (2012). Allocation of attention to second language form and meaning: Issues of think alouds and depth of processing. Studies in Second Language Acquisition, 34, 4, 659–685. Nisbett, R., & Wilson, T. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 231–259. Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32. Pachella, R. G. (1973). An interpretation of reaction time in information processing research. In B. Kantowitz (Ed.), Human information processing: Tutorials in performance and cognition (pp. 41–82). Hillsdale, NJ.: Erlbaum. Philp, J. (2003). Constraints on “noticing the gap”: Non-native speakers’ noticing of recasts in NS-NNS interaction. Studies in Second Language Acquisition, 25, 99–126. Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum. Qi, D. S., & Lapkin, S. (2001). Exploring the role of noticing in a three-stage second language writing task. Journal of Second Language Writing, 10, 277–303. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422. Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 6, 317–327. Reber, A. S., & Allen, R. (1978). Analogic and abstraction strategies in synthetic grammar learning: A functionalist interpretation. Cognition, 6 (3), 189–221. Rebuschat, P. (2013). Measuring implicit and explicit knowledge in second language research: A review. Language Learning, 63, 595–626. Rosa, E. M., & Leow, R. P. (2004a). Awareness, different learning conditions, and L2 development. Applied Psycholinguistics. 25, 269–292. Rosa, E. M., & Leow, R. P. (2004b). Computerized task-based exposure, explicitness and type of feedback on Spanish L2 development. Modern Language Journal, 88, 192–217. Rosa, E. M., & O’Neill, M. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21(4), 511–556. Rossomondo, A. E. (2007). The role of lexical temporal indicators and text interaction format in the incidental acquisition of the Spanish future tense. Studies in Second Language Acquisition, 29, 39–66. Rott, S. (2005). Processing glosses: A qualitative exploration of how form-meaning connections are established and strengthened. Reading in a Foreign Language, 17, 95–124. Sachs, R., & Polio, C. (2007). Learners’ uses of two types of written feedback on an L2 writing revision task. Studies in Second Language Acquisition, 29, 67–100. Sachs, R., & Suh, B. R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 197–227). Oxford: Oxford University Press. Saffran, J. R., Newport, E., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35(4), 606–621. Sanz, C., Lin, H-J., Lado, B., Bowden, H. W., & Stafford, C. A. (2009). Concurrent verbalizations, pedagogical conditions, and reactivity. Two CALL studies. Language Learning, 59, 33–71. Segalowitz, N. (2003). Automacity and second language. In C. Doughty & M. Long (Eds.), The handbook of second language acquisition (pp. 382–408). Oxford: Blackwell.

156

Research Methodology

Segalowitz, N., & Segalowitz, N. J. (1993). Skilled performance, practice and the differentiation of speedup from automatization effects. Applied Linguistics, 14, 369–385. Shanks, D. R., Green, R. E. A., & Kolodny, J. A. (1994). A critical examination of the evidence for unconscious (implicit) learning. In C. Umiltà & M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing (pp. 837– 860). Cambridge, MA: MIT Press. Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447. Simard, D., & Wong, W. (2001). Alertness, orientation, and detection: The conceptualization of attentional functions in SLA. Studies in Second Language Acquisition, 23, 103–124. Sime, D. (2006). What do learners make of teachers’ gestures in the language classroom? International Review of Applied Linguistics in Language Teaching, 44 (2), 211–230. Smith, B. (2010). Employing eye-tracking technology in researching the effectiveness of recasts in CMC. In F. M. Hult (Ed.), Directions and Prospects for Education Linguistics (pp. 79–97). New York: Springer. Smith, B. (2012). Eye-tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning & Technology, 16 (3), 53–81. Stafford, C., Bowden, H., & Sanz, C. (2012). Optimizing language instruction: Matters of explicitness, practice, and cue learning. Language Learning, 62 (3), 741–768. Williams, J. N. (2004). Implicit learning of form-meaning connections. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form meaning connections in second language acquisition (pp. 203–218). Mahwah, NJ : Erlbaum. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269–304. Winke, P. M. (2013). The effects of input enhancement on grammar learning and comprehension. Studies in Second Language Acquisition, 35(2), 323–352. Wonnacott, J. N., Newport, E., & Tanenhaus, M. K. (2008). Acquiring and processing verb argument structure: Distributional learning in a miniature language. Cognitive Psychology, 56 (3), 165–209. Wright, L. M., & Ward, L. M. (2008). Orientation of attention. New York: Oxford University Press. Yamashita, J. (2002). Reading strategies in L1 and L2: Comparison of four groups of readers with different reading ability in L1 and L2. ITL, Review of Applied Linguistics, 135–136, 1–35. Yanguas, I., & Lado, B. (2012). Is thinking aloud reactive when writing in the heritage language? Foreign Language Annals, 45(3), 380–399. Yellott, J. I. (1971). Correction for fast guessing: The speed-accuracy tradeoff in choice reaction times. Journal of Mathematical Psychology, 8, 159–199. Yoshida, M. (2008). Think-aloud protocols and type of reading task: The issue of reactivity in L2 reading research. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 109–209). Somerville, MA: Cascadilla Proceedings Project.

SECTION 3

Empirical Research Investigating the Role of Attention/Noticing in L2 Development

This page intentionally left blank

9 YOUR ATTENTION, PLEASE

The current interest in the crucial role of attention as a construct in learning that arguably began in the early 90s has generated quite a large body of empirical classroom studies that are explicitly or implicitly premised on the role(s) of attention and/or noticing (attention plus a low level of awareness) in L2 development. Not surprisingly, the impetus for this theoretical line of investigation is the assumption that “attention” to form and meaning is necessary for language development in the early stages of the learning process, and since, for some studies, “attentional” resources are limited under the metaphor of limited capacity (recall the attentional models in psychology in Chapter 3), it may be possible to devise specific techniques to promote intake and L2 learning by directing learners’ “attention” to formal aspects of the L2 (but not in the old traditional way of directly teaching grammar). This chapter provides a synopsis of four popular strands of research, with their respective theoretical underpinnings, that are premised minimally on the role of attention/noticing and which are conducted with the intention of promoting L2 development primarily in the classroom setting. Before we proceed to the synopses of the SLA studies conducted under the metaphor of adult L2 learners as limited capacity processors of information (or the strand of “simultaneous attention to form and meaning”), input/ textual enhancement, processing instruction, and interactional feedback, let us first address another terminological issue.

Attention Versus Noticing Versus Processing Did you notice that the term “attention” is surrounded by inverted commas and that I placed “noticing” next to “attention”? Like acquisition and learning in Chapter 7, the terms attention, noticing, and processing sometimes appear to

160

Empirical Research Investigating the Role of Attention

be referring to the same concept. For example, you will see later a conflation between “(level of) attention” and “type of noticing” with “(level of) processing,” as in “attentional levels,” “focused attention,” and “quality of noticing,” in addition to, for example, references to “simultaneous attention to form and meaning” and “blocked attention.” It appears that the term “attention” subsumes “noticing” and further “processing” of the targeted items in the input. However, one can pay attention to an item in the input, but this does not always translate to it being noticed, that is, attended and processed with a low level of awareness a la Schmidt (2001), or even further processed (elaboration, activation of prior knowledge, form-meaning connection, etc.). At this point, simply keep in mind that we need to view the process of attention as having several types or phases dependent upon whether it is peripheral, selective, or focal. These types or phases will be distinguished in Chapter 12.

Empirical Research on Attention/Noticing in SLA SLA research is quite rich in the number of strands seeking to find ways to promote L2 development, and studies usually claim to conduct these studies from a classroom-based perspective with the purpose of improving our students’ L2 development. The techniques devised to promote intake and L2 learning by drawing, directly or indirectly, learners’ attention to formal aspects of the L2 (with the implicit hope that they do process a bit deeper what is attended) date back several decades ago to the early 90s when researchers became more aware (levels of awareness) of the central role that the construct of attention plays in SLA. Any guess as to the theoretical source of this “new” interest in attention (Chapter 3)? These techniques included aural and written simplification of texts (e.g., Leow, 1993, 1995; Wong, 2001) as well as a series of form-focusing or consciousness/awareness-raising tasks (e.g., Fotos, 1994; Leow, 1997a). In addition, some researchers have investigated whether attention to form (mainly grammatical) may also be encouraged through a variety of input-enhancement and focus-on-form (FonF) techniques (e.g., Doughty, 1991; Lyddon, 2011), input flood or an increase in the frequency of the target form (e.g., J. White, 1998; Williams & Evans, 1998), processing instruction (e.g., Morgan-Short & Wood Bowden, 2006; VanPatten & Cadierno, 1993), learning or training conditions (e.g., Morgan-Short, Sanz, Steinhauer, & Ullman, 2010; Robinson, 1996), the provision of feedback in the oral/aural (e.g., Leeman, 2003; Lyster & Saito, 2010) or written (e.g., Bitchener & Knoch, 2010; Storch & Wigglesworth, 2010) mode, accompanied or not by explicit rule presentation (e.g., Rosa & Leow, 2004a), face-to-face or computerized (e.g., Hsieh, Moreno, & Leow, 2015; Sanz & Morgan-Short, 2004), and so on. Attention/noticing has been measured by a variety of instruments in these studies that include offline questionnaires (e.g., Alanen, 1995), online uptake charts (Mackey, McDonough, Fujii, & Tatsumi, 2001), learning diaries (e.g.,

Your Attention, Please

161

Schmidt & Frota, 1986), online verbal reports (e.g., Leow, 2001a), offline verbal reports such as stimulated recall protocols (e.g., Egi, 2008), and eye-tracking (e.g., Smith, 2012). In addition, in some studies participants were prompted to take notes while reading an L2 text (Izumi, 2002), to underline, circle, or check targeted linguistic structures in written text (Greenslade, Bouden, & Sanz, 1999), or to make a check mark every time a targeted item was heard (VanPatten, 1990). Quite a large range of linguistic items has also been empirically investigated, and these include Spanish imperatives (e.g., Leow, 1997b), imperfect and preterit forms (e.g., Overstreet, 1998), present perfect forms (e.g., Leow, Egi, Nuevo, & Tsai, 2003), relative pronouns (e.g., Shook, 1994), past conditional (Rosa & Leow, 2004a), Finnish locative suffixes (e.g., Alanen, 1995), English possessive determiners (e.g., J. White, 1998), relative clauses (Izumi, 2002), French past participle agreement (e.g., Wong, 2003), and so on. Different levels of language experience have also been explored, ranging from beginning learners of an L2 (e.g., Alanen, 1995) to intermediate (e.g., Bowles, 2003) to advanced levels (e.g., Rosa & Leow, 2004b). Amount of exposure is also differential, ranging from less than an hour (e.g., Leow, 1997a) to over several days (e.g., J. White, 1998). Overall, the findings of these studies provide strong support for the role of attention and/or noticing in L2 development. However, the research designs of many of these studies did not methodologically tease out the specific role attention played during exposure to incoming L2 data. Before we take a look at some of the popular strands of research and their respective theoretical underpinnings, we need to first discuss the early attentional studies addressing directly the assumed metaphor of adult learners as limited capacity processors of information.

SLA Studies Under the Metaphor of Adult Learners as Limited Capacity Processors of Information As I mentioned above, we (including yours truly as a young scholar many decades ago) like to follow non-SLA fields to provide a theoretical foundation for our empirical studies. Not surprisingly, then, drawing from the attentional theories postulated in cognitive psychology ( Chapter 3), the early 90s witnessed the first SLA studies (Leow, 1993, 1995; Shook, 1994; VanPatten, 1990) that have been conducted under the metaphor of adult L2 learners as limited capacity processors of incoming information. VanPatten investigated the attentional capacity of adult L2 learners in the aural mode. He hypothesized that attending to both form and meaning simultaneously would result in a cognitive overload. Participants were exposed to Spanish aural input in one of four conditions: Condition 1 required participants to listen for meaning only, condition 2 required them to listen to the word “inflación” (“inflation”) only, condition 3 required them to listen to the definite article “la,” while condition 4 required the participants to listen to the third-person plural

162

Empirical Research Investigating the Role of Attention

morpheme -n. Attention was operationalized by asking participants to mark on a sheet of paper every instance targeted linguistic forms were noticed in the input. VanPatten found an overall decrease of comprehension when participants appeared to have attended to la and -n (argued to be of less communicative value) when compared to attention paid to inflación (argued to be of more communicative value). He also found superior performance at the advanced level when compared to the less advanced level, prompting him to postulate that “only when input is easily understood can learners attend to form as part of the intake process” (VanPatten, 1990: 296). In other words, there may be a tendency at the early stages of language learning to process only for meaning due to the constraints of attentional resources to accommodate both form and meaning simultaneously. These findings appeared to have led to VanPatten’s (1996, 2004, 2007) postulation of his first and major input processing principle, namely, the Primacy of Content Word Principle (cf. Chapter 5) that claims that learners are driven to process or make form-meaning connections of content words (e.g., lexical items such as words) in the input before non-content words such as the, is, and so on. Inspired by the notion of limited capacity or inability to divide one’s attention to incoming information in the L2, at least five published empirical studies specifically investigated the issue of simultaneous attention to (read: processing of ) form and meaning in the SLA literature by conceptually or partially replicating VanPatten’s original study (Greenslade et al., 1999; Wong, 2001) or by extending the research design to address some methodological issues (Han & Peverly, 2007; Leow, Hsieh, & Moreno, 2008; Morgan-Short, Heil, BoteroMoriarty, & Ebert, 2012). Whereas Greenslade et al.’s replication study changed the input mode from aural to written, Wong conducted a partial replication of both VanPatten (1990) and Greenslade et al. (1999). Her research design differed from both VanPatten’s and Greenslade et al.’s in that it directly compared the aural and written modes within the same participant pool and sought to explore whether similar results would hold across different modalities. In addition, given that her participants were (French) students learning English as a foreign language, VanPatten’s experimental text was translated into English, resulting in the loss of the morpheme -n as one of the targeted forms. Greenslade et al.’s (1999) results paralleled those found in VanPatten’s (1990) study with one apparently major difference: No significant difference in comprehension was found between the lexical item inflación and verbal morpheme -n groups, arguably the two experimental groups representing the ends of a form continuum in this study in terms of saliency of item. In spite of this contradictory finding, Greenslade et al. concluded that during the early stages of L2 acquisition, processing for meaning and form in the written mode also competes for learners’ limited attentional resources. In the aural mode, Wong (2001) reported, as VanPatten (1990) did, that participants listening to content only comprehended significantly more than participants

Your Attention, Please

163

listening to the definite article the, but performed statistically similar to the inflation group. However, differing from VanPatten, no significant difference in comprehension was found between the inflation and the definite article the groups. In the written mode Wong reported, as Greenslade et al. (1999) did, that there was no significant difference in comprehension between the read-for-content-only group and the inflation group. However, her findings differed in the other two conditions identical in the two written studies: No differences in comprehension were found between the control and the definite article the groups and between the inflation and the definite article the groups. Overall, only the control and inflation groups’ statistically similar performances supported the previous studies. Wong (2001) concluded that her findings suggest that “learners’ limited attentional capacity is not constrained in the same way during input processing in the aural and written modes” (p. 358; cf. Leow, 1995). Leow et al. (2008) continued this strand of inquiry in the written mode, making adjustments in the research design to address some methodological issues with the previous studies. The changes included (a) the collection of online verbal reports as a means of determining the baseline of the research design, that is, whether participants did pay attention to both form and meaning as instructed; (b) the use of a ten-item written multiple choice test as the measure of comprehension instead of the free recall method; (c) the use of sol, meaning ‘sun,’ instead of inflación as the lexical item to control for saliency differences between the lexical and grammatical items; (d) the inclusion of a new grammatical form, lo, a direct object clitic pronoun meaning “him” or “it,” that was claimed to have a higher communicative value than the definite article and the verbal morpheme; and (e) a more even distribution of the targeted forms throughout a new reading passage that had been modified from an authentic Spanish article. Like previous studies, participants in the experimental groups were included in the analysis if they had circled at least 60% of the occurrences of their particular target form. Similar to the results for the written passage in Wong (2001), Leow et al. (2008) found no differences in the level of comprehension between the experimental and control conditions and suggested one plausible explanation for the results based on evidence gleaned from the think aloud protocols, namely that all experimental groups evidenced a low level of processing. A conceptual replication of Leow et al. (Morgan-Short et al., 2012) provided empirical support for the external validity of the original study by reporting similar findings with a larger number of participants (p. 361). Finally, Han and Peverly (2007), arguing that studies investigating learners’ capacity to simultaneously process form and meaning have “emanated almost exclusively from studies of learners who had some knowledge of the target language” (p. 18), addressed this issue with naïve learners’ exposed to Norwegian, a language with which none “had any prior experience” (p. 25). The participants were randomly divided into a Sequential (SQ) and a Simultaneous (SM) group and were then exposed to a written text in Norwegian taken from a popular

164

Empirical Research Investigating the Role of Attention

Norwegian textbook. They reported that directing attention to either form or meaning did not have a significant effect on processing for meaning by naïve learners. Under the same limited capacity metaphor, Leow (1993, 1995) addressed the issue of simplification on learners’ intake in both written and aural modes. Leow hypothesized that textual simplification of input, resulting in a more statistically comprehensible text, should reduce the processing demands of adult L2 learners, thereby facilitating their intake of the linguistic items under study. In both the written (Leow, 1993) and aural (Leow, 1995) modes, the results were the same: Simplification did not appear to facilitate any significant intake of the targeted forms in the input. In other words, there was no direct evidence that suggested that learners reallocated their attention to form when the processing demands to attend to meaning were significantly reduced. Attention was operationalized by learners’ performances on a post-exposure multiplechoice recognition task. Shook (1994) investigated the effects of attentional condition on learners’ intake of two linguistic items, the present perfect and the relative pronoun (que and quien) contained in a written Spanish text. Participants were divided into three groups: The first was exposed to the text alone, the second was exposed to the grammatical items bolded with no instructions, and the third received a similar text like the second group together with the request to deduce a grammatical rule for the bolded items. The premise underlying the study was that the saliency of the targeted forms in the input would draw learners’ attention to them while processing the text content. Shook found significant effects for type of attentional condition on learners’ intake of grammatical information contained in written input. Participants exposed to the enhanced texts presumably paid substantially more attention to the targeted forms when compared to participants not exposed to such enhanced forms in the input. Like Leow (1993, 1995) and VanPatten (1990), Shook’s results also indicated different processing for different linguistic items, where the perfect tense form was taken in more significantly than the relative form. However, he also had a mixed bag of findings with respect to language experience: Second-year learners performed better than first-year learners on the present perfect production task, but this finding was the opposite for the relative pronoun production task.

Summary Overall, studies empirically addressing L2 learners’ limited attentional capacity or ability to simultaneously process both grammatical information and content in the L2 appear to indicate that while modality may play a role in performance due to cognitive constraints, current findings also appear to indicate some role of depth of processing in accounting for non-significant differences in performances between experimental conditions. The question remains whether it is

Your Attention, Please

165

methodologically possible to separate one’s processing of grammatical information embedded in a form devoid of processing for meaning or vice versa. To this end, further studies are required to provide more substantial empirical support for this principle.

The Input or Textual Enhancement Strand The notion of input enhancement has permeated several distinct strands of research in pedagogical second language acquisition (SLA) literature (e.g., processing instruction, consciousness-raising, input flooding, focus on form, textual enhancement, interaction). This strand of research is relatively popular, perhaps due to the fact that many researchers (and teachers) feel that enhancing the L2 input does work. Keep in mind that there are several issues (which will be enhanced later) that are very important to consider when reading the literature on input enhancement.

Theoretical Underpinning The term “input enhancement” was first coined by Sharwood Smith (1991, 1993) to override his previous term “language consciousness-raising” (Sharwood Smith, 1981), that is, the guidance teachers provide for promoting second/foreign language (L2) learners’ self-discovery or conscious awareness of the formal features of the L2. According to Sharwood Smith, input enhancement can be defined from two perspectives: An external perspective, which is any pedagogical attempt (usually by a teacher) to make more salient specific features of L2 input in an effort to draw learners’ attention to such enhanced features (raise your hand if you write on the blackboard and then underline, circle, capitalize, etc., some grammatical point you are trying to underscore! I still do after 40 years of teaching), and an internal perspective, that is, it is the learners’ internal mechanism that makes salient specific features in the input. The major theoretical underpinning of either perspective is, without doubt, that learners need to pay attention to specific items in the input before such information can be taken in, with the potential of being processed further into the learners’ language system. In an effort to move away from the perception of consciousness-raising as “a complete and unrelenting focus on the formal structure of the TL” (Sharwood Smith, 1981: 160), Sharwood Smith proposed that language consciousness-raising be viewed from two axes: Degrees of elaboration and explicitness leading to four basic types of manifestation of consciousness-raising based on these two axes. From this discussion, it appears that the focus of the discussion was more on product (explicit knowledge) than on process (input processing). In 1991, Sharwood Smith appeared to view input enhancement somewhat more from an input processing perspective than a product perspective as previously posited. First of all, he overrode his previous term consciousness-raising

166

Empirical Research Investigating the Role of Attention

with input enhancement, while acknowledging the discrepancy in the two terms’ assumptions regarding the input/intake dichotomy in light of an internal processing (consciousness-raising) versus an external manipulation of the L2 input (input enhancement). In other words, while consciousness-raising assumed that learners became conscious of all the input they were exposed to, leading to some linguistic change in their mental state, input enhancement assumed that manipulated input (by the teacher) may or may not be taken in by the learners. Think of not even paying attention to the language. Based on language learnability (e.g., Hornstein & Lightfoot, 1981), that is, that input may be viewed from various kinds of evidence, usually referred to as being either positive or negative, Sharwood Smith cautioned that the enhanced input might or might not be further processed into the language system, that is, L2 knowledge. In other words, this may be interpreted that while enhanced input is noticed and taken in by the learner, such linguistic information may not be further processed due to the kind of evidence to which learners are exposed. However, in this same article Sharwood Smith also began to question the role of awareness in input enhancement by drawing on notions from non-SLA fields such as “disunity of awareness” (Jackendoff, 1987), that is, the possibility of being aware of something and not aware of it, and suggested that language learning be viewed also from a modular perspective (Fodor, 1983). These issues are elaborated in his 1993 article in which he followed Jackendoff’s (1987) postulation that no activity of the mind is conscious and that, at most, one is only aware of a succession of states. Did anyone notice (with some level of awareness) the connection between this theoretical shift and MOGUL (Truscott & Sharwood Smith, 2011)?

Empirical Studies Studies purporting to address the benefits of input enhancement over unenhanced input have generally followed Sharwood Smith’s (1991) definition of input enhancement as any pedagogical intervention on the part of the teacher to make targeted items in the L2 input more salient in an effort to draw their attention to these enhanced items. In other words, studies in the research strand of input enhancement are minimally premised on the role of attention, that is, learners exposed to enhanced input should pay more attention to and substantially process better enhanced items in the input when compared to learners not exposed to such enhancement. Several studies have attempted to address a permutation of the different exemplars of Sharwood Smith’s two axes, that is, elaboration and explicitness. According to Sharwood Smith, exemplars on the elaboration axis range from a onetime signal to indicate a learner error to repeated signals for the same type of error. Exemplars on the explicitness axis range from a facial gesture to a metalinguistically sophisticated rule explanation. However, as I pointed out (Leow, 2009), the definition of input enhancement,

Your Attention, Please

167

based on the two axes (elaboration and explicitness) provided by Sharwood Smith, is arguably too open ended to be empirically tested and this may have led to several misinterpretations of what comprises an empirical study, with an appropriate research design, setting out to address the effects of the variable input enhancement on comprehension, intake, and L2 development. Not surprisingly then, what constitutes input enhancement has been interpreted from several perspectives. For example, many studies have visually enhanced written input via the use of bolding, capitalizing, underlining, italicizing, different fonts and sizes, and so on (e.g., Alanen, 1995; Bowles, 2003; Izumi, 2002; Jourdenais, Ota, Stauffer, Boyson, & Doughty, 1995; Kim, 2006; Lee, 2007; Leow, 1997b, 2001b; Leow et al., 2003; Shook, 1994; J. White, 1998; Overstreet, 1998; Winke, 2013). These studies typically compared the performance of an enhanced group with that of an unenhanced group. Other studies have subsumed enhancement (metalinguistic or visual) within some type of instruction, be it focus on form (e.g., Leeman, Arteagoitia, Fridman, & Doughty, 1995) or processing instruction (e.g., Morgan-Short & Wood Bowden, 2006; VanPatten & Cadierno, 1993). Input enhancement has also been employed in the research strand of oral interaction via the use of recasts (e.g., Leeman, 2003) or corrective feedback (e.g., White, Spada, Lightbown, & Ranta, 1991) and also in the CALL strand of research (Gurzynski-Weiss, Al Khalil, Baralt, & Leow, 2015; Lyddon, 2011; Sachs & Suh, 2007). Studies have provided participants the opportunity to discuss metalinguistically targeted grammatical structures in groups (e.g., Fotos, 1994). Other studies have interpreted input enhancement as an overdose or increase of targeted forms or structures in the input by, for example, input flooding (e.g., Williams & Evans, 1998). Quite a large range of linguistic items has also been empirically investigated in studies of input enhancement that have controlled for this variable in their research designs. These include Spanish imperatives (e.g., Bowles, 2003; Leow, 1997b, 2001b), Spanish imperfect and preterit forms (e.g., Jourdenais et al., 1995; Overstreet, 1998), Spanish present perfect forms (e.g., Leow et al., 2003; Shook, 1994), Spanish relative or object pronouns (e.g., Gurzynski-Weiss et al., 2015; Shook, 1994), Spanish gender agreement (e.g., Gurzynski-Weiss et al., 2015; Leeman, 2003), Finnish locative suffixes (e.g., Alanen, 1995), English possessive determiners (e.g., J. White, 1998), relative clauses (Izumi, 2002), passives (e.g., Winke, 2013), and French past participle agreement (Wong, 2003). Different levels of language experience have also been explored, ranging from beginner learners (e.g., Alanen, 1995) of an L2 to intermediate levels (e.g., Bowles, 2003). Amount of exposure is also differential, ranging from less than an hour (e.g., Leow, 1997b) to over several days (e.g., J. White, 1998). The typical research design comprised the classic pretest—exposure or instruction—posttest with very few studies including a delayed posttest in the design (e.g., Leeman, 2003; Leow, 2001b; Leow et al., 2003). Posttests typically were designed to measure learners’ intake (e.g., recognition and error correction

168

Empirical Research Investigating the Role of Attention

tasks), written production (e.g., picture-cued, sentence completion, sentence combination, narration, fill-in-the-blank), grammaticality judgment, and L2 comprehension. The overall findings of the more than twenty studies investigating the effects of input enhancement on L2 development reveal quite inconsistent benefits. These surprising findings may make you want to stop and take a pause. During this pause, it may be noted that many of these studies did not operationalize the process of attention so we need to take a look at those studies that did in order to arrive at a plausible explanation for the effect or lack thereof of input enhancement.

Studies Employing Concurrent Data Elicitation Procedures Only six studies to date have employed concurrent data elicitation procedures to establish learner attention to enhanced or unenhanced items in the input: think aloud protocols or online verbal reports (Alanen, 1995; Gurzynski-Weiss et al., 2015; Leow, 2001b; Leow et al., 2003; Bowles, 2003) and eye-tracking (Winke, 2013). A sixth (Izumi, 2002) measured noticing via note-taking during exposure to the L2 input, a measurement that may be subjected to critique due to the qualitatively poor data it might have provided. Attentional data for most of the studies were typically elicited via the use of non-metacognitive think aloud protocols, that is, learners were requested to think aloud without providing any metacognitive explanation on their thoughts while interacting with the L2 input.

Revelations of Concurrent Data on the Effects of Textual Enhancement These revelations are going to be brief and perhaps surprising. First of all, the three studies that reported amount of noticing were split into two camps: Two (Bowles, 2003, and Winke, 2013) reported that the enhanced group noticed substantially more enhanced items (in Winke’s study through longer gaze durations and rereading times via an eye-tracker) when compared to the unenhanced group, while Leow (2001b) reported no statistical difference. Given that the eyetracker is a more robust measure of attention, it can be safely concluded that enhancement does appear to draw more attention to targeted items in the input. Second, Leow (2001b) and Bowles (2003) also reported a few outliers who performed very well on the assessment tasks. Their think aloud protocols revealed awareness at the level of understanding, quite a high level of processing. Third, in spite of this greater amount of attention paid to the enhanced items, the five studies (Alanen, 1995; Bowles, 2003; Leow, 2001b; Leow et al., 2003; and Winke, 2013) that addressed the effects of enhancement on subsequent L2 development reported no benefits for enhancement when compared directly with an unenhanced group, while the sixth (Gurzynski-Weiss et al., 2015), which did not address L2 development, reported no relationship between level of awareness and

Your Attention, Please

169

enhancement. As Winke (2013) concluded, enhancement (in her study) functioned as intuitively and originally as Sharwood Smith (1991, 1993) proposed; it promoted substantially more noticing, but in itself was not effective for improved performance when compared to the unenhanced exposure. What, then, explains the lack of benefits for subsequent L2 development? Two simple plausible explanations were revealed by the protocols and suggested by the eye-tracking data: (1) Participants did not necessarily attempt to process enhanced items in the input for linguistic information at any deep level, but simply tried to extract semantic information from the targeted forms. This is not surprising given that instructions provided to learners in these studies typically requested that they read or listen to the L2 input for information. (2) This issue of depth of processing is underscored by the few participants in the two studies who did demonstrate a higher level of awareness of the targeted items, together with accompanying higher levels of processing, and performed significantly differently from those who processed these items at a low level. Additional nonconcurrent support may come from Shook (1994), whose experimental group, instructed to pay closer attention to the enhanced items and formulate a rule, performed significantly better than the group not receiving this instruction. I shall elaborate more on this depth of processing later in Chapter 11.

Overviews of the Input Enhancement Strand of Research There are at least two overviews of the input enhancement strand of research (Han, Park, & Combs, 2008; Leow, 2009). Han et al. (2008) pointed out that studies investigating the effects of input enhancement may differ in their conclusions due to several methodological differences found in the research designs: (1) Employing simple versus compound enhancement; (2) employing isolated words versus sentences versus discourse as stimuli; (3) enhancing a meaningbearing versus a non-meaningful form; (4) employing learners with or without prior knowledge of the target form; (5) enhancing the target form many versus one or a few times; (6) using a longer versus a shorter text; (7) employing a single versus multiple short sessions over an extended period of time; (8) enhancing one form versus multiple forms; (9) providing (or not) comprehension support prior to the treatment; and (10) providing (or not) explicit instruction on what to focus on prior to the treatment (Han et al., 2008). Given the broad definition that the term input enhancement represents in the SLA literature, I (Leow, 2009) took a more research-design approach to the issue and categorized the studies into two sub-strands: (1) Studies that have incorporated some kind of input enhancement within some pedagogical intervention on the part of the teacher/researcher to make targeted items in the L2 input more salient in an effort to draw their attention to these enhanced items, and (2) those that have methodologically teased out the variable enhancement in their designs. The first sub-strand, which I called “conflated input enhancement” (CIE),

170

Empirical Research Investigating the Role of Attention

incorporates a permutation of the different exemplars of Sharwood Smith’s two axes, that is, elaboration and explicitness. Input enhancement is operationalized as a conflation of the variable input enhancement plus one or more independent variables (some not methodologically controlled), and the research design employed views this conflation as one independent variable. For example, participants in a CIE study may be exposed to L2 texts in which targeted grammatical forms are highlighted or enhanced in an effort to draw their attention to these forms while they read the texts. In addition to these enhanced texts, participants are also provided with grammatical information (e.g., feedback, explicit metalinguistic information or instruction) regarding the grammatical rules of the targeted forms in the texts. These two different sources of exposure and grammatical information are then combined or conflated to represent input enhancement of the L2 data. The control is not exposed to any enhanced input or provided with any grammatical information. While the direct comparisons between the two groups are (1) the absence or presence of enhancement, (2) the absence or presence of grammatical information, and (3) the absence or presence of enhancement plus grammatical information, the research designs of these studies typically address only the third comparison. In other words, the true effects of enhancement are conflated with those of grammatical information. Did the effects come from enhancement only, grammatical information only, or a combination of both? Only the last question is answered in the CIE design. The second sub-strand, called “non-conflated input enhancement” (NCIE), avoids conflating enhancement with another variable by methodologically teasing out the variable input enhancement and comparing an enhanced group (+enhanced input) with an unenhanced one (-enhanced input). For example, one experimental group receives a text enhanced with targeted forms bolded and underlined. The other group (control) receives the same text but without any forms bolded and underlined. No other source of exposure or grammatical information is provided. NCIE studies, then, address only the first direct comparison listed above for the CIE studies, namely the presence or absence of enhancement. Consequently, any statistical difference in performance can be attributed to the variable enhancement and nothing else, everything being equal. To summarize the critical review of the literature on the grammatical benefits of input enhancement, I (2009) reported three major findings. 1.

2.

A conflation of input enhancement with one or more variables (e.g., instruction, feedback, metalinguistic information, oral discussion, etc.) together with an exposure that lasts for more than three hours appears to contribute to better grammatical development with the caveat that it is unknown what specific role the variable input enhancement played in this development. In the written mode, the variable input enhancement provided in a period of less than one hour does not appear to hold any superior grammatical benefit when compared to the absence of such enhancement. While learners may

Your Attention, Please

3.

171

report noticing more targeted items in the input (Bowles, 2003; cf. Winke, 2013), the low level of awareness revealed in online data may indicate that internalization of grammatical information may require a higher level of awareness (Leow, 2001b; Leow et al., 2003). The issue of delayed effects of input enhancement on L2 grammatical development has not been robustly investigated.

Summary In summary, while studies investigating the effects of input enhancement on L2 development do overall report beneficial effects in itself, as measured by gain scores from pretest to posttest, whether its effects are superior to those of unenhanced input still requires more robust evidence. Two plausible explanations to account for many of the non-significant differences between enhanced and unenhanced input may lie in (1) the focus on reading for meaning and, as a consequence, (2) the low depth of processing employed in relation to grammatical items in the input. The results from input flooding studies (e.g., J. White, 1998; Williams & Evans, 1998) revealing little benefits appear to support these two plausible explanations.

Processing Instruction Strand Processing Instruction (PI) is an input-based instructional technique based on VanPatten’s (1996, 2004, 2007) model on learners’ input processing strategies (cf. Chapter 5). According to VanPatten, PI is designed to affect L2 learners’ existing learning strategies employed during input processing, which in turn should have an impact on their developing linguistic system. To this end, PI provides learners first with (a) explicit, non-paradigmatic (i.e., forms and relevant examples presented sequentially) grammatical instruction that includes input through examples and information about a processing strategy (e.g., in Spanish object pronouns usually come before the verb and not after the verb as in English, etc.), (b) structured input practice comprised of meaningful activities that are both referential and affective, and (c) implicit feedback, that is, the participants are not corrected explicitly. Structured input is “input that is manipulated in particular ways so that learners become dependent on form and structure to get meaning and/or to privilege the form or structure in the input so that learners have a better chance of attending to it” (VanPatten, 2002: 764–765). The typical research design of the first set of PI studies is exemplified in VanPatten and Cadierno’s (1993) study in which they empirically investigated the effectiveness of PI when compared to a more traditional instruction (TI) on the acquisition of Spanish preverbal direct object pronouns. Participants were college-level students of Spanish at the second-year level and divided by class into one of the following three instructional conditions: PI, TI, and control

172

Empirical Research Investigating the Role of Attention

(in which participants simply followed their regular syllabus with no experimental exposure to the targeted L2 form or structure, the classic maturational control group that assumes that over time learning will eventually take place). The instructional differences between the two experimental conditions were the following: For PI, instruction (grammatical explanation) was non-paradigmatic and comprised information on processing strategies to use when exposed to sentences with Spanish preverbal object pronouns, while for TI the instruction was paradigmatic, together with examples. Type of practice also differed. For PI, practice was always meaningful and focused on meaning (input-based), while for TI practice was meaningful sometimes and was focused on production (outputbased). L2 development was measured via an interpretation task (participants had to select one of two options) and a controlled written production task (a sentence level completion).

Empirical Studies VanPatten and Cadierno (1993) was the first study in this strand of research and reported that PI significantly outperformed TI and control on the interpretation task but performed similarly on the production task when compared to TI. TI outperformed the control on the production task. According to VanPatten and Cadierno, the findings of this study indicate that PI appears to be superior to TI. VanPatten and Cadierno (1993) was followed by a slew of replication and extension studies. Following the original research design, they addressed new linguistic forms or structures (e.g., the Spanish preterit form in Cadierno, 1995; the copulas ser and estar in Cheng, 2002; the Italian future tense in Benati, 2001; the French causative in Allen, 2000; the Spanish subjunctive in Collentine, 1998, etc.) and included a wider range of assessment tasks (e.g., sentence level in VanPatten & Sanz, 1995; a suprasentential production task in Cheng, 2002, etc.). Other studies addressed variables that arose of the first set of PI studies to include a more fine-grained view of TI as output instruction and its accompanying issue of meaningfulness (e.g., Benati, 2001; Morgan-Short & Wood Bowden, 2006), which appeared to play some role in L2 development, the role of explicit grammar instruction in relation to structured input only (SIO), or both (full PI) (Benati, 2004; Farley, 2004; VanPatten & Oikkenon, 1996), which revealed that while both full PI and SIO appear to promote L2 development, full PI appeared to hold an edge over SIO. A more robust study (Morgan-Short & Wood Bowden, 2006) investigated the effects of meaningful input- and output-based practice on L2 development. Forty-five first-semester Spanish students were assigned to PI, meaningful output-based instruction (MOBI), or control groups. PI and MOBI received the same input in instruction but received meaningful practice that was inputor output-based. Both PI and MOBI showed significant gains on immediate and delayed interpretation and production tasks. Repeated-measures ANOVAs

Your Attention, Please

173

showed that overall for interpretation both experimental groups outperformed the control group. For production, only the meaningful output-based group outperformed the control group. Morgan-Short and Wood Bowden concluded that these results indicate that not only input-based but also output-based instruction and practice can lead to linguistic development.

Summary The overall findings of the PI strand of research appear to reveal the following results: (1) PI appears to promote L2 development for different linguistic forms (Benati, 2001; Cadierno, 1995; Cheng, 2002; VanPatten & Cadierno, 1993; VanPatten & Sanz, 1995); (2) while PI appears to be more effective than TI in some cases (Cadierno, 1995; VanPatten & Cadierno, 1993), it is noted that significant differences are found on the interpretation task and whenever TI is more meaningful, or when the targeted linguistic form is more semantically complex (a point made by DeKeyser, Salaberry, Robinson, & Harrington, 2002), the advantage of PI is debatable (Benati, 2001; Cheng, 2002); (3) while SIO appears to be sufficient for L2 development, the full form of PI appears to hold more potential for development (Benati, 2004; Farley, 2004; VanPatten & Oikkenon, 1996; Wong, 2004); and (4) arguments made in this strand, that the only way to impact the developing internal system is via input-based exposure, need to be watered down given the finding that not only input-based but also output-based instruction and practice can lead to linguistic development (Morgan-Short & Wood Bowden, 2006). While VanPatten argues that his 2004 model of input processing, which feeds his pedagogical offshoot Processing Instruction, is only concerned with input processing per se and not with the construct of attention, he does provide a role for attention and, more specifically, noticing prior to input processing. Consequently, based on the consistent positive findings reported in this strand of research, it goes without saying that the role attention and noticing played in these studies clearly was important in subsequent L2 development.

Interactional Feedback Strand Since Hatch’s (1978: 63) proposal that “language learning evolves out of learning how to carry on conversations, out of learning how to communicate” and Long’s (1981) initial formulation of the Interaction Hypothesis that claimed that “participation in conversation with native speakers, made possible through modification of interaction, is the necessary and sufficient condition for SLA” (p. 275), research on interactional feedback in SLA has grown exponentially over the last three decades. It continues to strive currently with the establishment of this hypothesis as an interactionist approach (Gass & Mackey, 2007) that subsumes tenets of other theoretical underpinnings (e.g., Schmidt’s Noticing Hypothesis,

174

Empirical Research Investigating the Role of Attention

Krashen’s Input Hypothesis, and Swain’s Output Hypothesis). The core tenets of the Interactionist Approach are the following: “[I]nteractionally modified input, having the learner’s attention drawn to his/her interlanguage and to the formal features of the L2, opportunities to produce output, and opportunities to receive feedback” (Mackey, Abbuhl, & Gass, 2012: 10). It is arguably one of the most productive strands of research in SLA, having moved away from, according to Mackey et al., addressing whether interaction has an effect on L2 development to probing deeper into the role of individual difference (e.g., working memory, affective factors), the L2 aspects benefitting most from interaction, types of interaction, and, more specifically, types of feedback that contribute the most to L2 learners. Indeed, testimony gleaned from three meta-analyses (Li, 2010; Lyster & Saito, 2010; Russell & Spada, 2006) on corrective feedback appear to support the overall conclusion that corrective oral and written feedback in both laboratory and classroom settings were beneficial (Russell & Spada, 2006), that prompts more than recasts were longer lasting (Lyster & Saito, 2010), and that explicit feedback was superior to implicit feedback on both immediate and shortdelayed posttests, while feedback provided in a formal setting was better than that provided in second-language setting (Li, 2010). Probably the most popular type of feedback investigated has been the recast, which is typically defined as a reformulation of the original utterance produced by the L2 learner while keeping the meaning or communicative focus established. However, the effectiveness of oral interaction with recasts for L2 development has been relatively mixed, with some studies reporting limited to no learning benefits for recasts (e.g., Lyster, 2004), and other studies supporting their facilitative role in L2 development (cf. Mackey & Goo’s, 2007, meta-analysis). For an interesting debate on this issue, the reader is encouraged to read Lyster and Ranta (2013) and Goo and Mackey (2013). At the same time, the popularity of the recast has been expanded to include written feedback in the electronic platform, for example, synchronous computer-mediated communication (SCMC) (e.g., Baralt, 2013; Sachs & Suh, 2007; Yilmaz, 2012; Yilmaz & Yuksel, 2011). Some major issues associated with the effectiveness of recasts have been raised. One issue lies with learners’ perception of the recast, that is, whether it was a corrective reformulation of their own statement or a simple repetition (e.g., Lyster, 1998; Mackey, Gass, & McDonough, 2000). To reduce misinterpretation, suggestions have been made to increase the salience (e.g., overt prosodic cues, shorter forms, stress or intonation) of recasts (e.g., Loewen & Philp, 2006; Lyster, 1998; Philp, 2003), which is hypothesized to alleviate working memory constraints that may come from processing the juxtaposition of recasts to original non-target-like utterances. Another issue with recast effectiveness is a methodological one that addresses the very premise underlying the provision of recasts, namely the need to ascertain that learners not only perceived or paid attention to the recast (cf. Leow, 1999), but also were minimally aware of the target of this feedback. A corollary of this methodological issue is the dearth of cognitive data

Your Attention, Please

175

we can gather during oral interaction. Methodological efforts have been made to partially address this issue (cf. Gass and Mackey’s (2000) stimulated recall protocol and Philp’s (2003) study, in which Philp used the “knock-knock” technique, stopping the oral interaction several times to ask participants to share their noticing of the targeted form in the recast in an effort to gather concurrent data on her learners’ processes). This can only be done by gathering concurrent data while learners are processing the recasts, and the expansion into the technology platform offers some insights into these issues discussed here.

Synchronous Computer-Mediated Communication (SCMC) The first question pertaining to the type of medium (oral versus computer) is whether the type of interaction performed in an oral setting can be duplicated in a computerized setting. Several studies have empirically demonstrated that interaction in the SCMC medium does appear to provide relatively similar opportunities for interaction, negotiation for meaning, and the provision of feedback when compared to those that occur in the oral mode (cf. Li’s (2010) meta-analysis; Ortega, 2009, for a review). If we think of the issues discussed above, it is quite easy to assume that the characteristics of text-based SCMC can potentially enhance the efficacy of recasts. Text-based SCMC, for example, may allow non-salient linguistic forms or structures to gain more saliency (either externally by the interlocutor via enhancement or by the learners themselves) and, consequently, hold more potential for being noticed and processed further by the L2 learner/reader. The slower turn taking in SCMC may also permit the L2 learner/reader the opportunity not only to process the recast a bit longer when compared to oral interaction, but also to spend more time preparing the response (Lai & Zhao, 2006; Smith, 2005). In addition, given that turns in text chats remain on screen, the opportunity to make comparisons between recasts and previous productions is provided, as reported in Baralt (2013). Indeed, it is well accepted that the SCMC platform holds more potential for the manipulation of the salience or enhancement of targeted items in the recasts when compared to the oral mode. One logical advantage of enhancement in this medium may be that L2 learners are not only reading for meaning but are also participating in a productive and interactive activity, which should encourage some deeper processing of the enhanced feedback or recast (but see Sachs & Suh, 2007, below). Finally, given the potential to gather concurrent data during online interaction (e.g., Gurzynski-Weiss et al., 2015; Sachs & Suh, 2007), SCMC appears to be a satisfactory methodological platform to empirically address the roles played by the attention to, noticing, and further processing of the recasts provided, upon which such saliency is premised. To this end, I shall report on two studies that have used the SCMC platform to address the effects of recasts in addition to levels of awareness, enhancement, and type of linguistic item.

176

Empirical Research Investigating the Role of Attention

Empirical SCMC Studies Sachs and Suh (2007) is one attempt to address the issues discussed above (i.e., enhancing the recast to address the issue of misperception and adhering to a more robust research design to gather concurrent data on learner processes). More specifically, Sachs and Suh employed the SCMC medium to combine three strands of research (interactional recasts, textual enhancement, and awareness) by investigating the effects of two types of computerized recasts (typographically enhanced and unenhanced) and the relationship between reported levels of awareness and L2 learners’ subsequent performance on a targeted grammatical structure. The participants were 30 Korean intermediate and high-intermediate learners who were randomly assigned to one of four experimental conditions based on a permutation of ± enhanced recast and ± think aloud. The targeted linguistic structure was English back-shifting of verbs in the past tense to the past perfect tense. They interacted with an English native speaker via text chat and the task they used was a guided story retelling. To complete this task, participants read a story in their L1 and then retold the story to the researcher in English with prompts that attempted to solicit their use of the targeted structure. While performing the task, they received online enhanced or unenhanced recasts on this structure. The structure was enhanced by underlining the matrix verb and bolding the back-shifted verb. For example, “He said that she had lied about her job.” To gather data on participants’ reported levels of awareness, Sachs and Suh required that some participants think aloud as they communicated with the researcher, while others did not, to control for the issue of reactivity in their research design. The researchers then coded participants’ verbal think aloud protocols for instances of levels of awareness. Sachs and Suh reported significant gains in both groups from the pretest to posttest, which indicated that learning, based on recasts and interaction in SCMC, did take place, replicating the overall benefits of interactional feedback on L2 development reported in the interactionist strand of research. While no effect for saliency via textual enhancement was found, they did report that higher levels of awareness led to better L2 development, once again replicating previous findings in the awareness strand of research, namely higher levels of awareness are correlated with higher levels of performance. Two important issues were raised in this study, namely how levels of awareness are potentially related to type of linguistic item, and the potential interaction between type of recast (enhanced vs. unenhanced) and type of linguistic item in SCMC. These issues were addressed by Gurzynski-Weiss et al. (2015). Gurzynski-Weiss et al. (2015) addressed whether levels of awareness (high vs. low) were related to type of feedback (enhanced vs. unenhanced) and to type of linguistic item (lexis vs. morphology vs. syntax), and whether there were interactions between these variables. Participants were 24 second- and third-semester adult college-level L2 learners of Spanish, who were randomly assigned to either an enhanced or unenhanced recast group. Each participant completed three

Your Attention, Please

177

consecutive story re-tell tasks (each with a different linguistic item) individually via iChat with a native or near-native interlocutor while thinking out loud. Each re-tell contained 12 target linguistic items. When participants made a mistake with a target item, the interlocutor provided an enhanced or unenhanced recast. Results revealed that learners’ reported level of awareness was not related to type of recast (enhanced or unenhanced), providing empirical support for the external validity of the original study (Sachs & Suh, 2007), and studies in the enhancement strand of research reporting similar non-significant differences between enhanced and unenhanced input and L2 development. However, level of awareness was found to be significantly related to type of linguistic item, as reported in previous studies (e.g., Leow, 1997a; Leow et al., 2003; Leow et al., 2008). Learners reported more awareness for recasts targeting lexis as compared to morphology and syntax. The authors pointed out these findings deviated somewhat from other studies that have examined saliency within target items that share the same physical feature, such as morphemes (e.g., Yilmaz, 2012; Yilmaz & Yuksel, 2011). They recommended that future research address whether type of linguistic item that includes lexis, morphology, and syntax provided within recasts is related to subsequent L2 performance, and it would be worthwhile to pursue this line of research with the end goal of determining what feedback types might be best for specific linguistic items in both traditional and online learning environments.

Summary The overall benefits of interactional feedback have been relatively well established in the SLA field, and this is not surprising if we were to consider that feedback provides information that can confirm or disconfirm our initial hypotheses or even reinforce our prior knowledge of some linguistic data we hold in our brains. The caveat here is, of course, whether the L2 learner does do something with the feedback received or even has time to process it further. Mackey et al. (2012) provide some useful directions for future research that include a potential marriage between cognitive neuroscience (read electro-encephalography, EEG and functional magnetic resonance imaging, fMRI) and learners’ internal processes employed during interaction with feedback (cf. also Mackey, 2006), individual differences, the inclusion of delayed posttests to address robust retention, replication studies to provide empirical support for the external validity of the original studies, the role of the context (in and out of the classroom, a more socio-cognitive perspective), and more research in the SCMC platform. These avenues of research will keep the Interactionist Approach quite alive for many years to come.

Conclusion To establish the presence of attention/noticing in input processing, arguably the most productive measure should be one that is conducted concurrently while learners are exposed to and engaged in processing the L2 data. Three concurrent

178

Empirical Research Investigating the Role of Attention

or online measures that are used in both SLA and non-SLA fields are online verbal reports or think aloud protocols, eye-tracking, and reaction time ( Chapter 8). The revelations of these online procedures, together with those of studies employing offline procedures, clearly establish the fact that in SLA the role of attention is minimally important for any linguistic data to have the chance of being further processed. So telling our student to “pay attention” is fine, but just be aware that simply paying attention is not going to cut it. Students need to do something with the input they receive for it to remain a short while in working memory, and if we can get them to process the information, we may be on the right track in promoting L2 development. It is also important to keep a clear distinction between attention and processing as noted in the terminology in the SLA field. For example, we have a strand of “simultaneous attention to form and meaning” (e.g., VanPatten, 1990), where the term “attention” appears to be following the perceptual and visual attention studies in non-SLA field. We also have the recent phenomenon of “blocked attention” (e.g., Ellis & Sagarra, 2010), which appears once again to be referring to some deeper processing of the redundant item in the input instead of mere attention. Keep this distinction in mind as we move on to the studies on the role of awareness in L2 learning, and, more specifically, that thorny dual phrase of implicit learning.

References Alanen, R. (1995). Input enhancement and rule presentation in second language acquisition. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and teaching (pp. 395–411). Honolulu, HI: University of Hawai’i Press. Allen, L. Q. (2000). Form-meaning connections and the French causative: An experiment in processing instruction. Studies in Second Language Acquisition, 22, 69–84. Baralt, M. (2013). The impact of cognitive complexity on feedback efficacy during online versus face-to-face interactive tasks. Studies in Second Language Acquisition, 35, 689–725. Benati, A. (2001). A comparative study of the effects of processing instruction and outputbased instruction on the acquisition of the Italian future tense. Language Teaching Research, 5(2), 95–127. Benati, A. (2004). The effects of structured input activities and explicit information on the acquisition of the Italian future tense. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 207–225). Mahwah, NJ: Lawrence Erlbaum. Bitchener, J., & Knoch, U. (2010). The contribution of written corrective feedback to language development: A ten month investigation. Applied Linguistics, 31(2), 193–214. Bowles, M. A. (2003). The effect of textual input enhancement on language learning: An online/offline study of fourth semester Spanish students. In P. Kempchinshky & C. Pineros (Eds.), Theory, practice, and acquisition: Papers from the 6th Hispanic linguistic Symposium and 5th Conference on the Acquisition of Spanish and Portuguese (pp. 395–411). Summerville, MA: Cascadilla Press. Cadierno, T. (1995). Formal instruction from a processing perspective: An investigation into the Spanish past tense. The Modern Language Journal, 19, 179–193.

Your Attention, Please

179

Cheng, A. C. (2002). The effects of processing instruction on the acquisition of ser and estar. Hispania, 85, 2, 308–323. Collentine, J. (1998). Processing instruction and the subjunctive. Hispania, 81, 576–587. DeKeyser, R. M., Salaberry, R., Robinson, P., & Harrington, M. (2002). What gets processed in processing instruction? A commentary on Bill VanPatten’s “Processing instruction: An update.” Language Learning, 52 (4), 805–823. Doughty, C. (1991). Second language instruction does make a difference: Evidence from an empirical study of SL relativization. Studies in Second Language Acquisition, 13, 431–496. Egi, T. (2008). Investigating stimulated recall as a cognitive measure: Reactivity and verbal reports in SLA research methodology. Language Awareness, 17(3), 212–228. Ellis, N. C., & Sagarra, N. (2010). The bounds of adult language acquisition: Blocking and learned attention. Studies in Second Language Acquisition, 32, 553–580. Farley, A. P. (2004). Processing instruction and the Spanish subjunctive: Is explicit information needed? In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 227–239). Mahwah, NJ: Lawrence Erlbaum. Fodor, J. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fotos, S. (1994). Integrating grammar instruction and communicative language use through grammar-consciousness tasks. TESOL Quarterly, 28 (2), 323–351. Gass, S. M., & Mackey, A. (2000). Stimulated recall methodology in second language research. Mahwah, NJ: Lawrence Erlbaum. Gass, S. M., & Mackey, A. (2007). Input, interaction, and output in second language acquisition. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 175–199). Mahwah, NJ: Lawrence Erlbaum. Goo, J., & Mackey, A. (2013). The case against the case against recasts. Studies in Second Language Acquisition, 35, 127–165. Greenslade, T., Bouden, L., & Sanz, C. (1999). Attending to form and content in processing L2 reading texts. Spanish Applied Linguistics, 3, 65–90. Gurzynski-Weiss, L., Al Khalil, M., Baralt, M., & Leow, R. P. (2015). Levels of awareness in relation to type of recast and type of linguistic item in synchronous computermediated communication: A concurrent investigation. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Han, Z., Park, E. S., & Combs, C. (2008). Textual enhancement of input: Issues and possibilities. Applied Linguistics, 29 (4), 597–618. Han, Z.-H., & Peverly, S. (2007). Input processing: A study of ab initio learners with multilingual backgrounds. The International Journal of Multilingualism, 4 (1), 17–37. Hatch, E. M. (1978). Acquisition of syntax in a second language. In J. C. Richards (Ed.), Understanding second and foreign language learning (pp. 34–70). Rowley, MA: Newbury House. Hornstein, N., & Lightfoot, D. (1981). Introduction. In N. Hornstein and D. Lightfoot (Eds.), Explanation in linguistics: The logical problem of language acquisition. London: Longman. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An experimental study on ESL relativization. Studies in Second Language Acquisition, 24, 541–577.

180

Empirical Research Investigating the Role of Attention

Jackendoff, R. (1987). Consciousness and the computational mind. Cambridge, MA: MIT Press. Jourdenais, R., Ota, M., Stauffer, S., Boyson, B., & Doughty, C. (1995). Does textual enhancement promote noticing? A think-aloud protocol analysis. In Schmidt, R. (Ed.), Attention and awareness in foreign language learning. Honolulu, HI: University of Hawai’i Press. Kim, Y. (2006). Effects of input elaboration on vocabulary acquisition through reading by Korean learners of English as a second language. TESOL Quarterly, 40 (2), 341–370. Lai, C., & Zhao, Y. (2006). Noticing and text-based chat. Language Learning & Technology, 10, 102–120. Lee, Sang-Ki. (2007). Effects of textual enhancement and topic familiarity on Korean EFL students’ reading comprehension and learning of passive form. Language Learning, 57(1), 87–118. Leeman, J. (2003). Recasts and second language development: Beyond negative evidence. Studies in Second Language Acquisition, 25, 37–63. Leeman, J., Arteagoitia, I., Fridman, B., & Doughty, C. (1995). Integrating attention to form with meaning: Focus on form in content-based Spanish instruction. In R. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 217–258). Honolulu, HI: University of Hawaii, Second Language Teaching and Curriculum Center. Leow, R. P. (1993). To simplify or not to simplify: A look at intake. Studies in Second Language Acquisition, 15, 333–355. Leow, R. P. (1995). Modality and intake in second language acquisition. Studies in Second Language Acquisition, 17, 79–89. Leow, R. P. (1997a). Attention, awareness, and foreign language behavior. Language Learning, 47(3), 467–505. Leow, R. P. (1997b). The effects of input enhancement and text length on adult L2 readers’ comprehension and intake in second language acquisition. Applied Language Learning, 8, 151–182. Leow, R. P. (1999). The role of attention in second/foreign language classroom research: Methodological issues. In F. Martínez-Gil & J. Gutiérrez-Rexach (Eds.), Advances in Hispanic linguistics: Papers from the 2nd Hispanic Linguistics Symposium (pp. 60–71). Somerville, MA: Cascadilla Press. Leow, R. P. (2001a). Attention, awareness, and foreign language behavior. Language Learning, 51(Suppl. 1), 113–155. Leow, R. P. (2001b). Do learners notice enhanced forms while interacting with the L2? An online and offline study of the role of written input enhancement in L2 reading. Hispania, 84 (3), 496–509. Leow, R. P. (2009). Input enhancement and L2 grammatical development: What the research reveals. In J. Watzinger-Tharp & S. L. Katz (Eds.), Conceptions of L2 grammar: Theoretical approaches and their application in the L2 classroom (pp. 16–34). Boston: Heinle Publishers. Leow, R. P., Egi, T., Nuevo, A. M., & Tsai, Y-C. (2003). The roles of textual enhancement and type of linguistic item in adult L2 learners’ comprehension and intake. Applied Language Learning, 13(2), 1–16. Leow, R. P., Hsieh, H-C., & Moreno, N. (2008). Attention to form and meaning revisited. Language Learning, 58, 665–695. Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60 (2), 309–365. Loewen, S., & Philp, J. (2006). Recasts in the adult L2 classroom: Characteristics, explicitness and effectiveness. The Modern Language Journal, 90, 536–556.

Your Attention, Please

181

Long, M. (1981). Input, interaction, and second language acquisition. Annals of the New York Academy of Sciences, 379, 259–278. Lyddon, P. (2011). The efficacy of corrective feedback and textual enhancement in promoting the acquisition of grammatical redundancies. The Modern Language Journal, 95, 104–129. Lyster, R. (1998). Recasts, repetition, and ambiguity in L2 classroom discourse. Studies in Second Language Acquisition, 20, 51–81. Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction. Studies in Second Language Acquisition, 26, 399–432. Lyster, R., & Ranta, L. (2013). Counterpoint piece: The case for variety in corrective feedback research. Studies in Second Language Acquisition, 35, 167–184. Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA. Studies in Second Language Acquisition, 32, 265–302. Mackey, A. (2006). From introspections, brain scans, and memory tests to the role of social context: Advancing research on interaction and learning. Studies in Second Language Acquisition, 28, 369–379. Mackey, A., Abbuhl, R., & Gass, S. M. (2012). Interactionist approach. In S. M. Gass and A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 7–23). New York: Routledge. Mackey, A., Gass, S. M., & McDonough, K. (2000). How do learners perceive interactional feedback? Studies in Second Language Acquisition, 22, 471–497. Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 407–452). Oxford: Oxford University Press. Mackey, A., McDonough, K., Fujii, A., & Tatsumi, T. (2001). Investigating learners reports about the L2 classroom. International Review of Applied Linguistics, 39 (4), 285–308. Morgan-Short, K., Heil, J., Botero-Moriarty, A., & Ebert, S. (2012). Allocation of attention to second language form and meaning: Issues of think alouds and depth of processing. Studies in Second Language Acquisition, 34, 4, 659–685. Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. (2010). Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning, 60, 154–193. Morgan-Short, K., & Wood Bowden, H. (2006). Processing instruction and meaningful outputbased instruction: Effects on second language development. Studies in Second Language Acquisition, 28, 31–65. Ortega, L. (2009). Understanding second language acquisition. London: Hodder Education. Overstreet, M. (1998). Text enhancement and content familiarity: The focus of learner attention. Spanish Applied Linguistics, 2, 229–258. Philp, J. (2003). Constraints on ‘noticing the gap’: Nonnative speakers’ noticing of recasts in NS-NNS interaction. Studies in Second Language Acquisition, 25, 99–126. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule search and instructed conditions. Studies in Second Language Acquisition, 18, 27–67. Rosa, E., & Leow, R. P. (2004a). Computerized task-based exposure, explicitness, type of feedback, and Spanish L2 development. Modern Language Journal, 88, 193–217. Rosa, E., & Leow, R. P. (2004b). Awareness, different learning conditions, and L2 development. Applied Psycholinguistics, 25, 269–292. Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In J. Norris & L. Ortega (Eds.),

182

Empirical Research Investigating the Role of Attention

Synthesizing the research on language learning and teaching (pp. 133–164). Amsterdam: John Benjamins. Sachs, R., & Suh, B-R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 197–227). Oxford: Oxford University Press. Sanz, C., & Morgan-Short, K. (2004). Positive evidence vs. explicit rule presentation and explicit negative feedback: A computer-assisted study. Language Learning, 53(4), 35–78. Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 3–32). New York: Cambridge University Press. Schmidt, R., & Frota, S. N. (1986). Developing basic conversational ability in a second language: A case study of an adult learner of Portuguese. In R. R. Day (Ed.), Talking to learn: Conversation in second language acquisition (pp. 237–326). Rowley, MA: Newbury House. Sharwood Smith, M. (1981). Consciousness-raising and the second language learner. Applied Linguistics, 2, 159–168. Sharwood Smith, M. (1991). Speaking to many minds: On the relevance of different types of language information for the L2 learner. Second Language Research, 7(2), 118–132. Sharwood Smith, M. (1993). Input enhancement in instructed SLA: Theoretical bases. Studies in Second Language Acquisition, 15(2): 165–179. Shook, D. J. (1994). FL/L2 reading, grammatical information, and the input-to-intake phenomenon. Applied Language Learning, 5(1): 57–93. Smith, B. (2005). The relationship between negotiated interaction, learner uptake, and lexical acquisition in task-based computer-mediated communication. TESOL Quarterly, 39, 33–58. Smith, B. (2012). Eye-tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning & Technology, 16 (3): 53–81. Storch, N., & Wigglesworth, G. (2010). Learners’ processing, uptake, and retention of corrective feedback on writing: Case studies. Studies in Second Language Acquisition, 32, 303–334. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. VanPatten, B. (1990). Attending to form and content in the input: An experiment in consciousness. Studies in Second Language Acquisition, 12, 287–301. VanPatten, B. (1996). Input processing and grammar instruction: Theory and research. Norwood, NJ: Ablex. VanPatten, B. (2002). Processing instruction: An update. Language Learning, 52, 755–803. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–32). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B. (2007). Input processing in adult second language acquisition. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 115–135). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B., & Cadierno, T. (1993). Explicit instruction and input processing. Studies in Second Language Acquisition, 15, 225–243. VanPatten, B., & Oikkenon, S. (1996). Explanation versus structured input in processing instruction. Studies in Second Language Acquisition, 18, 495–510. VanPatten, B., & Sanz, C. (1995). From input to output: Processing instruction and communicative tasks. In F. Eckman, D. Highland, P. Lee, J. Mileham, & R. Rutkowski

Your Attention, Please

183

(Eds.), Second language acquisition: Theory and pedagogy (pp. 169–185). Hillsdale, NJ: Lawrence Erlbaum. White, J. (1998). Getting the learners’ attention: A typographical input enhancement study. In C. Doughty & J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 85–113). New York: Cambridge University Press. White, L., Spada, N., Lightbown, P., & Ranta, L. (1991). Input enhancement and L2 question formation. Applied Linguistics, 12, 416–432. Williams, J., & Evans, J. (1998). What kind of focus and on which forms? In C. Doughty & J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 139–155). Cambridge: Cambridge University Press. Winke, P. (2013). The effects of input enhancement on grammar learning and comprehension: A modified replication of Lee, 2007, with eye-movement data. Studies in Second Language Acquisition, 35(2), 323–352. Wong, W. (2001). Modality and attention to meaning and form in the input. Studies in Second Language Acquisition, 23, 345–368. Wong, W. (2003). Textual enhancement and simplified input: Effects on L2 comprehension and acquisition of non-meaningful grammatical form. Applied Language Learning 13(2), 17–45. Wong, W. (2004). Processing instruction in French: The roles of explicit information and structured input. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 187–205). Mahwah, NJ: Lawrence Erlbaum. Yilmaz, Y. (2012). The relative effects of explicit correction and recasts on two target structures via two communication modes. Language Learning, 62, 1134–1169. Yilmaz, Y., & Yuksel, D. (2011). Effects of communication mode and salience on recasts: A first exposure study. Language Teaching Research, 15, 457–477.

10 LEARNING EXPLICITLY OR IMPLICITLY That Is the Question

As you will recall from Chapters 2 and 4, the construct of awareness has been present in the SLA field since its inception and has been couched in several terms, such as implicit versus explicit and deductive versus inductive. From a theoretical perspective, we have the empiricist versus the rationalist, from a method perspective, we have Grammar Translation Method (explicit) versus Audiolingual Method (implicit). From an instructional perspective, we have inductive (implicit) instruction, that is assumed to involve implicit learning, versus deductive (explicit) instruction, that is assumed to involve explicit learning. Even a research perspective reveals quite a mixture of approaches that are based on the theoretical underpinnings upon which the studies are grounded. From the early 1990s (McLaughlin, 1990; Schmidt, 1990), the term awareness entered the SLA literature in lieu of consciousness, and, subsequently, several empirical studies began to incorporate some reference to learners’ awareness of targeted forms or structures in the input in an effort to provide an interpretation of the results of the study. Different sources have provided the data for such references; for example, findings from online or offline data elicitation procedures, the researchers’ personal interpretation of the results of the study, etc. In this chapter we are going to discuss the definition and operationalization of this thorny construct and then the empirical research in SLA on the role of awareness and lack thereof in L2 learning, or, namely, implicit learning and explicit learning.

Addressing the Role of (Un)Awareness in L2 Learning or Implicit and Explicit Learning To address directly the role of (un)awareness in the L2 learning process, it is important to include only studies that have methodologically investigated this construct as an independent variable and investigated it within naturally occurring

Learning Explicitly or Implicitly 185

languages (e.g., Hama & Leow, 2010; Leow, 2000). For example, studies that have created so-called experimental learning or training conditions (e.g., DeKeyser, 1995; N. C. Ellis, 1993; Morgan-Short, Sanz, Steinhauer, & Ullman, 2010; Robinson, 1996, 1997) in which a specific type of learning is assumed to take place based on experimental conditions/instructions may not be sensitive enough to address the role awareness or lack thereof played in the learning process. As mentioned earlier, several studies that have employed both concurrent (e.g., Alanen, 1995; Leow, 2000; Rosa & O’Neill, 1999) and non-concurrent (e.g., Robinson, 1997) data elicitation procedures have reported clear instances of behavior that did not represent the type of learning assumed to have occurred within each experimental condition. Likewise, studies that have subsumed the construct of awareness within, for example, type of instruction (e.g., de Graaff, 1997; Sato & Ballinger, 2012), noticing (Mackey, 2006; Smith, 2012; Uggen, 2012), and so on are not included in the review. First, let us define and operationalize the construct of awareness.

Defining the Construct of Awareness in SLA The definitions of the construct of awareness in SLA include Tomlin and Villa’s (1994) restricted definition of awareness as “a particular state of mind in which an individual has undergone a specific subjective experience of some cognitive content or external stimulus” (p. 193), and Schachter’s (1989) definition that awareness “refers to a state of mind in which one has become cognizant of the regularities underlying the data” (p. 577). Awareness, according to Leow (1997, 2001) and based on Allport’s (1988) criteria for the presence of awareness, may be demonstrated through (a) some resulting behavioral or cognitive change, (b) a metareport of the experience but without any metalinguistic description of a targeted underlying rule, or (c) a metalinguistic description of a targeted underlying rule. However, implicit learning, or learning without awareness, is an issue that has been debated in the SLA literature for quite some time (e.g., Hulstijn, 2005; N. C. Ellis, 1994; Schmidt, 1994a; Williams, 2009) and that parallels the contentious topic taking place for over four decades in cognitive psychology (cf. Dulaney, Carlson, & Dewey, 1985; Reber, 1967, 1976, 1989; Shanks, Green, & Kolodny, 1994; Shanks & St. John, 1994). Indeed, interest in the implicit learning strand of research has been increasing over the last four years (cf. Chan & Leung, 2014; Chen et al., 2011; Faretta-Stutenberg & MorganShort, 2011; Hama & Leow, 2010; Leung & Williams, 2011, 2012, 2014) since Leow (2000), who attempted to operationalize this construct concurrently or online, with many of these studies adopting Williams’s (2005) clever design. Like non-SLA fields, to date no solid consensus on its role in the L2 learning process has emerged. As I have stated (Leow, 2015), this current situation may have to do with the different perceptions of what comprises the phrase “implicit learning,” as evident in the several definitions and perspectives in the SLA literature and the manner in

186

Empirical Research Investigating the Role of Attention

which the construct of unawareness has been operationalized and measured. For example, some researchers view implicit learning as a direct relationship between (un)awareness and the L2 underlying rule. Definitions of implicit learning thus include input processing without a conscious intention to grapple with and learn the linguistic or grammatical features of the input information (Hulstijn, 2005: 131) and uninstructed learning, that is, “learning without the benefit of rule explanation” (Ortega, 2009: 157), which both appear to view implicit learning as being directly associated with the processing of the underlying linguistic rules of the L2. Recent definitions appear to equate implicit knowledge with the process of learning without awareness—for example, “[t]he term implicit learning was first employed by Arthur Reber (1967) to describe a process during which subjects in experimental studies acquire knowledge about a complex, rule-governed stimulus domain without intending to and without becoming aware of the knowledge they have acquired” (Rebuschat, 2013: 595–596), which also includes the notion of incidental learning (learning without intention) observed also in the following definition: “[L]earning that proceeds without awareness of what is being learned and without intention to learn it” (Leung & Williams, 2011: 33). Other researchers view awareness during learning as minimally some form of cognitive change taking place during input processing without any reference to the underlying linguistic rule. From this perspective, one broad definition of implicit learning is that any L2 learning may take place without any “awareness at the point of learning” (Schmidt, 1994b: 20) or “conscious operations” (N. C. Ellis, 1994) occurring on the learner’s part, which opens up the possibility of levels of awareness (cf. de la Fuente, 2015; Leow, 2001; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007; Schmidt, 1990). This distinction between these two perspectives of what comprises implicit learning appears to account for the two stages discussed below (concurrent vs. non-concurrent) at which the construct of awareness is currently operationalized and measured in the SLA literature (cf. Leow, Johnson, & Zárate-Sández, 2011).

Operationalizing the Construct of Awareness in SLA An overview of both non-SLA and SLA methodological approaches to operationalizing the construct of awareness in learning (Leow, Johnson, & Zárate-Sández, 2011) reported two stages. The first stage is at the concurrent (online) stage of encoding or accessing the incoming experimental information (e.g., Hama & Leow, 2010; Leow, 2000), that is, where learners receive, process, and encode online the incoming information (Stages 1 and 3). Operationalizing awareness or lack thereof at this stage views learning as a process and provides a richer insight into the actual point of learning the L2. The second stage is at the non-concurrent (offline) stage of retrieval of stored knowledge of the targeted linguistic rule (e.g., Chan & Leung, 2014; Chen et al., 2011; Faretta-Stutenberg & MorganShort, 2011; Leung & Williams, 2011, 2012, 2014; Williams, 2004, 2005), that is,

Learning Explicitly or Implicitly 187

when learners indicate offline (beyond Stage 5), after they have processed the incoming information, whether they were aware of the targeted underlying rule during the experimental exposure. Operationalizing awareness or lack thereof at this stage views learning as a product that is more closely associated with learned knowledge at Stage 4, and may not represent whether awareness or lack thereof occurred at Stages 1 and 3. Let us begin with the less controversial issue of explicit learning in the L2 context.

Learning With Awareness (Explicit Learning) Support for Explicit Learning: Empirical Concurrent Evidence in SLA Like the process of attention, explicit learning or learning with awareness is not controversial, and the evidence in support of this process is rather substantial. To provide a description of the online operationalization of the construct of awareness, I shall use my 1997 study and Rosa and O’Neill (1999). In the study, I quantitatively and qualitatively addressed the role of awareness in foreign language behavior in relation to Schmidt’s noticing hypothesis. My targeted item was a morphological form, Spanish “irregular” third person singular and plural preterit forms of stem-changing -ir verbs in Spanish. My experimental task was a crossword puzzle that was premised on the notion of task-essentialness (Loschky & Bley-Vroman, 1993), which simply means that in order to successfully complete the task, the participant needs to pay attention to the targeted forms in the input. To this end, I carefully created mismatches between the vowels that changed in the “irregular” third person singular and plural preterit forms of stem-changing -ir verbs in Spanish, for example, *morió “he died” > murió and *pedió “he asked for” > pidió. I undertook to ensure that noticing did indeed occur by employing think aloud protocols during the experimental exposure before attempting to address the role of levels of awareness and their effects on L2 behavior. Awareness in this study was based on Tomlin and Villa’s (1994) restricted definition and Allport’s (1988) criteria for the presence of awareness: (a) A show of some behavioral or cognitive change (e.g., verbal or written production of the stem-change of the targeted form) due to the experience and either (b) a report of being aware of the experience or (c) some form of metalinguistic description of the underlying rule. I then analyzed the think aloud protocols produced by 28 adult beginning learners of Spanish, who were required to complete a problem-solving task (a crossword puzzle), and their immediate performances on two post-exposure tasks designed to elicit recognition and written production of the targeted forms in Spanish. From the analysis of the think alouds I identified three levels of awareness: [+ cognitive change, − meta-awareness, − morphological rule formation], where participants did not provide a report of their subjective experience, nor did

188

Empirical Research Investigating the Role of Attention

they verbalize any rule, [+ cognitive change, + meta-awareness, − morphological rule formation], where participants did report their subjective experience but did not provide any verbalization of the rule, and [+ cognitive change, + metaawareness, + morphological rule formation], where participants provided both a report and a verbalization of rule formation (similar to Schmidt’s (1990) notion of understanding that is a higher level of awareness). Based on the data, I put forward three conclusions. First, different levels of awareness led to differences in processing. More specifically, meta-awareness appeared to correlate with an increased usage of hypothesis testing and morphological rule formation (conceptually-driven processing) while absence of metaawareness appeared to correlate with an absence of such processing. Second, the findings indicated that more awareness contributed to more recognition and accurate production of the noticed forms by facilitating or enhancing further processing of such forms contained in the L2 data. Finally, I concluded that the findings provided empirical support for the facilitative effects of awareness on foreign language behavior. Rosa and O’Neill (1999) extended the 1997 study by also employing a problemsolving task premised on the notion of task-essentialness to examine the role of awareness in L2 learning, and, more specifically, a syntactic structure that required the use of the imperfect subjunctive in contrary-to-fact conditional sentences in Spanish. The problem-solving task was a multiple-choice jigsaw puzzle divided into two pasted sections on a page: (1) A piece of the puzzle depicting an event, a person, or the result of an event, and (2) another piece of the puzzle with the main clause of a conditional sentence of either one of two experimental targeted structures. Each page also had three other pieces of the puzzle, each with a subordinate clause written on it. Participants were required to select one of the three un-pasted pieces that would correctly fit between the picture and the main subordinate clause. Sixty-seven adult L2 learners of Spanish were randomly divided into five conditions of different degrees of explicitness. Two factors were varied to create the five conditions: Explicit instruction on Spanish contrary-tofact conditional sentences and directions to search for rules. Concurrent data on learners’ awareness were gathered through the use of think aloud protocols performed while they were performing the problem-solving tasks. Rosa and O’Neill (1999) found that awareness at the level of noticing and at the level of understanding translated into a significant improvement in intake scores from the pretest to the posttest. In addition, they also found, like Leow (1997, 2001), that learners who demonstrated understanding of the target structure performed significantly better on intake posttests than learners who evidenced noticing only. These studies were followed by several others that also independently investigated the construct of awareness by employing concurrent or online data elicitation procedures to operationalize and measure this construct (e.g., de la Fuente, 2015; Hama & Leow, 2010; Hsieh, Moreno, & Leow, 2015; Leow, 2000, 2001; Martínez-Fernández, 2008; Medina, 2015; Rosa & Leow, 2004; Sachs &

Learning Explicitly or Implicitly 189

Suh, 2007). Quite a large range of linguistic items (e.g., Spanish imperatives, present perfect forms, past conditional, passive se ; Finnish locative suffixes; English sequence of tenses, etc.), language levels (e.g., beginning, intermediate, advanced), different assessment types (e.g., multiple-choice, controlled production, grammaticality judgment tasks, etc.), and stages (e.g., pre-experiment, immediate post-experiment, delayed post-experiment) also have been empirically investigated in these studies. Amount of exposure has also been different, ranging from less than an hour to over several days. Overall, these studies appear to provide relatively strong empirical support for the facilitative effects of awareness on foreign language behavior and learning. The major findings of these studies include (1) awareness at the level of noticing and understanding contributed substantially to a significant increase in learners’ ability to take in the targeted form or structure (Leow, 2000, 2001; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999) and to produce in writing the targeted form or structure (de la Fuente, 2015; Leow, 2001; Medina, 2015; Rosa & Leow, 2004), including novel exemplars (Rosa & Leow, 2004); (2) awareness at the level of understanding led to significantly more intake when compared to awareness at the level of noticing (Leow, 2001; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999); (3) there is a correlation between awareness at the level of understanding and usage of hypothesis testing / rule formation (Hsieh et al., 2015; Leow, 2000, 2001; Rosa & Leow, 2004; Rosa & O’Neill, 1999); (4) there is a correlation between level of awareness and formal instruction and directions to search for a rule (Rosa & O’Neill, 1999); (5) there is a correlation between awareness at the level of understanding and learning conditions providing an explicit pre-task (with grammatical explanation), as well as implicit or explicit concurrent feedback (Rosa & Leow, 2004); and (6) there is a strong correlation between reported awareness and comprehension and production scores (de la Fuente, 2015). In addition, there appears to be correlations between level or depth of processing, amount of cognitive effort, and level of awareness (Leow, 2012). These correlations will be discussed more fully in Chapter 11. The benefits of explicit learning have also been reported in several studies that did not control for the construct of awareness but did employ conditions that appeared to promote some level of awareness. For example, in Leow (2001) and also in Bowles (2003), high performing participants, whom we called “outliers” given that their scores on both the immediate and delayed posttests were substantially outside the mean of the experimental groups, were the only ones who reported high levels of awareness while performing in the experiments. Likewise, in studies methodologically exploring the construct of awareness or lack thereof (e.g., Hama & Leow, 2010; Leung & Williams, 2011; Williams, 2005), there were clear behavioral differences between participants demonstrating a high level of awareness and those who failed to do so. To summarize, these studies have provided strong empirical support for the facilitative effects of awareness on foreign language behavior and learning, that

190

Empirical Research Investigating the Role of Attention

is, when we focus on how the learner processes the incoming L2 data. In addition, the gains reported are quite substantial, and retention has been reported for both lexical (e.g., Rott, 2005) and grammatical information (e.g., Hsieh et al., 2015; Leow, 1998; Rosa & Leow, 2004; Rosa & O’Neill, 1999).

Learning Without Awareness (Implicit Learning) To do justice to the studies that have operationalized and measured the role of awareness or lack thereof in learning and to situate the findings within an SLA theoretical model, once again we need to recall the framework I presented in Chapter 2 on the stages postulated for the learning process in SLA and the distinction between learning as a process and a product. Currently, there are at least nine empirical studies that purport to have addressed the role of awareness or lack thereof in L2 learning, although some of these studies are fairly similar in design, and two are conceptual replications. Two have employed concurrent data elicitation procedures (Hama & Leow, 2010; Leow, 2000), while the others have employed non-concurrent or off line data elicitation procedures (Chan & Leung, 2014; Chen et al., 2011; FarettaStutenberg & Morgan-Short, 2011; Leung & Williams, 2011, 2012, 2014; Williams, 2005) and therefore may be viewed as addressing implicit knowledge premised on implicit learning having taken place. The two concurrent studies reported no benefits of implicit learning, while all of the non-concurrent studies, with the exception of Faretta-Stutenberg and Morgan-Short (2011), a replication of Williams (2005), reported benefits of implicit learning. So, in an effort to explicate these contradictory findings, let us partake in a careful and critical review of the research designs (the heart) of the above-mentioned studies, with a special focus on their operationalization of the construct of unawareness and measurement of what comprises learning, both in relation to the stages along the learning process. I shall put on my researcher’s hat while reporting the studies that have provided such conf licting findings, while you take a quick trip back to the methodological section of the book ( Chapters 6, 7, and 8).

Support for “Implicit Learning” Derived From Evidence of Implicit Knowledge There is currently a series of studies employing a relatively similar research design to address the issue of implicit learning, with a few methodological changes in measurement in the more recent studies (Chan & Leung, 2014; Chen et al., 2011; Faretta-Stutenberg & Morgan-Short, 2011; Leung & Williams, 2011, 2012, 2014; Williams, 2004, 2005). To situate the debate on the role of implicit learning in SLA, it is important to report the first studies (Williams, 2004, 2005) that provided evidence of learning without awareness.

Learning Explicitly or Implicitly 191

Following up on his 2004 study with methodological improvements, Williams (2005) conducted two experiments to test whether participants were able to learn miniature noun class systems without awareness. The 41 participants in his studies were from a variety of language- and linguistics-related backgrounds. The noun phrases used for Williams’ experiments were four novel determiners, gi, ro, ul, and ne. Gi and ro are the English translation equivalence of “near,” and ul and ne are equivalent to “far.” These determiners also carry animacy values: Gi and ul are animate and ro and ne are inanimate. Before the training phase, participants were provided with the translation of the four novel determiners for “near” or “far.” However, they were not informed about the animacy rules of these determiners. During the training phase, the participants were presented aurally with individual sentences with noun phrases that comprised a novel determiner and an English noun (e.g., gi dog, “near dog”). For each sentence, they (a) listened to a sentence, (b) indicated if the novel word meant “near” or “far,” (c) repeated the sentence aloud, and (d) requested to create a mental image of the situation described by the sentence. Participants were told that the main focus of the study was memory in an effort to distract them from paying attention to the underlying animacy rules. After the training phase, to test participants’ learning of the animacy rules, participants engaged in a written selection assessment task with new sentences that contained either trained or untrained (new) noun phrases. These sentences provided to the participants were incomplete in the manner that the targeted structure, the noun phrase, was missing. The participants were then given two noun phases (one animate, the other inanimate) and instructed to select in a twooption MC assessment task the noun phrase that seemed “more familiar, better, or more appropriate” on the basis of what they had heard during the training task” (Williams, 2005: 282–283). Operationalization of awareness was via an offline questionnaire in which participants were asked what criteria they had used to make these choices. If they did not mention “any references to living or nonliving, moves or does-not-move, and so forth” (p. 283) as a reason for their choice, they were classified as unaware. According to the author, the results showed “that, at least for some individuals, it is possible to learn form-meaning connections without awareness of what those connections are” (p. 293). Two conceptual replications of Williams (2005) were conducted and conflicting results were reported. While Chen et al. (2011) provided support for the original study, Faretta-Stutenberg and Morgan-Short (2011) did not (reported below). Chen et al. (2011) was a conceptual replication of Williams (2005) that replaced the offline verbal reports of the original study with an offline measure of source attributions. Source attributions ask participants to decide what knowledge source they used to judge each item on the assessment task (in this study a sentence completion task) based on one of the options: Rule, memory, intuition, or guess. According to Dienes and Scott (2005), source attributions are believed to reflect the underlying structural knowledge created during

192

Empirical Research Investigating the Role of Attention

training that participants use during testing to make their judgments. Participants selecting rule and memory were coded as explicit learning, while those selecting intuition and guess were coded as implicit learning. If participants did not respond within 20 s, the next sentence was displayed. Accuracy feedback was given after every response. According to Chen et al., “Experiment 1 conceptually replicated Williams’ (2005) finding that people could acquire unconscious knowledge [my italics] about form-meaning connections, but using trial by trial measures of awareness” (p. 1754), but not when a feature was not linguistically relevant, for example, when size was the stimuli (cf. Leung & Williams, 2012, Experiment 2). Leung and Williams (2011) extended Williams’ (2005) research agenda by investigating the implicit learning of a mapping between thematic roles and a set of novel determiners. What was unique in this study was that learning of the mapping between thematic roles and a set of novel determiners was measured by participants’ performances on a reaction time test or the serial reaction time task (SRT), which is the standard paradigm for examining sequence learning in psychology (e.g., Cleeremans & McClelland, 1991; Nissen & Bullemer, 1987). In this task, participants demonstrated their comprehension or interpretation of a stimulus by hitting corresponding response keys. According to Leung and Williams, the critical data came from the last 32 trials of the experiment, which were divided into control and violation blocks with no division between these trials and the training. In the control block, the sentences respected the same system used in training, whereas in the violation block, the mapping between articles and thematic roles was reversed so that gi and ul were used with patients and ro and ne were used with agents. No post-exposure learning test was administered. The authors concluded that participants “had implicitly learned that certain articles were associated with certain thematic roles, which supports the hypothesis that form-meaning connections can be learned implicitly” (Leung & Williams, 2011: 48–49). Leung and Williams (2012) followed the same research design employed in their 2011 study but, in addition to animacy and distance in Experiment 1, they investigated animacy and size in Experiment 2. While Experiment 1 replicated previous studies of implicit learning related to animacy, Experiment 2 revealed no implicit learning for size (cf. Chen et al., 2011, for a similar finding). Similarly, Leung and Williams (2014) extended the research design to address the potential effect of prior English and Cantonese linguistic knowledge on implicit language learning. In Experiment 1, both participant groups showed evidence of learning a mapping between articles and noun animacy, replicating earlier studies. In Experiment 2, neither group showed learning of a mapping between articles and a linguistically anomalous concept (the number of capital letters in an English word or that of strokes in a Chinese character), replicating earlier studies. In Experiment 3, the Chinese group, but not the English group, showed evidence of learning a mapping between articles and

Learning Explicitly or Implicitly 193

a concept derived from the Chinese classifier system. It was concluded that first language knowledge affected implicit language learning and that implicit learning, at least when natural language learning is concerned, is subject to constraints and biases. Chan and Leung (2014) employed a relatively similar design to Williams (2005) but addressed in two experiments the implicit learning of Spanish word stress rules by Cantonese L2 English participants. In Experiment 1, the researchers investigated words that end in an “o” versus “ar” and in Experiment 2, words that end in a vowel versus words that end in a consonant, both of which, incidentally, follow the general Spanish stress rule that words ending in a vowel have their stress on the penultimate syllable and words that end in a consonant have their stress on the last syllable. The training phase consists of 64 randomized trials, each containing a Spanish word and its English translation. A set of 16 Spanish words, half of which end with -ar and the other half with -o, was repeated four times. Participants repeated aloud after the recording. According to the researchers, this provided participants with exposure to the target stress rules without explicitly directing their attention to them. In Experiments 1 and 2, a pronunciation judgment test of novel exemplars (whether one pronunciation “sounded better” over another) was employed to measure learning. To measure awareness, an inclusion-exclusion test “that required participants to read aloud 8 two-syllable words in each of the two conditions: 1) “as similarly to Spanish pronunciation as possible” (inclusion) and 2) “as differently from Spanish pronunciation as possible” (exclusion)” (Chan & Leung, 2014: 193) was used in both experiments together with an off line verbal report in Experiment 1 or a confidence rating test in Experiment 2. Chan and Leung reported that Experiment 1 demonstrated the implicit learning of association between the ending phoneme and word stress and that Experiment 2 demonstrated the implicit learning of a more abstract rule of stress placement, and concluded that L2 word stress rules may be learned implicitly. Put in a nutshell, this series of studies operationalized and measured awareness offline via offline verbal reports, empirically investigated participants’ interpretation (Stage 3) via a two-option test offline or together with a reaction time test online, with the assumption that participants were using recently learned implicit knowledge of the embedded or hidden rule, and reported that statistically above chance performances were evidence of implicit learning taking place in their experiments. To summarize, the above-cited studies have provided empirical evidence that L2 learners may learn implicitly form-meaning connections or phonological rules of targeted items presented in the L2 input. It is noted that several of the studies also reported the substantial difference in performances between the aware and unaware groups. For example, while the unaware participants averaged around 57% mean accuracy (which is statistically above chance performance), the aware participants averaged 70% and above.

194

Empirical Research Investigating the Role of Attention

Comments on Studies The major limitation of all these studies is the operationalization of awareness (cf. Bialystok, 1979; R. Ellis, 2004; Eriksen, 1960; Hama & Leow, 2010; Leow & Hama, 2013; Shanks et al., 1994; Shanks & St. John, 1994, and others for this and other critiques). Given that the construct of awareness is operationalized non-concurrently or offline and measured via offline verbal reports seeking to elicit learners’ knowledge of the targeted underlying rules, they all appear to view implicit learning as a product and, more specifically, as learned implicit knowledge (cf. Chen et al.’s [2011] conflation of learning and knowledge in their study). In other words, awareness appears to be associated with some form of explicit declarative L2 knowledge of the rules associated with what is being learned. This conclusion fits in quite well with an increasing number of empirical studies (e.g., Grey, Williams, & Rebuschat, 2014; Hamrick & Rebuschat, 2012; Rebuschat & Williams, 2012) that are currently addressing type of knowledge (implicit vs. explicit), employing a variety of offline measures (e.g., confidence ratings, source attributions) adopted from cognitive psychology (cf. Rebuschat, 2013, for a review of such measures). Regarding the assessment tasks employed to measure learning based on the stages along the learning process, both the two-option interpretation test employed in Williams’s (2004, 2005) studies and the concurrent two-option (accompanied by the reaction time test) interpretation task/test employed in Leung and Williams’s (2011, 2012, 2014) studies are situated at Stage 3 of the learning process, although the premise in the latter studies is that at this point in time while performing this task, participants implicitly learned and were applying this newly learned and implicit knowledge to interpret correctly the targeted mapping, based on the accuracy of selecting the correct option. The use of reaction time, employed in Leung and Williams (2011, 2012, 2014), is indeed an improvement over offline assessment tasks to gather concurrent data, once the issue of automaticity or speeded-up performance is controlled (cf. Chapter 8 on the limitations of the reaction time procedure). However, while reaction time is assumed by the researchers to measure learning based on the accuracy of the responses over time, it still cannot account for the measure of awareness employed in these studies. What can be taken away from these studies is that evidence of implicit knowledge (as measured in these studies) presupposes that implicit learning did take place during the experimental phase of the studies.

No Support for Implicit Learning Leow (2000) employed a hybrid design that employed both concurrent and nonconcurrent data and analyses, and reported that learning did not appear to occur among unaware learners for his particular population. His study addressed the effect of awareness or lack thereof on L2 learners’ subsequent intake and written

Learning Explicitly or Implicitly 195

production of L2 forms. The targeted forms were the third persons of Spanish irregular preterit verbs (ending in either -er or -ir) that have a stem change in this tense. For example, the third person singular of the preterit of morir, “to die,” does not follow the regular verb paradigm *morió, but murió, “he died,” where the stem vowel o changes to u ; likewise, pedir, “to ask for,” is not pedió, but pidió, “he asked for,” where the stem vowel e changes to i. Thirty-two English-speaking adults learning Spanish as a foreign language at a college level participated in a problem-solving task (crossword puzzle) in which they were exposed to ten incomplete exemplars of the targeted forms among fifteen clues. Prior knowledge (recognition and controlled written production) of the targeted forms was controlled via a pretest administered three weeks before exposure. The crossword puzzle was designed to require learners to fill in the correct endings of the irregular verbs without the need to pay attention to the stem-change present in the incomplete targeted verb form (e.g., mur-). Immediately upon completion of the crossword puzzle, participants engaged in a four-option MC recognition assessment task with items that were identical to those in the crossword clues and a written production assessment task (a fill-in-the-blank) in which they were to produce the targeted forms in different contexts (the same assessment tasks used for the pretest). To operationalize the construct of awareness, participants were instructed to think aloud nonmetalinguistically while completing the crossword puzzle and also during the post-exposure assessment tasks. The verbal reports from the think aloud protocols were then coded to establish whether a participant was either aware or unaware, based on the criteria posited in Leow (2001) for levels of awareness. Participants were assigned to the aware group if they “provided a report of being aware of the target forms [a simple reference to the target forms which does not require mentioning of rules] or some form of metalinguistic description of the underlying rule” (p. 564), and others to the unaware group (their unawareness level was also cross-checked with offline awareness measures via post-exposure questions and an interview). The posttest scores of the aware participants’ recognition and written production assessment tasks revealed significant gains from the pretest scores of those tasks and also significant superiority on the immediate posttest when compared to the unaware group. On the other hand, among unaware learners there were no significant differences between their performances on the pretest and posttest. Leow (2001) concluded that “no disassociation between awareness and learning was found in this study” (p. 573) and cautioned that the findings reported were only directed to his sample of “adult beginning learners of Spanish” (p. 573). Hama and Leow (2010) extended Williams’s (2005) study by addressing several methodological issues of Williams’s research design, namely (1) employing a hybrid design to gather data at the concurrent stage of encoding, during the testing phase, and after the experimental exposure, (2) increasing the number of items (four instead of two) on his two-option multiple-choice test to include the

196

Empirical Research Investigating the Role of Attention

presence of distance in learners’ selection of options (i.e., animacy plus distance) in order to replicate the training context and a more normal learning context, (3) including a production test in addition to the MC test to address participant performance after the internalization stage (cf. Leow, 1998, for the need to employ several assessment tasks), and (4) providing the same modality for both the learning and testing phases to address the potential impact of this variable not addressed in the original study. The think aloud protocols served to eliminate participants who demonstrated awareness of the animacy rule, employed a non-animacy-based strategy, or became aware of the rule while performing the post-exposure task. A critical statistical analysis (chance correction formula, Hama & Leow, 2010: 488) was then performed to align the study’s four-option MC test to the original two-option one employed in Williams (2005) to ensure statistical comparability between the current and original studies. The findings revealed that “unaware learners, at the stage of encoding, did not appear to demonstrate any significant animacy bias in either their selection or production of the trained or new determiner-noun combinations” (Hama & Leow, 2010: 482). Faretta-Stutenberg and Morgan-Short (2011) was the second conceptual replication of Williams (2005). More specifically they examined implicit learning in a group of learners with specific linguistic backgrounds (cf. Leung & Williams, 2014; Williams, 2004) and employed, according to the researchers, a more finegrained analysis of awareness level, classified as noticing, understanding, and no report (cf. Rosa & O’Neill, 1999; Hama & Leow, 2010). Faretta-Stutenberg and Morgan-Short (2011) did not find evidence for learning without awareness, or of learning with awareness at the level of noticing, duplicating the overall findings reported by Hama and Leow (2010) that no empirical evidence was found to support the original study’s claim of implicit learning. Like previous studies, only learners coded as awareness at the level of understanding were able to perform at levels significantly above chance on generalization items. With regard to prior linguistic knowledge, Faretta-Stutenberg and Morgan-Short (2011) did not find evidence for a relationship between knowledge of gendered languages and performance in aware or unaware learners. However, according to the researchers, this null result might be because (a) the accuracy scores on generalization items among the unaware participants were not significantly higher than chance, and (b) there was little variation among participants in terms of number of gendered languages known.

Comments on Studies Given that both Leow (2000) and Hama and Leow (2010) have operationalized the construct of awareness at the concurrent (online) stage of encoding or accessing the incoming experimental information, the definition that underlies these studies on the role of awareness in language learning appears to be one that views

Learning Explicitly or Implicitly 197

implicit learning as a process (N. C. Ellis, 1994; Schmidt, 1994b; Tomlin & Villa, 1994), and, more specifically, taking place during the early stages (Stage 1 and Stage 3) along the learning process. To measure intake, the assessment tasks employed in these two studies were a four-option MC test, while to measure learning, a controlled written production test (Leow, 2000) and a fill-in-the-blank oral production test (Hama & Leow, 2010) were employed. The written production test employed in Leow (2000) did not contain novel exemplars, thereby limiting its findings to the aware group’s knowledge that might or might not have been fully systemized. However, with respect to the unaware group’s scores, this issue may not be relevant given that no significant increase in performance was reported. The fill-in-the-blank oral production test employed in Hama and Leow (2010) was quite controlled and not spontaneous in nature but, similar to Leow (2000), might not be an issue due to the performance of the participants. Both of these studies did not employ a control group to address the issue of reactivity, that is, the act of thinking aloud potentially impacting participants’ cognitive processes while processing the L2 data, and may need to rely on the findings of the several studies that have empirically addressed this methodological issue. The issue of reactivity is an important methodological concern with studies employing concurrent data elicitation procedures, and I have reviewed this issue earlier.

Differences Between the Two Measures of Awareness As can be seen, there are clear differences between these two perspectives of what comprises implicit learning. First, with respect to learning, the assessment task used to measure learning, namely, receptive at the stage of intake processing (Stage 3) (online, Leung & Williams, 2011, 2012, 2014; offline, Williams, 2005) versus productive at the stage of output (beyond Stage 5) (Hama & Leow, 2010; Leow, 2000) appears to be crucial in explicating where along the learning process data are being elicited. Second, with respect to unawareness, there are two perspectives of this construct. The first is the perception of where awareness occurs along the learning process, as exemplified in the concurrent or online (Hama & Leow, 2010; Leow, 2000) versus the non-concurrent or offline (Chan & Leung, 2014; Chen et al., 2011; Faretta-Stutenberg & Morgan-Short, 2011; Leung & Williams, 2011, 2012, 2014; Williams, 2005) operationalization and measurement of the construct of awareness. Another way of viewing this distinction is that of awareness while engaging in the process of learning (concurrent, Stage 3) versus awareness of one’s knowledge after exposure (non-concurrent, beyond Stage 5). Addressing unawareness at the encoding and concurrent stage via the use of online verbal protocols is likely to be more valid when compared to offline operationalization and measurement because the information is still fresh in working memory

198

Empirical Research Investigating the Role of Attention

and directly accessible for verbal reports. The second perspective is viewing the construct of awareness either at one level or as a dichotomous variable, or as a construct that occurs on a continuum with several levels. To elicit tangible evidence that learning did take place, it is recommended that studies investigating any type or kind of learning employ minimally a productive task with both trained and untrained items to control for any potential role for memory. To address robust learning, the inclusion of a delayed posttest is highly warranted.

Pedagogical Implications? OK, I am back as a teacher wearing my researcher’s cap. What can we take away from the current literature on the role of awareness or lack thereof in learning? There is no question that promoting learning with awareness in the L2 classroom has both theoretical and empirical support in the SLA field. At the same time, should we promote learning without awareness in the L2 classroom? That is, should I expose my students to the L2, albeit accompanied by well-designed materials or input, and hope that they learn? To answer this question, let us return to the evidence gleaned from the SLA literature on implicit learning. On the one hand, we have reported evidence that L2 learners can make form-meaning connections of non-language items embedded within a naturally occurring language, but presented at a sentential level with a huge amount of exemplars. We are, however, not sure whether this type of learning is robust, given that participants performed a receptive task, no immediate productive test or delayed posttest was administered, and the gain scores are minimal despite the huge amount of exemplars. On the other hand, it is also reported in several of these studies that the aware learners typically performed substantially better than the unaware learners, replicating the positive benefits reported in other awareness studies on L2 development. Why did aware participants consistently perform significantly better than unaware participants? Studies employing think aloud protocols may provide the answer: These participants most likely processed the data at a deeper level and employed reported processes of hypothesis testing and rule formation (cf. e.g., Bowles, 2003; Hsieh et al., 2015; Leow, 2001; Rosa & O’Neill, 1999; Rosa & Leow, 2004). At this point, it may be advisable to await further investigation into the robustness of implicit learning before the findings of this type of learning can be extrapolated to the classroom setting.

Conclusion In summary, there is really no argument or debate concerning the beneficial role of awareness in L2 development or the fact that explicit learning does promote L2 development, while the jury is still out regarding implicit learning. At the same time, while it is clearly challenging to create a research design that can elicit data

Learning Explicitly or Implicitly 199

on unawareness during L2 input processing (hence the methodological issues that I have raised), I don’t think anyone will disagree that implicit learning, that is, learning without awareness, is intuitively one way an adult L2 learner may learn or even acquire the L2. Of course, there is a huge caveat: This process will naturally depend on the ideal conditions that include quite a long period of exposure to and interaction with the L2 in a meaningful and communicative way (think immersion setting), though the potential for both explicit and implicit learning taking place may be quite high. The obvious question is whether the classroom setting, be it formal (as in attending classes in a classroom) or online (as in long-distance classes or even a hybrid curriculum), can provide such optimal conditions for implicit learning to take place. Given the formality of the classroom setting and pedagogical implications derived from SLA research regarding the role of awareness in L2 learning, I am going to elaborate in the next chapter on a fine distinction I have observed between levels of awareness and depth of processing, which will have important ramifications for both pedagogical extrapolations and the tenets of my proposed model of the L2 learning process in Instructed SLA.

References Alanen, R. (1995). Input enhancement and rule presentation in second language acquisition. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and teaching. Honolulu, HI: University of Hawai’i Press. Allport, A. (1988). What concept of consciousness? In A. J. Marcel & E. Bisiach (Eds.), Consciousness in contemporary science (pp. 159–182). London: Clarendon Press. Bialystok, E. (1979). Explicit and implicit judgments of L2 grammaticality. Language Learning, 29, 81–103. Bowles, M. A. (2003). The effect of textual input enhancement on language learning: An online/offline study of fourth semester Spanish students. In P. Kempchinshky & C. Pineros (Eds.), Theory, practice, and acquisition: Papers from the 6th Hispanic Linguistic Symposium and 5th Conference on the Acquisition of Spanish and Portuguese. Summerville: MA: Cascadilla Press. Chan, R., & Leung, J. (2014). Implicit learning of natural language stress rules. Second Language Research, 30, 463–484. Chen, W., Guo, X., Tang, J., Zhu, L., Yang, Z., & Dienes, Z. (2011). Unconscious structural knowledge of form-meaning connections. Consciousness and Cognition, 20, 1751–1760. Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235–253. de Graaff, R. (1997). The eXperanto experiment: Effects of explicit instruction on second language acquisition. Studies in Second Language Acquisition, 19, 249–276. DeKeyser, R. M. (1995). Learning second language grammar rules: An experiment with a miniature linguistic system. Studies in Second Language Acquisition, 17, 379–410. De la Fuente, M. J. (2015). Explicit corrective feedback and computer-based, formfocused instruction: The role of L1 in promoting awareness of L2 forms. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton.

200

Empirical Research Investigating the Role of Attention

Dienes, Z., & Scott, R. (2005). Measuring unconscious knowledge: Distinguishing structural knowledge and judgment knowledge. Psychological Research, 69 (5/6), 338–351. Dulaney, D. E., Carlson, R. A., & Dewey, G. I. (1985). On consciousness in syntactic learning and judgment: A reply to Reber, Allen, and Regan. Journal of Experimental Psychology: General, 114, 25–32. Ellis, N. C. (1993). Rules and instances in foreign language learning: Interactions of explicit and implicit knowledge. European Journal of Cognitive Psychology, 5, 289–318. Ellis, N. C. (Ed.). (1994). Implicit and explicit learning of languages. London: Academic Press. Ellis, R. (2004). The definition and measurement of explicit knowledge. Language Learning, 54, 227–275. Eriksen, C. W. (1960). Discrimination and learning without awareness: A methodological survey and evaluation. The Psychological Review, 67, 279–300. Faretta-Stutenberg, M., & Morgan-Short, K. (2011). Learning without awareness reconsidered: A replication of Williams (2005). In G. Granena, J. Koeth, S. Lee-Ellis, A. Lukyanchenko, G. Prieto Botana, & E. Rhoades (Eds.), Selected proceedings of the 2010 Second Language Research Forum: Reconsidering SLA research, dimensions, and directions (pp. 18–28). Somerville, MA: Cascadilla Proceedings Project. Grey, S., Williams, J. N., & Rebuschat, P. (2014). Incidental exposure and L3 learning of morphosyntax. Studies in Second Language Acquisition, 1–35. Hama, M., & Leow, R. P. (2010). Learning without awareness revisited: Extending Williams (2005). Studies in Second Language Acquisition, 32, 465–491. Hamrick, P., & Rebuschat, P. (2012). How implicit is statistical learning? In P. Rebuschat & J. Williams (Eds.), Statistical learning and language acquisition. Boston: de Gruyter. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hulstijn, J. H. (2005). Theoretical and empirical issues in the study of implicit and explicit second-language learning. Studies in Second Language Acquisition, 27, 129–140. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (1998). The effects of amount and type of exposure on adult learners’ L2 development in SLA. Modern Language Journal, 82, 49–68. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware versus unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2001). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leow, R. P. (2012). Explicit and implicit learning in the L2 classroom: What does the research suggest? The European Journal of Applied Linguistics and TEFL , 2, 117–129. Leow, R. P. (2015). Implicit learning in SLA: Of processes and products. In P. Rebuschat (Ed.), Implicit and explicit learning of languages. Amsterdam: John Benjamins. Leow, R. P., & Hama, M. (2013). Implicit learning in SLA and the issue of internal validity: A response to Leung and Williams’ ‘The implicit learning of mappings between forms and contextually derived meanings.’ Studies in Second Language Acquisition, 35(3), 545–557. Leow, R. P., Johnson, E., & Zárate-Sández, G. (2011). Getting a grip on the slippery construct of awareness: Toward a finer-grained methodological perspective. In C. Sanz & R. P. Leow (Eds.), Implicit and explicit conditions, processes and knowledge in SLA and bilingualism (pp. 61–72). Washington, D.C.: Georgetown University Press.

Learning Explicitly or Implicitly 201

Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mappings between forms and contextually derived meanings. Studies in Second Language Acquisition, 33, 33–55. Leung, J. H. C., & Williams, J. N. (2012). Constraints on implicit learning of grammatical form-meaning connections. Language Learning, 62, 634–662. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. Loschky, L., & Bley-Vroman, R. (1993). Grammar and task-based learning. In G. Crookes & S. Gass (Eds.), Tasks and language learning: Integrating theory and practice (pp. 123–167). Clevedon, UK: Multilingual Matters. Mackey, A. (2006). Feedback, noticing and instructed second language learning. Applied Linguistics, 27(3), 405–430. Martínez-Fernández, A. (2008). Revisiting the involvement load hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. McLaughlin, B. (1990). “Conscious” versus “unconscious” learning. TESOL Quarterly, 24, 617–634. Medina, A. (2015). The variable effects of level of awareness and CALL versus non-CALL textual modification on adult L2 readers’ input comprehension and learning. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. (2010). Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning, 60, 154–193. Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19, 1–32. Ortega, L. (2009). Understanding second language acquisition. London: Hodder. Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 6, 855–863. Reber, A. S. (1976). Implicit learning of synthetic languages: The role of instructional set. Journal of Experimental Psychology: Human Learning and Memory, 2, 88–94. Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219–235. Rebuschat, P. (2013). Measuring implicit and explicit knowledge in second language research: A review. Language Learning, 63, 595–626. Rebuschat, P., & Williams, J. (2012). Implicit and explicit knowledge in second language acquisition. Applied Psycholinguistics, 33, 829–856. Robinson, P. (1996). Learning simple and complex second language rules under implicit, incidental, rule-search and instructed conditions. Studies in Second Language Acquisition, 18, 27–68. Robinson, P. (1997). Generalizability and automaticity of second language learning under implicit, incidental, enhanced, and instructed conditions. Studies in Second Language Acquisition, 19, 223–247. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21, 511–556. Rott, S. (2005). Processing glosses: A qualitative exploration of how form-meaning connections are established and strengthened. Reading in a Foreign Language, 17, 95–124.

202

Empirical Research Investigating the Role of Attention

Sachs, R., & Suh, B-R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction, In A. Mackey (Ed.), Conversational interaction in second-language acquisition: A series of empirical studies (pp. 197–227). Oxford: Oxford University Press. Sato, M., & Ballinger, S. (2012). Raising language awareness in peer interaction: A crosscontext, cross-methodology examination. Language Awareness, 21, 159–179. Schachter, D. L. (1989). On the relation between memory and consciousness: Dissociable interactions and conscious experience. In H. L. Roediger and F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel Tulving. Hillsdale, NJ: LEA. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Schmidt, R. W. (1994a). Implicit learning and the cognitive unconscious: Of artificial grammars and SLA. In N. Ellis (Ed.), Implicit and explicit learning of languages (pp. 165–209). London: Academic Press. Schmidt, R. W. (1994b). Deconstructing consciousness in search of useful definitions for applied linguistics. In J. H. Hulstijn & R. W. Schmidt (Eds.), AILA Review: Consciousness and second language learning: Conceptual, methodological and practical issues in language learning and teaching, 11, 11–26. Shanks, D. R., Green, R. E. A., & Kolodny, J. A. (1994). A critical examination of the evidence for unconscious (implicit) learning. In C. Umiltà & M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing (pp. 837– 860). Cambridge, MA: MIT Press. Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447. Smith, B. (2012). Eye-tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning and Technology, 16, 53–81. Tomlin, R. S., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183–203. Uggen, M. S. (2012). Reinvestigating the noticing function of output. Language Learning, 62 (2), 506–540. Williams, J. N. (2004). Implicit learning of form-meaning connections. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form meaning connections in second language acquisition (pp. 203–218). Mahwah, NJ: Erlbaum. Williams, J. N. (2005). Learning without awareness. Studies in Second Language Acquisition, 27, 269–304. Williams, J. N. (2009). Implicit learning in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), The new handbook of second language acquisition (pp. 319–353). Bingley, UK: Emerald Group Publishing.

11 DEPTH OF PROCESSING IN L2 PROCESSING

As described in Chapter 2, input processing is postulated to represent the initial stage of the learning process and is theoretically placed between the input-tointake stage of this learning process. While I cautioned that input processing in SLA is not as straightforward as it may seem, given that some theoretical underpinnings view such initial processing of the L2 data as minimally dependent upon attention and some low level of cognitive effort (Robinson, 1995, 2003; Schmidt, 1990, 2001), others appear to assign higher levels of processing during this initial stage (Gass, 1997; Truscott & Sharwood Smith, 2011; VanPatten, 2004). Of course, to ascertain whether learners do expend a specific amount of cognitive effort while processing L2 data is not easily obtained, and insights into such levels usually come from concurrent data elicitation procedures such as think aloud (TA) protocols and, to a lesser extent, eye-tracking (ET) technology and reaction time (RT) procedure. A careful review of data gathered concurrently while learners were processing L2 written data reveals that, in several instances, learners who appeared to have paid peripheral attention to targeted items in the input failed to even recognize such items in a post-exposure assessment task. Evidence of such peripheral attention paid to targeted items in the input has also been reported in studies that have employed eye-tracking technology to operationalize and measure the construct of attention/noticing (e.g., Godfroid, Boers, & Housen, 2013). At the same time, we also have evidence of learners demonstrating noticing of targeted items in L2 written input, yet performing differentially on subsequent assessment tasks—for example, some were able to recognize some of the targeted items but unable to produce these recognized items on a controlled written production test, some were able to perform well on both recognition and written production tasks, and, very surprisingly, others were unable to even recognize some of the targeted

204

Empirical Research Investigating the Role of Attention

items, much less produce them (e.g., Bowles, 2003; Leow, 1997; Leow, Hsieh, & Moreno, 2008). Of interest here are the differential depths of processing (DOP) demonstrated by learners in these protocols. On the one hand, protocols revealed the rapidity of the reading process, with no attempt to decode the targeted items beyond demonstrating paying attention to or briefly noticing them, while on the other hand, there were obvious efforts to decode minimally or to process further the targeted items (cf. Baralt, 2013 for such evidence in a synchronous computermediated communication (SCMC) platform). Clearly, then, we do need to take seriously the issue of level or depth of processing during the input, intake, and output processing stages. This chapter reports on the theoretical and empirical research on the notion of depth of processing in the L1 in cognitive psychology and replicates this report in the SLA field, operationalized via both experimental conditions and concurrent think alouds. Depth of processing is discussed in relation to levels of awareness, as well as its potential inhibitory effect, and the benefits of adopting a DOP perspective to code TA protocols are presented. A chart for operationalizing the concept of depth of processing for both lexical and grammatical items is provided, and this concept is, in turn, employed to explicate findings from different strands of research such as textual enhancement, experimental learning conditions, and oral feedback strand/CMC. But first, here is a definition of depth of processing.

Definition of Depth of Processing Depth of processing is defined as the relative amount of cognitive effort, level of analysis, and elaboration of intake, together with the usage of prior knowledge, hypothesis testing, and rule formation employed in decoding and encoding same grammatical or lexical item in the input.

Depth of Processing in the L1 in Cognitive Psychology The concept of level or depth of processing is usually attributed to Craik and Lockhart’s (1972) levels of processing framework in the cognitive psychology field, which employed this concept to refer to conceptual or semantic processing (i.e., deep processing) versus perceptual processing (i.e., shallow processing). To support their levels of processing framework, Craik and Lockhart relied on several previous studies that reported that (1) deep processing was superior to shallow processing in terms of memory performance related to word-structure processing under incidental conditions (participants were not given specific instructions to learn or pay attention to any information), such as ranking a word as to its pleasantness versus checking for the letter e (Hyde & Jenkins, 1969) or writing down an adjective that could be modified versus a rhyming word (Johnston & Jenkins, 1971), and (2) increased repetition within a low level of processing did not lead to increased performance (Tulving, 1966; Turvey, 1967).

Depth of Processing and Input Processing

205

Craik and Lockhart (1972) claimed that remembering information depended not only on having attended to it during its occurrence or having rehearsed it after its occurrence, but also on how deeply it was processed. Shallow processing may be either structural processing, which occurs when we encode physical features of something (e.g., the appearance of letters in a word), or phonemic processing, which is when we encode the sound of the item. The potential for retention is not strong for this type of processing since it only involves maintenance rehearsal or repetition to hold it in short-term memory. Deep processing, on the other hand, is when we decode the word in relation to its meaning and potential relationship with other similar words already existing in our current knowledge system. This type of processing involves elaboration rehearsal that incorporates deeper analysis of the item, such as activation of prior knowledge and meaningful analysis, and leads to superior recall of the item. Craik and Lockhart’s proposal included the following notions: (a) Processing is the analysis of information that underlies perception and comprehension, (b) there is a hierarchy of levels of analysis, running from early analyses of sensory and surface features to later analyses of semantic and conceptual features, (c) analysis of meaning requires more attention than analysis of sensory features, and, therefore, the later levels of analysis are “deeper” levels of processing, and (d) semantic deeper processing is associated with stronger and longer-lasting memory traces. In a series of ten experiments on word or lexical processing, Craik and Tulving (1975) reported overall empirical evidence for the effects of levels of processing on both incidental and intentional memory performance. Put another way, then, the concept of depth of processing is quite simple. If one were to process incoming information at a deeper level, that is, employ greater cognitive effort during processing while using prior knowledge to strengthen the process, the chances of remembering such information are substantially increased. What occurs is that when some aspect of the input is attended to, a trace is formed, and whether this trace remains in memory to be accessed later will depend upon the depth with which the trace has been processed. From a cognitive neuroscience perspective, the activation level of the processed information is higher than normal and the neurons are firing strongly. In addition, retention of said information over a period of time is predicted to occur based on depth of processing (there is evidence of this in SLA, as will be discussed below). To measure depth of processing, Craik and Lockhart proposed addressing the retention, amount of attention, and compatibility of the analyzing structures, together with amount of time spent, although they also did not place much emphasis on this last variable given that time alone cannot be used as a reliable measure of depth of processing across different tasks (e.g., familiar vs. unfamiliar tasks). Empirical studies have investigated the effect of levels of processing not only on memory performance but also on Tulving’s (1983) notions of episodic memory, which involves recollection of contextual details (i.e., remembering) and

206

Empirical Research Investigating the Role of Attention

semantic memory, characterized by a sense of familiarity (i.e., knowing). Using Tulving’s proposed remember-know procedure, studies have addressed different types of modes and items, such as L1 words in the visual mode (e.g., Gardiner, 1988; Gardiner, Java, & Richardson-Klavehn, 1996; Rajaram, 1993) and L1 words in the auditory mode (e.g., Karayianni & Gardiner, 2003; Rajaram, 1993), and pictures (e.g., Konstantinou & Gardiner, 2005; Rajaram, 1996). Overall, these studies have found support for two main hypotheses: (1) Fairly minimal encoding conditions (shallow processing) are sufficient for items to be registered in semantic memory and/or to be recognized with an experience of knowing, and (2) optimal encoding conditions (deep processing) are necessary for items to be registered in episodic memory and/or to be recognized with an experience of remembering. The operationalizations of deep and shallow processing in these studies included the following: (a) Encoding while carrying out semantic versus graphemic/ phonetic tasks; (b) encoding while completing generating versus reading tasks; (c) encoding while carrying out undivided attention versus divided attention tasks; and (d) encoding of words versus non-words. Most studies used a withinsubject design so that all participants completed both types of task. The number of items for each condition ranged between 15 and 100, and variables such as number of letters or syllables, frequency, and, to a lesser extent, word class and word imageability, were controlled. The retention interval ranged from a few minutes in most cases to one week. Findings of these studies indicated that deep processing gives rise to significantly more remember responses than know responses. In contrast, shallow processing does not have a significant effect on either remember or know responses. Overall, deep processing was related to conceptual or semantic processing, amount of attention, and elaborative processing. The role of awareness in several of these studies has also been addressed by assuming that the responses of remembering provided evidence of awareness, while knowing provided evidence of unawareness (e.g., Gardiner, 1988; Gardiner et al., 1996; Khoe, Kroll, Yonelinas, Dobbins, & Knight, 2000; Rajaram, 1993), which, interestingly, led to methodological issues associated with the validity of using remembering and knowing as indices of awareness, together with the role of guessing in such designs. Researchers, however, have pointed out that the view of semantic and perceptual processing as deep and shallow processing respectively may not be pertinent given that both conceptual and perceptual features can be processed with different degrees of depth (cf. Baddeley, 2002, and Craik, 2002, for a review of the levels of processing framework). Consequently, there has been a shift in the notion of depth of processing to align it closer to “degree of elaboration,” “degree of consciousness,” or “level of awareness” rather than to semantic processing (e.g., Craik & Tulving, 1975). Gardiner and Richardson-Klavehn (2000) established a relationship between these notions and memory: “All that is necessary for encoding into the semantic system is some initial awareness of the events, however

Depth of Processing and Input Processing

207

fleeting. In contrast, encoding into episodic memory must depend on greater conscious elaboration of the events” (p. 234). According to this view, findings would postulate a relationship between attention and awareness at encoding on the one hand, and memory awareness at retrieval on the other hand. There are, nevertheless, certain methodological limitations of the way the notions of “deep” versus “shallow,” “degree of elaboration,” “degree of consciousness,” and “level of awareness” were operationalized in the empirical studies reviewed (e.g., encoding while completing generating versus reading tasks, undivided attention versus divided attention tasks, and encoding of words versus non-words). First of all, the concepts of “deep” and “shallow” were not defined independently, that is, the definition of one level depended on the definition of the other level (e.g., memorizing real words is assumed to induce deep processing, but only when this task is compared to memorizing non-words). Second, process measures such as think aloud protocols were not employed to ensure that participants’ performances were representative of their different experimental conditions (e.g., it was assumed that they paid attention in a task with no divided attention requirements). Therefore, most studies investigated the effects of different types of tasks or items that were assumed to reflect a type of processing, but they did not attempt to measure the type of processing itself. Third, most of the studies reviewed have used a short retention interval. Studies outside the levels of processing framework suggest that know responses may lead to lack of recognition after one week, while remember responses become know responses over time (Conway, Gardiner, Perfect, Anderson, & Cohen, 1997; Knowlton & Squire, 1995). This phenomenon is referred to as the “remember-to-know-shift” (declarative/explicit knowledge to procedural/implicit knowledge?). Because the effect of retention interval may have important implications for learning, studies investigating depth of processing should also address this time variable, that is, retention is a good indicator of deep processing and, ultimately, robust learning. Operationalizations of depth of processing, then, may not be completely satisfactory in cognitive psychology, and one main limitation of the levels of processing framework may be the absence of an objective index of depth of processing (Craik, 2002).

Summary In summary, in all of the studies reviewed here, deeper, semantic processing appeared to lead to higher retention (e.g., Gardiner, 1988; Gardiner et al., 1996, Gardiner, Brandt, Vargha-Khadem, Baddeley, & Mishkin, 2006; Khoe et al., 2000; Rajaram, 1993). The majority of these studies reported that this higher retention was associated only with the remember response type, with the know response type being unaffected by processing condition (e.g., Gardiner, 1988; Gardiner et al., 1996; Gardiner et al., 2006). It has also been suggested that, given the relationship between remember responses and processing level, the advantage

208

Empirical Research Investigating the Role of Attention

provided by the higher processing condition is due to its enhancement of episodic memory (e.g., Gardiner, 1988; Rajaram, 1993). Research in the L1, then, provides substantial evidence in favor of a role for depth of processing (and awareness) in mostly lexical learning and retention. Craik and Lockhart’s (1972) levels of processing framework is viewed as an improvement on earlier multi-store models of memory (e.g., Shiffrin and Schneider’s (1977) model of short- and long-term memory), given that it underscores the importance of focusing more on the processes involved in memory than on postulated stores or structures of memory. Encoding of information, then, was not a simple process, and this new perspective had a huge impact on memory research in cognitive psychology by expanding the perception of memory as simple stores and viewing it as a complex processing system. One of the major plausible reasons to support the finding that deeper encoding produces better retention is because it is more elaborate. Elaborative encoding creates a rich memory representation of an item by not only activating many aspects of its meaning but also linking it into the learner’s pre-existing network of semantic associations (read activation of appropriate prior knowledge). However, teasing out the roles of time and effort clearly needs to be addressed.

Depth of Processing in SLA As discussed, the concept of levels or depth of processing (DOP) has been refined during the last few decades, and instead of referring to semantic processing, the notion of depth or level of processing in SLA refers to the strength of the connections made within one domain. In this way, depth of processing in second language acquisition is similar to Craik and Tulving’s (1975) degrees of elaboration and has been adopted in the SLA field to refer also to, for example, (1) amount (Shook, 1994) or type (e.g., focused vs. non-focused; Gass, Svetics, & Lemelin, 2003) of attention, (2) mental effort, elaboration, or involvement (Calderón, 2013; de la Fuente, 2015; Hsieh, Moreno, & Leow, 2015; Kim, 2008; Laufer & Hulstijn, 2001; Leow et al., 2008; Martínez-Fernández, 2008; Rott, 2005), (3) substantive vs. perfunctory noticing or quality of noticing (Qi & Lapkin, 2001), shallow versus deep processing (Bird, 2012), and (4) levels of awareness (Hsieh et al., 2015; Leow, 2012). DOP is also subsumed in several of the psycholinguistic underpinnings discussed in Chapter 5. As you may recall, McLaughlin’s (1987) cognitive theory postulates that the amount of attention paid to the L2 data by L2 learners is largely dependent upon the amount of cognitive effort required by their processing of the input. Gass’s (1997) model posits that comprehended input may be analyzed at different levels of analysis, for example, global comprehension versus a more linguistic focus. VanPatten’s (2004) Primacy of Meaning Principle postulates that deeper levels of processing (e.g., processing for form and meaning simultaneously), which presumably require greater levels of cognitive effort and elaboration, should interfere more with comprehension processes. Chaudron

Depth of Processing and Input Processing

209

(1985) refers to additional stages of cognitive effort, from preliminary intake to final intake, while Robinson includes in his model data-driven and conceptuallydriven processes that occur after initial intake. Ellis (2007) permits conscious attention to problematic input, while Swain (1985) requires learners to process quite deeply during output. Finally, Truscott and Sharwood Smith provide several levels of processing-awareness in the MOGUL framework. The several studies in SLA that have addressed the role of depth of processing in L2 development operationalized DOP either indirectly via experimental conditions assumed to promote deeper processing (Bird, 2012; Gass et al., 2003; Laufer & Hulstijn, 2001; Shook, 1994) or directly via concurrent verbal reports (Hsieh et al., 2015; Leow et al., 2008; Morgan-Short, Heil, Botero-Moriarty, & Ebert, 2012; Qi & Lapkin, 2001; Rott, 2005). Let us take a look at these studies.

DOP Operationalized via Experimental Conditions Shook (1994) empirically investigated the effect of three attentional levels (implicitly assumed to engage learners’ differential amount of attention) on participants’ intake and production of the Spanish present perfect tense and relative pronouns. Participants were 125 English-speaking college-level first- and second-year students of Spanish that were randomly assigned into one of three Attentional Conditions. In Attentional Condition 1, the control group simply read the experimental texts for meaning; In Attentional Condition 2, participants read for meaning the same texts with the targeted grammatical items enhanced (bolded and uppercase); Attentional Condition 3 was similar to Attentional Condition 2, with the added instruction to “come up with a rule for the use of [the grammatical items]” (Shook, 1994: 70). Participants read, on two separate days, two passages, each containing six exemplars of the targeted grammatical item (the present perfect tense, for example, ha comprado, “he has bought” or the relative pronouns que/quien(es), “who”). The design was a pretest-exposure-posttest design, and the assessment tasks were a written production and recognition task. Shook found that while participants in Attentional Conditions 2 and 3 performed significantly better on both the written production and recognition tests when compared to the Control group, no significant differences were found between Attentional Conditions 2 and 3. This study appears to suggest that more attention plays a facilitative role in intake and production. One key limitation of this study is the failure to establish that participants did indeed pay the appropriate amount of attention to the targeted items in the texts. Similarly, Gass et al. (2003) also noted that Shook’s manipulation of input only made it “more or less likely that learners will focus attention on something” (p. 508), and therefore the design failed to demonstrate that participants actually paid attention to the extent that was intended by the experimental conditions. Gass et al. (2003) attempted to address their methodological critique of Shook (1994) by investigating the effect of focusing participants’ attention on syntactic,

210

Empirical Research Investigating the Role of Attention

morphosyntactic, and lexical items on their ability to (1) judge the grammaticality of sentences, in the case of syntactic and morphosyntactic items, and (2) to translate sentences for lexical items. Participants were 34 English-speaking collegelevel students of Italian at the first-, second-, and third-year level of proficiency and were assigned to one of four groups: (1) Focused attention for syntax and lexicon; (2) Focused attention for morphosyntax; (3) Focused attention for syntax; and (4) Focused attention for morphosyntax and lexicon. Focused attention was operationalized as requesting participants to pay attention to underlined targeted items, phrases, or structures in a reading passage and to respond to a specific question related to the similarity between these items. This was followed by a rule presentation or instructions for word guessing and practice with the targeted items, phrases, or structures. Unfocused attention was operationalized as requesting participants to focus on meaning while reading the unenhanced text, answering comprehension questions, followed by instructions to memorize non-targeted words and practice a word substitution task for synonyms of non-targeted words. Participants read three stories, each with several exemplars of one type of targeted item, together with a series of additional tasks. Gass et al. (2003) reported that focused attention had a significant effect on first-year learners’ performance for all linguistic items. However, the only significant effect found for the secondyear learners was for the lexical items. In other words, the results indicated that the more abstract and complex syntactic structure appeared to benefit more from focused attention, while type of attention was not differentiated with regard to lexical items. Gass et al. (2003) concluded that “focused attention does seem to be a powerful mechanism for learning” (p. 526), even though “learning can take place without focused-attentional intervention” (p. 526) and focused attention has an impact on short-term learner development. A key methodological issue with Gass et al.’s (2003) study, however, lies with the multiple variables, including input enhancement and practice, that were used to operationalize focused attention, which leads to some uncertainty regarding which variable or variables contributed to the results reported in the study. In addition, like Shook’s (1994) study, this study failed to methodologically establish that participants did indeed focus appropriately on the targeted items in the materials. The assumption that making a mental effort had a positive effect on vocabulary learning (e.g., Hulstijn, 1992) led Laufer and Hulstijn (2001) to propose the Involvement Load Hypothesis within the incidental L2 vocabulary learning strand. This hypothesis is rooted in both the levels of processing framework proposed in cognitive psychology and the attentional models in SLA, that is, depth of processing was viewed as elaboration and amount of attention. According to Hulstijn (2001), “processing new lexical information more elaborately (e.g., by paying attention to the word’s pronunciation, orthography, grammatical category, meaning and semantic relations to other words) will lead to higher retention than by processing new lexical information less elaborately (e.g., by paying attention to only one or two of these dimensions)” (p. 270). Consequently,

Depth of Processing and Input Processing

211

Laufer and Hulstijn (2001) proposed the notion of “involvement” as one way to operationalize the construct of depth of processing in SLA. The Involvement Load Hypothesis posits that incidental tasks that induce higher involvement can promote the type of processing deemed crucial for vocabulary retention. The notion of involvement includes three task-specific components: A motivational component, “need” (+N), and two cognitive components, “search” (+S) and “evaluation” (+E). “Need” is defined as “the drive to comply with task requirements, whereby the task requirements can be either externally imposed (i.e., moderate need, +N) or self-imposed (i.e., strong need, ++N)” (Laufer & Hulstijn, 2001: 14). “Search” and “evaluation” require the allocation of attention to form-meaning relationships. “Search” is defined as the attempt to find the meaning of an unknown word, while “evaluation” involves “a comparison of a given word with other words, a comparison of a specific meaning of a word with its other meanings, or combining the word with others in order to assess whether a word (i.e., a form-meaning pair) does or does not fit its context” (Laufer & Hulstijn, 2001: 14). According to Laufer and Hulstijn, there is also “moderate evaluation” (+E), when words being evaluated must fit in a given context, and “strong evaluation” (++E), when words being evaluated must be combined with additional words in an original context created by the learner. According to the permutation of these three components, tasks may induce different degrees of learner involvement and, depending upon the presence or absence of these components, may lead to noticing and elaborated processing of the words, and ultimately to vocabulary retention. Laufer and Hulstijn also suggest that (a) task effectiveness does not depend on whether the task is input or output oriented, but only on its involvement load, and (b) there may be an interaction between the effect of involvement load and other factors such as type of item and quantity of exposure. To test the Involvement Load Hypothesis, Laufer and Hulstijn’s (2001) participants were three intact classes of advanced university learners of English in the Netherlands (N = 87) and in Israel (N = 99), who were randomly assigned to one of three conditions varying in the involvement load induced by the task completed: (a) In the gloss condition [+N, -S, -E], participants read a text with L1 marginal glosses for ten targeted words, and answered ten multiple-choice comprehension questions; (b) In the fill-in condition [+N, -S, +E], participants read the same text and answered the same questions but the targeted words were deleted from the text, leaving ten blanks, which they had to fill by choosing a word from a list that contained fifteen words with their L1 translations and L2 explanations; (c) In the writing condition [+N, -S, ++E], participants were requested to write a composition using the targeted words, for which grammatical category, L2 explanation, example, and L1 translation were provided. To measure vocabulary retention, an unannounced immediate and delayed production posttest in which participants provided either an L1 translation or an L2 explanation for the targeted words was employed.

212

Empirical Research Investigating the Role of Attention

The results indicated that the writing condition (i.e., the condition with the highest involvement load) yielded significantly higher retention than the fill-in and gloss conditions in both experiments. The fill-in group produced significantly higher retention than the gloss condition in only one experiment. The hypothesis, then, appeared to be fully supported in the experiments conducted in Israel and partially in the Netherlands. Additional empirical support for the Involvement Load Hypothesis was provided by Kim (2008) and Keating (2008), who both conducted conceptual replications of this study controlling for time on task and proficiency level. However, according to Martínez-Fernández (2008), these results can be questioned in light of the following methodological limitations: (a) Experimental tasks differed not only in the degree of evaluation (-E/+E/++E), but also in input versus output orientation and amount and quality of information provided with the targeted words in each task; (b) there was no control group; (c) process measures, such as think aloud protocols, were not employed to ensure that tasks induced the involvement load predicted; (d) no pretest was used; instead, the likelihood of target-word familiarity was assessed in a pilot study, and prior knowledge was controlled via a post-exposure questionnaire; (e) targeted items included expressions and words of different classes, so that word type was not held constant; (f) there was no randomization of participants; and (g) retention was measured by a production task only. Martínez-Fernández (2008) together with Rott (2005) both conducted their studies situated within the Involvement Load Hypothesis and are reported in the next section that addresses studies employing concurrent data elicitation procedures. Bird (2012) recently conducted a replication of one (the 5th) of Craik and Tulving’s (1975) ten experiments in which they compared the effects of shallow and deep encoding tasks on recognition of target lexical items. In addition to native speakers of English exposed to words in English investigated in Craik and Tulving (1975), Bird (2012) included both nonnative speakers of Arabic as their L1 (n = 24) and native speakers of English (n = 24). According to Bird, the non-native speakers’ first language (Arabic) should not play a role while performing the tasks and subsequent test in English when compared to the native speakers. A total of 120 target words were split into 3 blocks of 40 words each, further divided into 20 higher frequency and 20 lower frequency words. Participants were assigned to each of these lists. All words rotated so that they appeared in all four of the following conditions: Non-semantic yes and no and semantic yes and no. Results replicated those found by Craik and Tulving (1975) in their original study. Deeper processing led to better recognition than did shallow processing for both low and high frequency words, and this was found for both native and non-native speakers. Again, as seen in previous studies, depth of processing was pre-determined by the inherent characteristics of each condition, and no online measures were employed to see whether participants were processing in the expected way. Also, as with Laufer and Hulstijn (2001), these findings are limited to L2 vocabulary.

Depth of Processing and Input Processing

213

Summary In summary, studies that relied on experimental conditions to address the role of DOP in L2 grammatical or lexical development generally appear to provide empirical support for the role of depth of processing (referred to amount of attention, focused attention, mental effort, elaboration, and shallow or deep processing). At the same time, they all suffer from the major methodological limitation of failing to methodologically establish that participants were indeed processing at the level assumed to have taken place in the experimental processing conditions. To gain further insights into these depths of processing, let us now take a look at the SLA studies that did employ concurrent think aloud protocols to probe deeper into the actual depth of processes that were employed by L2 learners while exposed to and interacting with the L2 data.

DOP Operationalized via Concurrent Think Aloud Protocols Situating her study within the Involvement Load Hypothesis, Rott (2005) explored why certain vocabulary interventions were more facilitative for word learning than others. The quality and quantity (Hulstijn, 2001) of word processing strategies of ten English-speaking participants at the intermediate level of German were recorded to determine the effects on (a) establishing and (b) strengthening lexical form-meaning connections (FMCs), as well as (c) text comprehension. The text was a modified adaptation of Shade for Sale: A Chinese Tale (Dresser, 1994), which was adopted for the study for the following reasons: (a) It provided a clearly developed story line, (b) the story was culturally neutral, and (c) the text length (535 words) was appropriate for third semester learners. To further ensure comprehension, seven words (besides the target words) were glossed. A native speaker translated the text into German. L2 learners read the text enhanced with either multiple-choice glosses (MCGs) or single-translation glosses (STGs). In both conditions the target words (TWs) occurred three more times in the text after the first glossed occurrence. The data analyses suggested that MCGs might have led to more robust and complete FMCs than STGs. According to the think aloud protocols, “strengthening of FMCs seemed to be related to the integration of multiple meta-cognitive and semantic-elaborative resources, the repeated search and evaluation of individual word meanings as well as recursive reading strategies” (Rott, 2005: 95). Weaker FMCs were marked by the use of only metacognitive resources, linear text processing, and a lack of motivation to assign concrete word meaning. Rott also reported that retention (four weeks later) for the MCG group was superior when compared to the STG. Overall, the results appeared to support the role of depth of processing on L2 readers’ ability to make form-meaning connection and retention of the target words, as espoused by the Involvement Load Hypothesis. Martínez-Fernández (2008) addressed the methodological issues discussed above in Laufer and Hulstijn’s (2001) research design. Forty-five English-speaking

214

Empirical Research Investigating the Role of Attention

participants enrolled in college-level second-year Spanish language courses completed a pretest and were randomly assigned to one of four conditions that differed in whether or not the input-oriented task included need, search, and evaluation components: Multiple-choice gloss condition, which involved need, search, and evaluation of the meaning of the targeted words (+N, +S, +E); a fill-in condition, which involved need and evaluation (+N, –S, +E); a single gloss condition, which included only the need component (+N, –S, –E); and a control condition. The text was an adaptation of the text used by Rott (2005). In all conditions the targeted words, four concrete and four abstract unfamiliar nouns, occurred four times in the text but were glossed or deleted only in their first occurrences. To reduce guessing, there were four options provided in the multiple-choice gloss task, three possible translations, and a “don’t know” option. All participants thought aloud while reading, and completed a written retelling task immediately after. Participants’ production and recognition of the targeted words, and their ability to use them in a sentence, were measured immediately after exposure and one week later. Similar to Rott (2005), the think aloud protocols revealed that the involvement load predicted was met by the experimental conditions. In addition, incidental learning was controlled by a post-debriefing questionnaire that asked learners whether they expected to be tested on the glossed words or not. Results revealed that, contrary to the prediction of the Involvement Load Hypothesis, the fill-in task, when compared to multiple-choice gloss and control groups, led to the highest vocabulary development on both the immediate and delayed posttests and the highest reported awareness. No significant difference was found between the fill-in and single gloss conditions in vocabulary development. However, frequency of the targeted words might have played a role in these results. Therefore, the question of whether fill-in conditions are more conducive to vocabulary development than single gloss and control conditions, where targeted words appear only once, remains unanswered. Finally, no significant difference was found between groups in amount of global ideas (i.e., ideas not expressed by the targeted items) recalled, but the single gloss group significantly outperformed multiple-choice gloss and control groups on amount of local ideas (i.e., ideas expressed by the targeted items). Interestingly, even though the fill-in task required focusing on word form, word meaning, and text content simultaneously to a greater extent than the single gloss condition, there was no significant difference between these groups in either local or global text comprehension. Qi and Lapkin (2001) investigated the role of depth of processing, or what they refer to as the “quality of noticing,” on two English as a second language (ESL) learners’ uptake of feedback during a writing task. One ESL student had a high proficiency level, while the other student had a low proficiency level. During three sessions, participants wrote a story based on a drawing of a crime scene that they were given (Stage 1), compared it with a reformulated or corrected version (Stage 2), and revised it without access to the reformulation (Stage 3).

Depth of Processing and Input Processing

215

Participants performed think alouds while addressing the feedback in Stage 2, were video-recorded during all tasks, and data on language-related events (LRE) were collected. An analysis of the think aloud protocols produced by the learners as they interacted with the feedback made on the reformulated version revealed two types of noticing, which Qi and Lapkin referred to as substantive and perfunctory. Substantive noticing episodes were those in which the learner articulated understanding of the feedback received (noticing with giving a reason, which can be viewed as a high level of processing) while perfunctory noticing lacked this reported understanding (“noticing without giving a reason,” which can be seen as a low level of processing). Qi and Lapkin found that substantive noticing by the higher proficiency learner led to greater improvements. Interestingly, the notions of substantive and perfunctory noticing appear to be relatively similar to Schmidt’s postulation of awareness at the levels of understanding and noticing reported in other studies (e.g., Leow, 1997; Rosa & O’Neill, 1999, etc.). Leow et al. (2008) originally set out to address, from a methodological perspective, the simultaneous attention to form and meaning strand of research to determine whether limited attentional resources during input processing compete to be allocated to either form or meaning (cf. VanPatten’s (2004) Primacy of Meaning Principle). Participants were 72 second semester English-speaking students of Spanish who were randomly assigned to one of five experimental conditions: (1) Read for meaning only (control), (2) read for meaning plus circle instances of a lexical item (sol ), (3) read for meaning plus circle instances of the feminine article (la), (4) read for meaning plus circle instances of a masculine object pronoun (lo), or (5) read for meaning plus circle instances of a verbal morpheme (-n). Participants read the passage, followed their condition’s directions, and recorded think alouds. Comprehension was assessed by a multiple-choice assessment task, and the results revealed no difference in performances between all five experimental conditions. To probe deeper into the non-significant difference, the think aloud protocols were coded for depth of processing. Leow et al. (2008) reported three levels of processing: (1) simply circling the targeted forms in Level 1, (2) providing a simple pronunciation of the forms, a slight raising of their intonation of the targeted forms, or an occasional comment such as “oh, here is another one” in Level 2, and (3) interpreting or translating the targeted form, whether correct or incorrect, in Level 3. They then hypothesized that, based on VanPatten’s (2004) Primacy of Meaning Principle, participants who had reported not only circling but also deeper processing of the target forms would obtain lower comprehension scores in comparison with those who had not. The data, however, did not show a clear trend in support of this hypothesis, although statistical analyses were not feasible due to the small number of instances. Leow et al. (2008) concluded that the relatively low level of processing of the target forms observed in all experimental groups might explain the non-significant difference in comprehension between the experimental conditions. Low levels of processing seemingly did not create any cognitive overload while processing for meaning.

216

Empirical Research Investigating the Role of Attention

Morgan-Short et al. (2012) conceptually replicated Leow et al. (2008) in hopes of obtaining more conclusive findings regarding the possible relationship between comprehension and depth of processing. Modifications included increasing the number of participants to 308 and adding a silent (non-thinkaloud) group; otherwise, all materials and procedures mirrored those of Leow et al. (2008). They reported a significant positive correlation between depth of processing and comprehension score, and that the correlation had a medium effect size. According to Morgan-Short et al. (2012), this result contradicts VanPatten (2004)’s Primacy of Meaning Principle, which “may predict . . . [that] processing written L2 input for both form and meaning simultaneously is detrimental to comprehension” (p. 24). In the computer-assisted language learning (CALL) strand of research, Hsieh et al. (2015) revisited Hsieh (2008) to qualitatively compare the role of awareness in the L2 development of a morphosyntactic structure (Spanish gustar, “to please”) in type of medium (a computerized version of the traditional face-toface or C-FTF instruction versus computer-assisted instruction (CAI)). The instructional features of C-FTF were [teacher-centered, +grammatical explanation, exemplars shown without any learner-initiated practice, no feedback] while those for the CAI were [learner-centered, performing a problem-solving task, -grammatical explanation, learner-initiated practice, implicit feedback]. Participants were 13 English-speaking college-level students of beginning Spanish. The original study (Hsieh, 2008) reported no significant difference in performance on both an oral and written production assessment task between the two instructional conditions at both the immediate and delayed posttests, but reported substantial gain scores for the CAI condition by the delayed posttest. However, the qualitative data in Hsieh et al. (2015) revealed that the features of each instructional medium clearly prompted differential levels of awareness. The C-FTF condition reported substantially more instances of higher awareness, although it was noted that the protocols contained reports that closely resembled the script delivered by the teacher in the video. At the same time, a qualitative difference in type and amount of processing of the targeted structure between the two sets of protocols was observed while coding for level of awareness. A reanalysis of the think aloud data was then conducted to shed some light on this lack of differential performance between the C-FTF and CAI conditions. Three levels of processing were identified: Low, medium, and high. The new results revealed a substantial larger amount of verbal reports and more mental and cognitive effort during the experimental exposure in the CAI condition when compared to the C-FTF condition. In other words, participants in the CAI condition needed to put more effort in processing the gustar structure compared to those in C-FTF condition, who received the grammatical information on its usages. This higher depth of processing could provide one plausible explanation for the non-significant differences in performances between the two groups (cf.

Depth of Processing and Input Processing

217

the usual findings that higher levels of awareness correlated with higher performances when compared to lower levels in the awareness studies) and, perhaps more interestingly, for the superior gain scores on the delayed posttests for the CAI group reported in the original study (cf. Rott, 2005, for similar findings on lexical items). I shall return to this potential plausible explanation for superior retention later.

Summary The preliminary evidence gleaned from concurrent data supports the overall findings reported in non-concurrent studies that depth of processing has a facilitative effect on several aspects of L2 grammar and lexical learning in the early stages. The attentional conditions with the assumed higher amount of attention (Shook, 1994), focused attention (Gass et al., 2003), or substantive noticing (Qi & Lapkin, 2001) all led to better performances. On the other hand, there is still some inconsistency between studies addressing depth of processing of lexical items within the Involvement Load Hypothesis (Laufer & Hulstijn, 2001; Keating, 2008; Kim, 2008; Martínez-Fernández, 2008; Rott, 2005). Based on think aloud protocols, Qi and Lapkin reported that a higher level of processing was more beneficial when compared to a lower level. Hsieh et al. (2015) employed an expanded list of criteria (cf. below) to code depth of processing and found that it played an important role in retention, as evidenced on both delayed oral and written production assessment tasks. Very similar findings were reported in Rott (2005) regarding lexical items. While Leow et al. (2008) reported qualitatively that participants who reported having processed the target form at deeper levels did not have lower comprehension scores than those who had not, Morgan-Short et al. (2012) replicated this study with a larger sample of participants and reported that participants demonstrating a higher level of processing performed significantly better in text comprehension with a mediumsize effect when compared to lower levels of processing. Whether based on concurrent or non-concurrent data, the results indicate overall that higher levels or greater depths of processing appear to be correlated with higher levels of performance. These findings fall neatly in line with the strand of awareness research (e.g., de la Fuente, 2015; Leow, 1997, 2000; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007, etc.), with Wickens’s (1989, 2007) notion of effort invested that represents the amount of mental effort a learner puts in when performing a task, and also with the capacity theories that include the notion of mental effort related to task difficulty employed during processing. Accompanying the amount of mental or cognitive effort is a deeper level of processing that should lead to more robust learning and retention. Doesn’t this sound like our perennial admonishing of our students to “try harder”?

218

Empirical Research Investigating the Role of Attention

Depth of Processing and Levels of Awareness Depth of processing, not surprisingly, is closely aligned to the awareness strand of research (e.g., de la Fuente, 2015; Hsieh et al., 2015; Leow, 2001; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007), in which concurrent data are gathered in a methodological effort to gain insights into and/ or establish the cognitive processes employed by L2 learners during exposure to or interacting with L2 data. Protocols have revealed that awareness at the level of understanding is typically associated with hypothesis testing, rule formation, and conscious activation of prior knowledge (cf. for example, de la Fuente, 2015; Hsieh et al., 2015; Leow, 1997, 1998, 2001; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007), which are all indicative of a high level of processing (cf. Leow, 2012; Leow et al., 2008; Martínez-Fernández, 2008; Morgan-Short et al., 2012; Qi & Lapkin, 2001; Rott, 2005). Lower levels of awareness also appear to be associated with some level of processing that does not reach the depth evidenced at awareness at the level of understanding—for example, making comments on targeted items (cf. Leow, 2001; Bowles, 2003). For quite some time I have been aware of the correlation between levels of grammatical awareness and depth or levels of processing after exposure to hundreds of think aloud protocols. To this end, I revisited the awareness strand of research and associated the levels of awareness reported in Leow (1997) with corresponding levels of processing (Leow, 2012). Based on the think aloud protocols, there was a clear distinction between the amount of cognitive effort, elaboration, and time spent processing the targeted items in the input, as seen in this table taken from 2012: TABLE 11.1 Coding levels of processing and amount of cognitive effort in relation to

Leow’s (1997: 480) data on levels of awareness Awareness at the Level of Understanding • 4 down (mumble) tu so dormir is irregular in the third person so that’s gotta be durmió with a u • mmm alright, the stems are changing, from e to i and ah o to u . . . [high level of processing: hypothesis testing and rule formation; deeper amount of cognitive effort ] Awareness at the Level of Reporting • and the verb to go is ir . . . oh cool, so that corrects number 24 across, repitieron, so you find out that’s ir OK . . . • so 11 horizontal would no longer be mentieron and is now mintieron, so I have to remember that (changes mentieron to mintieron) . . . [medium level of processing: some deeper cognitive effort ] (Continued)

Depth of Processing and Input Processing

219

TABLE 11.1 (Continued)

Awareness at the Level of Noticing • 1 down divirtieron . . . • 12 down opposite of no is sí (changes corregió to corrigió) . . . [low level of processing: minimum cognitive effort ] Leow (2012: 121)

Here are some other concrete exemplars of the data coded for the three levels of awareness (at the level of noticing, reporting, and understanding) taken from the study conducted by Rosa and Leow (2004). Note that the grammatical information (Spanish past conditional) is novel to these learners, and I have correlated these levels with levels of processing and cognitive effort. Awareness at the level of understanding •



(Learner reads sentence in Spanish) . . . prehistoric men discovered fire . . . and the second part is now we would have a lot of cold in the houses . . . so tendríamos is in the conditional, so the si clause has to be in the subjunctive . . . and if the man had not discovered fire . . . it’s probably past context . . . we would be cold in the house . . . that makes a lot of sense, so I will try that one . . . drag it in there . . . feedback and I was correct . . . Ok, moving on to the next puzzle . . . this summer I’m not going to have vacations . . . I would go to Florida without . . . ok, and the si clauses are . . . if I had a vacation soon I would go to Florida . . . and I think it’s a present context; it’s not the past, so I’m not gonna do the one with hubiera, I’m gonna do tuviera . . . and clicking for feedback, and it’s correct. [high level of processing and cognitive effort: hypothesis testing and rule formation ]

Awareness at the level of reporting •



Cristóbal Colón llegó en América in 1492 pero su intención was to travel to India . . . hoy, today they would speak Spanish en español in India . . . if . . . si . . . I don’t ever think it’s hubiera, so I’m gonna go with . . . si Colón viajara en ese país y no a América . . . (fits the piece) . . . no, I was wrong . . . I don’t know why . . . no idea why it’s wrong, but it must be this one (fits the piece), yes, that’s right . . . si Colón hubiera viajado a ese país y no a América. Mi hermana Isabel sabe hablar Italiano . . . my sister Isabel can speak Italian . . . knows how to speak Italian . . . podría . . . would be able to communicate without problems with Italians . . . that’s the second part and that’s in the conditional . . . so . . . so 3 and 4 . . . ah hubiera . . . is it that? spent . . . her next vacation in Italy . . . I’m gonna go with si Isabel pasara sus próximas vacaciones en Italia . . . good! I’m on a roll . . . si Isabel pasara sus próximas vacaciones en Italia . . .

220

Empirical Research Investigating the Role of Attention

[medium level of processing: medium cognitive effort but no hypothesis testing or rule formation] Awareness at the level of noticing •



(Learner reads sentence in Spanish) . . . gave a concert tonight . . . ah . . . I would . . . could it be this one? . . . hubiera? . . . I don’t think so, but let’s try . . . I don’t understand that hubiera thing . . . I’m really lost . . . si Pearl Jam . . . (fits piece) . . . there we go . . . (writes sentence). Number 7 . . . la película Titanic won many Oscars . . . alright . . . would not be as famous today . . . ah . . . si Titanic ganara menos oscars . . . I don’t think we need to use hubiera, but let’s see . . . (fits piece) . . . and that’s incorrect . . . hubiera ganado . . . so . . . (fits piece) . . . ok, that’s right, there we go [low level of processing: low cognitive effort ]

As can be seen, concurrent data reveal that as the level of awareness rises, so too does the level of processing or amount of cognitive effort and elaboration. Awareness at the level of noticing is usually correlated with a low level of processing and cognitive effort, at the level of reporting a medium level of processing and cognitive effort, while at the level of understanding a very high level of processing and instances of hypothesis testing and rule formation are usually reported. Similar associations can be found in many other awareness protocols. Depth of processing, then, may be closely tied to levels of awareness and can be used to account for the statistically superior performances reported for participants evidencing higher levels of awareness or learning explicitly when compared to lower levels (e.g., de la Fuente, 2015; Leow, 1997, 2000, 2001; MartínezFernández, 2008; Medina, 2015; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007). Indeed, as reported in Chapter 10, even the studies that have addressed implicit learning have also reported superior results for participants coded as aware (e.g., Hama & Leow, 2010; Leung & Williams, 2011; Williams, 2004, 2005) when compared to unaware learners. One recent study (Calderón, 2013) investigated the relationships between learner proficiency, depth of processing, levels of awareness, and learners’ intake of linguistic items contained in aural input. Participants were 24 L1 English learners of university-level first and third semester Spanish who were exposed to an aural passage in Spanish with the complex past perfect subjunctive and then immediately performed simultaneous concurrent verbal protocols to measure depth of processing and levels of awareness while completing a multiple-choice recognition test to measure intake. While results of a repeated measures ANOVA revealed that there were no significant main effects for proficiency (most likely due to the complex linguistic structure and limited amount of time during exposure, 2 min and 20 s), intermediate participants showed more awareness at the level of

Depth of Processing and Input Processing

221

understanding than did low proficiency participants, and the intermediate group also had significantly lower depth of processing. Furthermore, there were positive relationships in the low proficiency group between high depth of processing, levels of awareness, and intake; in the intermediate group, the only significant positive relationship was between high depth of processing and intake. According to Calderón (2013), while depth of processing appears to play a facilitative role at the intake stage of L2 Spanish learning of the past perfect subjunctive when the input is aural, the results highlight the ultimate importance of awareness. In other words, learners may process deeply, but they may need to achieve a higher level of awareness in order to take in successfully the linguistic information. The roles that these two variables play appear to depend on learner proficiency. More specifically, although the low and intermediate proficiency participants all demonstrated depth of processing, they did so to varying degrees. Intermediate participants showed more instances of awareness at the level of understanding than did participants of low proficiency, and the intermediate proficiency group also had significantly fewer instances of low depth of processing than did participants of low proficiency (cf. Gass et al., 2003, who reported that focused attention seemed to have taken a diminished role due to language experience). Furthermore, in the low proficiency group, there were positive relationships between high depth of processing, levels of awareness, and intake; in the intermediate proficiency group, the only significant positive relationship was a very strong one between high depth of processing and intake. Overall, it is plausible that low proficiency learners have more difficulty attaining higher levels of awareness when exposed to aural input due to the potential for more cognitive overload when compared to intermediate proficiency learners, as seen in the dominant use of low level of processing. Intermediate learners, on the other hand, may have more resources to tackle more complex grammar and to employ higher levels of awareness. Because they attain higher levels of awareness more than low proficiency learners, there is no need for them to also employ high depth of processing. When they do attain a high depth of processing, however, it appears to provide a boost in regards to what they are able to take in. The significantly fewer instances of low depth of processing in the intermediate proficiency group appear to indicate that once awareness at the level of understanding is achieved, depth of processing logically decreases. Calderón (2013) reported the case of a participant of intermediate proficiency. Within a concurrent verbal report of 588 words, the participant achieved awareness at the level of understanding after 297 words. Before that point, the participant had 11 instances of depth of processing, 9 of which were low and 2 of which were high. After achieving awareness at the level of understanding, the participant showed only three more instances of depth of processing, all of which were at a low level. This hints that once awareness at the level of understanding is reached, high levels of depth of processing are not only unnecessary but also infrequent. Level of proficiency logically leads to the role of prior knowledge in depth of processing.

222

Empirical Research Investigating the Role of Attention

The Role of Prior Knowledge in Depth of Processing Concurrent TA data appear to reveal a relationship between depth of processing, level of awareness, and the role of prior knowledge or lack thereof. Leow (1998) reported the use of conceptually-driven processing (cf. Robinson, 1995), as demonstrated by participants who were presented with a second exposure to the L2 input. Participants who verbalized the underlying morphological rules during the first exposure were precisely those who appeared to have relied on some kind of prior knowledge during the second exposure. Differential performances based on this type of processing, as compared to data-driven processing, appeared to be associated with awareness at the level of understanding and processes of elaboration typically found in a high depth of processing. Qi and Lapkin (2001) observed deeper processing or “quality of noticing” in her higher proficiency participant, while TA data from de la Fuente (2015) also revealed type of processing in relation to the type of feedback (L1 or L2). De la Fuente reported differential depth of processing and processing time due to the presence or absence of activation of appropriate prior knowledge of the targeted form, Spanish passive se (cf. similar findings in the outliers’ performances reported in Leow, 2001, and Bowles, 2003). While the conceptually-driven processing was based on the activation of prior knowledge of the target forms during experimental exposure in Leow (1998), another study (Leow, 2000) reported TA data revealing that some participants were activating prior knowledge of a previously learned verbal form to assist in selecting the targeted form on the posttest. To demonstrate the role of prior knowledge in one of his participants’ performances on the recognition task, Leow provided the following think aloud protocol in which the participant was clearly using his prior knowledge of the stem change of the targeted verb in the present tense together with a strategy of elimination to arrive at the correct item on the multiple-choice test, the preterit form. Note that performance on these items is not related to the experimental exposure. 3. . . . dormirse, duer-, du- it is not B (se dormeron), it is not A (se dormieron), there is no ‘i’ in C (se durmeron), so it is definitely D (durmieron) . . . 6. . . . dormirse, OK duermo, it is definitely not B (dormió), remember dormir is a spelling changing verb, so and there is an -ió, so it is C (durmió) . . . In sum, it appears that this type of deep processing together with a high level of awareness involves elaboration rehearsal that incorporates deeper analysis of the item such as activation of prior knowledge and meaningful analysis.

Inhibitory Effects of Depth of Processing The overall positive correlation observed between levels of awareness and depth of processing needs to be tempered by the potential inhibitory effect of deep processing in achieving awareness at the level of understanding. In all of these studies some type of implicit feedback was provided, which, if processed deeply

Depth of Processing and Input Processing

223

and correctly, often led to a higher level of awareness that was typically tied to hypothesis testing, successful rule formation, and successful intake and learning of the targeted linguistic information in the input. However, it is informative to note that while a deeper level of processing, accompanied by greater cognitive effort and elaboration employed, appears to correlate with higher levels of awareness, this does not logically lead to a higher level of awareness, for example, at the level of understanding. If the linguistic information is complex, the potential for cognitive overload and misunderstanding may occur and lead to incorrect misunderstanding and confusion of the underlying rule(s). To exemplify this distinction, let us take a look at the following protocol (coded + high depth of processing) taken from an adult L2 learner as she was navigating an experimental path that provided two options, of which one was correct (so implicit feedback was provided), to continue down the path and complete a sentence (Hsieh, 2008), together with extracts from her immediate oral and written posttests. The immediate posttests, sentence completions, are relatively similar in format to the treatment phase, without the implicit feedback provided. The targeted item was the problematic Spanish verb gustar, “to like,” that has several levels of complexity. The unique aspect of this verb is that the subject of the verb is the thing liked, not the person, so a sentence like “I like the houses” is translated as “The houses please to me.” In the first protocol she is processing sentences like this one in which the focus is on yo, “I,” versus the indirect object pronoun (IO) me, “to me,” verbal agreement (V).

Protocol I am doing a computer program . . . Hmmmm, I have to click the right one to get to continue on. Oh, I have to finish this sentence, ok . . . yo meeee . . . gusta? Yeah . . . I want to go into something else . . . XXX. Yo me gusta . . . la biología? Thank God they repeat this multiple times so I can get this. Me gusta la fresa. Yo me gusta el futbol. Yo me gustan . . . oh, it’s multiple houses in the picture, so it’s gustaNNN las casas. Me gustaNNN las matematicas? Yeah! There are two tacos, so . . . me gustan los tacos. Me gustan las papas fritas. Me gustan los huevos. [Comment: so far, so good. As you note, after making the typical error of using the personal pronoun yo ‘I’ to say “I like something,” she becomes aware of the underlying verbal agreement, describes it, and then proceeds to apply it in subsequent exemplars. One happy participant.] The next section is a more problematic level of the gustar structure since a sentence like “Juan likes the apple” is translated as A Juan le gusta la manzana, “To John to him pleases the apple.” . . . A Juan le gusta la manzana. A (emphasized) Pedro le gusta el espanol. A (emphasized) Maria le gusta el tenis. A (emphasized) Carmen le gusta la química. A Pedro le gusta . . . this is good; the repetition helps, I think. A Juan le gustaNN las hamburguesas . . . 18, . . .

224

Empirical Research Investigating the Role of Attention

[Comment: She notices the preposition a (Prep), makes no further report on it, and is highly aware of the verb agreement between the object of the liking and the verb. Still one happy camper.] This is the most complex aspect of the structure. “John and Maria like the apple” is translated as A Juan and a Maria les gusta la manzana “To John and to Maria pleases the apple.” A Juan y a Maria le gustan. No, le gusta??? Why?! What’s wrong with that? Oh! OK, so the plural is for the people, not for the plurality of the thing. A Pepe y a Belen le gusta, le gusta.. yeah!. .. la física. A Elena y a Marta le gusta el pollo. A Pedro y a Andres le gustan, no? Le gusta el queso; that’s confusing, b/c there are multiple cheeses in the picture so it makes me think it should be plural. That was fun! Oh well . . . ah, A Carmen y a Carlos le gusta el taco. Ummm . . . A Juan y a Maria le gustan las naranjas. A Laura y a Maria les gustan las rosas. Yeayyyyyyyy! Finished!!!! [Comment: She appears to be relatively in control with the verb agreement (V), she has not commented on the preposition (Prep) and does not appear to have noticed the indirect pronoun (IO) le, “to him, to her,” versus les, “to them.”] OK, oral production [Comment: implicit feedback has been removed and she now has a series of pictures to describe.] no. 1. Ummm. A Ana y Lucia le gusta el pollo. [misses second Prep and incorrect IO] 2. A Ana y Teresa le gusta la pizza. [misses second Prep and incorrect IO] 4. um. Yo gusto, yo gusta el tomate. [incorrect IO, back to the original error of translating I with yo] 6. a Pepe y Carmen llllle gusta la geografia. [misses IO, incorrect IO but emphasizes IO] 7. a yooo.. oh! Me gusta . . . las . . . me gustan? No, me gusta las hamburguesas. [notices incorrect yo and corrects it, comments on V] 8. A Juan le gustan las peras. No, a Juan le gusta las peras. [applies incorrect verb rule for the next several ítems] . . . 19. me gusta las matematicas. [incorrect V] 20. me gusta la astro . . . oh, no, me gustaNNN las matematicas. Me gusta la astronomia. [notices V in matemáticas and begins to apply V rule correctly for the most part later] 22. a Carlos y Maria le gustan le.. las bananas. [misses second Prep, incorrect IO] [Comment: note the behavior changing due to the ítems on the posttest] ok, now written part. Elena y Marta . . . le gusta el pollo. Hmmm. I can’t remember in my head if the subject conforms or the verb. Hmmm!!! I can’t remember whether I make the part that says gusta or gustan according to the thing that is being talked about or the people. Or whether it’s the le . . . Hmmm!!!! XXX in Spanish getting mixed up! OK. We’ll just go on. [CONFUSION] Ummm. Me . . . gustan las hamburguesas. A Ana le gusta la rosa . . . chicle, what is chicle? A Lisa y Maria le gustan . . . le . . . XXx maybe its just gusta b/c it’s geografia. I think that’s it. A Marta le, so we leave out the ‘s’ cuz it’s just her, gustan cuz it’s multiple pineapples, las pinas. [Comment: Here she has both rules correct.] Pedro le gusta la fresa? A Juan le gustan los pimientos. Ok// gustan . . . XXX A Pedro y a XX les gustaNN las cervezas. OK, at least I am consistent now; I don’t know if it’s right or not. But we shall see . . .

Depth of Processing and Input Processing

225

Umm. Vero y Josefina le gusta la pizza. Me gusta la sandia. Me gustan . . . las XXX oops, las zanahorias. Me gusta la musica. A Juan le gusta el futbol. It’s always gusta, it’s never gusto; is that different from . . .? maybe that’s . . . oh boy, is the yo form . . . gusto? Doesn’t the yo form always end in o? oh, I don’t remember . . . A Juan y Maria le gustan las bananas. Me gusta , me gustan los aguacates. Umm a Maria le gusta el tenis. Aaaagh XXX. A Pedro y Andres le gustan el queso. A pedro . . . le gustan los huevos. A Juan , yeah, . . . le gustan las naranjas. Le gustan las palomitas de maiz. Luis y Maria le gustan las uvas. A Maria le gustan los tomates. A Pablo y a Juan le gustan la cerveza. Le gustan los tacos. Pablo le gusta el italiano. A Vero y a Jose le gustan los limones. OK. Finished. As can be seen, there is quite some confusión here and some uncertainty regarding several of the underlying rules governing the gustar structure. She is clearly employing hypothesis testing and rule formation for a few of the underlying rules but does not seem to have a strong grasp of these rules. Having to process deeply one rule was relatively fine but processing several underlying rules at the same time appeared to be creating a cognitive overload. In other words, she was finding it quite challenging to use what she had previously learned to facilitate the upper levels of this problematic structure. Consequently, while the amount of cognitive effort was high, she was only achieving partial awareness at the level of understanding of a few of the underlying rules.

Levels of Awareness Versus Depth of Processing Given the relative correlation between levels of awareness and depth or levels of processing, the logical question is the following: What differentiates between these two levels? The difference lies at awareness at the level of understanding and the highest depth of processing characterized by amount of cognitive effort together with elaboration that includes activation of prior knowledge and evidence of hypothesis testing and rule formation. As demonstrated above, grammatical TA protocols have revealed that a learner may process L2 grammatical data deeply, expend much cognitive effort and time in attempting to understand the targeted item, and even make hypotheses, yet fail to arrive at an accurate or full understanding of the underlying rule. While protocols have generally revealed that the low and medium depths of processing L2 grammatical data correlate quite well with the two lower levels of awareness (noticing and reporting), processing at a high level does not automatically lead to awareness at the level of understanding (the underlying rule). In other words, demonstrating awareness at the level of understanding does imply a high depth of processing (+awareness at the level of understanding, +high level of processing). A learner’s protocol evidencing a high depth of processing but failing to arrive at a full understanding of the underlying rule would be coded [- awareness at the level of understanding, +high level of processing]. Let us discuss the benefits of employing a DOP coding perspective to TA protocols when compared to levels of awareness.

226

Empirical Research Investigating the Role of Attention

Benefits of Adopting a DOP Perspective to Code TA Protocols There are several benefits if TA protocols were to be initially coded for depth of processing instead of levels of awareness. The first lies in the findings reported in Calderón (2013) that “depth of processing is facilitative at the intake stage of L2 Spanish development but that awareness is even more important” (p. 104). In other words, (depth of) processing is primary, which may then lead to levels of awareness that may further facilitate the process of learning. The second benefit lies in its comprehensive ability to address both grammatical and lexical items. Level of awareness cannot fully account for lexical processing given that there is usually no underlying rule per se for most underived words. Coding lexical protocols, then, is a bit straightforward since it is usually based on a partial or full form-meaning connection, and learners typically do not process deeply targeted lexical items while reading a text primarily for informational content (unless manipulated as in Rott, 2005), as they do grammatical items. Thus, if we were to take this direct form-meaning connection into consideration, achieving this accurate connection with new words would be coded for high depth of processing once this level of processing is supported by evidence of elaboration or deep analysis having taken place to arrive at this accurate connection. An immediate direct translation would be coded [+low depth of processing], for obvious reasons. A third benefit is related to the dichotomy of implicit versus explicit learning, or whether awareness plays a role in learning. It appears more feasible that this dichotomy should be first viewed from the perspective of how the learner is processing the L2 data, and, based on level of processing, be assigned the type of learning (implicit or explicit) taking place. DOP can then provide information on whether the learner demonstrates any level of awareness. A fourth benefit lies in the argument that coding for depth of processing may be a less thorny approach to code learner cognitive behavior when compared to the construct of awareness. A fifth benefit may be that, in addition to online verbal reports, eye-tracking and reaction time procedures may be employed to address learner depth of processing once, like TAs, certain criteria are well established and tested over time. The benefits of adopting a DOP perspective to address the issue of type of learning are theoretical, methodological, empirical, and pedagogical. Theoretically, we are addressing both the important roles of attention and cognitive processes employed during L2 exposure. More specifically, we are not centralizing the process of learning on the construct of attention, which clearly plays an important role in the early stages of L2 learning, but placing a focus on the notion of how L2 learners process the L2 data. Methodologically, as I have proposed (Leow, 2015), we can continue to operationalize the construct of awareness either as a dichotomy (aware versus unaware) or as occurring on a continuum, for example, no awareness > (awareness) > no awareness or no awareness > awareness, as reported in Hama and Leow (2010). As a dichotomy, if we assume that the

Depth of Processing and Input Processing

227

processes involved in both implicit learning and implicit knowledge are similar, we use depth of processing to establish the threshold at which awareness can be operationalized. More specifically, we employ the same descriptors employed in the SLA literature to describe the accessing of implicit knowledge (e.g., “very low depth of processing,” “minimal cognitive or mental effort,” “automatic,” “unverbalizable,” “rapid,” “initiated without intention,” “resistant to modification,” “does not interfere with other processes,” “is unaffected by other processes”), which all describe an extremely low level of processing, with an absence of cognitive effort employed during processing, to operationalize and measure what comprises implicit learning. Any data, and logically concurrent data, that demonstrate behavioral performance that deviates from such descriptors should serve as an indicator that the threshold of awareness has been crossed at that point of encoding. As I pointed out, such classification easily avoids the major limitation of employing a high level of awareness, usually associated with verbalization of linguistic knowledge, and incorporates the lower levels of awareness reported in the SLA literature. On the other hand, if we view the construct of awareness as occurring on a continuum, depth of processing may address, as suggested by Leow (2000), “whether awareness is deployed as particular items in the L2 are encountered or whether this deployment results from being in a general state of awareness” (p. 573). In addition, depth of processing may also address whether both explicit and implicit processes, as postulated by some researchers (e.g., Schmidt, 1994), operate during input processing. Methodologically, such data may only be elicited concurrently. Pedagogically, we can employ learning activities in our classrooms that are designed to promote deeper processing or, in other words, explicit learning.

Operationalization of Depth of Processing for Lexical and Grammatical Items Taking several of the benefits discussed above into consideration, two colleagues (Annie Calderón and Ellen Johnson Serafini) and I created from dozens of think aloud protocols the following coding scheme to operationalize depth of processing for both lexical and grammatical items. The coding scheme is presented below: Operationalization of Depth of Processing (DOP): Lexical Items Level 1

Level 2

Level 3

Low depth of processing

Medium depth of processing

High depth of processing

Provides some evidence of processing target item

Provides evidence of making accurate form-meaning connection

Description Shows no potential for emerging form-meaning connection

(Continued)

Descriptors

Level 1

Level 2

Level 3

Low depth of processing

Medium depth of processing

High depth of processing

Reads target quickly Translates the phrase to English but leaves the target in Spanish Says s/he isn’t sure what it is Says s/he will click something Repeats the target item Carefully pronounces target word Does not spend much time processing target item Low level of cognitive effort to get meaning of target item

Spends a bit more time processing target item Makes a comment that indicates some processing of target item Some level of cognitive effort to get meaning of target item

Spends time processing target item Provides an accurate translation of target item or finds a different way to say almost the same thing High level of cognitive effort to get meaning of target item

Operationalization of Depth of Processing (DOP): Grammatical Items Low depth of processing

Medium depth of processing

High depth of processing

Level of awareness

Noticing

Reporting

+ Understanding (based on accuracy of underlying rule or formmeaning connection)

Description

Shows no potential for processing target form grammatically

Descriptors

Reads target quickly Translates the phrase to English but leaves the target in Spanish Carefully pronounces target item Repeats target item Says s/he isn’t sure what it is Does not spend much time processing target item Low level of cognitive effort to process target item grammatically

Comments on target item in relation to grammatical features Spends a bit more time processing target item Makes comments that indicate some processing of target item Some level of cognitive effort to process target item grammatically

Arrives at an inaccurate, partially accurate, or fully accurate target underlying grammatical rule Makes hypotheses regarding target item Provides an inaccurate, accurate, and/or partially accurate rule Corrects previous translation Spends much time processing target item High level of cognitive effort to process target item grammatically

Depth of Processing and Input Processing

229

Note that we have included levels of awareness on the grammatical operationalization chart in order to provide an important contrast between depth of processing and levels of awareness. As discussed above, processing at a high level does not automatically lead to awareness at the level of understanding (the underlying rule), hence the plus or minus code.

Using DOP for Explication in Some Strands of Research in SLA The concept of DOP may provide a plausible explanation to explicate findings gleaned from other strands of research. Let us select a few popular SLA strands of research (e.g., textual enhancement, experimental learning condition, and oral interaction/CMC) and take a brief look at this proposition in the next section.

Textual Enhancement I have discussed this popular strand of research in Chapter 9, so I shall only underscore the pertinent role of depth of processing in relation to noticing and levels of awareness. Based on TAs, Leow (2001) reported no significant difference in amount of noticing between the enhanced and unenhanced groups, while Bowles (2003), based on TAs, and Winke (2013), based on duration of eye gazes obtained on an eye-tracker, reported substantial differences. Leow and Bowles, based on TA data, also suggested one plausible explanation, namely low depth of processing, which was clearly evident in the majority of protocols that included comments such as “I don’t know what that one means,” “Adquiera? I don’t remember that” (Bowles, 2003: 407) and “I am not sure,” “I don’t know why this is underlined” (Leow, 2001: 502; cf. also Leow et al., 2008). Together with other studies employing a concurrent data elicitation procedure (Alanen, 1995; Gurzynski-Weiss, Al Khalil, Baralt, & Leow, 2015; Leow, Egi, Nuevo, & Tsai, 2003), no significant difference in performance or relationship between the enhanced and unenhanced groups was reported. Interestingly, in Leow’s (2001) and Bowles’s (2003) studies there were a few high performers in each group (enhanced and unenhanced) whose protocols revealed instances of awareness at the level of understanding or, put another way, a high depth of processing of the targeted items. Bowles (2003) reported: . . . these two high-scoring participants’ comments revealed deeper levels of processing. It was obvious from the think-aloud data that these participants were actively involved in rule search and hypothesis testing. The thought progression of one participant is clearly demonstrated in his protocol. His first comment on the targeted forms was, “Haga y ponga . . . they are underlined, which means they have some significance . . . and those are

230

Empirical Research Investigating the Role of Attention

subjunctive forms, I think.” Then, when he saw the next targeted verb, he tried to confirm this hypothesis: “Um, ad-quiera, also looks like it’s in the subjunctive form . . . don’t know what it says.” Then, after reading a few more sentences, he changed his hypothesis, saying, “Oh, maybe this is a command form.” He confirmed this once and for all when he saw the subsequent targeted form: “Duerma . . . um, that’s a command form . . . I feel like they’re telling me to go to sleep . . .” Then, during the post-exposure tasks, he continued to search for a rule, applying the hypothesis he had formulated. (p. 407) Given the substantial difference in amount of noticing and yet the non-significant difference in performances between the enhanced and unenhanced groups, it is quite clear the role depth of processing played in learners’ subsequent performances after exposure in these studies.

Experimental Learning Conditions Several studies have provided some type of experimental exposure or instruction to L2 learners and have reported similar performances between experimental groups on the immediate posttest (e.g., Hsieh et al., 2015; Morgan-Short, Sanz, Steinhauer, & Ullman, 2010; Rott, 2005; Sanz & Morgan-Short, 2004) but different performances on the delayed posttest (Hsieh et al., 2015; Morgan-Short et al., 2010; Rott, 2005). Of these selected studies, two employed concurrent data elicitation procedures (Hsieh et al., 2015; Rott, 2005) and two did not (Morgan-Short et al., 2010; Sanz & Morgan-Short, 2004). Hsieh et al.’s (2015) higher processing group was an implicit condition in which participants performed a problemsolving task while receiving implicit feedback. Their lower processing group was an explicit condition in which participants received a grammatical presentation. Rott’s (2005) higher processing group was a multiple-choice glossed condition, while her lower processing group was a single-translation gloss condition. TA protocols were used to code depth of processing. Hsieh et al. (2015; grammatical structure) and Rott (2005; lexical items) reported that both experimental groups performed statistically similar on the posttest (cf. similar findings in Morgan-Short et al., 2010, and Sanz & MorganShort, 2004). However, while the higher processing group retained most of its immediate posttest means on the delayed posttest (2 weeks and 4 weeks, respectively), the lower processing group demonstrated a substantial decrease in their means on this delayed posttest (in Hsieh et al.’s (2015) study on the oral production test, –7.49 vs. –25.76, gain scores +47.66 vs. +26, and on the written production test, –2.66 vs. –28.76, gain scores +38.27 vs. +11.50; in Rott’s (2005) study on the word knowledge test, –5 vs. –35, and on the syntactic knowledge, –10 vs. –35). In sum, then, deeper processing of L2 data appears to contribute to superior retention of both grammatical and lexical information.

Depth of Processing and Input Processing

231

If this were the case, then given the relatively similar learning conditions (implicit vs. explicit) between Hsieh et al.’s (2015), Morgan-Short et al.’s (2010), and Sanz and Morgan-Short’s (2004) studies, it is quite plausible that the role of depth of processing could have also accounted for the non-significant differences in performances on the immediate posttest reported in these two latter studies that did not gather concurrent data. Similarly, the superior performance reported for Morgan-Short et al.’s (2010) implicit training condition on the delayed posttest could also be explicated by the similar findings reported in Hsieh et al.’s (2015) and Rott’s (2005) studies, with the assumption that implicit learning conditions providing implicit feedback may promote deeper processing of the L2 data.

Oral Feedback Strand/CMC Finally, in the oral feedback strand, depth of processing may also account for differential performances demonstrated by participants. Let us take a classic interaction scenario from Oliver and Mackey (2003), in which a recast is provided. STUDENT: TEACHER: STUDENT:

Why did you fell down? Why did you fall down? Fall down, yes.

This recast could have been processed at different depths. Did the student perceive the recast to be a correction of his/her error? The “yes” appears to answer this question. Did processing stop or continue at this point? If processing was abandoned, then s/he processed the recast at a low depth of processing, and the chances of addressing the error in the future might not be strong. However, if s/ he spent some more cognitive effort to probe deeper into the source of the error and perhaps arrived at the solution to the error, then one can assume that depth of processing was higher, the content of the utterance remained in working memory, and the chances for addressing any potential error or avoiding this type of error were greater. The literature on the effects of recasts provided within an interaction (e.g., Leeman, 2003; Mackey & Philp, 1998; Philp, 2003) has generally reported beneficial effects on L2 development, although such development is not reported for all participants receiving recasts. Perhaps depth of processing could have accounted for those who performed well or not so well on subsequent assessment tasks or production during the interaction. Given that it is quite challenging to gather concurrent data during oral interaction, let us take a look at learner behavior (via screen recording videos) involving recasts provided during computer-mediated communication (CMC; Baralt, 2013) to probe a bit into the depth of processing employed by L2 learners during interaction in this medium. To investigate why cognitive complexity was experienced so differently in face-to-face (FTF) versus CMC conditions, Baralt (2013) conducted

232

Empirical Research Investigating the Role of Attention

a follow-up analysis, looking at the qualitative nature of interaction and feedback provision for each experimental condition. She noted interesting features about the provision and reception of recasts alongside simple versus cognitively complex tasks. For the better performing group, the screen-recorded videos revealed that some participants were processing the recasts deeper by attempting to write out the past subjunctive (the targeted structure), then moving their mouse upwards or scrolling up to examine formerly provided feedback before sending their message. Baralt provided a demonstration of this behavior below: (1) tuvo tuvieron tuviera [scrolled up to confirm form] tuvieran [sent message] This participant wrote the past tense indicative and singular form of the verb tener (to have): tuvo. He then erased it (represented here with the crossed-out text), and typed out the plural verb form, still indicative: tuvieron (they had). He then erased the indicative morphological ending, replacing it with subjunctive morphology. Before sending the message, the participant scrolled up quickly to view a past subjunctive form (from a previous recast) for comparison. After seeing that it was correct, he scrolled back down, added the plural morpheme -n, and then sent the message to the researcher. This example indicates that learners in the CMC-C group had more time as well as more attentional resources available during the task in order to test their hypotheses about a form and confirm the accuracy of their production. (Baralt, 2013: 717)

Conclusion The concept of depth of processing in relation to attention, noticing, levels of awareness, cognitive effort, elaboration, deep analysis, and prior knowledge is quite intriguing given the broad empirical support for its effectiveness in both cognitive psychology and SLA (e.g., the awareness strand of research), especially in relation to retention of both grammatical and lexical items several weeks after the experimental exposure. Put simply, it provides an explanation not only of why we are able to perform much better and for much longer than others, but it may also underscore the importance of acknowledging its role in L2 processing and promoting our students’ deeper processing of the L2 data to which they are exposed in the formal setting.

References Alanen, R. (1995). Input enhancement and rule presentation in second language acquisition. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and teaching. Honolulu, HI: University of Hawai’i Press. Baddeley, A. (2002). Human memory: Research and practice. Hove, UK: Psychology Press.

Depth of Processing and Input Processing

233

Baralt, M. (2013). The impact of cognitive complexity on feedback efficacy during online versus face-to-face interactive tasks. Studies in Second Language Acquisition, 35, 689–725. Bird, S. (2012). Expert knowledge, distinctiveness, and levels of processing in language learning. Applied Psycholinguistics, 33, 665–689. Bowles, M. A. (2003). The effect of textual input enhancement on language learning: An online/offline study of fourth semester Spanish students. In P. Kempchinshky & C. Pineros (Eds.), Theory, practice, and acquisition: Papers from the 6th Hispanic Linguistic Symposium and 5th Conference on the Acquisition of Spanish and Portuguese. Summerville: MA: Cascadilla Press. Calderón, A.M. (2013). The effects of L2 learner proficiency on depth of processing, levels of awareness, and intake. In J. M. Bergsleithner, S. N. Frota, & J. K. Yoshioka (Eds.), Noticing and second language acquisition: Studies in honor in Richard Schmidt (pp. 103–121). Honolulu, HI: University of Hawai’i, National Foreign Language Resource Center. Chaudron, C. (1985). Intake: On models and methods for discovering learners’ processing of input. Studies in Second Language Acquisition, 7, 1–14. Conway, M. A., Gardiner, J. M., Perfect, T. J., Anderson, S. J., & Cohen, G. M. (1997). Changes in memory awareness during learning: The acquisition of knowledge by psychology undergraduates. Journal of Experimental Psychology: General, 126, 393–413. Craik, F. I. M. (2002). Levels of processing: Past, present . . . and future? Memory, 10 (5–6), 305–318. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684. Craik, F., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104 (3), 268–294. De la Fuente, M. J. (2015). Explicit corrective feedback and computer-based, formfocused instruction: The role of L1 in promoting awareness of L2 forms. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Dresser, C. (1994). Shade for sale: A Chinese tale. In The rainmaker’s dog (pp. 223–234). New York: St. Martin’s Press. Ellis, N. C. (2007). The associative-cognitive CREED. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 77–95). Mahwah, NJ: Lawrence Erlbaum. Gardiner, J. (1988). Functional aspects of recollective experience. Memory & Cognition, 16 (4), 309–313. Gardiner, J. M., Brandt, K. R., Vargha-Khadem, F., Baddeley, A., & Mishkin, M. (2006). Effects of level of processing but not of task enactment on recognition memory in a case of developmental amnesia. Cognitive Neuropsychology, 23(6), 930–948. Gardiner, J. M., Java, R. I., & Richardson-Klavehn, A. (1996). How level of processing really influences awareness in recognition memory. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 50 (1), 114–122. Gardiner, J. M., & Richardson-Klavehn, A. (2000). Remembering and knowing. In E. Tulving & F.I.M. Craik (Eds.), Handbook of memory (pp. 229–244). New York: Oxford University Press. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum. Gass, S., Svetics, I., & Lemelin, S. (2003). Differential effects of attention. Language Learning, 53(3), 497–546.

234

Empirical Research Investigating the Role of Attention

Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in L2 vocabulary acquisition by means of eye tracking. Studies in Second Language Acquisition, 35, 483–517. Gurzynski-Weiss, L., Al Khalil, M., Baralt, M., & Leow, R. P. (2015). Levels of awareness in relation to type of recast and type of linguistic item in synchronous computer-mediated communication: A concurrent investigation. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), Technology and L2 learning: A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hama, M., & Leow, R. P. (2010). Learning without awareness revised. Studies in Second Language Acquisition, 32 (3), 465–491. Hsieh, H-C. (2008). The effects of type of exposure and type of post-exposure task on L2 development. Journal of Foreign Language Instruction, 2, 117–138. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Hulstijn, J. H. (1992). Retention of inferred and given word meanings: Experiments in incidental learning. In P. J. Arnaud & H. Béjoint (Eds.), Vocabulary and applied linguistics (pp. 113–125). Basingstoke: Macmillan Academic and Professional. Hulstijn, J. H. (2001). Intentional and incidental second language vocabulary learning: A reappraisal of elaboration, rehearsal and automaticity. In P. Robinson (Ed.), Cognition and second language instruction (pp. 258–286). Cambridge: Cambridge University Press. Hyde, T. S., & Jenkins, J. J. (1969). Differential effects of incidental tasks on the organization of recall of a list of highly associated words. Journal of Experimental Psychology, 82 (3), 472–481. Johnston, C. D., & Jenkins, J. J. (1971). Two more incidental tasks that differentially affect associative clustering in recall. Journal of Experimental Psychology, 89 (1), 92–95. Karayianni, I., & Gardiner, J. M. (2003). Transferring voice effects in recognition memory from remembering to knowing. Memory & Cognition, 31(7), 1052–1059. Keating, G. D. (2008). Task effectiveness and word learning in a second language: The involvement load hypothesis on trial. Language Teaching Research, 12 (3), 365–386. Khoe, W., Kroll, N. E. A., Yonelinas, A. P., Dobbins, I. G., & Knight, R. T. (2000). The contribution of recollection and familiarity to yes–no and forced-choice recognition tests in healthy subjects and amnesics. Neuropsychologia, 38 (10), 1333–1341. Kim, Y. (2008). The role of task-induced involvement and learner proficiency in L2 vocabulary acquisition. Language Learning, 58 (2), 285–325. Knowlton, B. J., & Squire, L. R. (1995). Remembering and knowing: Two different expressions of declarative memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21(3), 699–710. Konstantinou, I., & Gardiner, J. M. (2005). Conscious control and memory awareness when recognizing famous faces. Memory, 13(5), 449–457. Laufer, B., & Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics, 22 (1), 1–26. Leeman, J. (2003). Recasts and second language development: Beyond negative evidence. Studies in Second Language Acquisition, 25, 37–63. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47(3), 467–505. Leow, R. P. (1998). Toward operationalizing the process of attention in SLA: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19 (1), 133–159.

Depth of Processing and Input Processing

235

Leow, R. P. (2000). A study of the role of awareness in foreign language behavior. Studies in Second Language Acquisition, 22 (4), 557–584. Leow, R. P. (2001). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leow, R. P. (2012). Explicit and implicit learning in the L2 classroom: What does the research suggest? The European Journal of Applied Linguistics and TEFL , 2, 117–129. Leow, R. P., Egi, T., Nuevo, A. M., & Tsai, Y-C. (2003). The roles of textual enhancement and type of linguistic item in adult L2 learners’ comprehension and intake. Applied Language Learning, 13(2), 1-16. Leow, R. P., Hsieh, H.-C., & Moreno, N. (2008). Attention to form and meaning revisited. Language Learning, 58 (3), 665–695. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mapping between forms and contextually derived meanings. Studies in Second Language Acquisition, 33(1), 33–55. Mackey, A., & Philp, J. (1998). Conversational interaction and second language development: Recasts, responses, and red herrings? Modern Language Journal, 82 (3), 338–356. Martínez-Fernández, A. (2008). Revisiting the Involvement Load Hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. McLaughlin, B. (1987). Theories of second-language learning. London, Baltimore: Edward Arnold. Medina, A. (2015). The variable effects of level of awareness and CALL versus nonCALL textual modification on adult L2 readers’ input comprehension and learning. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), Technology and L2 learning: A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Morgan-Short, K., Heil, J., Botero-Moriarty, A., & Ebert, S. (2012). Allocation of attention to second language form and meaning: Issues of think alouds and depth of processing. Studies in Second Language Acquisition, 34, 4, 659–685. Morgan-Short, K., Sanz, C., Steinhauer, K., & Ullman, M. (2010). Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning, 60, 154–193. Oliver, R., & Mackey, A. (2003). Interactional context and feedback in child ESL classrooms. The Modern Language Journal, 87, 519–533. Qi, D. S., & Lapkin, S. (2001). Exploring the role of noticing in a three-stage second language writing task. Journal of Second Language Writing, 10 (4), 277–303. Philp, J. (2003). Constraints on “noticing the gap”: Nonnative speakers’ noticing of recasts in NS-NNS interaction. Studies in Second Language Acquisition, 25, 99–126. Rajaram, S. (1993). Remembering and knowing: Two means of access to the personal past. Memory & Cognition, 21(1), 89–102. Rajaram, S. (1996). Perceptual effects on remembering: Recollective processes in picture recognition memory. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 365–367. Robinson, P. (1995). Attention, memory, and the “Noticing” Hypothesis. Language Learning, 45(2), 283–331. Robinson, P. (2003). Attention and memory in SLA. In C. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 631–678). Oxford: Blackwell. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25(2), 269–292.

236

Empirical Research Investigating the Role of Attention

Rosa, E., & O’Neill, M. D. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21(4), 511–556. Rott, S. (2005). Processing glosses: A qualitative exploration of how form-meaning connections are established and strengthened. Reading in a Foreign Language, 17, 95–124. Sachs, R., & Suh, B. R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 197–227). Oxford: Oxford University Press. Sanz, C., & Morgan-Short, K. (2004). Positive evidence vs. explicit rule presentation and explicit negative feedback: A computer-assisted study. Language Learning, 53(4), 35–78. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129–158. Schmidt, R. W. (1994). Implicit learning and the cognitive unconscious: Of artificial grammars and SLA. In N. Ellis (Ed.), Implicit and explicit learning of languages (pp. 165–209). London: Academic Press. Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language learning (pp. 3–32). New York: Cambridge University Press. Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing: Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190. Shook, D. J. (1994). FL/L2 reading, grammatical information, and the input-to-intake phenomenon. Applied Language Learning, 5(1), 57–93. Swain, M. (1985). Communicative competence: Some roles of comprehensible input and comprehensible output in its development. In S. M. Gass & C. Madden (Eds.), Input in second language acquisition (pp. 235–253). Rowley, MA: Newbury House. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. Tulving, E. (1966). Subjective organization and effects of repetition in multi-trial freerecall learning. Journal of Verbal Learning and Verbal Behavior, 5(2), 193–197. Tulving, E. (1983). Elements of episodic memory. Oxford: Oxford University Press. Turvey, M. T. (1967). Repetition and the preperceptual information store. Journal of Experimental Psychology, 74 (2, Pt.1), 289–293. VanPatten, B. (2004). Processing instruction: Theory, research, and commentary. Mahwah, NJ: Lawrence Erlbaum. Wickens, C. D. (1989). Attention and skilled performance. In D. Holding (Ed.), Human skills (pp. 71–105). New York: John Wiley. Wickens, C. D. (2007). Attention to the second language. International Review of Applied Linguistics, 45, 177–191. Williams, J. N. (2004). Implicit learning of form-meaning connections. In B. VanPatten, J. Williams, S. Rott, & M. Overstreet (Eds.), Form meaning connections in second language acquisition (pp. 203–218). Mahwah, NJ : Erlbaum. Williams, J. (2005). Learning without awareness. Studies in Second Language Acquisition, 27(2), 269–304. Winke, P. (2013). The effects of input enhancement on grammar learning and comprehension: A modified replication of Lee, 2007, with eye-movement data. Studies in Second Language Acquisition, 35, 323–352.

SECTION 4

Model Building

This page intentionally left blank

12 TOWARD A MODEL OF THE L2 LEARNING PROCESS IN INSTRUCTED SLA

In the previous chapters we have discussed both a coarse-grained and finergrained framework of the learning process based generally on the different stages postulated to occur along the L2 learning process. We have discussed the need to view the learning process as one that includes both processes and products. Now, who can remember the three processes and the four products? We have also discussed the theoretical underpinnings and empirical research conducted on the constructs of attention and awareness, together with discussions on the variables of working memory, depth of processing, and prior knowledge and their pertinent research literature. Now is the time to engage in some model building. To this end, Chapter 12 discusses the criteria for a good theory in SLA, including models, frameworks, and hypotheses (cf. VanPatten & Williams, 2007; Mitchell, Myles, & Marsden, 2013, for more elaboration). I then present my proposed model of the L2 learning process in Instructed SLA, followed by a detailed description of each of its stages along the learning process, namely the input processing, intake processing, and knowledge processing stages together with the L2 developing system. The major features of the model will then be presented. Finally, I shall report a recent study that tested empirically the early stages of the model.

Theory Building As some of you may be aware, we can classify theory building from at least two dimensions, namely deductive (top-down) or inductive (bottom-up). Briefly, a deductive theory proposes some concepts or constructs that are related in a series of propositions assumed to be true without empirical evidence (but they can be tested). These concepts form the axioms or premises of the theory. The next step is to deduce the consequences of the theory by postulating some hypotheses. If these hypotheses can be empirically supported, then the laws and facts of the theory are

240

Model Building

postulated. An inductive theory progresses from a series of facts and laws until it becomes a theory. Instead of beginning with a series of axioms that are assumed to be true, this inductive theory is strongly based on the findings of empirical studies. Irrespective of the approach to theory building, here are several criteria a good theory should have:

Criteria for a Good Theory (SLA) 1. applicable 2. explicit 3. coherent and consistent 4. explanatory 5. simple and clear 6. pliant 7. heuristic

pertinent to the classroom setting, SLA defines its principal concepts, testable relationship between the multiplicity of components, describes the various components and what takes place in each lies in the theory’s ability to explain, predict, and stimulate empirical research the greater its simplicity, clarity, and economy, the more communicable it is to the field capable of modification promotes the search for new and more powerful generalizations

A classic example of an SLA theory is the skill acquisition theory (DeKeyser, 2007) derived from cognitive psychology. This theory claims that a set of basic principles common to the acquisition of all skills can account for the learning process in SLA.

Model Building A model does not need to be explanatory or predictive like a theory, but should share the rest of the criteria listed above, especially the criteria of coherence and consistency. Indeed, the crucial difference between a model and a theory is that the former provides a description of processes and sets of processes and, ideally, how these processes perform and interact among themselves. A classic example is VanPatten’s (2004) model of input processing, derived from cognitive psychology and universal grammar (UG), in which VanPatten describes how adult L2 learners process incoming input during the early stage of the learning process.

A Theoretical Framework A theoretical framework is structured from a set of broad ideas and theories that help researchers to address issues that confront them. Some researchers seek information from outside their specialized field to provide plausible solutions to the problems they have encountered in their field. A classic example of a framework in SLA is Truscott and Sharwood Smith’s (2011) MOGUL (Modular Online Growth and Use of Language), which is an interdisciplinary, processing-oriented framework of L2

Model of the L2 Learning Process in SLA

241

development that integrates Jackendoff’s (1997, 2007) modular view of language with a processing component. In this framework, Truscott and Sharwood Smith want to address important concepts such as input, intake, and consciousness in SLA. The limitation of a framework is that it does not need to be tested.

A Hypothesis A hypothesis is a simple idea dedicated to one phenomenon; it can form part of a theory’s predictions and it is testable. The classic example in SLA is Schmidt’s (1990) noticing hypothesis, in which Smith postulates that attention and awareness are two sides to a coin, meaning that L2 intake cannot take place without some level of awareness present. The model that I am proposing to account for the L2 learning process in Instructed SLA and, more specifically, in the L2 classroom, is clearly inductive, drawing from some tenets of previous theoretical underpinnings in SLA and cognitive psychology, critiques of the empirical findings of the pertinent studies related to the many variables postulated to play a role in this model, and an abundant amount of concurrent data gathered in several studies investigating the cognitive processes employed by adult L2 learners. The overall framework is based on the finer-grained one presented in Chapter 2 that will now be made even more finely grained.

Model of the L2 Learning Process in Instructed SLA This model is premised on the role of attention in the process of learning an L2; that is, without attention minimally paid to new information in the L2 data, the process of learning anything is almost not likely to occur, especially in the foreign language classroom context. The model of the L2 learning process in Instructed SLA is displayed on next page. Paying attention to new L2 information in the input is regulated by several variables that may accompany the allocation of attentional resources, namely the depth of processing or the amount of cognitive effort employed while paying attention, cognitive registration (the process that selects or engages a particular and specific bit of information), and/or the level of awareness of the new information. As can be seen, from input to output there are several stages during which L2 linguistic information contained in the input is processed (cf. Chaudron, 1985; Gass, 1997; VanPatten, 2004) and produced. In the model, there are three major processing stages, namely the input processing stage, the intake processing stage, and the knowledge processing stage. Each stage is elaborated below.

Input Processing Stage INPUT > attended intake > detected intake > noticed intake The first stage (input processing) is between the input and the intake of specific linguistic information and what is taken in (intake) is initially stored in working

242

Model Building

memory. This stage is largely dependent upon the level of attention (peripheral, selective, or focal) paid to such information by the learner and may be accompanied by depth of processing, cognitive registration, and level of awareness. To this end, this stage is divided into three phases: Attended intake, detected intake, and noticed intake. The first phase indicates that while peripheral attention was paid to some linguistic data in the input, attention was not accompanied by any high level of processing, cognitive registration, or awareness of the data. This attended intake phase may be in line with Chaudron’s (1985) notion of the initial stages of perception of input, although attended intake in this model is clearly premised on an extremely low level of processing. Given that what is peripherally attended to in the input may not be necessarily processed further, this product (attended intake) is most likely to be discarded without any storage in working memory or further processing (Corder, 1967). Studies employing concurrent data elicitation procedures such as eye-tracking (cf. Godfroid, Boers, & Housen, 2013) and concurrent verbal reports (cf. Leow, 2001a) confirm these postulations. The second phase (detected intake) indicates that some amount of selective attention together with a very low level of processing was minimally paid to the linguistic data. The learner cognitively took note of the new information without any level of awareness. This detected intake phase is in line with Tomlin and Villa’s (1994) notion of detection, that is, cognitive registration without the

Model of the L2 Learning Process in SLA

243

presence of awareness. In other words, the learner has detected the key unknown grammatical information to be potentially learned but is not aware of doing so. While the potential for storage in working memory and further processing increases in relation to attended intake, it may depend on the learners’ current working memory and whether a higher level of processing or cognitive effort is allocated subsequently to the detected intake (cf. Leow, 2000, 2001a). The claims of implicit learning made by some researchers (cf. Leung & Williams, 2011, 2014) may provide empirical support for this type of detected intake once the large amount of exemplars is taken into consideration. The third phase indicates that linguistic data were attended to, cognitively registered with a low level of awareness by the learner but still accompanied by a low level of processing. This noticed intake is in line with Schmidt’s (1990) notion of noticing, that is, focal attention accompanied by a low level of awareness. Given the relatively “higher” level of processing and cognitive effort, albeit still at a low level, noticed intake holds the most potential to remain stored in working memory and made available for further processing that may lead to incorporation into the L2 learners’ grammar system, as reported in the many studies that have addressed the effects of noticing in L2 development via think aloud protocols (cf. Leow, 1997, 1998a, 1998b, 2000, 2001a, 2001b; Leow, Egi, Nuevo, & Tsai, 2003; Martínez-Fernández, 2008; Rosa & Leow, 2004; Rosa & O’Neill, 1999) and eye-tracking (cf. Godfroid et al., 2013; Smith, 2012). Crucially, while both detected intake, noticed intake, and, to a substantially lesser extent, attended intake may be lodged in working memory and made available for subsequent recognition by L2 learners, they can all be discarded if not minimally processed further (cf. Hama & Leow, 2010; Leow, 2000, 2001b).

Intake Processing Stage Preliminary intake, which may be attended, detected, or noticed, and comprising the first exemplar of linguistic data, may be processed one of two ways depending upon depth or level of processing and/or cognitive effort. It may be accompanied by minimal data-driven processing (cf. Robinson, 1995) that allows the data to be entered into learners’ L2 developing system encoded as a non-systemized chunk of language (cf. Gass, 1997), also referred to as item learning. Not much cognitive effort has been expended in such processing. Subsequent exemplars not accompanied by higher levels of processing may follow this path, forming a collection of encoded discrete data or entities lodged in learners’ L2 developing system. These data may be measured by recognition or, more robustly, simple controlled production assessment tasks of old exemplars (cf. Leow, 2001a). Preliminary intake can also be processed via another stage of further processing if accompanied by relatively higher levels of processing, such as consciously encoding and decoding the linguistic information (cf. Hsieh, Moreno, & Leow, 2015; Leow, 2012) and conceptually-driven processing (cf. Leow, 1998a; Robinson, 1995).

244

Model Building

This stage may be accompanied by higher levels of awareness that keep the L2 data alive in working memory in order to facilitate its potential entry and incorporation into the learner’s systemized grammatical system (cf. Leow, 1997; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Sachs & Suh, 2007). This stage of higher processing may occur minimally one of two ways: The first exemplar is cognitively linked to prior knowledge of some old related linguistic data that are used to facilitate the encoding and decoding of the linguistic information contained in the preliminary intake (cf. Leow, 1998a; de la Fuente, 2015). Potential activation of prior knowledge is indicated by the magnet that represents both old and new knowledge, see below, in the L2 developing system. An example in Spanish would be learners’ activation of the radical changing verb forms in the present tense of the third person singular of morir, “to die” > muere, “he dies,” and linking this vocalic stem change to the irregular preterit third person form of the same verb murió, “he died.” This conceptually-driven processing necessitates a higher level of processing, accompanied by a higher level of awareness. Repeated activation of prior knowledge of the same linguistic data will result in a reduction of the level of awareness and depth of processing required to process the L2 data (cf. Calderón, 2013). The second way may follow the data-driven processing path, in which the linguistic information has been encoded and lodged in the L2 developing system but un-systemized (cf. Gass, 1997). As the second or more exemplars are taken in, intake processing may be viewed from two perspectives: Activation of old (as described above) or new data. With respect to new data at this stage, linguistic un-systemized data recently stored in the L2 developing system may be reactivated by further exposure to the same or related linguistic data (cf. Leow, 1998a). Dependent upon the depth of processing or amount of cognitive effort and/or levels of awareness, this activation may lead to either implicit or explicit systemized learning of the L2 information, which is then stored into the grammatical system within the system learning component. A low level of processing may potentially lead to implicit restructuring, if necessary, of the L2 information, and implicit systemized learning. However, this kind of processing depends heavily on many factors that include the provision of large amounts of exemplars in meaningful contexts and quite a long period of time to process, internalize the exemplars, and have the knowledge available for subsequent usage. With regard to explicit learning, as the depth of processing increases to include hypothesis testing and rule formation (cf. Hsieh et al., 2015; Leow, 1997; Rosa & Leow, 2004; Rosa & O’Neill, 1999), so too does the potential level of awareness increase: From awareness at the level of noticing > awareness at the level of reporting > to awareness at the level of understanding (cf. Leow, 2001a). It is important is to note that while higher depths of processing hold potential for achieving higher levels of awareness, the correlation does not hold true for all instances. While a higher depth of processing logically involves greater amounts of cognitive effort and elaboration in attempting to, for example, arrive at an underlying grammatical rule or some lexical root, awareness at

Model of the L2 Learning Process in SLA

245

the level of understanding is only attained when the correct underlying rule is obtained and fully understood (cf. de la Fuente, 2015; Hsieh et al., 2015). The combination of prior knowledge activation, depth of processing, and potential higher level of awareness allows the linguistic data to be explicitly restructured if necessary and stored into the grammatical system within the system learning component. Awareness at the level of understanding and subsequent automatization of the linguistic data via subsequent multiple exposures and meaningful practice will lead to a sharp reduction in the depth required to process the relevant linguistic data in the L2 input, which may lead to a less important role for awareness and depth of processing during intake processing of previously learned linguistic data (cf. Calderón, 2013). In addition, depth of processing may be variable even during the same exposure. The second stage of further processing, then, occurs between preliminary intake (attended, detected, and noticed) and the L2 developing system. Crucial variables in this intake processing stage are the roles pertaining to levels of cognitive effort, depth of processing, levels of awareness, amount of exemplars, conceptually-driven processing (prior knowledge), data-driven processing, and restructuring. Other variables that may play a role include motivation, individual differences, type of linguistic item, language experience, and so on.

L2 Developing System What is stored in the L2 developing system, then, are two kinds of product or stored linguistic knowledge of what has been processed up to this point in the learning process, namely, un-systemized (discrete linguistic data) and systemized (internalized or learned) data. This separation of internalized data in the system is reminiscent of Gass’s (1997) postulation and accounts for item versus system learning. Accuracy of the product is not of importance at this point, but there is a correlation between higher levels of awareness and more accuracy (cf., Leow, 1997; Rosa & Leow, 2004; Rosa & O’Neill, 1999).

Knowledge Processing Stage The third and final process occurs at Stage 5 between the L2 developing system and what is produced by the learner (knowledge processing, e.g., assigning phonological features to the L2 in oral production, monitoring production in relation to learned grammar, etc.). Depth of processing and potential level of awareness may also play a role at this stage, together with the ability to activate (appropriate) knowledge. Speed of activation and appropriateness of knowledge may be observed in L2 learners’ fluency and accuracy of their L2 production. Stages 1 through 5 are representative of the internal learning mechanism of the L2 learner.

246

Model Building

Finally, this representation of the L2 learning process is not viewed as linear given that learners’ output may also serve as additional input (as represented by the loop from the output stage back to the input stage). Learners may monitor what they produce or use potential feedback based on what they have just produced as confirmation or disconfirmation of their L2 output. Dependent upon depth of processing or level of awareness, they may reinforce their current knowledge or restructure their current interlanguage. Here are the key features of the model of the L2 learning process in Instructed SLA: 1. The postulation that it is not the limited attentional capacity that is responsible for any potential breakdown in processing the L2 (at the input, intake and knowledge processing stages) but learners’ limited processing capacity, hence the potential roles of depth of processing and awareness at all three processing stages. 2. The postulation of three phases of intake that may be taken into learners’ working memory. 3. Awareness does not play an important role at the input-to-intake stage. 4. All phases of intake may disappear from working memory unless further processed. 5. The shift in the centrality of both attention and awareness to the role of depth of processing taking place in the intake processing stage. 6. Higher depth of processing may lead to higher levels of awareness. 7. Activation of two types of prior knowledge (old and new). 8. High depth of processing does not necessarily lead to awareness at the level of understanding. 9. Learning occurs in the internal system. 10. Both implicit and explicit learning are possible, even during the same exposure, with the former dependent upon specific conditions. 11. There are two types of learning: item learning and system learning. 12. The view of the L2 learning process as both processes and products, and 13. This representation of the learning process is not viewed as linear given that learners’ output may also serve as additional input.

Testing the Early Stages of the Model of the L2 Learning Process in Instructed SLA Calderón (2014) sought to test the tenets of my model’s early stages of the L2 learning process by addressing the potential existence of different levels of intake and the role of depth of processing during these early stages. The study also addressed whether type of linguistic item (grammatical versus lexical) played a role. To explicate the roles of level of intake, depth of processing, type of linguistic item, and reactivity in adult L2 learner’s subsequent intake, the study

Model of the L2 Learning Process in SLA

247

employed both eye-tracking and concurrent verbal reports. Ninety-six beginning learners of Spanish read a text and then completed production, recognition, and comprehension assessments in a pretest/posttest design. Results revealed that (1) there was no reactivity, (2) there were differences in processing type of linguistic item, (3) different levels of intake do appear to exist, and (4) depth of processing not only may play a role in subsequent processing of intake but also appears to facilitate the deeper processing needed for incorporation of intake into the developing system, providing preliminary empirical support for the postulations of the model of the L2 learning process.

Conclusion As mentioned earlier, the postulations of my model draw from previous theoretical underpinnings in SLA and cognitive psychology, the findings from several empirical studies probing the cognitive processes employed by adult L2 learners as they interact with the L2 data during some kind of experimental exposure, and the vast amount of concurrent data gathered over the last decade and a half. It is sincerely hoped that these postulations do provide the opportunity for validation or refutation and, as Calderón (2014) noted, “the other postulations made in Leow’s model need to be empirically tested” (p. 266). Now, keeping in mind the importance of understanding the important role cognitive processes play in the L2 learning process, let us proceed to the next chapter, which will provide some suggestions to develop activities or tasks premised on the usages of these cognitive processes.

References Calderón, A. (2013). The effects of L2 learner proficiency on depth of processing, levels of awareness, and intake. In J. Bergsleithner, S. Frota, & J. Yoshioka (Eds.), Noticing: L2 studies and essays in honor of Dick Schmidt (pp. 103–121). Honolulu, HI: University of Hawaii, National Foreign Language Resource Center. Calderón, A. (2014). Level of intake, depth of processing, and type of linguistic item in L2 development. (Unpublished dissertation). Georgetown University, Washington, D.C. Chaudron, C. (1985). Intake: On models and methods for discovering learners’ processing of input. Studies in Second Language Acquisition, 7, 1–14. Corder, S. (1967). The significance of learners’ errors. International Review of Applied Linguistics, 5, 161–170. DeKeyser, R. (2007). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 97–113). Mahwah, NJ: Lawrence Erlbaum. De la Fuente, M. J. (2015). Explicit corrective feedback and computer-based, formfocused instruction: The role of L1 in promoting awareness of L2 forms. In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Gass, S. M. (1997). Input, interaction, and the second language learner. Mahwah, NJ: Lawrence Erlbaum.

248

Model Building

Godfroid, A., Boers, F., & Housen, A. (2013). An eye for words: Gauging the role of attention in L2 vocabulary acquisition by means of eye tracking. Studies in Second Language Acquisition, 35, 483–517. Hama, M., & Leow, R. P. (2010). Learning without awareness revised. Studies in Second Language Acquisition, 32 (3), 465–491. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Jackendoff, R. (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff, R. (2007). Language, consciousness, culture: Essays on mental structure. Cambridge, MA: MIT Press. Leow, R. P. (1997). Attention, awareness, and foreign language behavior. Language Learning, 47, 467–506. Leow, R. P. (1998a). The effects of amount and type of exposure on adult learners’ L2 development in SLA. Modern Language Journal, 82, 49–68. Leow, R. P. (1998b). Toward operationalizing the process of attention in second language acquisition: Evidence for Tomlin and Villa’s (1994) fine-grained analysis of attention. Applied Psycholinguistics, 19, 133–159. Leow, R. P. (2000). A study of the role of awareness in foreign language behavior: Aware vs. unaware learners. Studies in Second Language Acquisition, 22, 557–584. Leow, R. P. (2001a). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leow, R. P. (2001b). Do learners notice enhanced forms while interacting with the L2?: An online and offline study of the role of written input enhancement in L2 reading. Hispania, 84, 496–509. Leow, R. P. (2012). Explicit and implicit learning in the L2 classroom: What does the research suggest? The European Journal of Applied Linguistics and TEFL , 2, 117–129. Leow, R. P., Egi, T., Nuevo, A-M., & Tsai, Y. (2003). The roles of textual enhancement and type of linguistic item in adult L2 learners’ comprehension and intake. Applied Language Learning, 13, 93–108. Leung, J. H. C., & Williams, J. N. (2011). The implicit learning of mapping between forms and contextually derived meanings. Studies in Second Language Acquisition, 33(1), 33–55. Leung, J. H. C., & Williams, J. N. (2014). Crosslinguistic differences in implicit language learning. Studies in Second Language Acquisition, 29, 1–23. Martínez-Fernández, A. (2008). Revisiting the involvement load hypothesis: Awareness, type of task and type of item. In M. Bowles, R. Foote, S. Perpiñán, & R. Bhatt (Eds.), Selected proceedings of the 2007 Second Language Research Forum (pp. 210–228). Somerville, MA: Cascadilla Proceedings Project. Mitchell, R., Myles, F., & Marsden, E. (2013). Second language learning theories (3rd ed.). New York: Routledge. Robinson, P. (1995). Attention, memory and the ‘noticing’ hypothesis. Language Learning, 45, 283–331. Rosa, E. M., & Leow, R. P. (2004). Awareness, different learning conditions, and L2 development. Applied Psycholinguistics, 25, 269–292. Rosa, E. M., & O’Neill, M. (1999). Explicitness, intake, and the issue of awareness. Studies in Second Language Acquisition, 21(4), 511–556. Sachs, R., & Suh, B. R. (2007). Textually enhanced recasts, learner awareness, and L2 outcomes in synchronous computer-mediated interaction. In A. Mackey (Ed.),

Model of the L2 Learning Process in SLA

249

Conversational interaction in second language acquisition: A collection of empirical studies (pp. 197–227). Oxford: Oxford University Press. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Smith, B. (2012). Eye-tracking as a measure of noticing: A study of explicit recasts in SCMC. Language Learning & Technology, 16 (3), 53–81. Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 471–483). Mahwah, NJ: Lawrence Erlbaum. Truscott, J., & Sharwood Smith, M. A. (2011). Input, intake, and consciousness: The quest for a theoretical foundation. Studies in Second Language Acquisition, 33, 497–528. VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum. VanPatten, B., & Williams, J. (2007). Theories in second language acquisition. Mahwah, NJ: Lawrence Erlbaum.

This page intentionally left blank

SECTION 5

Pedagogy

This page intentionally left blank

13 TOWARD THE DEVELOPMENT OF PSYCHOLINGUISTICS-BASED E-TUTORS

Now that we have arrived at the pedagogical section of the book, what can we teachers take away from all the theoretical underpinnings and empirical research on so many different aspects of the L2 learning process? We can all agree that the role of input, the L2 data (form-based and/or meaning-based) that learners receive either in the formal classroom setting, in a naturalistic setting, and/or online is undoubtedly crucial in the process of second/foreign language (L2) learning. But we also noted in the section on empirical research ( Chapters 9, 10, and 11) that how L2 input is presented to L2 learners can have an important impact on the processes learners employ to interact with the input (input processing) if viewed from a psycholinguistic perspective. If we consider the classroom setting as a place to promote our students’ communicative abilities, then we know that the limited amount of exposure to and interaction with the L2 is relatively inadequate to promote deep learning. If we seriously consider the important role played by cognitive processes in the L2 learning process and the impoverished classroom setting, then promoting explicit learning on an individual basis as a stepping-stone to practice communicating in the L2 is not rocket science. Premised on the benefits of promoting more robust L2 learning before actual practice in the classroom setting, this chapter will first provide the rationale for developing psycholinguistics-based e-tutors (computer software that allows learners to practice independently, without the help of a teacher or a peer) to promote our students’ L2 grammatical development, then describe the definition, features, format, and creation of these tasks, and finally provide three samples of psycholinguistics-based tasks.

254

Pedagogy

Rationale for Developing Psycholinguistics-Based E-Tutors In a review of receptive practice employed in several strands of SLA classroombased research (e.g., the attention/awareness strand, processing instruction, and skill acquisition), Leow (2007) reported that the empirical findings suggest that providing learners with explicit grammatical information prior to and/or during practice may be positive for SLA in that the levels of accuracy appear to be higher than those found for conditions not exposed to such information. He also noted that receptive practice that incorporates task-essentialness and feedback (discussed below) appears “to reduce the need for such explicit grammatical information prior to or during practice, most likely due to the importance of attending to the targeted structure and feedback on the potential accuracy of the structure” (Leow, 2007: 43). In other words, “increasing the sources of explicit grammatical information and placing them at strategic points during input processing may enhance learners’ capacity to acquire generalizable knowledge from a limited set of input data” (Rosa & Leow, 2004a: 211). Interestingly, the tasks that have proven successful in the SLA literature in promoting deeper learning are all e-tutors, which allow the researchers the additional advantage of specifically manipulating how our students process the L2 input while performing the tasks (e.g., Bowles, 2008; Hsieh, 2008; Hsieh, Moreno, & Leow, 2015; Leow, 2001; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999; Sanz & Morgan-Short, 2004). One recent meta-analysis (Grgurovic, Chapelle, & Shelley, 2013) of 37 computer-assisted language learning (CALL) studies concluded: “Second/foreign language instruction supported by computer technology was found to be at least as effective as instruction without technology, and in studies using rigorous research designs the CALL groups outperformed the non-CALL groups” (p. 165). Similar findings were reported by Cerezo, Baralt, Suh, and Leow (2013), who reported that the overall effects of CALL tasks on L2 development appear to indicate relatively strong support for its usage. Additionally, studies comparing CALL to FTF (face-to-face) instruction yielded greater learning outcomes for the former, at least if technology takes the form of an e-tutor. This kind of CALL task is theoretically driven and psycholinguistic in nature, and designed, for the most part, to encourage students’ usage of crucial learner processes such as learner attention, depth of processing, levels of awareness, conceptually-driven processing (activation of prior knowledge), and working memory to promote deeper learning of targeted linguistic items. Given the positive results manifested in these CALL tasks, let us fine-tune the development of psycholinguistics-based e-tutors, keeping in mind our students’ learning outcomes.

Defining the Psycholinguistics-Based E-Tutor In this chapter I adopt Rosa and Leow’s (2004a) view that “a task potentially has an influence, not only on automatization of already internalized structures . . . but also as a means of helping learners focus on formal aspects of the second/foreign

Psycholinguistics-Based E-Tutors

255

language (L2)” (p. 192). To this end, a task is taken to be a computerized activity that requires that participants comprehend, manipulate, or produce in the second or foreign language, and that aims at impacting L2 development by attempting to promote deeper processing and to raise learners’ awareness of particular grammatical features of the language via the provision of concurrent feedback and prompts.

Toward the Development of a Psycholinguistics-Based E-Tutor As discussed in this book, crucial learner processing and processes that need to be seriously considered in any psycholinguistic-based model of language learning are the roles pertaining to attention, depth of processing, levels of awareness, intake processing, conceptually-driven processing (activation of prior knowledge), and working memory. The consistent findings of quite a large number of studies that report the beneficial effects of these cognitive processes do underscore the need for teachers to recognize the crucial roles they play in subsequent processing of L2 data. To develop any psycholinguistics-based task, and, more specifically in this chapter, a CALL e-tutor, let us start with the very first premise based on most of the models discussed previously, namely, that any type of L2 instruction or exposure is premised, minimally, on the fact that students do pay attention to the L2 input or information being targeted by the teacher, be it semantic or linguistic content or both. However, as reported in both SLA and non-SLA (e.g., cognitive psychology) literatures, mere attention to information in the L2 input may lead to this information being stored briefly in working memory and, without further cognitive processing, may likely be discarded without being internalized into the students’ learning system. Consequently, to promote deeper processing and ultimately more robust learning, we need to ensure that the student is indeed cognitively engaged in attending to and processing the L2 information. Deeper processing can be reinforced by carefully designed learning activities or tasks that promote students’ usage of identified beneficial cognitive processes while interacting with the targeted content during instructional exposure to L2 data, whether the instruction is online or offline. We also need to consider students’ exposure to an adequate amount of examples in the input to be processed. The exposure provides the opportunity to make connections with prior knowledge of such data, and the potential not only to create hypotheses or rule formation (awareness at the level of understanding the grammatical rule) but also to receive some kind of feedback, whether explicit or implicit, that confirms or disconfirms such hypotheses.

Major Features of an E-Tutor There are three major features of an e-tutor. •

The first is what is known as “task-essentialness” (Loschky & Bley-Vroman, 1993), that is, students need to minimally pay attention to the targeted items in the task in order to successfully complete the task.

256





Pedagogy

The second is the provision of concurrent implicit feedback as students perform the tasks. Feedback serves to confirm or disconfirm previous hypotheses formation facilitated by task-essentialness. The third feature is the use of prompts. Prompts encourage deeper processing (e.g., hypothesis formation or testing) as students interact with the L2 data. For example, a student chooses one out of two options. A prompt may be, “Great. Do you know why that option is correct?” Or if the option is incorrect, the prompt may be, “Perhaps you may want to think of looking at . . .” There may be no need to provide explicit grammatical explanations, since the idea is to make the student process more deeply the implicit feedback being provided in the task.

Empirical Support for E-Tutors The success of the features of task-essentialness and feedback comes from studies conducted within a psycholinguistics framework, in which learners’ attention to and/or awareness of targeted linguistic information in the L2 input has been methodologically operationalized and measured during online input processing and feedback selectively provided during practice (e.g., Bowles, 2008; Hsieh, 2008; Rosa & Leow, 2004a; Sanz & Morgan-Short, 2004). The importance of implicit feedback in structure-based tasks is also directly related to the body of current empirical research in SLA, showing that this type of feedback encourages hypothesis formation and testing, which is indicative of high levels of processing that contribute to learning and system restructuring in addition to robust retention (e.g., Hsieh et al., 2015; Leow, 2001; Rosa & Leow, 2004a, 2004b; Rosa & O’Neill, 1999). The feature of prompts has support in type of feedback reported in the interactionist strand of research (e.g., Lyster, 2004; Lyster & Izquierdo, 2009).

Format of E-Tutor Based on both empirical support and student interest, the tasks will be generally of a problem-solving format (e.g., mazes, crossword puzzles with mismatches to promote noticing of targeted items, competitive games that embed different steps or levels for completion, etc.) that promotes a relatively deep level of processing and encourages students to make hypotheses and rule formation as they attempt to solve the problem via several stages of the tasks. As they receive more feedback, they begin to confirm or disconfirm their initial hypotheses. This in turn allows them not only to process the information more carefully but also to potentially raise their awareness of the linguistic form or structure at the level of understanding the underlying grammatical rule. Crucially, students’ performances can be tracked online (e.g., response times, online verbal reports) if so desired and will, in turn, provide valuable data to

Psycholinguistics-Based E-Tutors

257

indicate not only what they are paying attention to and processing but also how they are doing so (if concurrent verbal reports are used, which can be recorded digitally) as they perform the task. Such data can be shared with students and also disseminated in research venues interested in cognitive processes employed during the learning process.

E-Tutor: Receptive and Productive E-tutors can be divided into receptive and productive practice tasks for L2 learners. Receptive practice tasks are typically designed to address the input-to-intake and intake processing stages (Stages 1 and 3) along the learning process, while productive practice tasks are designed to focus on both the input, intake, and knowledge processing stages (Stages 1, 3, and 5). In addition, I shall include a sample of a content/form-based hybrid activity. It is important to note that this chapter does not presume that these kinds of computerized tasks constitute the only pedagogical avenue for successful L2 development in the classroom setting. On the contrary, these computerized tasks only address one aspect of the learning and teaching processes. Indeed, the ideal setting for these computerized tasks is outside the classroom, and they should be viewed as ancillary tools to prepare students for communicative practice in the actual classroom setting, powered by the important role of the teacher.

Sample of a Receptive E-Tutor Practice Task An example of an online student-centered task that is meaningful and fun, yet that involves quite a great deal of processing, is the 3D Gustar Maze that I created with an Initiative on Technology-Enhanced Learning (ITEL) grant provided by the Center for New Designs in Learning and Scholarship (CNDLS) at Georgetown University. The actual implementation of the computerized maze was designed by Bill Garr from CNDLS. The Gustar Maze game was made by adapting a version of the popular Minecraft game, written in Javascript for the Web. This Javascript port is called “VoxelJS” (http://voxeljs.com/). Its purpose is to provide a programming environment for producing relatively simple 3D games. A simple world with space for a square structure was created, to which a facility for defining maze walls within that structure through a configuration file was added, and questions were attached to locations in the maze. By uploading new configuration files to the game website, an instructor can add maze games of various complexity, each presenting its own custom content. The purpose of this e-tutor (cf. also Bowles, 2008; Hsieh, 2008, for other games with gustar) is to promote a deeper understanding of the uses of a syntactic construction involving the problematic Spanish psych verb gustar, which requires

258

Pedagogy

a dative experiencer, obligatorily doubled by a dative clitic. This structure contrasts sharply with English, where experiencers can only have nominative or accusative case, and it is quite common for English-speaking students of Spanish to have problems processing this structure. Do you recall this verb in Chapter 11 and its potential to create a cognitive overload? Perhaps this may explain why textbooks typically divide the formal presentation of the different levels of this verb across several chapters. This game includes, in addition to the notion of task-essentialness and implicit feedback employed in previous e-tutors, the use of prompts to reinforce deeper processing. The gustar structure is carefully presented at all four levels: Level 1 (6 exemplars): A mí me gusta(n) la(s) casa(s) “I like the house(s)” Level 2 (4 exemplars): A ti te gusta(n) la(s) casa(s), A nosotros nos gusta(n) la(s) casa(s) “You/We like the house(s)” Level 3 (4 exemplars): A él /A ella le gusta(n) la(s) casa(s) “He/She likes the house(s)” Level 4 (6 exemplars): A Mary/A John le gusta(n) la(s) casa(s) and A María y a John les gusta(n) la(s) casa(s) “Mary/John likes the house(s)” and “Mary and John like the house(s)”

Playing the Game The student accesses the game on the computer and reads the following instruction:

Gustar Everyone likes something, and gustar, “to give pleasure” or “to please” is the verb to use in Spanish to express this liking. However, whether you like it or not, this verb is rather tricky for many English-speaking learners of Spanish since it does not follow a one-to-one translation from English to Spanish. So, let us see whether you can learn (mostly on your own) how this verb works in Spanish, that is, you are the one in charge of learning the usages of this verb. All you need to do is to enter this maze and try to put into Spanish the sentences that you find in the maze. If you pay close attention to the tools you will receive and think a bit deeply about them, you should be able to get out of this maze. For every tool used correctly the first time, you will be awarded $5. For every tool used incorrectly, you will lose $2. By the way, this maze has different levels, each level dependent upon the previous one, so, for example, if you make a mistake on Level 2 that is based on knowledge gained at Level 1, you will drop back to Level 1, lose all your money earned, and will need to start over again. Try it and you will like it! ¡Vamos! Moving their cursor or using the keypads, the student enters the maze and is greeted by a bubble that pops up with the following sentence ¿Cómo se dice en español (How do you say in Spanish) “I like the house”? At the top of the screen

Psycholinguistics-Based E-Tutors

259

la casa, “the house,” appears in the right corner, and under the sentence are two options: Yo, “I,” and a mí, “to me.” If the student selects the wrong option (100%), the following screen appears:

FIGURE 13.1

When the option is correct, the following screen appears:

FIGURE 13.2

260

Pedagogy

The student then hits the Okay button and moves deeper into the maze until another sentence pops up. Here are several more exemplars of type of feedback or prompt for incorrect and correct selections:

FIGURE 13.3

FIGURE 13.4

FIGURE 13.5

FIGURE 13.6

262

Pedagogy

FIGURE 13.7

FIGURE 13.8

Once all the sentences on one level are completed correctly, at the end of each level is a summary that involves the student in rule formation. They are provided with several rules that are correct or incorrect and asked to select those that they think are correct, based on their hypotheses or what they think they have learned up to this point.

Psycholinguistics-Based E-Tutors

263

Here is a sample from Level 2 before the student is transported to Level 3:

FIGURE 13.9

The accuracy of their selections does not play a role in being transported to the upper level, where they are warned that if they make a wrong selection pertaining to the lower level, they will be transported back to that level for more practice. Here is a transition from Level 1 to Level 2: Now, if you think you have grasped the equivalent of saying “I like something(s)” = “something(s) please(s) to me”, let us head upstairs to the second level, where you will tackle somebody else liking something(s) and get the opportunity to earn some more dinero.

Transported to Level 2 A bubble pops up with: ¡BIENVENIDO/A AL NIVEL 2! ¿Estás preparado/a para continuar? On this level we are going to change the person(s) doing the liking to “You” (singular and informal) and “We.” ¡Vamos! Once they exit the maze after completing all correct options, the student is welcomed by a series of fireworks to celebrate his/her survival. Total money won and time spent appear in a corner. It takes approximately 15 minutes to complete this game. How do we know that the student processed deeply and learned this problematic gustar construction? Here is a typical think aloud protocol taken from among several beginning students of Spanish who played the game (this version of the game had a few technical glitches, so references such as “Technology and I just do not get along today” and “being lost” are reflective of some difficulty in moving the cursor in the maze). Note the high depth of processing, hypothesis

264

Pedagogy

formation and testing, activation of recent prior knowledge, and finally the breakthrough at Level 3. This student’s scores on an oral and written production test comprising both old and new exemplars (total 12 items) were the following: Pretests (0 on both tests), immediate posttest (12 on both tests) and delayed posttest (10 and 11 on the oral and written production, tests, respectively) two weeks later. For readers not familiar with Spanish, just keep in mind that the equivalent of an English sentence such as “John and Mary like Spanish” is A John y a Mary les gusta el español ‘To John and to Mary to them pleases Spanish” Okay, this is Mark (not real name), I’m back and on this one. Let’s see. Next, next. There we go. ¿Cómo se dice en español I like the house? A mí, me, gusta, la casa. There we go. ¿Cómo se dice I like Spanish? A mí, I don’t know why this option’s correct, no. I thought it was yo. Oh they’re not gonna tell me. Okay. Uh a mí, me, I don’t know why this option’s correct either. Uh I’m gonna say gusto el español. Gusto is not used to indicate, oh really? (1:00) I don’t know why that’s correct again. Oh here. (Mumbling) that may be helpful to you. Yes. A mí I didn’t know a mí refers to either. I probably should know that by now in this semester shouldn’t I? Um, okay. Close. I like the houses. A mí because it means, a mí. A mí is I. Okay, I like the houses. Me. Yeah, they mean the same thing (notices the connection). Why do they use them twice then? I don’t know. (2:00) Gustan. I chose this option because the n indicates plural (rule formation). I like Spanish and French. A mí, I, me, gustan because it’s gonna be plural (rule formation). All right. Let’s keep going. All right. Technology and I just do not get along today. I like soccer. A mí, me, gusta (3:00) because gusto does not indicate liking. I’m not sure why, but it doesn’t. I like my classes. A mí is I, me for some reason, gusta. Ooo. I like my classes. Oh. A mí, me, gustan (4:00). I am getting the hang of it. I like my subjects this semester. A mí, me, mis is gonna be gustan because it’s plural (rule formation). Oh did I actually make it to the second level? This is exciting. You like the house. A ti. I did, see I got that from number one (recent prior knowledge). (5:00) Te. La casa is gonna be gusta. I’m gonna say a nosotros. Oh! Is it because it’s the ob- no, it’s not the object. It’s the subject. I’m not even gonna ask why (hypothesis formation). Gusta? Gustamos? No. I’m gonna say el español is gusta. You like the houses. A ti, te, gustan the casas. We like mass. Math. A nosotros, nos, (6:00) haha oh and los ma- las matemáticas is plural (rule formation). You like the class. A ti, te, a ti te, gusta, la class. We like books. A nosotros, nos, gustan because it’s plural (rule formation). Yes. Okay, Level three. Okay, let’s see. Next. (7:00) Haha I’m getting so lost in this maze. There we go. She likes Spanish. A ella because it’s gonna follow the same pattern as the last levels (recent prior knowledge), still get that. Le because we need some kind of thing there (prior knowledge), el español it’s gonna be gusta because it’s singular (rule formation). All right, so I know at least that. Makes me feel somewhat good about myself I guess. He likes the house. A él, le, gusta la casa. Gusta is singular. (8:00) She likes the houses. A ella, le, gustan las casas. ¿Cómo se dice en español He likes Spanish and French? A él, a, whatever, él uh le gustan because it’s plural. Once again gustan the verb to like agrees with what is being liked and not necessarily

Psycholinguistics-Based E-Tutors

265

the people (rule formation). She likes the class. A ella, they put the a there (notices the preposition). The reason, I’m confused why they put the a because we just learned that means it’s an object. Ohhh! It is an object! (9:00) Because, that’s why gustan agrees with the subject. The class is pleasing to her making her the object and the class the subject. That’s why it doesn’t follow a literal English translation! There we go. I just had a breakthrough. Thank God. And that’s why I’m also doing the Spanish lab now while we’re learning all about this. A él, because a, because he is the object, le, gustan. ¡Excelente! You have survived up to this point so let us see where you stand regarding verbs like gustar in Spanish. Select which of the following statements are accurate. Gustar is very simple. It follows a literal translation. No. One is wrong. Gustar is tricky but I know how to handle it. Yeah, that’s true. Yes! Oh that’s what I just said, the real subject of the verb is the thing liked and not the person doing the liking. In fact, the person doing the liking becomes the indirect object of the verb gustar. That’s why there’s an a, the a, I don’t know how to pronounce it. (10:00) To me is, to you is te is me, is to him or to her is le, yeah that’s true. [Mark has just reinforced what he has learned by interacting with the potential rules presented at the end of Level 3. From this point on, Mark is on a roll and continues to reinforce the knowledge he gained from the lower levels.] All right. Level four. How many levels are there? Whoops. Um, hold on I’m lost. There we go. There we go! Mary likes Spanish music. So a Mary because she’s the object, le, gusta la música español. (11:00) I wish I could pronounce any of this. That would be nice too. John likes sports. A John, le, gustan. A John and Mary, oooh it’s gonna be les plural. Aha. Because that agrees with them. Oh. This makes sense now. ¿Cómo se dice en español John and Mary like fast food? A John, oh. I didn’t realize there was an a before Mary too. I didn’t do that last time. Uh les plural, gusta. (12:00) All right. ¿Cómo se dice The students like? A los estudiantes, les, gustan. One more sentence and you are out of the maze. One slip and slide back. Oh no. All right here goes nothing. All right. So, the students like this game. A los estudiantes because they are the object, les, and este juego is gusta. It’s singular. I’m out of the maze. Good. (13:00) Haha yeah. Perfect, thank you. And now I do this? Great. All right. Thank you. As noted, this game promotes a relatively high level of processing and encourages students to make hypotheses and rule formation as they attempt to solve the problem of exiting the maze. As they receive more feedback and prompts, they begin to confirm or disconfirm their initial hypotheses and activate recent prior knowledge. This in turn allows them not only to process the information more carefully but also to hold the potential of raising their awareness of the structure at the level of understanding. This task can be followed by students sharing in class their likes and dislikes.

Sample of a Productive E-Tutor Practice Task Cerezo (2010) designed and created Talking to Avatars, an e-tutor that uses audiovisual recordings to simulate a conversation between the learner and a series of

266

Pedagogy

“avatars,” or pre-filmed human actors. He targeted two linguistic forms that are assumed to present different degrees of difficulty to learners of Spanish, namely, Spanish prepositional relative clauses (e.g., La persona CON LA QUE VIVO se llama Pepa, “The person WHO I LIVE WITH is called Pepa”) and present subjunctive in relative clauses (e.g., Quiero un apartamento QUE TENGA tres habitaciones, “I would like an apartment THAT HAS-SUBJ three bedrooms”). Unlike the sample above, Cerezo also provided either implicit or explicit feedback. Following is a description of its components. The software component. To facilitate data access, storage, and sharing, Talking to Avatars was developed as a web-based application, using a combination of PHP and HTML programming codes. The application consists of two components, a static component that contains the program code, and a dynamic component that can be edited by the administrator. There are two different gateways to access the program, one for the students and one for the administrator. Both gateways can be accessed by visiting http://www.talkingtoavatars. com and entering a password. The administrator gateway grants access to two different databases, the Questions database and the Answers database. The Questions database contains all the necessary information to be displayed to each student, including which video should be played at each point, the specific task to be completed by the student, a catalog of correct answers, discourse strategies on how to proceed depending on the students’ answers, and some extras (e.g., transcripts, translations). According to Cerezo (2010), this information can be edited to fine-tune or expand the program. In turn, the Answers database contains a log of the students’ responses, a search engine for selective display of collected data, and an application that generates transcripts of individual sessions in Adobe Acrobat PDF format. Finally, to analyze the students’ output, Talking to Avatars uses a string-matching algorithm with different specifications for every targeted structure. The video recordings. By using Talking to Avatars, learners are immersed in two real-life situations (finding an apartment and a roommate, reporting a theft) during a fictional year-abroad program in Spain. To simulate these interactions, Cerezo scripted and videotaped 217 monologues that he created. In these videos, the avatars looked directly into the camera, the students’ point of view, asking a question or providing feedback (implicit or explicit) in several ways, as appropriate. To ensure that the avatars delivered the scripted questions and feedback messages verbatim, they read their lines off autocues. These autocues were created with Microsoft PowerPoint, which were then projected onto an HD flat-panel TV placed in front of the actors and underneath the camera, and which remained in place for the entire shooting. After the shooting, all videos were captured onto a PC computer using Adobe Premiere Pro CS4 and compressed from the standard AVI format (Audio Video Interleave) into FLV format (Flash Video) to facilitate faster download over the Internet. This file conversion process was done using eRightSoft Super, with a video compression rate of 25 frames per

Psycholinguistics-Based E-Tutors

267

second, an audio sampling frequency of 22050 Hz in mp3 format, and a bitrate of 64 kbps per second. Each situation was designed to instruct one targeted form at a time and consisted of an introductory presentation and 15 items or mini-episodes, with 10 of them eliciting the targeted form and the remaining 5 eliciting the non-targeted counterparts (i.e., present subjunctive vs. indicative in relative clauses; prepositional vs. non-prepositional relative clauses). In every mini-episode, students were orally addressed by an avatar that was looking directly into the camera, the students’ point of view, thus initiating an interaction sequence in several steps. Here is a description of the main steps that the students receiving explicit feedback follow: 1.

2.

3.

4.

5.

6.

Students are informed of the real-life situation (e.g., finding an apartment and a roommate, reporting a theft) while on a simulated year-abroad experience in Málaga, Spain. “Question”: The avatar asks a question to elicit information from the student (e.g., Tengo varios apartamentos . . . ¿Cuánto dinero puedes pagar? “I have several apartments . . . How much are you able to spend?” “Activity”: A fill-in-the-blank exercise is displayed on the screen to help the students provide the requested information in a task-essential manner. Specifically, the students are asked to provide a written translation of the targeted form from English into Spanish and to supply the appropriate content information by either filling in a blank or selecting an option from a drop-down list (e.g., Busco un apartamento (THAT COSTS) ________ (menos de 500 | de 500 a 1000 | más de 1000) euros al mes. “I’m looking for an apartment (THAT COSTS) ________ (less than 500 | from 500 to 1000 | more than 1000) euros a month.”) “Answer”: The students then provide the requested information, producing an answer (e.g., Busco un apartamento que *cuesta menos de 500 euros al mes “I am looking for an apartment that *costs-IND less than 500 euros a month.”) “Feedback”: The avatar replies with feedback on the content and form, depending on the students’ experimental group (e.g., if explicit feedback is provided, the student hears Bueno, hay que ahorrar dinero . . . Una cosita: el verbo COSTAR no es correcto. ¿Puedes corregirlo? “Sure, you should try to save some money . . . One thing, though: the verb COSTAR is not correct. Could you please correct it?”) Students then proceed to the next of 15 scenarios, of which 10 focus on the target linguistic item.

As can be seen, these two samples of receptive and productive practice tasks are premised on psycholinguistics tenets that underscore, during some type of receptive or productive interaction with the L2 data, the important roles played by student attention (task-essentialness), feedback, and prompts (in the first sample) provided to encourage deeper processing of the target linguistic items in the task.

268

Pedagogy

Sample of a Content/Form-Based Hybrid Activity An example of a learning activity that focuses on semantic content and is coupled with lexical and linguistic practice may be a reasoning activity that addresses identified variables found to play a role in robust learning. The objectives (which can be modified) of this ecologically valid activity are to promote the use of specific vocabulary associated with traveling (a real-life experience), the ability to share information in the L2 using such vocabulary, the ability to share cultural information, and the ability to comment on the activity using comparative expressions. Students (in pairs) are provided with, for example, US $1,000 and asked to plan a budget (airfares are excluded) for a family of three (e.g., two adults and one child, age ten) to spend a week in one of, for example, six L2 countries. The budget needs to include, minimally, information on boarding, lodging, cultural sites to visit and interesting information about these sites, and transportation inside the foreign country. Online links that provide information on the several travel categories are carefully chosen by the instructor to include an overlapping of vocabulary and expressions, that is, essentially almost the same information is repeated in different contexts. Students are strongly encouraged to visit each link before making their respective budget. After completion of the project outside the classroom, pairs are required to present in the classroom and in the L2 their budget (e.g., via a PowerPoint presentation) covering the information required, and to justify their choices by making comparisons between the countries, etc. The class decides which pair has the most interesting and frugal budget by comparing different aspects of the budgets presented. The level of language proficiency may determine the level of presentation and discussion required. An activity like this one is clearly grounded in a psycholinguistic approach to hybrid instruction that incorporates both media of class time and online exposure, albeit in a controlled way to maximize the amount of interaction, in line with the objectives and cognitive processes involved in completing such an ecologically valid activity.

Conclusion This chapter was premised on the benefits of promoting more robust L2 learning before actual practice in the classroom setting. To this end, I presented theoretically and empirically supported samples of classroom tasks or activities that are designed to engage our students’ cognitive processes in a relatively controlled way to maximize L2 development. More specifically, they are designed to address the roles pertaining to attention, depth of processing, levels of awareness, intake processing, conceptually-driven processing (activation of prior knowledge), and working memory. The tasks also clearly acknowledge the role of technology in the traditional classroom and seek to use this source as one way of getting students to process the L2 data more deeply. By the way, we do use an “AAPP”

Psycholinguistics-Based E-Tutors

269

(Attention, Awareness, Preparation, and Practice) for our language classrooms. And talking about technology, perhaps we do need to reflect on the changing L2 classroom, which is going to affect all teachers inevitably, and researchers later. I shall do this in the final chapter.

References Bowles, M.A. (2008). Task type and reactivity of verbal reports in SLA: A first look at an L2 task other than reading. Studies in Second Language Acquisition, 30 (4), 359–387. Cerezo, L. (2010). Talking to avatars: The computer as a tutor and the incidence of learner’s agency, feedback, and grammatical form in SLA. (Unpublished doctoral dissertation). Georgetown University, Washington, D.C. Cerezo, L., Baralt, M., Suh, B-R., & Leow, R. P. (2013). Does the medium really matter in L2 development? The validity of CALL research designs. Computer Assisted Language Learning, 27(4), 294–310. Grgurovic, M., Chapelle, C., and Shelley, M. (2013). A meta-analysis of effectiveness studies on computer technology-supported language learning. ReCALL Journal. 25(2), 165–198. Hsieh, H-C. (2008). The effects of type of exposure and type of post-exposure task on L2 development. Journal of Foreign Language Instruction, 2 (1), 117–138. Hsieh, H-C., Moreno, N., & Leow, R. P. (2015). Awareness, type of medium, and L2 development: Revisiting Hsieh (2008). In R. P. Leow, L. Cerezo, & M. Baralt (Eds.), A psycholinguistic approach to technology and language learning. Berlin: De Gruyter Mouton. Leow, R. P. (2001). Attention, awareness and foreign language behavior. Language Learning, 51, 113–155. Leow, R. P. (2007). Input in the L2 classroom: An attentional perspective on receptive practice. In R. DeKeyser (Ed.), Practice in second language learning: Perspectives from applied linguistics and cognitive psychology (pp. 21–50). Cambridge: Cambridge University Press. Loschky, L., & Bley-Vroman, R. (1993). Grammar and task-based methodology. In G. Crookes & S. M. Gass (Eds.), Task and language learning: integrating theory and practice (pp. 123–167). Clevedon, UK: Multilingual Matters. Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction. Studies in Second Language Acquisition, 26, 399–432. Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language Learning, 59, 453–498. Rosa, E. M., & Leow, R. P. (2004a). Computerized task-based exposure, explicitness and type of feedback on Spanish L2 development. The Modern Language Journal, 88, 192–217. Rosa, E., & Leow, R. P. (2004b). Awareness, different learning conditions, and second language development. Applied Psycholinguistics, 25, 269–292. Rosa, E., & O’Neill, M. (1999). Explicitness, intake and the issue of awareness. Studies in Second Language Acquisition, 21(4), 511–556. Sanz, C., & Morgan-Short, K. (2004). Positive evidence versus explicit rule presentation and explicit negative feedback: A computer-assisted study. Language Learning, 54, 35–78.

14 CONCLUSION The Changing L2 Classroom, and Where Do We Go From Here?

We have been on a theoretical, methodological, empirical, model building, and pedagogical journey to provide some justification for the title of this book, namely, Explicit learning in the L2 classroom: A student-centered approach. First, I shall discuss some conclusions we can make from the previous chapters, while raising some questions to ponder on what specifically SLA research may offer the L2 classroom. We will discuss the inroads technology is currently making at both the curricular and instructional level, leading to the changing dynamics of the L2 classroom, and I shall provide one feasible curricular suggestion, namely a partial hybrid curriculum, to embrace the changing format of the traditional L2 classroom, the role of technology, and SLA research.

Some Conclusions (and Questions) Based on the previous chapters, there may be a few conclusions we can make, but they also come with some questions. First, our perception of language learning and teaching is implicitly or explicitly going to be shaped by the way we think the process of learning takes place, and this will inevitably translate into how we teachers perform in the L2 classroom. Second, research and teaching performed to promote L2 learning in the L2 classroom need to be both theoretically-driven and empirically supported. To this end, we need to understand as much as possible from robust research on how learning takes place, and then use this knowledge to inform pedagogy. We researchers then need to ensure that the findings of our empirical studies that have addressed L2 development from a classroombased perspective are robust enough to be extrapolated to the formal classroom setting. Think about this: The SLA field is now several decades old, yet can we categorically state that we know the best way, based on research, to teach the L2

Conclusion

271

or to promote L2 learning in the L2 classroom? There is no question that SLA research has come a very long way in illuminating and increasing our understanding of many aspects of the L2 learning process, yet there still appears to be a disconnect between what we researchers report and publish and what we teachers find relevant to our classrooms. Perhaps we teachers are not entirely sure what the research is all about, given the many variables that contribute to both language learning and teaching. In other words, SLA research, divided into its many strands, may only address one partial aspect of what really takes place in the formal classroom setting given all the variables involved in this context. Third, in spite of the diverse perceptions of this learning process (e.g., psycholinguistic, sociocultural, interactionist, connectionist, linguistic, etc.), what is certain is that there is indeed some marriage between all these perceptions (cf. a timely publication by Hulstijn et al., 2014, on different perspectives on L2 learning and teaching). But here lie several of the different marital viewpoints. We do not all appear to agree on the construct of learning (cf. Chapter 7) or how or where learning takes place. For example, does learning take place primarily in the brain, independent from the environment, or does learning take place primarily in the environment that then promotes learning? Our natural instinct is to say both, but the question still remains: Where does learning take place primarily? In addition, we conduct our research within different theoretical underpinnings, using different methodologies, yet we all claim that our findings address the same issue, namely L2 learning, and L2 learning is typically framed within the classroom setting. Even the use of the term “learning” is interesting. For example, when we talk about L2 learning, are we talking about “learning” as in “learning” our first language? Unless we are referring to formal instruction in the classroom setting where we do learn about our first language, then perhaps we should think more of acquisition. Are we talking about “learning” an L2? Then where is this learning taking place? In an immersion setting in which the acquisitional process is dominant, perhaps both implicit and explicit learning may describe the processes employed to master this L2. If it is in the classroom setting, then we may be talking about “learning” and not “acquisition,” and we need to consider seriously the context and conditions under which this type of learning is taking place, as described above. As the preacher said, just some thoughts, not a sermon. Let us now discuss the changing dynamics currently taking place in the L2 classroom in relation to our curricula, traditional classroom format, and how instruction is being impacted.

The (Changing) State of the Traditional L2 Classroom Let us first acknowledge that the term “classroom” (like the word “typewriter”) may soon be obsolete as brick-and-mortar classrooms become even more blended (Gruba & Hinkelman, 2012; Rhodes & Pufahl, 2009). Without doubt, technology in our classrooms is playing an increasing role both in our traditional

272

Pedagogy

teacher-fronted or face-to-face (FTF) classroom and our language curriculum. We are witnessing a redefinition of the traditional L2 curricula by an increasing number of educational institutions that are enhancing, going hybrid (a combination of both online and the traditional FTF) and, in some cases, even replacing traditional instruction with computer-assisted language learning (CALL). Indeed, the format of the traditional face-to-face, teacher-fronted classroom has increased to four types: “Traditional,” where 0% of the course content is delivered online, “web facilitated” (1–29%), “blended/hybrid” (30–79%), and “online” (80% or more) (Allen & Seaman, 2013). Goertler (2011) provides some potential explanations for this shift in the language curriculum perspective. First, from a logistical perspective, CALL may be beneficial because it can save costs and increase revenue (quite a sell for administration), institutions can minimize classroom space and staffing while attracting and reaching out to a larger population of students (another sell for administration), and courses in less demand can be developed collaboratively by different institutions. From a theoretical perspective, computers can promote language use in developmentally helpful ways, as posited by several SLA theories. And from a pedagogical perspective, technology may be used to interconnect language and content courses or to provide additional training to students, including skills such as learning autonomy, time management, and computer literacy.

The Advances of Technology There is no doubt also that e-tutors (computer software that allows learners to practice independently, without the help of a teacher or a peer) and synchronous computer-mediated communication (SCMC; where technology is used as a means of communication between human interlocutors) will be increasingly used in the L2 classroom, as the push for online language learning is ever more present. However, when making curricular decisions language program administrators should take into consideration the findings of theoretically driven and empirically supported research (Cerezo, Baralt, Suh, & Leow, 2013). In addition, as Petersen and Sachs (2015) point out, “the adoption and integration of emerging technologies into effective instruction will continue to be fraught with challenges of policy (Garrison & Vaughan, 2013), accreditation (O’Dowd, 2013), curricular alignment (Taylor & Newton, 2013) and validation (O’Dowd, 2013; Owston, 2013) and will require extensive training and professional development (Graham, Woodfield, & Harrison, 2013).” Likewise, as discussed in Chapter 13, instruction in L2 classrooms in the 21st century is clearly undergoing drastic structural changes as technology continues to make huge inroads not only in what is taught but also in how the L2 is presented to students. This interest in CALL is not surprising if we note also the proliferation of CALL publications in several journals solely devoted to technology and language learning. Recently, there has been quite an explosion of CALL

Conclusion

273

studies addressing the use of various technologically enhanced tasks, including task-based virtual CALL to promote interaction (En busca de esmeraldas; GonzálezLloret, 2003) and recent 3D gaming environments in which students enter the virtual world to interact in the L2 (e.g., Bryant, 2006; Jauregui, Canto, de Graaf, Koenraad, & Moonen, 2011; Thompson & Rodriguez, 2004; Wehner, Gump, & Downey, 2011). Recently, Mentiras (Sykes & Holden, 2009) was launched and presented as the “first mobile, place-based, augmented reality game” designed to develop Spanish language skills (http://arisgames.org/featured/mentira/). Indeed, mobile-assisted language learning (MALL) is also making inroads into instruction (cf. soaring sales of tablets and smartphones, Danova, 2013), with the argument that it holds the advantages of device portability (e.g., Kukulska-Hulme, 2009). What happened to the good old days of fostering a community spirit inside the formal classroom setting (cf. Brooks, 1990; Kramsch, 1987) or promoting some kind of social interaction (outside of texting)? Studies on the use of these task-based, virtual CALL have addressed efforts to promote socialization and to allow the instructor to set his/her own objectives while offering learners attractive activities to practice their language skills (Thompson & Rodriguez, 2004; Wehner et al., 2011), learners’ individual differences, specifically affective factors such as motivation, anxiety levels, degree of interest in learning the target language and degree of motivational intensity (Wehner et al., 2011), intercultural awareness and positive experiential experience (Jauregui et al., 2011), students’ opinion about the four Second Life (SL) tasks, including level of anxiety and shyness (Liou, 2012), collaborative work, output in the target language and attitudes toward the virtual world, including level of anxiety and shyness (Peterson, 2012), and the revelations of in-game task restarts on design and implementation (Sykes, 2014). It is also claimed that these tasks take into account what Jauregui et al. (2011) call “an orientation to intercultural awareness” (p. 78), in addition to the more traditional emphasis on meaning, goal achievement, and language proficiency of off- and online tasks (cf. Byram, 1997). We also have the Brazilian-based Teletandem: Foreign Languages for All (TTB) Project, which is an international collaboration of researchers from over 22 universities in 11 countries through videoconferencing or Skype (http://www.tele tandembrasil.org/page.asp?Page=25). Defined as an autonomous, collaborative, and virtual concept for foreign language learning (cf. Telles, 2009; Telles & Vassallo, 2006; Vassallo & Telles, 2006), Teletandem has reported on the multiple approaches used by students in both synchronous and asynchronous communication, and has devised strategies and suggestions on how to implement its context in university classrooms (see Garcia, 2010), as well as in the public school system in the state of São Paulo, Brazil (Andreu-Funo, 2010). Furthermore, the TTB project has outlined pedagogical, technological, and logistical challenges the instructor may encounter through the role of mediator in Teletandem’s interactive, autonomous, and virtual context.

274

Pedagogy

Let Us Take a Pause Before Jumping on the Technology Bandwagon Before we jump on the bandwagon of Call of Duty: Advanced Warfare and World of Warcraft (with our kids and students), let us take a closer look, like we did for the literature on e-tutors in Chapter 13, at these virtual worlds/MALL tasks or activities in terms of learning outcomes. What is quite revealing is that these recent studies on the use of virtual games for L2 development, evidenced in the technology journals, have not empirically addressed learning outcomes and appear to follow the early technology-based studies that focused originally on non-learning outcomes such as motivation and untested variables such as positive experiential experience, anxiety, shyness, and so on. Given the open-endedness of these tasks and games, it may be quite challenging to control for L2 development, and whether such tasks or games are indeed ecological and do promote L2 development remains to be addressed empirically. With regard to Teletandem and MALL, given the multitude of variables that could potentially play a role in the classroom setting (e.g., type of language, individual differences, type of language-teaching methodology, etc.), this mode of communication also needs empirical verification that students’ L2 development does benefit from such innovation. As Petersen and Sachs (2015) put it, “it is vital for curriculum and materials developers and language teachers to be proactive in promoting authentically contextualized and communicative uses of mobile technologies that foster beneficial psycholinguistic processes for SLA.” Indeed, from a psycholinguistic perspective, it may be argued that the question of whether or not the medium really matters in L2 development may actually be a futile one. Language learning is an internal process that is driven by individual learner mechanisms. Consequently, it is not in essence what the learner is exposed to (be it externally manipulated L2 data or type of medium) but “what the learner really does cognitively with the input s/he receives” (Cerezo et al., 2013: 307). Sounds familiar? So where do we go from here? Given the permanent and increasingly prominent role of technology in our curricula and classrooms, one proposal is to combine the two utilizing the strengths of each component. On the one hand, the FTF format is priceless in promoting, among many other benefits, controlled and focused L2 development in a group FTF setting, with the presence of an instructor as the grand maestro who provides students with the opportunity to practice meaningfully the L2. On the other hand, a careful control of the content of the online component can maximize our students’ L2 development on an individual or perhaps paired basis. To take advantage of the benefits of SLA research, technology, and the important role of the teacher, I would like to propose the following partial hybrid curriculum.

Conclusion

275

Toward a Partial Hybrid Curriculum Given that technology is here to stay, we need to maximize its usage via welldesigned tasks that promote empirically supported practice, leading to robust learning in the classroom. This type of psycholinguistics-based task or activity has been fully discussed in Chapter 13. Fortunately, we do know which grammatical points are problematic to our students, so designing a partial hybrid curriculum may kill two birds with one stone. In other words, to maximize our students’ exposure to and interaction with the L2 in our formal classroom setting, a partial hybrid curriculum can shift the formal classroom-based presentation of several problematic grammatical points in the L2 to an online component. (Let us all admit it: We formally present or teach the subjunctive mood in the classroom. I am guilty.) In this online component of the curriculum, students at all levels of the language curriculum will be engaged in performing psycholinguistics-based tasks (cf. Chapter 13 for samples) that will provide them the opportunity to practice, process, and learn these points at a deeper level outside the classroom setting. Based on both theoretical and empirical support from SLA field of research, these tasks, as explained in Chapter 13, are designed to encourage students’ usage of crucial learner processes such as learner attention, depth of processing, levels of awareness, conceptually-driven processing (activation of prior knowledge), and working memory to promote deeper learning of these problematic L2 concepts known to present processing problems for our students. Students may be assigned these tasks to complete in advance of their practice in the classroom in order to establish a solid cognitive baseline for subsequent oral and written communication. They can also be used for revision, if so needed. The benefit of this type of curriculum is that we have more time in the classroom to promote communication in the L2 and to reduce the grammar-focused aspect of our classroom setting, putting learning where it belongs: With the students. An activity like this one is clearly grounded in a psycholinguistic approach to hybrid instruction that incorporates both media of class time and online exposure, albeit in a controlled way to maximize the amount of interaction in line with the objectives and cognitive processes involved in completing such an ecologically valid activity.

Final Conclusion I will be the first to admit that it was indeed challenging to wear my two hats (and I usually don’t wear hats) while writing a book that took a five-prong approach to the issue of L2 learning in the classroom setting from a studentcentered approach. I have been teaching in the language classroom for over four decades, and it is a location in which I have also learned a lot explicitly. The classroom setting is an extremely complex and multifaceted place in which there are

276

Pedagogy

not only the usual stakeholders (the teacher and the students) but also the curriculum, administration, policy holders, parents, politicians—and the list goes on. It is a place where the interactions that occur are oftentimes unexpected and the many variables (in some cases uncontrollable) that contribute to language learning and teaching are typically not constant. I can identify, however, two variables that may remain constant (see, experience teaches wisdom!): (1) One’s psycholinguistic perception of the L2 learning process, namely the important role cognitive processes play in this process, and (2) the impoverished “classroom” setting, be it virtual, hybrid, or brick and mortar, in relation to the inadequate amount of exposure to and interaction with the L2, together with the relatively artificial environment, curricular constraints, potential diverse teaching staff—once again, the list goes on. I consider all of these variables, both as a researcher and a teacher, and that explains simply why I opt for Explicit learning in the L2 classroom: A student-centered approach. In other words, I view the L2 classroom as a context in which the responsibility of learning is on the student (we cannot learn for them), and their role is to come prepared to class to practice what s/he has learned. It is the teacher’s role to maximize such learning by providing well-designed tasks and activities and the opportunity to practice the L2. Of course, I do need to reiterate that this psycholinguistic perspective of the L2 learning process in the L2 classroom is just one of many perspectives with regard to the L2 learning process. Finally, as in life, everything comes to an end, and I would like to conclude with one final statement: Enjoy the process of whatever you do, be it researching and/or teaching! I am indeed blessed to be doing both.

References Allen, I. E., & Seaman, J. (2013). Changing course: Ten years of tracking online education in the United States. Babson Survey Research Group and Quahog Research Group, LLC. Andreu-Funo, L. B. (2010). Teletandem e formação contínua de professores vinculados à rede pública de ensino do interior paulista: Um estudo de caso. (Dissertação de Mestrado). Programa de Pós-Graduação em Estudos Linguisticos, UNESP/IBILCE—São José do Rio Preto. Brooks, F. B. (1990). Foreign language learning: A social interaction perspective. In B. VanPatten & J. Lee (Eds.), SLA-FLL: On the relationship between second language acquisition and foreign language learning (pp. 153–169). Clevedon, UK: Multilingual Matters. Bryant, T. (2006). Using World of Warcraft and other MMORPGs to foster a targeted, social, and cooperative approach toward language learning. Academic Commons, The Library. Byram, M. (1997). Teaching and assessing intercultural communicative competence. Clevedon, UK: Multilingual Matters. Cerezo, L., Baralt, M., Suh, B-R., & Leow, R. P. (2013). Does the medium really matter in L2 development? The validity of CALL research designs. Computer Assisted Language Learning, 27(4), 294–310. Danova, T. (2013, November 13). Including wearables and smart TVs, global active Internet devices to grow 19% annually through 2018. Business Insider. Retrieved January 21, 2014, from http://www.businessinsider.com/wearables-and-smart-tvs-

Conclusion

277

will-boost-the-global-installed-base-of-internet-devices-10-this-year-17-in-5years-2013-11. Garcia, D. N. M. (2010). Teletandem: Acordos e negociações entre os pares. (Tese de Doutorado). Programa de Pós-Graduação em Estudos Linguisticos, UNESP/IBILCE—São José do Rio Preto. Garrison, D. R., & Vaughan, N. D. (2013). Institutional change and leadership associated with blended learning innovation: Two case studies. Internet and Higher Education, 18, 24–28. Goertler, S. E. (2011). Blended and open/online learning: Adapting to a changing world of language teaching. In N. Arnold & L. Ducate (Eds.), Present and future promises of CALL: From theory and research to new directions in language teaching (pp. 471–502). San Marcos, TX: CALICO. González-Lloret, M. (2003). Designing task-based CALL to promote interaction: En busca de esmeraldas. Language Learning & Technology, 7(1), 86–104. Graham, C. R., Woodfield, W., & Harrison, J. B. (2013). A framework for institutional adoption and implementation of blended learning in higher education. Internet and Higher Education, 18, 4–14. Gruba, P., & Hinkelman, D. (2012). Blending technologies in second language classrooms. Basingstoke, UK: Palgrave Macmillan. Hulstijn, J. H., Young, R. F., Ortega, L., Bigelow, M., DeKeyser, R., Ellis, N. C., Lantolf, J. P., Mackey, A., & Talmy, S. (2014). Bridging the gap: Cognitive and social approaches to research in second language learning and teaching. Studies in Second Language Acquisition, 36, 361–421. Jauregui, K., Canto, S., de Graaf, R., Koenraad, T., & Moonen, M. (2011). Verbal interaction in Second Life: Towards a pedagogic framework for task design. Computer Assisted Language Learning, 24 (1), 77–101. Kramsch, C. J. (1987). Interactive discourse in large and small groups. In W. M. Rivers (Ed.), Interactive language teaching (pp. 17–32). New York: Cambridge University Press. Kukulska-Hulme, A. (2009). Will mobile learning change language learning? ReCALL, 21, 157–165. Liou, H.-C. (2012). The roles of Second Life in a college computer-assisted language learning (CALL) course in Taiwan, ROC. Computer Assisted Language Learning, 25(4), 365–382. O’Dowd, R. (2013). Telecollaborative networks in university higher education: Overcoming barriers to integration. Internet and Higher Education, 18, 47–53. Owston, R. (2013). Blended learning policy and implementation: Introduction to the special issue. Internet and Higher Education, 18, 1–3. Petersen, K., & Sachs, R. (2015). The language classroom in the age of networked learning. To appear in R. P. Leow, L. Cerezo, & M. Baralt (Eds.), Technology and L2 learning: A psycholinguistic approach. Berlin: De Gruyter Mouton. Peterson, M. (2012). EFL learner collaborative interaction in Second Life. ReCALL, 24 (1), 20–39. Rhodes, N. C., & Pufahl, I. (2009). Foreign language teaching in U.S. schools: Results of a national survey. Washington, D.C.: Center for Applied Linguistics. Sykes, J. M. (2014). TBLT and synthetic immersive environments. What can in-game task restarts tell us about design and implementation? In M. González-Lloret & L. Ortega (Eds.), Technology-mediated TBLT: Researching technology and tasks (pp. 149–182). Philadelphia: John Benjamins. Sykes, J. M., & Holden, C. (2009). Mentiras. http://arisgames.org/featured/mentira/

278

Pedagogy

Taylor, J. A., & Newton, D. (2013). Beyond blended learning: A case study of institutional change at an Australian regional university. Internet and Higher Education, 18, 54–60. Telles, J. A. (Ed.). (2009). Teletandem: Um contexto virtual, autônomo e colaborativo de aprendizagem de línguas estrangeiras para o século XXI. Campinas: Pontes Editores. Telles, J. A., & Vassallo, M. L. (2006). Foreign language learning in-tandem: Teletandem as an alternative proposal in CALLT. The ESPecialist, 27(2), 189–212. Thompson, A., & Rodriquez, J. C. (2004). Computer gaming for teacher educators. Journal of Computing in Teacher Education, 20 (3), 94–96. Vassallo, M. L., & Telles, J. A. (2006). Foreign language learning in-tandem: Theoretical principles and research perspectives. The ESPecialist, 27(1), 83–118. Wehner, A., Gump, A. W., & Downey, S. (2011). The effects of Second Life on the motivation of undergraduate students learning a foreign language. Computer Assisted Language Learning, 24 (3), 277–289.

INDEX

acquisition xi, 78, 83, 92–5, 124–5, 271 apperception 41, 56, 92–4, 99–100; apperceived input 6, 92 awareness 2, 18, 24, 41, 49, 51, 94, 137; activating 80; analysis 196; attention and 7, 37, 70, 207; concept of 96; conscious 42, 56, 70, 165; construct 48, 76, 120, 149, 184–6, 195; and depth of processing 97–9, 123–4, 204, 208–9, 246; differential levels of 216, 225; intercultural 273; learner 91; learning with 5–6, 39, 58, 60–1, 126–7, 129; learning without 190; measures of 192, 197–8; measuring 147, 193; model 239, 241, 246; and the Noticing Hypothesis 72–3; operationalized 150, 187, 194, 256; pedagogical implication 198; presence of 75; prior knowledge and 99–100; psycholinguistic underpinnings 68; raising 255, 265; research design 146; role of 4, 50, 52, 59, 63, 82, 92–3, 145, 166, 178, 206, 245–6; strand 232, 254; studies in 62; Think Aloud Protocols 141, 144, 168, 196, 226; threshold of 227; without 191 awareness at the level of noticing 72, 82, 188–9, 219–20; at the level of understanding 81, 215, 218–19; at the level of reporting 219 awareness levels 16, 71–2, 78, 111–12, 118, 175–7, 199, 221–2, 228–9, 242, 268, 275; higher level of 169, 188, 214, 217, 223, 244; low-level 159–60, 171, 243

Cognitive Code Method 3 cognitive effort 38, 78–80, 125, 216; depth of processing 203–5, 208–13, 217–20, 225–8, 231–3; level of 18, 30–1, 41, 73, 118, 189, 241–2; measure 140; roles of 242–5 cognitive neuroscience 6, 52, 56, 68–9, 177, 205; field of 24, 29, 33, 48; modeling 53; sources 76 cognitive overload 25–6, 29, 161, 215, 225, 258 cognitive psychology 25, 34, 82, 147, 210, 232; defining implicit learning in 58, field of 24, 37, 40, 60, 62 69, 87, 150; investigating implicit learning in 59; limited capacity models in 80; research in 17, 138, 185, 204, 208, 240; studies in 71, 194; theoretical underpinnings 241, 247; theory in 4, 6, 52, 70, 161 comprehended input 92–4, 208 computer-assisted language learning (CALL) 167, 216, 232, 254–5, 272–3 conceptually-driven processing 19, 81, 99–100, 146, 188, 222, 243–5, 254–5, 268 consciousness 77, 95, 124, 184, 206, 241; and attention 41; in cognitive psychology 59; empirical research 160; input or textual enhancement strand 165–6; non-SLA 51–8; raising 4, 48–50, 90–1; Schmidt 70, 72; within MOGUL 96, 98

280

Index

data-driven processing 19, 81, 100, 146, 222, 243–5 deductive see explicit learning depth of processing 38, 41, 146, 218, 220–2, 226–9; in cognitive psychology 204–5; definition 204; in experiments 209–11; in Model of L2 Learning Process in Instructed SLA 241–3, 244–6; in other strands of research in SLA 229–32; prior knowledge role 222–5; in SLA 208; in Think Aloud Protocols 213–17 enhancement 117, 159, 208; input 4, 72, 160, 165–71, 175, 210; textual 111, 168, 176, 204, 229 e-tutors 254–8, 265, 272 explicit knowledge 4, 60, 62, 129, 165, 207; and awareness 126; and learning 124; measuring 128 explicit learning 6, 8, 15, 126–7, 129, 253, 271; benefits of 189, 198; deductive instruction 184; model 246; non-SLA definitions 58–60, 62; pedagogy 227; in studies 192; support for 187 eye-tracking (ET) 73, 120, 129–30, 136, 140–1, 146, 242; depth of processing 202, 226; measuring 129; research methodology 112, 140–1, 146 Gass and Mackey’s Interaction Approach 84, 106, 189–90 Gass’s Model of Second Language Acquisition 42, 92–4 Grammar Translation Method 4, 90, 184 hybrid curriculum x, 199, 270, 272, 274–5 implicit knowledge 4, 89, 124, 126, 150, 186, 227; and learning 190, 193–4; measure 128 implicit learning 6, 8, 15, 190, 197–9, 227, 231, 242; and awareness 126; characterization 62; in cognitive psychology 58–61; Ellis 88–9; and Reaction Time 137–9; Robinson 80; Schmidt 70, 72; in SLA field 185–6; strand 131–2; in studies 150, 192–4, 220 incidental learning 59, 117, 186, 204–5, 214; acquisition 124; tasks 211; vocabulary learning 210 inductive see implicit learning Input Hypothesis 106, 190

input processing 51, 71, 73–7, 80–6, 92, 173, 186, 227; and depth of processing 215; model 239–42; and processing instruction 171; stages of SLA 17–18; and VanPatten 162 input/textual enhancement ix, 159 instructed SLA 129, 239, 241, 246; key features 246–7 intake processing 17–20, 77, 127–30, 257; model 239–46 Interactionist Approach 90, 92, 173–4, 177 interlanguage 3, 18, 20, 79, 90, 128, 174; output 245; prior knowledge 99; Think Aloud 141 internal processes 112, 120, 136–7, 147, 151; learners’ 4–8, 83, 177 internal validity 7, 63, 73, 109, 137; criteria for 113–14; external validity 120–1; measurement 116–19; sample of high 111–12; think alouds 137–8, 149 item learning 123, 125, 128, 131, 133, 243, 245, 246 knowledge processing 17, 20, 90–2, 99, 127, 245; model 239, 241–2, 246 learning condition 59, 69, 229 Leow’s model of the L2 learning process in Instructed SLA viii, xiii, 239, 241–6 McLaughlin’s Cognitive Theory 6, 68, 77–8, 208; comments on 80; of input 242; of language 270 metacognitive 141–3, 145, 168, 213 N.C. Ellis’s Associative-Cognitive CREED 68, 84, 87–9, 99, 103–5 noticing 71–3, 76, 82, 130, 140, 173, 175, 217, 256; awareness at the level of 219–20; depth of processing 208, 211, 214, 228; empirical research 188–9, 196, 203; experiment 111, 118, 120; Gass’s model 93; in input processing 177; MOGUL 95–8; prior knowledge 99–100, 222; role of 159–61; Stimulated Recall 148; studies 168–9, 171; Swain 91; textual enhancement 229–30; Think-Aloud 144, 215; see also Schmidt’s Noticing Hypothesis perception of 23–4, 40, 48, 52, 57, 96, 165, 185, 197, 205; learners’ perception 174; researchers’ perception 95;

Index

subliminal perception 96–7; visual perception 61, 140 prior knowledge 51, 99; activation of 16, 144–5, 205, 218, 245; depth of processing 222; methodology 111–12; model 241, 243–4; role of 54, 56, 81–2, 91, 94; theoretical 80 reaction time (RT) 34–5, 77, 145–6, 129–30, 203; depth of processing 203, 226; empirical research 192–4; major characteristics 146; methodology 120; online verbal reports 141 reactivity 8, 142–3, 176, 197, 246 Robinson’s Model of the Relationship between Attention and Memory 6, 80–2 Schmidt’s Noticing Hypothesis 6, 69–70, 80–1, 95, 173, 187–8; comments on 73, 94; hypothesis 241; key features 72; similar to 215 student-centered ix, 257, 270 subliminal 41, 57; learning 35; perception 96–7; processing 50 Swain’s Output Hypothesis 20, 68, 90–2, 99, 174 synchronous computer-mediated communication (SCMC) 149, 174–5, 204, 272 system learning 123, 125, 128, 132, 133, 244, 245, 246

281

teacher-centered xi, 2, 86, 216, 257, 276 think aloud protocols (TA) 136, 141–6, 163, 168, 178, 187–8, 196–9, 219–20, 207, 212–18, 227 think alouds 23, 89, 160; awareness 203; depth of processing 231; internal validity 129, 133; research 220 3D Gustar Maze 257–8, 263, 265 Tomlin and Villa’s Model of Input Processing 30, 73–5, 77, 82 Truscott and Sharwood Smith’s Modular Online Growth and Use of Language (MOGUL) 84–5, 110–15, 182, 225, 256 unconscious 41, 50, 55, 60, 147; prior knowledge 54; processing 49, 51–3, 55–7, 78, 125, 139 VanPatten’s Input Processing Model/ Theory 84, 98, 102, 104 working memory 16–17, 19, 51, 100; learning 127–8, 130–1; model 241–3, 246; oral feedback strand 231; pedagogy 254–5, 268, 275; recasts 174; Robinson 81–2; theoretical foundations 38–41, 51, 53, 56, 96; Van Patten 83, 85 working memory capacity (WMC) 54–6

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF