Abstract
Since the pioneering work of John Sinclair on building and using corpora for researching, describing and teaching language, much thought has been given to corpora in Applied Linguistics (Hunston 2002), how to use corpora in language teaching (Sinclair 2004), teaching and learning by doing corpus analysis (Kettemann / Marko 2002) and similar themes. A look at the titles of recent papers, monographs and edited volumes—which are printed in italics in this introduction—suggests that Applied Corpus Linguistics (Connor / Upton 2004) has established itself as a specific and expanding field of study. It has provided ideas on how to manage the step from corpora to classroom (O’Keeffe et al. 2007) and has produced a growing body of research into the use of corpora in the foreign language classroom (Hidalgo et al. 2007). At face value, the enthusiasm of the research community seems to be increasingly shared by practising teachers. At many teacher training seminars at which I have discussed the use(fulness) of corpus resources, I have met teachers who—at the end of the seminar—were eager to use corpora with their students and were especially interested in the growing number of easily accessible web-based resources. But in spite of everyone’s best intentions, the use of corpora in language classrooms remains the exception, and the question of what it takes to get past ‘Groundhog Day’ in corpus-based language learning and teaching is far from being solved. Spoken corpora may not be the obvious solution. The use of Spoken corpora in Applied Linguistics (Campoy / Luzón 2007) is usually considered to be more challenging than the use of written corpora, since spoken language is often perceived to be ‘messy’, grammatically challenging and lexically poor. Moreover, spoken corpora have traditionally been more difficult to build and distribute. However, multimedia technologies have not only made this easier but they have also opened up new ways of exploiting corpus data. Against this backdrop, this paper will argue that spoken multimedia corpora are not simply an interesting type of corpus for language learning, but that they can in fact lead the way in bringing corpus technology and language pedagogy together (Braun et al. 2006). After a brief review of some of the prevailing obstacles for a more wide-spread use of corpora by students and some common approaches and solutions to the problems at hand (in section 2), one approach to designing a pedagogically viable corpus will be discussed in more detail (in section 3). The approach will then be exemplified (in section 4) using the ELISA corpus, a spoken multimedia corpus of professional English, to illustrate how corpus-based work can be expanded beyond the conventional methods of ‘data-driven learning’. The paper will be concluded with an outlook on some more recent initiatives of spoken corpus development (in section 5). The wider aim of this paper is to stimulate further discussion about, and research into, the development of pedagogically viable corpora, tools and methods which can foster student-centred corpus use in language learning and other areas such as translator / interpreter training and the study of language-based communication in general.