Skip to main content Skip to navigation

BASE (British Academic Spoken English) and BASE Plus Collections

Overview of BASE

The British Academic Spoken English (BASE) project took place at the Universities of Warwick and Reading between 2000–2005, under the directorship of Hilary Nesi (Warwick), with Paul Thompson (Reading). Natalie Snodgrass and Sarah Creer were employed as research assistants and Tim Kelly was video producer of the project. Lou Burnard (Oxford University) and Adam Kilgarriff (Lexicography MasterClass Ltd) acted as consultants.

The BASE Corpus consists of 160 lectures and 40 seminars recorded in a variety of departments (video-recorded at the University of Warwick and audio-recorded at the University of Reading). It contains 1,644,942 tokens in total (lectures and seminars). Holdings are distributed across four broad disciplinary groups, each represented by 40 lectures and 10 seminars.
The corpus has been deposited in the Oxford Text Archive and is catalogued by the Arts and Humanities Data Service. It is also available from this site via BASE Files page.


The early stages of corpus development were assisted by funding from the Universities of Warwick and Reading , BALEAP, EURALEX, and The British Academy (2000-2001, Grant reference: SG 30284). Major funding was provided by the Arts and Humanities Research Council as part of their Resource Enhancement Scheme (2001–2005, Award Number: RE/AN6806/APN13545).

Overview of BASE Plus
BASE Plus is a larger collection of British Academic Spoken English data held at the Centre for Applied Linguistics. It comprises the following:
  • Text and tagged transcripts of the lecture and seminar holdings (i.e. the BASE collection)
  • Video and audio recordings of lectures and seminars recorded in a variety of university departments;
  • Video recordings of academic conference presentations;
  • Interviews with academic staff on aspects of their academic work and field (audio recordings, transcripts, interview notes);
The BASE Plus collection is of great value to researchers (research students, academic staff and visiting academics) for purposes such as the following:
i. Discourse, pragmatic and multimodal analyses of authentic academic English discourse; e.g
  • the structure of academic lectures;
  • argumentation in seminars;
  • the discourse function of intonation;
  • the interplay of visual and aural stimuli;
  • patterns of interaction, including turn-taking and topic selection;
  • the representation of ideas and the expression of attitudes;

ii. Corpus linguistic analyses of transcripts; e.g.:

  • the frequency and range of academic lexis;
  • the meaning and use of individual words and multi-word units;

iii. Pedagogic analyses of methods and styles of academic lectures and seminars at a British university;

  • the pace, density and delivery styles of academic lectures;
  • styles of engagement and interaction in academic lectures and seminars;
iv. Cross-cultural comparisons with comparable data from other countries and languages;
v. Research methodology illustrations and analyses of interviewing.
BASE Plus may also be compared with other corpora, such as MICASE (Michigan Corpus of American Spoken English) and the T2K-SWAL (TOEFL 2000 Spoken and Written Academic Language).
BASE Plus is a collection of British spoken academic discourse at the turn of the 21st century. In the future may be compared with corpora compiled to investigate diachronic change in academic language use.
The BASE Plus video recordings have been used for a number of materials development projects at the University of Warwick, most notably the Essential Academic Skills in English (EASE) series; please note that EASE: Seminar Discussions and EASE: Listening to Lectures are now available in an online format.
The text and tagged transcripts of the original BASE corpus are available from this site as well as the Oxford Text Archive, and were developed as part of the British Academic Spoken English corpus project, 2000–2005. The video and audio resources for the entire BASE Plus collection are held only in the Centre and are NOT available for purchase. They are only available to students and academic staff in the Centre for Applied Linguistics for research and teaching purposes; they are also available to official academic visitors to the Centre.