Overview of BAWE
The British Academic Written English (BAWE) corpus was created through a project entitled 'An investigation of genres of assessed writing in British Higher Education' from 2004 – 2007. This project was funded by the Economic and Social Research Council (Project number RES-000-23-0800) and was a collaboration between the Universities of Warwick, Reading and Oxford Brookes.
The BAWE corpus contains 2761 pieces of proficient assessed student writing, ranging in length from about 500 words to about 5000 words. Holdings are fairly evenly distributed across four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences and Physical Sciences) and across four levels of study (undergraduate and taught masters level). Thirty-five disciplines are represented.
The assignments have been annotated using a system devised in accordance with the TEI guidelines. The header for each file includes factual information such as gender and year of birth and also contains some research findings from the initial team such as genre family. There is a dtd file which must be kept in the same folder as the corpus files, named tei_bawe.dtd and the holdings are described in an Excel spreadsheet 'BAWE.xls'. The transcription and mark-up conventions are described in the BAWE manual document, which is in PDF format.
The corpus is available free of charge to non-commercial researchers who agree to the conditions of use and who register with the Oxford Text Archive. The BAWE corpus can be accessed through the Oxford Text Archive (https://ota.bodleian.ox.ac.uk/repository/xmlui/handle/20.500.12024/2539) as resource number 2539. It includes text files, a spreadsheet with contextual information, and a corpus manual.
One of the original Principal Investigators, Professor Hilary Nesi of Coventry University, manages a useful website about BAWE and has a database of research articles based on the corpus - please contact her to add your research to the list.
For more information about the BAWE corpus holdings at Warwick, please email firstname.lastname@example.org
Overview of BAWE Plus
BAWE Plus is a collection of resources for research into academic written English in the UK in the twenty-first century. In addition to the BAWE corpus, it includes the following main components:
supplementary bawe data
the welt pilot corpus
This is a collection of written answers at grade B and above from the former Warwick English Language Test (now no longer used).
Other Academic English Resources
Applied Linguistics at Warwick also holds a collection (a corpus and associated resources) in British Academic Spoken English. See BASE Plus.
We welcome proposals from potential doctoral students and other researchers interested in working with these resources.
To contact us, please email email@example.com