CABank English CABNC Corpus

Saul Albert
Social Sciences
Loughborough University


Laura de Ruiter
Department of Psychology
Tufts University


J.P. de Ruiter
Department of Computer Science
and Department of Psychology

Tufts University


Participants: ~400
Type of Study: Subcorpus converted to CHAT for TalkBank
Location: UK
Media type: audio
DOI: doi:10.21415/T55Q5R

Browsable transcripts

Download transcripts

Media folder

Citation information

Saul Albert, Laura E. de Ruiter, and J.P. de Ruiter (2015) CABNC: the Jeffersonian transcription of the Spoken British National Corpus.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The CABNC corpus is a open-licensed, detailed conversation analytic re-transcription of naturalistic conversations from a subcorpus of the British National Corpus amounting to around 4.2 million words in 1436 separate conversations. The project aims to produce transcripts usable for both computational and detailed qualitative analysis. If you are a CA transcriptionist and you use the data, please make sure you re-submit your updated transcripts to help improve the corpus over time. The project website with instructions for contributing is at /CABNC