TalkBank CABank

This page provides an index to TalkBank CA data. These files include legacy CA transcripts, transcripts in the newer CA/CHAT format, and CA-relevant data not yet in CA format. All of the materials referenced here are naturalistic conversations amenable to CA analysis.

We use stars to rate the extent to which the transcripts capture detailed conversational phenomena. Transcripts with only one star have not yet been transcribed in CA, but include interesting materials that could eventually be transcribed in CA format.

The following table first lists CABank corpora and then lists three additional TalkBank corpus collections relevant to CA work.

You can also browse the CABank database online from this link.

Corpus Description Contributor Rating
Bergmann German emergency phone calls, recorded by Jörg Bergmann. Johannes Wagner *****
Bradford Narrative samples from African American adults from Washington, D.C. Angela Bradford Wainwright *
CallFriend Phone calls in Chinese, English, French, German, Japanese, and Spanish Linguistic Data Consortium **
CallHome Phone calls in Arabic, Chinese, English, German, Japanese, and Spanish. Linguistic Data Consortium **
CABNC Spoken language segment of the British National Corpus Saul Albert *****
CLAPI French conversations from the CLAPI Project. Lorenza Mondada *****
CMU Conversations collected by students at CMU. These can only be used for teaching purposes. Brian MacWhinney *
Croatian Spontaneous informal conversations of adult speakers of different ages in a wide variety of Croatian dialects. Gordana Hrzica and Jelena Kuvac-Kraljevic ***
Examples Examples for testing the TalkBank browser. Johannese Wagner and Brian MacWhinney ***
Garfinkel-Seminars Lectures by Harold Garfinkel, contributed by Johannes Wagner. Johannes Wagner *****
GCSAusE Australian conversations. Michael Haugh *****
Grimshaw An hour-long dissertation defense. Allen Grimshaw *
GulfWar Radio call-in show discussions during the first day of the first Gulf War. Elizabeth Couper-Kuhlen ***
ISL Conversations recorded to test ASR methods for meeting Susie Burger *
JOC Eight conversations from a special issue of the Journal of Communication. Curtis LeBaron *
Mambila Conversations in Mambila. David Zeitlyn **
MOVIN Conversations in Danish, German, French, English, and Italian. Johannes Wagner ***
Nahuatl Nahuatl story of a shooting. Jane Hill ***
Sakura Videotaped conversations of groups of 4 Japanese college students, not yet in CA format. Susanne Miyata **
SBCSAE The Santa Barbara Corpus of Spoken American English. A wide variety of interactional types. Jack DuBois ****
SCoSE The Saarbrücken Corpus of Spoken (American) English. Dennis Norick *
SPIRE HCI Design discussions. Jakob Buur *
Taiwan Mandarin Conversations in Taiwan Mandarin Kawai Chui ***
Taiwan Hakka Conversations and narratives in Taiwan Hakka Huei-ling Lai ***
Yiddish Hassidic Jews in New York speaking Yiddish. Zelda Newman **
Yucatec Story telling in Yucatec Mayan. John Haviland **
Lingua Franca Lingua Franca transcriptions by Gail Jefferson. Johannes Wagner *****
Newport Beach Newport Beach transcriptions by Gail Jefferson. Johannes Wagner *****
Poetics Lecture in Boston in 1977 by Gail Jefferson. Johannes Wagner *****
Watergate Watergate phone call transcriptions by Gail Jefferson. Johannes Wagner *****
SCOTUS-Blackmun Interview with Justice Henry Blackmun Jerry Goldman *
SCOTUS-Douglas Interview with Justice William O. Douglas Jerry Goldman *
SCOTUS_Oral_Arguments Oral arguments in the US Supreme Court. We have 38 years, each in its own .zip archive. Jerry Goldman *
Collection Description Contributor Rating
SamtaleBank Danish CA corpora from the DK-CLARIN Project. Register here for access to protected data. Johannes Wagner *****
BilingBank Corpora from multilingual groups engaged in code-switching. various **
ClassBank A wide variety of videotaped classroom lessons and work group interactions. various *
Password Password-protected corpora. various *