Corpus Methods for Research in Pragmatics

Judith Degen

  • Area: LaLo
  • Level: I
  • Week: 1
  • Time: 11:00 – 11:30
  • Room: C2.01

Click here to visit the course website. 


Traditionally, the primary source of data in pragmatics has been researchers’ intuitions about utterance meanings. However, the small numbers of introspective judgments about examples, hand-selected by researchers who themselves provide these judgments, introduces bias into the phenomena under investigation. The recently emerging use of experimental methods for probing linguistically untrained language users’ interpretations has ameliorated the bias introduced by small numbers of judgments. It cannot, however, remove item bias: researchers artificially construct the stimuli used in experiments. Fortunately, studying corpora of naturally occurring language can reduce item bias. Corpora provide naturally occurring utterances that can be used in tandem with platforms like Mechanical Turk to provide large-scale crowd-sourced interpretations of these utterances, thereby allowing for constructing large databases of different types of meanings (e.g., implicatures) in context.

In order to not only introduce course participants to the use of corpora of naturally occurring language for research in semantics/pragmatics but also equip you with practical skills for conducting your own research in this area, the course will contain a substantial hands-on component. We will use tools for searching syntactically parsed corpora (tgrep2, TDTlite) as well as tools for analyzing and visualizing data (R, in particular the lme4 and ggplot2 packages).