Skip to main content Skip to navigation

Text Data in Economics Masterclass

Text Data in Economics Masterclass

QAPECLink opens in a new window in collaboration with PEPELink opens in a new window Research Group is offering a training opportunity to acquire important skills in text data analysis. The masterclass is intended for MRes-PhD students, staff of the Department and QAPEC and PEPE affiliates.


Prof Elliott AshLink opens in a new window, from ETC Zurich, will deliver the course. Elliott's research and teaching focus on empirical analysis of the law and legal system using techniques from econometrics, natural language processing, and machine learning.


Knowledge of Python basics. Supplementary materials will be provided prior to the course for those who need to learn Python (coming soon)

Learning Objectives

1. To implement and evaluate text-as-data methods.

2. To evaluate the use of text-analysis tools in economics research.

3. To plan a research project using text data.

Teaching Team

Instructor: Elliott AshLink opens in a new window,

TA: Claudia Marangon,


Lectures: 2 (1.5 hours) lectures per week for 4 weeks on Zoom from mid June to mid July 2022

TA office hrs: TBC


Course Format

  • 10 lectures on zoom (10 hours), recorded
  • 4 TA sessions on zoom (4 hours), recorded
  • In-person workshopping of student project papers


  • 3 problem sets based on the example notebooks
  • In-class presentation of a course reading, with a partner (sign up sheetLink opens in a new window)
  • Referee report on one of the course readings
  • Research proposal on a text-data project (first and second draft, individually or partners)

Critical Presentations

  • Done in pairs
  • 10 minutes maximum
  • Present and critique the following:
    - research question
    - text-analysis method
    - empirical methods
    - results
    - contribution

Lecture Schedule

June 15th 13h UK Lecture 1
June 17th 10h Lecture 2
June 20th 10h Lecture 3
June 22nd 10h Lecture 4
June 29th 15h Lecture 5
July 4th 10h Lecture 6
July 6th 10h Lecture 7
July 8th 10h Lecture 8
July 11th 10h Lecture 9
July 12th 10h Lecture 10

TA Session Schedule

June 16th TBD TA Session 1
June 21st TBD TA Session 2
July 5th TBD TA Session 3
July 13th TBD TA Session 4

Topics Outline and Main Economics Papers Readings

Sign-up sheet for discussant presentationsLink opens in a new window

1. Overview

a. Gentzkow, Kelly, and Taddy, “Text as DataLink opens in a new window.”

2. Style Features and Dictionaries

a. Enke (2020), Moral values and votingLink opens in a new window
b. Michalopoulous and Xue (2021), FolkloreLink opens in a new window

3. Tokenization

a. Gentzkow and Shapiro (2010), What Drives Media Slant? Evidence from U.S. Daily NewspapersLink opens in a new window.
b. Hassan, Hollander, Van Lent, and Tahoun (2019), Firm-Level Political Risk: Measurement and EffectsLink opens in a new window

4. Document Distance

a. Kelly, Papanikolau, Seru, and Taddy, Measuring technological innovation over the very long runLink opens in a new window
b. Cage, Herve, and Viaud, The production of information in an online worldLink opens in a new window

5. Topic Models

a. Hansen, McMahon, and Prat, Transparency and deliberation with the FOMC: A computational linguistics approachLink opens in a new window.
b. Ash, Morelli, and Vannoni, “More laws, more growth? Evidence from U.S. statesLink opens in a new window

6. Supervised Learning

a. Gentzkow, Shapiro, and Taddy (2019), Measuring group differences in high-dimensional choices: Method and application to Congressional SpeechLink opens in a new window
b. Widmer, Galletta, and Ash (2022), Media Slant is ContagiousLink opens in a new window

7. Word Embeddings

a. Ash, Chen, and Ornaghi (2022), “Gender attitudes in the judiciary: Evidence from U.S. Circuit CourtsLink opens in a new window
b. Ash, Gennaro, Hangartner, and Stampi-Bombelli (2022), “Immigration and Social Distance: Evidence from Newspapers during the Age of Mass Migration”.

8. Syntactic and Semantic Parsing

a. Antoniak, Mimo, and Levy (2019), Narrative paths and negotiation of power in birth storiesLink opens in a new window

b. Ash, Gauthier, and Widmer (2022), Relatio: Text semantics capture political and economic narrativesLink opens in a new window

9. Additional Topics

a. Ash, Durante, Grebenschikova, and Schwarz (2022), Visual Representation and Stereotypes in News MediaLink opens in a new window.

b. Transformers tutorialLink opens in a new window


Further helpful resources can be found at the following links:

Form Submission

Please complete the form below to express your interest in partaking in the masterclass.

Role (required)
Privacy notice

The University of Warwick will process your personal data provided in this online registration form for the purposes of registering your interest to attend the sessions.

The legal basis for processing this personal data is necessary for the performance of services.

Your personal data will not be shared or disclosed to any third parties external to the University of Warwick. Your personal data will not be transferred outside of the UK, will be kept securely by the University of Warwick and will be retained indefinitely.

The University of Warwick is the Data Controller of any information you have entered on this form and is committed to protecting the rights of individuals in line with Data Protection Legislation. The University's Data Protection webpages provide further information on your rights and how the University processes personal data. If you wish to submit a data subjects rights request, make a complaint or report a suspected personal data breach, please contact the University’s Data Protection Officer by email at

Spam prevention

Failure to load reCAPTCHA

reCAPTCHA is a utility used to verify you're not a robot filling out this form. Unfortunately this has failed to load correctly.

Please try reloading the page. If the problem persists, or if you are in a country which blocks Google products, please contact us by using the ‘page contact’ link at the foot of this page.