AI-powered content analysis: Using generative AI to measure media and communication content
Methods tutorial #28831(a), module (political) communication research methods
Syllabus
Last updated: 01. 12. 2025, 16:43 Uhr
Important links
Blackboardš
Overview
Large language models (LLM; starting with Googleās BERT) and particularly their implementations as generative or conversational AI tools (e.g., OpenAIās ChatGPT) are increasingly used to measure or classify media and communication content. The idea is simple yet intriguing: Instead of training and employing humans for annotation tasks, researchers describe the concept of interest to a model such as ChatGPT, present the coding unit, and ask for a classification. The first tests of the utility of ChatGPT and similar tools for content analysis were positive to enthusiastic (Gilardi et al., 2023; Heseltine & Clemm von Hohenberg, 2024; Rathje et al., 2024). User-friendly tutorials have proliferated the method to the average social scientist (Stuhler et al., 2025; Tƶrnberg, 2024a). Yet (closed-source, commercial) large language models are not entirely understood even by their developers, and their uncritical use has been criticized on ethical grounds (Bender et al., 2021; Spirling, 2023; Widder et al., 2024).
In this seminar, we will engage practically with this cutting-edge research method. We start with a quick refresher on the basics of quantitative content analysis (both human and computational) and an overview of the rapidly developing literature on LLMsā utility in this field. The main part of the seminar will be dedicated to learning step-by-step how to use and evaluate a generative AI model for applied content analytical research. In the end, students should be able to use the method in their own research.
Course plan
Preliminary, subject to change!
Introduction
(1) 13. 10.: Hello
Class content: Introduction, demo, and organization
Organization: Find a team mate for the state-of-the-art presentation. The goal is to find someone with a complementary skill set.
Homework:
- Listen to this podcast episode with Petter Tƶrnberg: LLMs in Social Science. For context: This podcast was reported when ChatGPT was still new, so there is a lot of excitement and (too) much focus on OpenAIās GPT models.
- Register your presentation in the Blackboard Wikiš.
- Register for account for the computing environment (Instructions in Blackboardš).
(2) 20. 10.: Refresher: Traditional content analysis (human and computational)
Class content: Quick refresher on the basics of quantitative content analysis (both human and computational)
Texts (if needed):
State of the art
Class content: Short presentations on current work about LLM-based zero-shot classification
- Short presentations (10-15 Minutes)
- One paper presented by two participants (teams of three only if all other texts already have two)
(3) 27. 10.: State of the art I (Pioneers)
(4) 03. 11.: State of the art II (Communication Research)
(5) 10. 11.: State of the art III (Sociology & Political Science)
(6) 17. 11.: State of the art IV (Questions and Ideas)
Tutorial
Class content: Step-by-step tutorial to set-up and evaluate a zero-shot content analysis with generative AI tools
(7) 24. 11. ⦠(11) 05. 01.
Application
Class content: Own adaption of the classification procedure from the tutorial, performance evaluation
(12) 12. 01. ⦠(14) 26. 01.
Results
Class content: Short presentations on the own adaption
(15) 02. 02. & (16) 09. 02.
Aims
After the seminar, students should be able to:
- understand the methodological literature on zero-shot content analysis with generative AI tools.
- critically evaluate and improve the performance of a classifier in a (computational) content analysis.
- use zero-shot content analysis with generative AI tools in your own research project.
Requirements
- Prior knowledge in R, applied data analysis, and interacting with application programming interfaces (API) will be helpful but are not required. However, a willingness to learn the necessary skills and an openness to explore the possibilities of code-based computational social science research during the seminar are mandatory.
- Some prior exposure to (standardized, quantitative) content analysis will be helpful. However, qualitative methods also have their place in evaluating content analysis methods. If you have little experience with the former but can contribute with the latter, make sure to team up with a student whose skill set complements yours.
- Students will primarily use a curated, browser-based RStudio/ R environment to interact securely with onsite-hosted LLMs. This will work on any device with a web browser. However, using a device with a keyboard will be most practical.
Tasks
- 5 ECTS ā 125-150 hours workload
- Active participation, not graded
- Participation in class: read texts, ask questions, discuss, give feedback to other students
- Short 10-minute presentation of a published evaluation study report (in pairs)
- Follow along in the step-by-step tutorial to set-up and evaluate a zero-shot content analysis with generative AI tools (in pairs)
- Implement a small adaption of the classification procedure, evaluate its performance, and present the results (in pairs)
Materials
Seminar materials
- Work in Progress
- Most materials will be provided on this website or within the browser-based computing environment
- Additional material in Blackboardš: e.g., non-public data.
Computing environment
- Please follow the instructions in Blackboardš to access the computing environment.
Additional materials
- Getting started: R Primers by Andrew Heiss: browser-based, no local installation required
- Code recipes for typical tasks: Posit Recipes
- Learn how to properly use R for data science: R for Data Science von Hadley Wickham; Be sure to use 2nd edition.
General information
Generative large-language-model-based tools (commonly known as artificial intelligence, AI) challenge learning and teaching in academic contexts. My general advice is to engage with these tools openly but critically. The tools can help you with academic tasks. They can also help you learn. However, the tools are prone to errors, and, unfortunately, they often fail silently: They provide seemingly sensible answers, often in a convinced (and convincing) tone. More importantly, such tools can hinder or even prevent positive learning outcomes if they are used too early, too frequently, or in an inappropriate manner.
If you want to engage with AI tools in the context of (academic) learning, I highly recommend the online course Modern-Day Oracles or Bullshit Machines? How to thrive in a ChatGPT world.
In this methods tutorial, AI tools can help you with programming tasks. Particularly, students who have little or no prior knowledge of programming languages like R, can be tempted to let AI tools create their whole code (so-called vibe coding). This code is then pasted into RStudio and will sometimes (or often, depending on the taskās complexity) work on the first attempt. While this might be an effective way to complete tasks for this class, it is also a sure way to avoid learning anything. Therefore, my primary recommendation is to approach each task without using AI tools first. If you get stuck, need more explanation of specific code snippets, or want to check your solution against another one, AI tools can be of great help. Used in this way, they can be beneficial to learning outcomes.
Finally, remember that this class is not graded. There is no need to use AI tools to achieve a ābetterā result and, consequently, a better grade.
My goal is for all students to feel welcome and actively participate in this class. I strive to ensure that no one is discriminated against or excluded through course planning and my language. Likewise, I expect all participants to behave respectfully and appreciatively, acknowledging the opinions and experiences of other students. At the same time, it is clear that neither I nor the students will always fully meet this expectation. Therefore, please inform me or your peers if you feel uncomfortable or observe discriminatory behavior. If you prefer not to do this yourself, you can also appoint a trusted person to do so.
Attending university is demanding and, as a time of transition, brings many challenges, both within and outside of your academic work. If you feel overwhelmed, please make use of support services such as the Mental Wellbeing support.point or the Psychological Counseling Service. Feel free to contact me directly or through a trusted person if your situation conflicts with the course requirements.
Contact
Arbeitsstelle Digitale Forschungsmethoden
E-Mail: marko.bachl@fu-berlin.de
Telephone: +49-30-838-61565
Webex: Virtual Office
Office: Garystr. 55, Raum 274
Office hours: Tuesday, 11:00-13:00, please book an appointment.
