AI-powered content analysis: Using generative AI to measure media and communication content

Methods tutorial #28831(a), module (political) communication research methods

Lecturer

Prof. Dr. Marko Bachl

Day and time

Monday, 14:15

Room
Garystr.55/302a Seminarraum

Syllabus

Last updated: 01. 12. 2025, 16:43 Uhr

Overview

Large language models (LLM; starting with Google’s BERT) and particularly their implementations as generative or conversational AI tools (e.g., OpenAI’s ChatGPT) are increasingly used to measure or classify media and communication content. The idea is simple yet intriguing: Instead of training and employing humans for annotation tasks, researchers describe the concept of interest to a model such as ChatGPT, present the coding unit, and ask for a classification. The first tests of the utility of ChatGPT and similar tools for content analysis were positive to enthusiastic (Gilardi et al., 2023; Heseltine & Clemm von Hohenberg, 2024; Rathje et al., 2024). User-friendly tutorials have proliferated the method to the average social scientist (Stuhler et al., 2025; Tƶrnberg, 2024a). Yet (closed-source, commercial) large language models are not entirely understood even by their developers, and their uncritical use has been criticized on ethical grounds (Bender et al., 2021; Spirling, 2023; Widder et al., 2024).

In this seminar, we will engage practically with this cutting-edge research method. We start with a quick refresher on the basics of quantitative content analysis (both human and computational) and an overview of the rapidly developing literature on LLMs’ utility in this field. The main part of the seminar will be dedicated to learning step-by-step how to use and evaluate a generative AI model for applied content analytical research. In the end, students should be able to use the method in their own research.

Course plan

Preliminary, subject to change!

Introduction

(1) 13. 10.: Hello

Class content: Introduction, demo, and organization

Organization: Find a team mate for the state-of-the-art presentation. The goal is to find someone with a complementary skill set.

Homework:

  • Listen to this podcast episode with Petter Tƶrnberg: LLMs in Social Science. For context: This podcast was reported when ChatGPT was still new, so there is a lot of excitement and (too) much focus on OpenAI’s GPT models.
  • Register your presentation in the Blackboard WikišŸ”’.
  • Register for account for the computing environment (Instructions in BlackboardšŸ”’).

(2) 20. 10.: Refresher: Traditional content analysis (human and computational)

Class content: Quick refresher on the basics of quantitative content analysis (both human and computational)

Texts (if needed):

  • Manual content analysis: Krippendorff (2019), Neuendorf (2017) (but not the parts on computational content analysis)
  • Computational content analysis: Bachl & Scharkow (2024), Van Atteveldt et al. (2022), Kroon et al. (2024)

State of the art

Class content: Short presentations on current work about LLM-based zero-shot classification

  • Short presentations (10-15 Minutes)
  • One paper presented by two participants (teams of three only if all other texts already have two)

(3) 27. 10.: State of the art I (Pioneers)

(4) 03. 11.: State of the art II (Communication Research)

  • Stolwijk et al. (2025)
  • Kathirgamalingam et al. (2024)
  • Stoll et al. (2025)

(5) 10. 11.: State of the art III (Sociology & Political Science)

  • Stuhler et al. (2025)
  • Chae & Davidson (2025)
  • Heseltine & Clemm von Hohenberg (2024)

(6) 17. 11.: State of the art IV (Questions and Ideas)

Tutorial

Class content: Step-by-step tutorial to set-up and evaluate a zero-shot content analysis with generative AI tools

(7) 24. 11. … (11) 05. 01.

Application

Class content: Own adaption of the classification procedure from the tutorial, performance evaluation

(12) 12. 01. … (14) 26. 01.

Results

Class content: Short presentations on the own adaption

(15) 02. 02. & (16) 09. 02.

Aims

After the seminar, students should be able to:

  • understand the methodological literature on zero-shot content analysis with generative AI tools.
  • critically evaluate and improve the performance of a classifier in a (computational) content analysis.
  • use zero-shot content analysis with generative AI tools in your own research project.

Requirements

  • Prior knowledge in R, applied data analysis, and interacting with application programming interfaces (API) will be helpful but are not required. However, a willingness to learn the necessary skills and an openness to explore the possibilities of code-based computational social science research during the seminar are mandatory.
  • Some prior exposure to (standardized, quantitative) content analysis will be helpful. However, qualitative methods also have their place in evaluating content analysis methods. If you have little experience with the former but can contribute with the latter, make sure to team up with a student whose skill set complements yours.
  • Students will primarily use a curated, browser-based RStudio/ R environment to interact securely with onsite-hosted LLMs. This will work on any device with a web browser. However, using a device with a keyboard will be most practical.

Tasks

  • 5 ECTS ā‰ˆ 125-150 hours workload
  • Active participation, not graded
  • Participation in class: read texts, ask questions, discuss, give feedback to other students
  • Short 10-minute presentation of a published evaluation study report (in pairs)
  • Follow along in the step-by-step tutorial to set-up and evaluate a zero-shot content analysis with generative AI tools (in pairs)
  • Implement a small adaption of the classification procedure, evaluate its performance, and present the results (in pairs)

Materials

Seminar materials

  • Work in Progress
  • Most materials will be provided on this website or within the browser-based computing environment
  • Additional material in BlackboardšŸ”’: e.g., non-public data.

Computing environment

  • Please follow the instructions in BlackboardšŸ”’ to access the computing environment.

Additional materials

General information

Generative large-language-model-based tools (commonly known as artificial intelligence, AI) challenge learning and teaching in academic contexts. My general advice is to engage with these tools openly but critically. The tools can help you with academic tasks. They can also help you learn. However, the tools are prone to errors, and, unfortunately, they often fail silently: They provide seemingly sensible answers, often in a convinced (and convincing) tone. More importantly, such tools can hinder or even prevent positive learning outcomes if they are used too early, too frequently, or in an inappropriate manner.

If you want to engage with AI tools in the context of (academic) learning, I highly recommend the online course Modern-Day Oracles or Bullshit Machines? How to thrive in a ChatGPT world.

In this methods tutorial, AI tools can help you with programming tasks. Particularly, students who have little or no prior knowledge of programming languages like R, can be tempted to let AI tools create their whole code (so-called vibe coding). This code is then pasted into RStudio and will sometimes (or often, depending on the task’s complexity) work on the first attempt. While this might be an effective way to complete tasks for this class, it is also a sure way to avoid learning anything. Therefore, my primary recommendation is to approach each task without using AI tools first. If you get stuck, need more explanation of specific code snippets, or want to check your solution against another one, AI tools can be of great help. Used in this way, they can be beneficial to learning outcomes.

Finally, remember that this class is not graded. There is no need to use AI tools to achieve a ā€œbetterā€ result and, consequently, a better grade.

My goal is for all students to feel welcome and actively participate in this class. I strive to ensure that no one is discriminated against or excluded through course planning and my language. Likewise, I expect all participants to behave respectfully and appreciatively, acknowledging the opinions and experiences of other students. At the same time, it is clear that neither I nor the students will always fully meet this expectation. Therefore, please inform me or your peers if you feel uncomfortable or observe discriminatory behavior. If you prefer not to do this yourself, you can also appoint a trusted person to do so.

Attending university is demanding and, as a time of transition, brings many challenges, both within and outside of your academic work. If you feel overwhelmed, please make use of support services such as the Mental Wellbeing support.point or the Psychological Counseling Service. Feel free to contact me directly or through a trusted person if your situation conflicts with the course requirements.

Contact

Prof. Dr. Marko Bachl

Arbeitsstelle Digitale Forschungsmethoden

E-Mail: marko.bachl@fu-berlin.de

Telephone: +49-30-838-61565

Webex: Virtual Office

Office: Garystr. 55, Raum 274

Office hours: Tuesday, 11:00-13:00, please book an appointment.