The tutorial focuses on topic modelling with transformers. More specifically the tutorial employs BERTopic to classify a set of news articles that are relevant in politics and policy areas. BERTopic is a technique that uses modularity so that each step can be modified to best fit the problem in question. The dataset used in this tutorial further emphasizes the importance of using BERTopic for a time-efficient and low-cost analysis of large-size datasets, to uncover patters, themes and main topics in numerous documents of different categories, geographies, etc.
The tutorial gives clear steps to familiarize the user with the dataset, topic modelling musts, and the main components that build BERTopic. It is also suplemented with several materials to help the user understand how transformers work when used for topic modelling, and how such approaches can be applied in different policy areas. The materials include:
- Presentation Slides, for a brief introduction with the topic and what to expect from the tutorial
- Tutorial in Google Colab, a step-by-step implementation of topic modelling with BERTopic
- Video-guide, with few more bits and takeaways on BERTopic
This tutorial was developed by:
