This course introduces Natural Language Processing (NLP) and transformer-based Large Language Models (LLMs). Students will explore foundational NLP concepts, including tokenization, word embeddings, and language modelling. They will learn the core mechanics of LLMs, such as architecture, training, fine-tuning, reasoning, evaluation, and deployment strategies. The curriculum includes practical applications such as text classification, machine translation, summarization, and zero-/few-shot prompting.
Through hands-on work with real-world datasets, students will design NLP pipelines and evaluate model performance in multilingual settings, with particular emphasis on low-resource and under-represented languages. By the end of the course, students will also build a simple language model from scratch.
| Lecture | Title | Resources | YouTube Videos | Suggested Readings |
|---|---|---|---|---|
| 1 | Introduction to NLP and LLMs - (07-Feb-2026) | Slide | YouTube | 1. Natural Language Processing: State of the Art, Current Trends and Challenges 2. The Rise of AfricaNLP: Contributions, Contributors, and Community Impact (2005–2025) 3. HausaNLP: Current Status, Challenges and Future Directions for Hausa NLP |
| 2 | How Language Modelling Started (N-grams) | Slide Practical Exercise |
— | 1. Jurafsky & Martin — Speech and Language Processing, Chapter 3 2. Rosenfeld (2000) — Two Decades of Statistical Language Modeling |
| 3 | Text Classification | Slide Intro to PyTorch |
— | 1. Jurafsky & Martin — Speech and Language Processing, Chapter 4 2. Muhammad et al. (2022) — AfriSenti 3. Learn PyTorch: Zero to Mastery |
| 4 | Word Vectors | Slide Training Embeddings |
— | 1. Mikolov et al. (2013) — Efficient Estimation of Word Representations 2. Mikolov et al. (2013) — Linguistic Regularities |
| 5 | Sequence Modelling | Slide Sentiment Analysis |
— | 1. Goodfellow et al. — Deep Learning, Chapter 6 2. Goldberg (2016) — Neural Network Models for NLP |
| 6 | Attention | Slide Attention Exercise |
— | 1. Bahdanau et al. (2014) — Neural Machine Translation 2. Luong et al. (2015) — Attention-based NMT |
| Lecture | Title | Resources | Suggested Readings |
|---|---|---|---|
| 7 | Introduction to Transformers | Slide 1, Slide 2 | 1. Vaswani et al. (2017) — Attention is All You Need 2. Alammar — Illustrated Transformer |
| 8 | Pretraining | Slide 1, Slide 2 Pre-training Fine-tuning Exercise |
1. BERT: Pre-training of Deep Bidirectional Transformers 2. GPT-3: Language Models are Few-Shot Learners |
| 9 | Post-training | Slide 1, Slide 2 | 1. FLAN: Finetuned Language Models 2. T0: Multitask Prompted Training |
| 10 | Model Compression | Slide | 1. Wei et al. (2022) — Chain-of-Thought Prompting 2. Kojima et al. (2022) — Zero-Shot CoT |
| 11 | Benchmarking and Evaluation | Slide | 1. Holistic Evaluation of Language Models (HELM) |
Each student will conduct a project. More detalais coming soon.
- Speech and Language Processing – Jurafsky & Martin (Online Draft)
- Hands-On Large Language Models: Language Understanding and Generation
- LLMs-from-scratch
- LLM-course
- Natural Language Processing with Python – Steven Bird, Ewan Klein, Edward Loper (Free Online)
- Transformers for Natural Language Processing – Denis Rothman
- Deep Learning for NLP – Palash Goyal, Sumit Pandey, Karan Jain
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow – Aurélien Géron