Calendar
YRM W10 - Patrick Zietkiewicz with a tutorial on Transformer-based Language Models
Title: Tutorial on Transformer-based Language Models
Overview: In this talk, we will explore the theory and practical implementation of transformers, with a focus on GPT-3. We will begin with the data collection and processing, as well as embeddings, byte-pair and positional encodings. Then look at the architecture of transformers, including their use of attention mechanism. From there, we will delve into the specifics of how GPT-3 is trained. We will then discuss how chatGPT arises from GPT-3 via fine-tuning/InstructGPT/reinforcement learning from human feedback. Practical questions like “what can we compute in parallel?” will also be looked at. The goal is for you to walk away with a basic understanding of the full picture of how you get from text data to chatGPT with at least a grasp of the theory.