From the course: Generative AI: Working with Large Language Models
Unlock the full course today
Join today to access over 23,100 courses taught by industry experts.
Going further with Transformers
From the course: Generative AI: Working with Large Language Models
Going further with Transformers
- [Jonathan] We've covered a ton of material in this course. We've looked at many of the large language models since GPT-3. Let's review them really quickly. We saw how Google reduced training and inference costs by using sparse mixtures of expert models with GLaM. A month later, Microsoft teamed up with Nvidia to create the Megatron Turing LG model that was three times larger than GPT-3 with 530 billion parameters. In the same month, the DeepMind team released Gofer and their largest 280 billion parameter model which was their best performing model. A few months later, the DeepMind team introduced Chinchilla, which turned a lot of our understanding of large language models on its head. The main takeaway was that large language models up to this point had been undertrained. Google released the 540 billion parameter modeled PaLM in April training it on their Pathways infrastructure, and this has been the best performing…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.