From the course: Generative AI: Working with Large Language Models

Unlock the full course today

Join today to access over 23,100 courses taught by industry experts.

Transformers: History

Transformers: History

- [Instructor] The models based on the original transformer paper from 2017 have evolved over the years. One of the challenges with training large language models in 2017 was that you needed labeled data. This required a lot of time and effort. The ULMFiT model proposed by Jeremy Howard and Sebastian Ruda provided a framework where you didn't need labeled data. And this meant large corpus of text, such as Wikipedia, could now be used to train models. In June of 2018, GPT or Generative Pre-Trained Model, which is developed by Open AI, was the first pre-trained transformer model. Now, this was used for fine tuning on various NLP tasks and obtained state-of-the-art results. And a couple of months later, researchers at Google came up with BERT or Bidirectional Encoder Representations from Transformers. We saw a couple of examples of BERT being used in production at Google. In February, 2019, Open AI released a bigger and…

Contents