From the course: Generative AI: Working with Large Language Models

Unlock the full course today

Join today to access over 23,100 courses taught by industry experts.

Transformer: Architecture overview

Transformer: Architecture overview

- [Instructor] You're probably wondering what the transformer architecture looks like. So let me head over to the attention is all you need paper and show you. We'll divide this architecture into two sections so that we can understand each component. The left half of the diagram is known as an encoder and the right hand side is known as a decoder. We feed in the English sentence such as I like NLP into the encoder at the bottom of the diagram. And the transformer can act as a translator from English to German. And so the output from the decoder at the top of the diagram is the German translation, ich mag NLP. The transformer is not made up of a single encoder, but rather six encoders. Each of these parts can be used independently depending on the task. So encoder-decoder models are good for generative tasks such as translation or summarization. Examples of such encoder-decoder models are Facebook's BART model and Google's…

Contents