From the course: Generative AI: Working with Large Language Models

Unlock the full course today

Join today to access over 23,100 courses taught by industry experts.

BIG-bench

BIG-bench

- [Instructor] Now some of the challenges with the current benchmarks were that they were two narrow in scope, including tasks like language understanding or summarization. It almost seemed like a research team would come up with some of these more basic tasks, and then a couple of months later, another research team would come up with a model that would ace these tasks. What if there were some benchmarks that had some really challenging tasks? And that's pretty much the background to BIG-bench or Beyond the Imitation Game Benchmark. A team of researchers from different institutions came up with over 200 tasks that humans perform well on but current state of the art language models don't. They also included a team of human expert writers that performed all tasks in order to provide a strong baseline, and they were allowed to use all available resources including searching the internet. The tasks are really diverse…

Contents