Leandro von Werra’s Post

Chief Loss Officer at Hugging Face

1mo

We just released the 🌸 BigCodeBenchmark: testing LLMs on more realistic and harder coding tasks involving tool usage. 🛠 While benchmarks like HumanEval are saturating even GPT-4o or DeepSeekCoder-v2 just land around 50% while humans get 97%! A few highlights 🚀: - 🛠 tasks utilize diverse function calls from 139 popular Python libraries. - 🤓 complex, user-oriented instructions for each task - 📊 includes verified examples and high test coverage - 🙋♂️ comes in a standard function complete form as well as instruction version Resources: - 🤗📊 HF Leaderboard: https://lnkd.in/eqxnEAPE - 🤗🗂️ HF Dataset: https://lnkd.in/esc8MRVD - 🤗🔍 HF Data Viewer: https://lnkd.in/esp_ZaTC - 💻 Code: https://lnkd.in/eTgzMWRv - 📝 Paper: https://lnkd.in/e4MYx3CK

8 Comments

Leandro von Werra

Chief Loss Officer at Hugging Face

1mo

Awesome work led by Terry Yue Zhuo!

Utkarsh Priyadarshi

CS @UW Madison | Ex Founder & CEO Toonication.com | AI/ML Researcher | Harvard Innovation Fellow

1mo

Love this! Had some questions regarding this, would love to discuss on a meet?

Partha Pratim Ray

Top 2% Scientist by Stanford University, Founder of Indian Knowledge Forum, IoT Researcher, Generative AI Enthusiast, Indian Knowledge Bearer, FIETE, Technical Evangelist, Promoter of Indian Knowledge

1mo

I'll keep this in mind

Claudio Spiess

Research Intern @ IBM | CS PhD student @ UC Davis | ML for Software Engineering, LLMs for code, AI4SE

1mo

Super exciting work! There's definitely a need for harder/more realistic benchmarks than HumanEval, MBPP, etc.

Siddhant Singh

student

1mo

WOWOWOW

Saurav Nanda

Generative AI | NLP | Cloud

1mo

This was much need! Thanks Leandro and Hugging Face Team!

See more comments

To view or add a comment, sign in

More Relevant Posts

Lucien Taal

data-scientist R en Python
7mo
Report this post
Hi Data-Scientists I programmed the last two days a new consensus- visualisation. Suppose there are several independent groups Ng react to Nq 0 - 1 statements. (about political views, product-quality, Food - preference, etc. ) Of course it is important to visualise the results. Below you can see such a consensus - plot showing there is clearly a lack of consensus. Th plot is based on the igraph script in R and Python
Like Comment
To view or add a comment, sign in
Zouhair Sahtout

Software Developer | JavaScript | TypeScript | Python
3mo
Report this post
I've been exploring a fascinating non-transitive dice game that presents a great example of how probability and statistics can yield unexpected outcomes. In this game, each die has unique values and probabilities, creating a circular pattern of wins and losses. Here's how it works: There are three dice—Red, Blue, and Olive. The Red die tends to beat the Blue die, the Blue die tends to beat the Olive die, and the Olive die tends to beat the Red die. It's like a game of Rock, Paper, Scissors but with dice! I implemented this game in Python to simulate multiple rounds and calculate the outcomes. It's a great learning experience in terms of understanding probabilities and the non-transitive nature of these dice. Here's the link of the code: https://lnkd.in/edeqe58x
1 Comment
Like Comment
To view or add a comment, sign in
Bruno P.

Data {Analyst, Scientist} | AWS | Airflow | Polars | Seaborn | Ex-Loggi
3mo Edited
Report this post
I just released an update to library 'eule', to handle set elements: a set is a mathematical object with content. Except to Russel's paradox on set of sets which do not list themselves, a set of 'n' sets may overlap. These overlappnig sets are cardinal and may account to 2^n-1 sets 😓 . For example, 2 sets will have at most 3 exclusive elements: A, B and intersection of A and B. But fear not, these intersections may be empty. This release of library 'eule' builds a set in python as either a list or a dictionary of lists or sets. For the moment, it just considers discrete sets, like numbers and strings. In the future, it may consider (disjoint union of) continuous intervals, like time, for example Keep tuned! 😎 https://lnkd.in/dFBbDppR
Like Comment
To view or add a comment, sign in
Brighton Asumani

Full Stack Developer
5mo
Report this post
Today's problem is as follows : DAY 39 - Swapping Nodes in a Linked List You are given the head of a linked list, and an integer k. Return the head of the linked list after swapping the values of the kth node from the beginning and the kth node from the end (the list is 1-indexed). Example 1: YInput: head = [1,2,3,4,5], k = 2 Output: [1,4,3,2,5] Example 2: Input: head = [7,9,6,6,7,8,3,0,9,5], k = 5 Output: [7,9,6,6,8,7,3,0,9,5] Solution Notes(Brute Force): 1. Get all the values in the linked list 2. Append these values in a results list 3. Swap values in the results list by accessing the element at index (k) from the end, considering that indexing in Python starts from 0 (k-1) 4. Reconstruct a new linked list from the new results list Time complexity is O(n), where n is the number of nodes in the linked list Space complexity is O(n), where n is the number of nodes in the linked list #100daysleetcodechallenge #datastructuresandalgorithms #datastructures #pythonprogramming #datastructures Leetcode link: https://lnkd.in/dbKJSc3X
Like Comment
To view or add a comment, sign in
Pavlo Myrotiuk

Software Developer at Smartling
8mo
Report this post
This week was dedicated to Gradient descent for linear regression. Simple case: one variable (You have to start somewhere... 🙂 ) Got some dataset from the Kaggle. Implemented algorithm. The first thing I got is > RecursionError: maximum recursion depth exceeded while calling a Python object Hello, Scala's tail recursion optimization... Miss you ❤️ Not sure if I did something wrong but choosing the initial lr of 0.01 quickly resulted in `nan` number for `w` and `b` params in the line equation `f = wx+b` While playing with the learning rate I found out that it depends on the `x` variable. The bigger variable the lower the learning rate should be. https://lnkd.in/dxHNgbND
Like Comment
To view or add a comment, sign in
Daniele Bernardi

Bringing the best developer experience to AI
7mo
Report this post
🔮 Defeat Knowledge Cutoffs with function calling ✨ Function calling gives LLMs the power to tap into current and future events, effectively providing a good way to bring models up to date. Here's a small example: LLMs can't tell the time. Function calling enables them to ask for the current time, and process it for whatever purpose you're requesting. I built an open source library for Python to turn your code into AI tools 👉 Check it out: https://lnkd.in/gaCjv7gB
Like Comment
To view or add a comment, sign in
Jothimalar Paulpandi

Senior Analyst - Data Scientist | Avid Reader | Blogger
7mo Edited
Report this post
Hello folks, Let's see a problem on arrays. Problem Statement: Given an integer array nums, rotate the array to the right by k steps, where k is non-negative. Example 1: Input: nums = [1,2,3,4,5,6,7], k = 3 Output: [5,6,7,1,2,3,4] Explanation: rotate 1 steps to the right: [7,1,2,3,4,5,6] rotate 2 steps to the right: [6,7,1,2,3,4,5] rotate 3 steps to the right: [5,6,7,1,2,3,4] Example 2: Input: nums = [-1,-100,3,99], k = 2 Output: [3,99,-1,-100] Explanation: rotate 1 steps to the right: [99,-1,-100,3] rotate 2 steps to the right: [3,99,-1,-100] Constraints: 1 <= nums.length <= 105 -231 <= nums[i] <= 231 - 1 0 <= k <= 105 #day4ofprogramming #python #pythonprogramming #pythoncoding #codinglife #codingeveryday #pythonforbeginners #arrays #codingjourney #happycoding
Like Comment
To view or add a comment, sign in
Ayesha Mehmood

Mathematician who codes|Hackathon Winner|Student at standford|AI enthusiast|Python Trainer|moderator at iCodeGuru|Cs_50 puzzle day winner|Data structure and algorithm|machine learning| IELTS trainer
5mo
Report this post
leetcode day 20: Missing Number: https://lnkd.in/d9cjHD2F 🔍 Problem Statement: Solving the missing number problem in an array of integers where the elements are in the range from 0 to n. 🛠️Approach: 1. Sort the array in ascending order. 2. Check if the smallest number is 0. If not, return 0. 3. Iterate through the sorted array and find the first missing number. 4. If no missing number is found, return the maximum number in the array plus one. 5. Otherwise, return the missing number. 🔢 Algorithm: def find_missing(nums): nums.sort() if nums[0] != 0: return 0 x = 0 for i in range(len(nums)): if i not in nums: x = i if x == 0: return max(nums) + 1 else: return x ⏱️ Time Complexity: The time complexity of this algorithm is O(n log n) due to sorting the array. 💾 Space Complexity: The space complexity is O(1) since the algorithm uses only a constant amount of extra space. #Algorithm #Python #CodingProblem #LinkedInLearning
1 Comment
Like Comment
To view or add a comment, sign in
Sofien Kaabar, CFA

Institutional Technical Analyst | Author of O'Reilly's Deep Learning for Finance
9mo
Report this post
Demystifying K-Nearest Neighbors: A Beginner’s Guide in Python In this article, we embark on a journey to demystify KNN, the algorithm that makes predictions based on the company it keeps, quite literally. We will explore how KNN takes inspiration from our everyday experiences, where we often seek advice from our nearest neighbors, friends, or colleagues. We will dive into two key aspects of KNN: classification and regression. In the world of classification, KNN helps us assign labels or categories to new data points based on their resemblance to previously observed instances. It is a vital tool in tasks like spam detection, image recognition, and disease diagnosis. In the task of regression, KNN enables us to predict continuous values, making it indispensable for applications such as real estate price prediction, stock market forecasting, and more. Link to the article in the first comment. #trading #strategy #patternrecognition #python #finance
1 Comment
Like Comment
To view or add a comment, sign in
Thamizazhagan B

SIH2022 Finalist || Pre-Final Year Data Science Student || St. joseph's College of Engineering || Anna University || Chennai
6mo
Report this post
100DaysOfCodeChallenge in LeetCode! Day 59: Problem Statement: https://lnkd.in/gFYCSrDT Solution explanation: ⚫ The problem is to generate a square matrix (2D array) of size 'n x n' in a spiral order, starting from the top-left corner and moving towards the center. The numbers in the matrix should be filled in a clockwise spiral pattern. ⚫ Initialize an empty matrix ('ans') of size 'n x n' filled with zeros. ⚫ Use a variable ('count') to keep track of the current number to be filled in the matrix, starting from 1. ⚫ Iterate through each layer of the spiral, moving from the outer layer towards the center. ⚫ For each layer, fill the top row, right column, bottom row, and left column with consecutive numbers in a clockwise manner. ⚫ If the matrix size 'n' is odd, fill the center element with the final value. Return the filled matrix. ⚫ The code uses loops to iterate through each layer and each side of the layer, updating the matrix with the correct values. Finally, it returns the matrix filled in the desired spiral pattern. Time complexity: O(n^2) Space complexity: O(n^2) #python #datastructures #algorithms #100daysofcodingchallenge #leetcode
Like Comment
To view or add a comment, sign in

15,926 followers

104 Posts

View Profile Follow

Leandro von Werra’s Post

More Relevant Posts

Explore topics