Hugging Face

Software Development

The AI community building the future.

See jobs Follow

View all 380 employees

About us

The AI community building the future.

Website: https://huggingface.co
External link for Hugging Face
Industry: Software Development
Company size: 51-200 employees
Type: Privately Held
Founded: 2016
Specialties: machine learning, natural language processing, and deep learning

Products

Hugging Face

Natural Language Processing (NLP) Software

We’re on a journey to solve and democratize artificial intelligence through natural language.

Locations

Primary

Get directions
Paris, FR

Get directions

Employees at Hugging Face

See all employees

Updates

Hugging Face reposted this

Sayak Paul

ML @ Hugging Face 🤗
1d
Report this post
SPRIGHT was accepted to #ECCV2024. Check out SPRIGHT, our approach toward solving the spatial consistency problem in diffusion models: https://lnkd.in/gpXpxi7U See you in Milan 🇮🇹 🫡 Agneet Chatterjee, Gabriela Ben-Melech, Estelle Aflalo, Dhurba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, 'YZ' Yezhou Yang.
2 Comments

Like Comment Share
Hugging Face reposted this

Zachary Mueller

Technical Lead for Accelerate at HuggingFace
3d
Report this post
Another start to the month, another Hugging Face Accelerate release! We've been cooking, let's talk about how 👨🍳 * There's a new profiler in town that can help you collect performance metrics during training and inference, and you can then visualize it in tools like Chrome. Check out the new docs (linked below) for more info! * Thanks to Stas Bekman we were able to track down, identify, and fix a slowdown during `import accelerate` that helped reduce our import times by over 60%! We've taken steps to ensure such slowdowns can't go unnoticed again, and we hope you enjoy being able to accelerate a bit faster! * We've added support for more complex PyTorch DDP communication hooks, allowing you to customize how gradients are communicated across workers. (New docs linked below) * With XPU now in native PyTorch, we've made sure to upstream the support so you can continue right away with using the native implementation (note, this requires PyTorch >= 2.4) * And so, so much more Enjoy the release!
8 Comments

Like Comment Share
Hugging Face reposted this

Sayak Paul

ML @ Hugging Face 🤗
3d
Report this post
We worked on a mini-project to show how to run SD3 DreamBooth LoRA fine-tuning on a free-tier Colab Notebook 🌸 The project is educational and is meant to serve as a template. Only good vibes here please 🫡 👉 https://lnkd.in/g_znevg3 👈 Enjoy!
5 Comments

Like Comment Share
Hugging Face reposted this

Abhishek Thakur

AutoTrain @ Hugging Face 🤗 | World's First 4x Kaggle GrandMaster ✨ | 150k+ LinkedIn, 100k+ YouTube 🚀
3d
Report this post
Which task shall be added next? 😉 AutoTrain: https://lnkd.in/dUmxgcUX
6 Comments

Like Comment Share
Hugging Face reposted this

Julien Chaumond

CTO at Hugging Face
4d
Report this post
We need more knowledge sharing about running ML infrastructure at scale! Here's the mix of AWS instances we currently run our serverless Inference API on. For context, the Inference API is the infra service that powers the widgets on Hugging Face Hub model pages + PRO users and Enterprise orgs can use it programmatically. 64 g4dn.2xlarge 48 g5.12xlarge 48 g5.2xlarge 10 p4de.24xlarge 42 r6id.2xlarge 9 r7i.2xlarge 6 m6a.2xlarge (control plane and monitoring) ––– Total = 229 instances This is a thread for AI Infra aficionados 🤓 What mix of instances do you run?
113 Comments

Like Comment Share
Hugging Face reposted this

Ahsen Khaliq

ML @ Hugging Face
3d
Report this post
MInference 1.0 Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention paper page: https://buff.ly/45RdSmP The computational challenges of Large Language Model (LLM) inference remain a significant barrier to their widespread deployment, especially as prompt lengths continue to increase. Due to the quadratic complexity of the attention computation, it takes 30 minutes for an 8B LLM to process a prompt of 1M tokens (i.e., the pre-filling stage) on a single A100 GPU. Existing methods for speeding up prefilling often fail to maintain acceptable accuracy or efficiency when applied to long-context LLMs. To address this gap, we introduce MInference (Milliontokens Inference), a sparse calculation method designed to accelerate pre-filling of long-sequence processing. Specifically, we identify three unique patterns in long-context attention matrices-the A-shape, Vertical-Slash, and Block-Sparsethat can be leveraged for efficient sparse computation on GPUs. We determine the optimal pattern for each attention head offline and dynamically build sparse indices based on the assigned pattern during inference. With the pattern and sparse indices, we perform efficient sparse attention calculations via our optimized GPU kernels to significantly reduce the latency in the pre-filling stage of long-context LLMs. Our proposed technique can be directly applied to existing LLMs without any modifications to the pre-training setup or additional fine-tuning. By evaluating on a wide range of downstream tasks, including InfiniteBench, RULER, PG-19, and Needle In A Haystack, and models including LLaMA-3-1M, GLM4-1M, Yi-200K, Phi-3-128K, and Qwen2-128K, we demonstrate that MInference effectively reduces inference latency by up to 10x for pre-filling on an A100, while maintaining accuracy.

3 Comments

Like Comment Share
Hugging Face

655,341 followers
4d
Report this post
Transformers v4.42 includes a new Transformer-based model capable of real-time object detection. See below for more info:

Niels Rogge

Machine Learning Engineer at ML6 & Hugging Face
5d Edited

RT-DETR is now supported in Hugging Face Transformers! 🙌 RT-DETR, short for “Real-Time DEtection TRansformer”, is a computer vision model developed at Peking University and Baidu, Inc. capable of real-time object detection. The authors claim better performance than YOLO models in both speed and accuracy. The model comes with an Apache 2.0 license, meaning people can freely use it for commercial applications. 🔥 RT-DETR is a follow-up work of DETR, a model developed by AI at Meta that successfully used Transformers for the first time for object detection. The latter has been in the Transformers library since 2020. After this, lots of improvements have been made to enable faster convergence and inference speed. RT-DETR is an important example of that as it unlocks real-time inference at high accuracy! Big congrats to Daniel Choi for contributing this model! * Demo notebooks (fine-tuning + inference): https://lnkd.in/eA_WzsyE * Demo Space: https://lnkd.in/ewzWTSHA * Paper: https://lnkd.in/eR3Qg6dm #ai #artificialintelligence #objectdetection #huggingface #computervision

19 Comments

Like Comment Share
Hugging Face reposted this

Lysandre Debut

Head of Open Source at Hugging Face
5d Edited
Report this post
🤗 Last week, Gemma 2 was released. Since then, implementations have been tuned to reflect the model performance: ``` pip install -U transformers==4.42.3 ``` We saw reports of tools (transformers, llama.cpp) not being on par with Google-led releases (Google AI Studio). Why is that? ✨ The first and most important aspect is that Google implemented soft-capping of logits within the attention. This change is meaningful: we tested with and without this soft capping, and while metrics show little difference, we see very significant changes in a long context. The current implementations of Flash Attention kernels do not allow for this soft-capping (yet), so we cannot take advantage of the Flash Attention speed gains without losing performance. Conclusion: soft-capping is required, especially for the 27B model, but it affects speed optimizations. This is particularly the case for inference but seems to be less relevant during fine-tuning. We'd be interested in seeing whether toggling FA2 on/soft capping off during fine-tuning results in correct fine-tunes, as this would significantly accelerate training on most cards. --- Secondly, the model was trained in bfoat16 and seems to perform best with that precision. The 27b model is sensible to precision, and running it in fp16 results in incorrect outputs. We've checked and confirmed that using BNB to run the checkpoints in 4-bit + 8-bit works correctly. --- If you're looking for presets to run with transformers, this is what we recommend for optimal performance: - Version v4.42.3 - Running with `attn_implementation='eager'` (so no FA/FA2) - Running in bfloat16 to start with
7 Comments

Like Comment Share
Hugging Face reposted this

Ahsen Khaliq

ML @ Hugging Face
5d
Report this post
Scaling Synthetic Data Creation with 1,000,000,000 Personas paper page: https://buff.ly/3XJ6Hek We propose a novel persona-driven data synthesis methodology that leverages various perspectives within a large language model (LLM) to create diverse synthetic data. To fully exploit this methodology at scale, we introduce Persona Hub -- a collection of 1 billion diverse personas automatically curated from web data. These 1 billion personas (~13% of the world's total population), acting as distributed carriers of world knowledge, can tap into almost every perspective encapsulated within the LLM, thereby facilitating the creation of diverse synthetic data at scale for various scenarios. By showcasing Persona Hub's use cases in synthesizing high-quality mathematical and logical reasoning problems, instructions (i.e., user prompts), knowledge-rich texts, game NPCs and tools (functions) at scale, we demonstrate persona-driven data synthesis is versatile, scalable, flexible, and easy to use, potentially driving a paradigm shift in synthetic data creation and applications in practice, which may have a profound impact on LLM research and development.
13 Comments

Like Comment Share
Hugging Face reposted this

Gradio

21,935 followers
1w
Report this post
🤯DiffIR2VR-Zero: Zero-shot video restoration to high-resolution using pre-trained image restoration diffusion models. - Video denoising and up to 8x super-resolution - Framework outperforms trained models in generalizing across diverse datasets and extreme degradations - Compatible with every 2D restoration model #superresolution #zeroshot #diffir2vr

6 Comments

Like Comment Share

Browse jobs

Funding

Hugging Face 7 total rounds

Last Round

Series D Feb 16, 2024

See more info on crunchbase

Hugging Face

Software Development

The AI community building the future.

About us

Products

Hugging Face

Natural Language Processing (NLP) Software

Locations

Employees at Hugging Face

Ludovic Huraux

Bassem ASSEH

Jeff Boudier

Product + Growth at Hugging Face

Terrence Rohan

Seed Investor

Updates

Join now to see what you are missing

Similar pages

OpenAI

Anthropic

Google DeepMind

Mistral AI

Generative AI

LangChain

DeepLearning.AI

LlamaIndex

Cohere

Perplexity

Browse jobs

Scientist jobs

Analyst jobs

Engineer jobs

Machine Learning Engineer jobs

Developer jobs

Manager jobs

Librarian jobs

Intern jobs

Data Scientist jobs

Director jobs

Operational Specialist jobs

Head jobs

Data Science Specialist jobs

Software Engineer jobs

Project Manager jobs

Data Analyst jobs

Account Executive jobs

Recruiter jobs

Product Manager jobs

Frontend Developer jobs

Funding