Hugging Face’s Post

View organization page for Hugging Face, graphic

663,294 followers

Transformers v4.42 includes a new Transformer-based model capable of real-time object detection. See below for more info:

View profile for Niels Rogge, graphic

Machine Learning Engineer at ML6 & Hugging Face

RT-DETR is now supported in Hugging Face Transformers! 🙌 RT-DETR, short for “Real-Time DEtection TRansformer”, is a computer vision model developed at Peking University and Baidu, Inc. capable of real-time object detection. The authors claim better performance than YOLO models in both speed and accuracy. The model comes with an Apache 2.0 license, meaning people can freely use it for commercial applications. 🔥 RT-DETR is a follow-up work of DETR, a model developed by AI at Meta that successfully used Transformers for the first time for object detection. The latter has been in the Transformers library since 2020. After this, lots of improvements have been made to enable faster convergence and inference speed. RT-DETR is an important example of that as it unlocks real-time inference at high accuracy! Big congrats to Daniel Choi for contributing this model! * Demo notebooks (fine-tuning + inference): https://lnkd.in/eA_WzsyE * Demo Space: https://lnkd.in/ewzWTSHA * Paper: https://lnkd.in/eR3Qg6dm #ai #artificialintelligence #objectdetection #huggingface #computervision

RT-DETR benefits real-time object detection by guaranteeing real-time performance and accuracy. It employs Vision Transformers for effective multiscale feature processing, features adaptable inference speed adjustment, and supports CUDA with TensorRT, outperforming other real-time detectors in both speed and accuracy.

Houston Austin Muzamhindo

AI & Data at Investec UK | Founder at IQmates | Udemy Instructor | TEDxJHB 1830 Fellow

1w

Interesting. Need to read the paper. YOLO is CNN-based. During the X / Twitter debate between Elon and Yann, one of the points was around computer vision without CNNs (which Elon claimed Tesla is doing now and those cars need very fast inference models which certain CNN architectures are capable of).

Like
Reply

Recently Yolov10 has been released. I checked on the paper, it doesn't do the comparaison with this one. Do we know if it even outperforms Yolov10 ? In anycase if the code is opensource, it opens a new door for real-time detection. I will have something to read before to sleep.

Like
Reply
Durga Prasad Dhulipudi

AI/ML Enterprise Architect, Expert Geospatial and Aviation

1w

RT-DETR is faster than YOLOv8 and has better accuracy. Thanks for the object detection notebook. Do you have any similar examples for segmentation, or could you come up with one?

Like
Reply

Great. Well done! And yet, a bit sad. YOLO was the last great bastion of CNN models. It saddens me in a way that this works. I guess now we'll really see them off.

Pablo Carmona Esparza

AI and Automation business solutions | LangChain developer | Zapier and Make IO ninja 🧑🏻💻🥷

1w

Exciting development, Hugging Face! The innovation in real-time object detection keeps raising the bar. Kudos to the team!

Like
Reply
Anshu Bhola

IoT | AI/ML | Technologist | Architectures

1w

Wow .. faster than YOLO.

Like
Reply
Francesco Cozzolino

medium.com/@francesco.cozzolino | AI Solution Developer

1w

Great

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics