📣 Hey, heard about the new encoder-free VLM called EVE? - It supports arbitrary image resolutions similar to Fuyu-8b - Performs better than Fuyu-8b on various benchmarks - EVE uses 35M publicly accessible data for training - EVE is suitable for general purpose VLM tasks unlike UI niche in case of Fuyu - MIT License 💯 The EVE paper benchmarks with Fuyu-8B, which is a similar encoder-free VLM. However, unlike Fuyu from Adept, EVE achieves sota results while being transparent about their training data and training method. Very helpful for reproducibility and further advancement. 🚀 EVE model collection- https://lnkd.in/g7PSP5dW 💡 Best way to quickly explore the model is to clone the repository and run the Gradio demo provided locally: https://lnkd.in/gzkn8zGu
Gradio’s Post
More Relevant Posts
-
🐍Introducing 𝐂𝐨𝐛𝐫𝐚: a groundbreaking MLLM that takes efficiency to the next level🚀 🧠By integrating the Mamba LLM with the visual modality, Cobra achieves linear computational complexity without compromising perf. ⚡Time to say goodbye to sluggish transformer MLLMs😉 ? 🤔Want to experience the power of Cobra? Check out the live demo on Spaces! 🔍Explore various modal fusion schemes & see how Cobra outperforms sota methods like LLaVA-Phi, TinyLLaVA, & MobileVLMv2 🚀Inspired to Build MLLMs apps with gr.MultimodalTextbox? Start here : Gradio.dev 🏆Cobra is fast & incredibly versatile! 🧩Vision backbone combines DINOv2 & SigLIP, 🧩2.8B Mamba LLM Cobra achieves remarkable perf in challenging tasks like overcoming visual illusions & spatial relationship decisions. 🔬Cobra with fewer params is more effective than LLaVA
To view or add a comment, sign in
-
There's a lot one can do with embeddings, from simple and effective similarity based tasks to downstream tasks of classification. The best embeddings are here: https://lnkd.in/gPbeqzzX
To view or add a comment, sign in
-
Modify Feature QUOMO https://lnkd.in/gb2Hrgq2
Modify Feature QUOMO
https://www.youtube.com/
To view or add a comment, sign in
-
Lead AI Engineer @ SymphonyAI | Machine Learning, Computer Vision, Industrial | I Help Companies and 100k+ People Define and Build AI Projects and Solutions
YOLOv10 just released! 🔥🔥We have a new member and version in the YOLO family. For the nano version of YOLOv10 we are talking 1ms per image (that's 1000 FPS) 🤯 Have not seen such a big jump in performance from any other version release before. Better mAP on the COCO benchmark dataset and close to 2x lower latency compared to the other models. It's based on the Ultralytics framework so we can use it easily in just a few lines of code for both training and inference. We are definitely going to work on some cool videos! Key highlights 🔑 ✅ NMS-free training: Improved performance and reduced latency. ✅ Uses spatial-channel decoupled downsampling for ops efficiency ✅ Adds new compact inverted block (CIB) ✅ Holistic design: Optimized components for efficiency and capability. ✅ YOLOv10: New generation for real-time object detection.
To view or add a comment, sign in
-
SuperAGI users will soon be able to power their Autonomous Agents with Local LLMs in v0.0.14 🚀 👉🏼 Check out our video guide on how to set up and use open-source LLMs locally with SuperAGI here: https://lnkd.in/efUYyvCM
Setting up & using local LLMs in SuperAGI
https://www.youtube.com/
To view or add a comment, sign in
-
Few Days ago, one of my colleagues talked about this video and how this technology is changing the world and I decided to see if I can replicate their work. I started from YOLOv8-pose and DeepSORT algorithm to detect and estimate human pose and track individuals. This video is from youtube. You can see the source code from my github repository. I will add upon this repo and will use more advnaced models and ideas. feel free to contribute. It is amazing to see what Ultralytics has done to YOLO and It's even more exciting that it's all open sourced. orignal_video: https://lnkd.in/dWEdAeCv source_code: https://lnkd.in/d_r-KbZ8 Ultralytics repo: https://lnkd.in/dK5sduTz
To view or add a comment, sign in
-
This groundbreaking leap in YOLOv10 performance opens up a realm of possibilities for real-time object detection projects. An improvement that comes mainly from decoupling spatial-channel downsampling! Imagine deploying it for smart city surveillance, enhancing retail analytics, or even creating interactive AR experiences. The potential is limitless, and I can't wait to dive into the world of Computer Vision again. #MachineLearning #ObjectDetection #YOLOv10 #ComputerVision
Lead AI Engineer @ SymphonyAI | Machine Learning, Computer Vision, Industrial | I Help Companies and 100k+ People Define and Build AI Projects and Solutions
YOLOv10 just released! 🔥🔥We have a new member and version in the YOLO family. For the nano version of YOLOv10 we are talking 1ms per image (that's 1000 FPS) 🤯 Have not seen such a big jump in performance from any other version release before. Better mAP on the COCO benchmark dataset and close to 2x lower latency compared to the other models. It's based on the Ultralytics framework so we can use it easily in just a few lines of code for both training and inference. We are definitely going to work on some cool videos! Key highlights 🔑 ✅ NMS-free training: Improved performance and reduced latency. ✅ Uses spatial-channel decoupled downsampling for ops efficiency ✅ Adds new compact inverted block (CIB) ✅ Holistic design: Optimized components for efficiency and capability. ✅ YOLOv10: New generation for real-time object detection.
To view or add a comment, sign in
-
Digital transformation | Innovation strategy | AI | Data culture | Bridging domains | IT is everywhere
Incredible how the YOLO family of object detection models (yes this is NOT GenAI!) just keeps going 🚀, everytime pushing its capabilities to new levels. Even this new 10th generation still manages to significantly enhance its performance metrics 👏 I clearly remember how incorporating the early version (3ish) in COWI’s national streetview campaign and how it enabled transitioning to fully AI-automated GDPR blurring of faces and licence plates 🥳 Generative AI are without doubt taking all the headlines but the newly coined “Traditional AI” are still the main value driver🦾⚙️
Lead AI Engineer @ SymphonyAI | Machine Learning, Computer Vision, Industrial | I Help Companies and 100k+ People Define and Build AI Projects and Solutions
YOLOv10 just released! 🔥🔥We have a new member and version in the YOLO family. For the nano version of YOLOv10 we are talking 1ms per image (that's 1000 FPS) 🤯 Have not seen such a big jump in performance from any other version release before. Better mAP on the COCO benchmark dataset and close to 2x lower latency compared to the other models. It's based on the Ultralytics framework so we can use it easily in just a few lines of code for both training and inference. We are definitely going to work on some cool videos! Key highlights 🔑 ✅ NMS-free training: Improved performance and reduced latency. ✅ Uses spatial-channel decoupled downsampling for ops efficiency ✅ Adds new compact inverted block (CIB) ✅ Holistic design: Optimized components for efficiency and capability. ✅ YOLOv10: New generation for real-time object detection.
To view or add a comment, sign in
26,520 followers
It's great to see advancements in technology that emphasize transparency and reproducibility. Keep up the good work!