Gradio’s Post

Name: Gradio on LinkedIn: 📣 New Text to Video research released: VADER - Adapts large-scale T2V…
Uploaded: 2024-07-12T16:31:57.624Z
Channel: Gradio

Gradio

26,524 followers

📣 New Text to Video research released: VADER - Adapts large-scale T2V models to specific downstream tasks - Uses pre-trained reward models, instead of supervised ft - The approach enables: Aesthetic video generation, Better alignment between text & videos, 3 times longer videos than training lengths🤯 ✨ VADER or Video Diffusion Alignment via Reward Gradients Model on @huggingface Hub: https://lnkd.in/gfs_E9FX Code: https://lnkd.in/gRW_vBFV We welcome community Gradio implementations for Vader-Videocrafter, Vader-Open-Sora, and Vader-Modelscope. You can request for GPU grants for your demo submissions. Start today: Gradio.dev

To view or add a comment, sign in

More Relevant Posts

Raymond Lo 👓🤖

AI Software Evangelist at Intel (Global Lead)
9mo Edited
Report this post
We have our round # 2 of the swag-give-away + fun challenge from Intel! The grand prize is an Intel NUC + A770m! Here is what you need to do. 1. Visit the site below. 2. Try our segmentation notebooks (for advanced user: try SAM too). 3. Create your own with OpenVINO 4. Submit your work on github! Our grand prize NUC is pretty maxed out spec and you can use it for AI workload including LLMs too! Link: https://lnkd.in/gTz2XDXm Thanks Kevin Hartman <3 #iamintel

Beginner Challenge

events.hackster.io

2 Comments
Like Comment
To view or add a comment, sign in
MD ARIF UDDIN

➡ Programming Testing Engineer at LuxShare-ICT |AI Deep Learning| Robotics and Computer Vision
3mo
Report this post
Part 1: OpenCV seamless fusion application - cut out the object in the image and move it. ☝ ☝ #OpenCV Seamless 🚀 #Computer Vision 🎈
Like Comment
To view or add a comment, sign in
Oisin Lunny

Award-Winning Marketer, Podcast Host, Event MC, TV Host, Online Moderator, Virtual Emcee, Keynote Speaker, Meeting Facilitator and Journalist. My motto is ABC Always Be Connecting ⭐️
10mo
Report this post
For our next talk at Ultralytics #YV23 #YoloVision, Soumik Rakshit, Machine Learning Engineer at Weights & Biases, gets bonus points for rocking a SpongeBob SquarePants background 🧽 Soumik holds the esteemed title of Google Developer Expert in #JAX and his expertise extends to open-source computer vision projects, with a focus on generative computing, image restoration, and computer graphics!
Like Comment
To view or add a comment, sign in
Top End Devs (formerly Devchat.tv)

314 followers
3mo
Report this post
Check out this week's episode of #ElixirMix with Jonatan Kłosko #𝗘𝗠𝘅: Elixir, LiveBook, and NX: Innovations in Machine Learning Training and GPU Integration https://lnkd.in/gtzWNdk5
Like Comment
To view or add a comment, sign in
Elixir Mix

39 followers
3mo
Report this post
Check out this week's episode of #ElixirMix with Jonatan Kłosko #𝗘𝗠𝘅: Elixir, LiveBook, and NX: Innovations in Machine Learning Training and GPU Integration https://lnkd.in/gtzWNdk5
Like Comment
To view or add a comment, sign in
Muhammad Rizwan Munawar

Computer Vision Engineer @Ultralytics | Solving Real-World Challenges🔎| Python | Published Research | Open Source Contributor | GitHub 🌟 | Daily Computer Vision LinkedIn Content 🚀 | Technical Writer VisionAI @Medium📝
2mo
Report this post
YOLOv10 vs Ultralytics YOLOv8 🔥⚽ 🦊 YOLOv10 offers an NMS-free object detection. Is this true? I explored it and performed tests on the visual data shown in the demo, highlighting algorithms' capabilities to tackle objects. Experiment findings 😍 🦁 The postprocessing time of YOLOv10 is better as compared to YOLOv8, approximately reaching 0.0x seconds, because its NMS-free architecture. 🦁 The inference and preprocessing time of YOLOv8 is better as compared to YOLOv10, which means YOLOv8 is the successor in real-time processing. 🦁 Overall, the inference speed of YOLOv8 is still state of the art, while YOLOv10 become a leader in post-processing speed. So ideally this can be considered a minor release instead of a major one. 🚀 Regarding visual experiments, I notice YOLOv10 suffers a lot, especially in zoom-out, zoom-in, and when objects are far from the camera. Some outcomes are shared in the comments below 👇 Learn more ➡ https://lnkd.in/dsp8i4Mr #computervision #objectdetection #yolov10 #experimentation #researchanddevelopment

100 Comments
Like Comment
To view or add a comment, sign in
Vaibhav Srivastav

GPU poor @ Hugging Face
2w
Report this post
Nvidia released BigVGAN v2! 🎧 > Custom CUDA kernel for inference: w/ fused upsampling + activation kernel upto 3x faster inference on A100 > Improved discriminator and loss: using a multi-scale sub-band CQT discriminator and a multi-scale mel spectrogram loss > Larger training data: trained using datasets containing diverse audio types, including speech in multiple languages, environmental sounds, and instruments. > Permissive pre-trained model checkpoints supporting up to 44 kHz sampling rate and 512x upsampling ratio > Models & Demo on the Hub 🤗
8 Comments
Like Comment
To view or add a comment, sign in
Ivomar Brito Soares
11mo Edited
Report this post
https://lnkd.in/dJ53_RYC Interesting presentation this week by DeepLearning.AI on how to reduce the memory footprint needed to fine-tune a 7 billion parameters llama LLM model so that the fine-tuning process fits in a single 16 GB GPU. Techniques explored are quantization, lora, qlora, gradient accumulation. Includes hands-on lab with example jupyter notebook.

Efficient Fine-Tuning for Llama-v2-7b on a Single GPU

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
David K.

🚀 LLMs & NLP Innovator | AI & Big Data Engineering Leader | Python Back-end Expert | 15+ Years in Tech | Speaker & Mentor
2w
Report this post
#Nvidia's #BigVGAN v2, highlighting its enhanced features for audio processing. It details improvements such as a custom #CUDA kernel for faster inference on A100 #GPUs, an improved discriminator and loss function for better audio quality, and the inclusion of diverse audio types in training data. Additionally, it mentions the availability of permissive pre-trained model checkpoints that support higher sampling rates and upsampling ratios. #mldk #mldktech #mldkgenai https://mldk.tech
Vaibhav Srivastav

GPU poor @ Hugging Face
2w

Nvidia released BigVGAN v2! 🎧 > Custom CUDA kernel for inference: w/ fused upsampling + activation kernel upto 3x faster inference on A100 > Improved discriminator and loss: using a multi-scale sub-band CQT discriminator and a multi-scale mel spectrogram loss > Larger training data: trained using datasets containing diverse audio types, including speech in multiple languages, environmental sounds, and instruments. > Permissive pre-trained model checkpoints supporting up to 44 kHz sampling rate and 512x upsampling ratio > Models & Demo on the Hub 🤗
Like Comment
To view or add a comment, sign in

26,524 followers

View Profile Follow

Gradio’s Post

More Relevant Posts

Efficient Fine-Tuning for Llama-v2-7b on a Single GPU

https://www.youtube.com/

Explore topics