Introducing CogVideoX-2B🤯 Text-to-video. Open-sourced. Similar to models like Sora or Gen3. Requires 18GB for inference.🔥
CogVideoX-2B - How to use it?
- Launch the Gradio app locally from ChatGLM's CogVideo GitHub repo: https://lnkd.in/gwEbT5kJ
- OR Play with the demo on Hugging Face Spaces : https://lnkd.in/gMszB7Ew
Can we use Video diffusion models for novel view synthesis similar to Neural Radiance Fields or Gaussian Splats without retraining? Pablo Vela built a Hugging Face Space using the Rerun Gradio plugin that explores this exact topic.
NVS-solver uses warped views and camera poses to modulate the video diffusion process. This allows for consistent view generation for the single view, multi-view and monocular video use case. Focusing specifically on the single view use case the pipeline looks as follows.
1. First a camera trajectory is generated, to specify what images we want the video diffusion network to synthesis.
2. With the generated camera trajectory, a monocular depth estimation network (in this case DepthAnythingV2) is used to generate a depth map.
3. Once we have the image, depth, and camera trajectory, forward warping is performed to place pixels from the source image to the destination image using bilinear splatting.
4. Finally a video diffusion model (in this case SVD) is used to generate the views at each camera trajectory. This involves modulating the score function using the scene priors (warped image and camera poses) as guidance. The critical point here is that this requires no retraining!
Give it a whirl and check out the links in the comments:
How to get Llama 3.1 8B running on Mac, 100% local, powered by llama.cpp 🔥
Two steps:
1. brew install llama.cpp
2. llama-cli --hf-repo reach-vb/Meta-Llama-3.1-8B-Instruct-Q6_K-GGUF \
--hf-file meta-llama-3.1-8b-instruct-q6_k.gguf \
-p "Hello?" --ctx-size 8192
It's a powerful model using ~6.5GB RAM. ⚡
That's it! 🤗
📢Introducing MeshAnything V2! Surpasses MeshAnything in both performance and efficiency! 💯 Efficiently achieves high-quality, highly controllable 3D Mesh for various 3D asset production pipelines: Image-to-mesh, text-to-mesh, pointcloud-to-mesh, 3DGS-to-mesh & NeRF-to-mesh.
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
🤗Demo + details👇
Official Gradio demo on Hugging Face Spaces: https://lnkd.in/g9CpPSHA
FLUX.1 [schnell] is blowing minds across the SD & MidJourney art community.🤯 All attached images are FLUX-generated by the community and they are unbelievable. For example, some of these images actually look like real photographs!
- Try out the new SOTA text-to-Image model today with a Gradio demo on Hugging Face Spaces for free!!
- Flux Prompt handling is way better SD3 & Midjourney.
- You might find it tough to believe these generated images are from Flux and not MidJourney !
- There are couple examples of artists generating character-sheets with similar looking characters out of the box
- Some examples show FLUX being used as a Jewelry Design tool!😍
- FLUX -> More realistic and more detailed images than SD3
Gradio demo on Hugging Face Spaces: https://lnkd.in/dQSXrRpr
Think you know the key to powerful AI? It's not just about fancy models; it's about the data.
Voxel51's first-ever Data-Centric AI Competition on Hugging Face Spaces challenges you to prove that data eats models for lunch!
The Challenge: Curate, Don't Just Compute
Your mission is to create a smaller, more efficient subset of our provided dataset (65,986 images, 43 object classes) that maintains or improves the performance of a YOLOv8m object detection model.
This means:
✅ Removing redundant or unhelpful data
✅ Fixing labeling errors
✅ Applying data augmentation
🚫 Don't use external data, add new annotations, or generate synthetic images.
🛠️Your Toolkit
• FiftyOne: An open-source tool for dataset curation and analysis (tutorial provided!)
• YOLOv8m model: From the Ultralytics Model Zoo
• Your ingenuity!
Winning Formula
We're looking for the sweet spot between dataset size and model performance.
Our scoring metric reflects this: Score = (mAP * log(N)) / N
Where:
mAP: Mean Average Precision on a hidden test set
N: Number of images in your curated dataset
Prizes Worth Fighting For
🥇 1st Place: $1,000
🥈 2nd Place: Top Tier Community Swag Package
🥉 3rd Place: Mid-Tier Community Swag Package
Timeline
🚀 Launch: August 1, 2024
Submissions Open: September 1 - October 27, 2024
🏆 Winners Announced: November 6, 2024
Support System
We've got your back! Benefit from:
• Workshops: "Getting Started with FiftyOne" (August 21 & September 25)
• Check-in/Office Hours: September 6, 13, & 20
Why Participate?
• Level up your data curation skills
• Make a real-world impact on AI development
• Connect with the AI community
• Gain recognition for your expertise
#data#artificialintelligence#computervision#deeplearning
CatVTON: A simple and most efficient virtual try-on diffusion model🤩 Lightweight (899.06M params only), Parameter-Efficient Training (49.57M parameters trainable), and Inference on less than 8G VRAM for 1024X768 resolution
😍 Gradio App locally:
To deploy the Gradio App on your machine, visit CatVTON project, and just run the appdotpy file, and checkpoints will be automatically downloaded from Hugging Face
Visit: https://lnkd.in/g-mbWPJY
In case you didn't already know the awesome FLUX model from Black Forest Labs is supported in 🧨 Diffusers.
The model has about 12B parameters along with two text encoders. So, it can be burdensome to run. So, we have put together a little gist guiding you on how to run Flux with limited resources.
What a model!
Check it out here:
https://lnkd.in/gr3R9Eza
SF3D is super-fast image to 3D. Generates 3D mesh in 0.5 second🤯 Attached video is not sped up! Demo + Key Details 👇
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
- SF3D is from Stability AI and is based on their previous work TripoSR.
- 🤩 Bonus: You can also select an HDR environment to light up your 3D model.
- Great work by Mark Boss on the model development and release!🙌
🥳 You can launch a gradio demo in just 3 steps from the SF3D repo: https://lnkd.in/gvpevyez
🤗 OR access the demo on Hugging Face Spaces: https://lnkd.in/gdUFdU5d