LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
speech-to-text
speech-to-speech
large-language-models
multimodal-large-language-models
speech-language-model
speech-interaction
-
Updated
Sep 24, 2024 - Python