Together AI

Software Development

San Francisco, California 25,910 followers

The future of AI is open-source. Let's build together.

See jobs Follow

View all 111 employees

About us

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society.

Website: https://together.ai
External link for Together AI
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2022
Specialties: Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing

Locations

Primary

251 Rhode Island St

Suite 205

San Francisco, California 94103, US

Get directions

Employees at Together AI

See all employees

Updates

Together AI

25,910 followers
1w
Report this post
Today marks an inflection point for open-source AI with the launch of AI at Meta Llama 3.1 405B, the largest openly available foundation model, that rivals the best closed-source models in AI, rapidly accelerating the adoption of open-source AI with developers and enterprises. We are excited to partner with Meta to bring all the Llama 3.1 models (8B, 70B, 405B, and LlamaGuard) to Together Inference and Together Fine-tuning. Together Inference delivers horizontal scalability with industry-leading performance of up to 80 tokens per second for Llama 3.1 405B and up to 400 tokens per second for Llama 3.1 8B, which is 1.9x to 4.5x faster than vLLM while maintaining full accuracy with Meta’s reference implementation across all models. Together Turbo endpoints are available at $0.18 for 8B and $0.88 for 70B, 17x lower cost than GPT-4o. This empowers developers and enterprises to build Generative AI applications at production scale in their chosen environment – Together Cloud (serverless or dedicated endpoints) or on private clouds. As the launch partner for the Llama 3.1 models, we're thrilled for customers to leverage the best performance, accuracy, and cost for their Generative AI workloads on the Together Platform while allowing them to keep ownership of their models and their data secure. Function calling is supported natively by each of the models, and JSON mode is available for the 8B and 70B models (coming soon for the 405B model). Together Turbo endpoints empower businesses to prioritize performance, quality, and price without compromise. It provides the most accurate quantization available for Llama-3.1 models, closely matching full-precision FP16 models. These advancements make Together Inference the fastest engine for NVIDIA GPUs and the most cost-effective solution for building with Llama 3.1 at scale. https://lnkd.in/gFwBNQhJ
- +1
12 Comments

Like Comment Share
Together AI

25,910 followers
12h
Report this post
Attending ACL 2024? Join fellow researchers and engineers at the Together AI recruitment mixer at The Black Cat restaurant on Monday, August 12. RSVP here: https://lu.ma/n2p4cpoc

Like Comment Share
Together AI

25,910 followers
3d
Report this post
We tested out Llama 3.1 405B – and it turns out it's an excellent coding model! To showcase it, we built an open source example project that uses Llama 3.1 405B to build small apps. Check it out: https://lnkd.in/e68u-EXV

Like Comment Share
Together AI

25,910 followers
5d
Report this post
Recently there has been considerable discussion on differences in quality when different inference providers use different implementations of Meta's Llama 3.1 models. In the blog post below, we investigated these differences and what they mean to real-world applications. Then, we share the details of our quality testing approach for you to leverage when evaluating a provider or hosting Llama 3.1 yourself. Learn more here: https://lnkd.in/g7rsHgFN

Llama 3.1: Same model, different results. The impact of a percentage point.

together.ai

2 Comments

Like Comment Share
Together AI

25,910 followers
1w
Report this post
Pika, a video generation company, leveraged our GPU Clusters and Inference solution to scale their AI video generation capabilities, enabling them to build a text-to-video model in just 6 months, save over $1 million in costs, and grow to millions of videos generated monthly. Together AI Acceleration Cloud offers: • Up to 10k+ Frontier GPU Clusters: Optimized for foundation model training with multi-user environment and SSH access. • End-to-End Platform: Training, fine-tuning, and hosting custom models for inference. • Flexible Commits: Terms starting from one month with scheduled buildup options. • Premium Support: Included with every cluster. • Cutting-Edge Architectures: Support for state space models like Mamba and Striped-Hyena • Industry-Leading Research: 9x faster training with FlashAttention-3 optimization Ready to accelerate your AI development? Reserve a top-spec H100 cluster for training and fine-tuning today here https://lnkd.in/g6dAtUiq
Like Comment Share
Together AI reposted this

Thomas Gburek

Helping Startups Accelerate with NVIDIA
1w
Report this post
Incredible news to share with team at Together AI! If you want the best enterprise grade infrastructure managed and supported by NVIDIA with the best available inference at scale with Together's Inference Engine, its now possible! This groundbreaking partnership will enable any company to build the best possible AI solutions using the best of both worlds. I'm happy to connect to the appropriate parties on both sides if anyone is interested.
Together AI

25,910 followers
1w

We are thrilled to announce our collaboration with NVIDIA that brings the industry-leading Together Inference Engine to NVIDIA AI Foundry customers. This empowers enterprises and developers to leverage openly available models like Llama 3.1 running on the Together Inference Engine on NVIDIA DGX Cloud. Developers and enterprises can also fine-tune the models with their proprietary data to achieve higher accuracy and performance and continue to maintain ownership of their data and models. The Together Inference Engine is built on innovations including FlashAttention-3 kernels, custom-built speculators based on RedPajama, and the most accurate quantization techniques available on the market. These advancements enable enterprise workloads to be highly optimized for NVIDIA Tensor Core GPUs allowing them to build and run generative AI applications on open-source models with unmatched performance, accuracy, and cost-efficiency at production scale. With this collaboration, businesses with sophisticated workloads on DGX Cloud can deploy open-source models into production faster on NVIDIA-optimized infrastructure, paired with the Together AI accelerated inference stack with unmatched performance, scalability, and security. https://lnkd.in/gt7xGzW2
Like Comment Share
Together AI

25,910 followers
1w
Report this post
We just built a full-stack AI tutor example app with the new Llama 3.1! It's called LlamaTutor, it's free, and it's fully open source: Type in a topic and what education level you want to be taught at and you will get a personalized chatbot that uses up to date sources from the internet to teach you the material in an interactive way. Check out the app here → https://llamatutor.com/ The code is also available → https://lnkd.in/e2c58hhC It's built with: ◆ Together AI's inference w/ Llama-3.1 (AI API) ◆ Serper for looking up sources (search API) ◆ Next.js app router, TypeScript, and tailwind ◆ Helicone (YC W23) for AI observability ◆ Plausible Analytics for analytics Llama 3.1 just came out today and we wanted to give developers an open source app to be able to easily use it. You can call Llama 3.1 405B, 70B, or 8B through our platform with an API with just a few lines of code.

3 Comments

Like Comment Share
Together AI reposted this

Hassan El Mghari

Building at Together.ai
1w Edited
Report this post
I built a free and open source AI personal tutor with the new Llama 3.1! It's called LlamaTutor: Type in a topic and what education level you want to be taught at and you will get a personalized chatbot that uses up to date sources from the internet to teach you the material in an interactive way. Check out the app here → https://llamatutor.com/ The code is also available → https://lnkd.in/e2c58hhC It's built with: ◆ Together AI's inference w/ Llama-3.1 (AI API) ◆ Serper for looking up sources (search API) ◆ Next.js app router hosted on Vercel ◆ TypeScript for the language & Tailwind for CSS ◆ Helicone (YC W23) for AI observability ◆ Plausible Analytics for analytics Designed by the talented Youssef El Mghari! #ai #opensource #artificialintelligence

91 Comments

Like Comment Share
Together AI

25,910 followers
1w
Report this post
We’re thrilled to be a launch partner for the MongoDB AI Applications Program (MAAP). This program allows developers and enterprises to build and run generative AI applications with the best performance, accuracy, and cost on the Together Platform while allowing them to keep ownership of their models and their data secure. Discover easy & seamless RAG-based app development using MongoDB Atlas Vector Search and Together AI’s Embedding & Inference endpoints. Partnered with @MongoDB, we're thrilled to offer an enterprise grade end-to-end solution for retrieval-augmented generation, semantic search, and conversational AI. Easily build and deploy RAG-based applications with MongoDB Atlas Vector Search and Together AI’s Embedding & Inference endpoints. Empower your data for impactful end-user use cases like support and semantic search. Leverage Together AI’s API to generate embeddings using top open-source models, seamlessly integrated with your Atlas vector search index. Enhance your applications with real-time inference capabilities for unparalleled user experiences. Together AI and MongoDB bring scalable, production-ready solutions to customer environments. Achieve optimal performance and efficiency across all workloads with our integrated solutions. Explore how our technology can elevate your applications: https://lnkd.in/gNiJANvi

Together AI - Partner Ecosystem | MongoDB

cloud.mongodb.com

Like Comment Share
Together AI

25,910 followers
1w
Report this post
We are thrilled to announce our collaboration with NVIDIA that brings the industry-leading Together Inference Engine to NVIDIA AI Foundry customers. This empowers enterprises and developers to leverage openly available models like Llama 3.1 running on the Together Inference Engine on NVIDIA DGX Cloud. Developers and enterprises can also fine-tune the models with their proprietary data to achieve higher accuracy and performance and continue to maintain ownership of their data and models. The Together Inference Engine is built on innovations including FlashAttention-3 kernels, custom-built speculators based on RedPajama, and the most accurate quantization techniques available on the market. These advancements enable enterprise workloads to be highly optimized for NVIDIA Tensor Core GPUs allowing them to build and run generative AI applications on open-source models with unmatched performance, accuracy, and cost-efficiency at production scale. With this collaboration, businesses with sophisticated workloads on DGX Cloud can deploy open-source models into production faster on NVIDIA-optimized infrastructure, paired with the Together AI accelerated inference stack with unmatched performance, scalability, and security. https://lnkd.in/gt7xGzW2
13 Comments

Like Comment Share

Funding

Together AI 3 total rounds

Last Round

Series A Apr 13, 2024

US$ 106.0M

Investors

Salesforce Ventures + 14 Other investors

See more info on crunchbase

Together AI

Software Development

San Francisco, California 25,910 followers

The future of AI is open-source. Let's build together.

About us

Locations

Employees at Together AI

Vipul Ved Prakash

Co-founder & CEO Together AI

Yaron Samid 🇮🇱🇺🇸🎗️

Founder & Managing Partner, TechAviv. 3X founder & CEO, investor, and community builder.

Justin Foutts

Charles Srisuwananukorn

Updates

Llama 3.1: Same model, different results. The impact of a percentage point.

together.ai

Together AI - Partner Ecosystem | MongoDB

cloud.mongodb.com

Join now to see what you are missing

Similar pages

Mistral AI

Perplexity

Hugging Face

Anyscale

Cohere

Anthropic

Glean

Scale AI

Pika

Pinecone

Funding