How does Together AI's infrastructure differ from traditional cloud providers?

Together AI structures its cloud by combining data centers globally in a disaggregated and decentralized manner, rather than relying solely on hyperscalers. They focus on building AI supercomputers with this distributed approach.

What was the motivation behind creating the RedPajama dataset?

The RedPajama dataset was created to provide a high-quality, reproducible dataset for open-source pre-training, inspired by the excitement around models like LLaMA but addressing the lack of available data recipes.

What are state space models, and why is Together AI researching them?

State space models are a research focus for Together AI as they offer potential for more efficient training and better performance, particularly for handling long contexts, and can provide benefits even outside of long-context applications.

Is there a risk of running out of data for training large language models?

While publicly available internet data might be limited, it's unlikely we'll run out of data globally. There are proprietary datasets within organizations, and the community needs to find ways to integrate diverse data sources beyond just internet text.

How many GPUs does Together AI operate?

Together AI operates close to 7,000-8,000 GPUs, primarily H100s and A100s, and is continuously growing this fleet.

Why is high inference speed (e.g., 5,000 tokens/second) important?

High inference speed is crucial for applications that consume model outputs directly (not just for human reading/listening), to increase system throughput, and to enable new user experiences that require immediate responses from AI.

What is the role of embeddings in AI development?

Embeddings are fundamental for many AI tasks, including retrieval-augmented generation (RAG) and building powerful models. They help represent knowledge and understand model inputs, forming a critical loop for improving data and models.

What are hybrid architectures in AI models?

Hybrid architectures combine different types of model components, such as state space models and transformers, within the same network. The idea is that different components might excel at different tasks, leading to potentially better overall quality.

What is the focus of Together AI's product strategy?

A major focus is making AI development serverless, allowing users to experiment with open models and fine-tuning without significant upfront compute commitments. They offer free credits to facilitate easier onboarding.

What kind of expertise is Together AI looking for in new hires?

Together AI is hiring across the entire stack, from backend systems (Kubernetes, CUDA, DevOps) to machine learning systems, algorithm development, and frontend/developer experience. They value passionate individuals, even if they lack direct AI experience.

What is an important unsolved question in the field of AI?

A critical open question is the need for a framework to understand the world with advanced AI systems. Developing a positive and realistic roadmap for the industry's direction is considered a meta-level problem that needs solving.

Key Moments

Building an open AI company - with Ce and Vipul of Together AI

Latent Space Podcast

Science & Technology4 min read76 min video

Feb 8, 2024|1,643 views|22

together ai latent space ce zhang vipul prakash swyx alessio fanelli

Save to Pod

Key Moments

TL;DR

Together AI builds open, decentralized AI systems, focusing on efficient training, inference, and novel architectures like State Space Models.

Key Insights

Together AI's core mission is to foster open and independent AI systems, providing a platform for open-source and user-owned models.

The company is building a decentralized cloud infrastructure, combining data centers globally rather than relying on hyperscalers.

Together AI prioritizes data quality, exemplified by their RedPajama datasets, offering flexible filtering and quality signals for users.

State Space Models (SSMs) are a significant research focus, promising more efficient training and inference, not just for long contexts but also for general improvements.

Optimizing inference is a multi-dimensional challenge, requiring a joint approach combining algorithms, model architecture, systems, and hardware.

The GPU market remains tight, emphasizing the value of efficient compute utilization and Together AI's decentralized infrastructure approach.

THE ORIGINS AND MISSION OF TOGETHER AI

Together AI was founded in June 2022 with a fundamental mission to build open and independent AI systems. The company views AI as one of the most consequential technologies of our time and aims to create a platform for open-source and user-owned models, differentiating itself from the proprietary platforms of large frontier model labs. This ethos of openness and decentralization is deeply embedded in their platform strategy and business model. They are actively building their own AI supercomputers through a disaggregated and decentralized network of data centers, choosing not to rely on traditional hyperscalers.

FROM APPLE'S ECOSYSTEM TO OPEN SOURCE ETHOS

Drawing from experience at Apple, where technology is often hidden behind polished products to create seamless user experiences, Together AI aims to bring a similar focus on developer experience to their open platform. The co-founders emphasize applying complex technology to simplify everyday tasks for developers. Past work with deep learning systems, including an open-domain Q&A system, highlighted the power and potential of these technologies, especially as models scale. The advent of large-scale models, driven by algorithms that improve with scale, marked a new era of computing, reinforcing the company's direction.

THE REDPAJAMA DATASET INITIATIVE

Recognizing the critical role of data in AI development, Together AI launched the RedPajama dataset. This initiative builds upon previous community efforts like C4 and LLaMA's data recipes, aiming to provide a high-quality, reproducible dataset for open model pre-training. RedPajama V1 was a best-effort reproduction of the LLaMA dataset, and V2 expanded significantly to 30 trillion tokens. V2 incorporated lessons learned from V1, focusing on modularity and data quality signals, offering 40 pre-computed quality signals to allow users to tailor datasets to specific applications, moving beyond a one-size-fits-all approach.

ADVANCING MODEL ARCHITECTURES: STATE SPACE MODELS

Together AI is heavily investing in research, with a significant portion of its team dedicated to exploring novel model architectures, particularly State Space Models (SSMs). This research is driven by the limitations of current Transformer architectures in terms of inference speed and cost, especially for long sequences. SSMs offer a path towards more efficient training and inference, reducing computational complexity and enabling longer context windows. The Hybrid architecture, like in Strap-Hyena, combines SSMs with Transformers, exploring how different components can optimize information processing across a context, promising significant advancements beyond just long-context capabilities.

OPTIMIZING AI INFERENCE AND INFRASTRUCTURE

Optimizing AI inference is a key focus, viewed as a multi-dimensional problem requiring advancements in algorithms, model architectures, systems, and hardware. Together AI employs a combination of approximately 50 'tricks and techniques' to enhance inference performance. Their decentralized infrastructure, utilizing thousands of GPUs (primarily H100s and A100s), aims for optimal compute utilization. The company is building a serverless platform to make AI development more accessible, allowing users to train, fine-tune, and run models without substantial upfront commitments, significantly lowering the barrier to entry for AI-driven applications.

THE CHALLENGES OF BENCHMARKING AND DATA ACCESSIBILITY

The discussion highlighted the complexities and potential pitfalls of AI benchmarking, emphasizing the need for standardized, independent, and transparent methodologies to avoid over-optimization and misaligned incentives. Together AI actively provides feedback to ensure benchmarks truly reflect technical progress. Furthermore, data accessibility remains a crucial issue, as the demand for diverse, high-quality data beyond publicly available internet data grows. The company believes a marketplace for data utilization and a balanced approach to data openness is necessary for the overall advancement of open AI.

THE FUTURE OF EMBEDDINGS AND TRAINING

Embeddings are considered a fundamental building block for many AI applications, including retrieval-augmented generation (RAG) and foundational model training. Together AI sees significant room for improvement in embedding quality and speed, enabling more accurate semantic understanding and closing the loop in iterative model development. On the training front, the company observes a continuous cycle of model refinement rather than one-off training runs, indicating sustained demand for robust training infrastructure and expertise, which Together AI aims to provide through its platform.

UNSOLVED QUESTIONS AND TOGETHER AI'S VISION

Looking ahead, Together AI is focused on developing a comprehensive framework for understanding the impact of advanced AI systems and is continuously hiring across all areas, from research to systems engineering. They believe in the power of compounding innovation across the entire AI stack. The co-founders are committed to their vision of open and decentralized AI, seeing their work as a dream job that aligns with their passion for advancing the field collectively, emphasizing that true innovation comes from combining expertise across diverse layers of the technology stack.

Mentioned in This Episode

●Products

●Software & Apps

●Companies

●Organizations

●Books

●Concepts

●People Referenced

Common Questions

Together AI's mission is to build open and independent AI systems. They aim to provide a platform for open-source models and user-owned AI, contrasting with closed platforms from large labs.

Topics

AI & Machine Learning Technology & Innovation Programming & Software Open-source AI AI Infrastructure Developer Tools Foundational Models Inference Optimization Dataset Creation GPU Compute

Mentioned in this video

Software & Apps

Llama

A foundational model that generated excitement in the AI community, influencing the development of the RedPajama dataset.

A large dataset from Google, mentioned as an inspiration for the RedPajama dataset.

TGI

Text Generation Inference, a toolkit for deploying large language models, mentioned in the context of machine learning systems expertise.

flash attention

An optimization technique for attention mechanisms in transformers, open-sourced and contributing to better AI models.

Domo

A tool from AI2 that provides a flexible format for quality signals, similar to Together AI's approach with RedPajama V2.

BGE

A Chinese embedding model that previously topped the MTEB chart, mentioned in the context of the rise of Chinese models in this domain.

Terraform

An infrastructure as code tool, mentioned as an example of devops expertise sought by Together AI.

VRM

Potentially referring to models that handle virtual reality models or files, mentioned in the context of machine learning systems expertise.

Vip razor

A collaborative spam filter created by one of the co-founders, mentioned as an early open-source project.

AWS

Amazon Web Services, mentioned for its revenue and as a benchmark for the scale of AI hyperscaler buildouts.

Kubernetes

A container orchestration system, mentioned as an example of the systems expertise sought by Together AI.

CUDA

NVIDIA's parallel computing platform and API, mentioned as an area of expertise sought by Together AI.

Mamba

A state space model architecture that changed the perception of sub-quadratic architectures, highlighting efficiency beyond just long context.

Strap Hyena

A hybrid architecture model developed by Together AI, combining state space models with transformers to achieve high quality.

Companies

Ten Years Voyage

A startup focused on embeddings, mentioned as an example of new companies emerging in this specialized area.

Facebook

Mentioned for its LSTM paper in 2016, which was used as a basis for an open-domain Q&A system developed at Apple.

Apple

Mentioned as a contrast to Together AI's open ethos, highlighting its focus on polished user experience and hidden technology.

Stack Overflow

Mentioned as a platform that shut down its API, contributing to data fragmentation.

Google

Mentioned regarding its C4 dataset and its paper on Federated Learning.

NVIDIA

The primary manufacturer of GPUs, discussed in the context of supply, demand, and market tightness for AI compute.

Together AI

An AI company focused on building open and independent AI systems, offering a platform for open models and custom model development.

CloudMark

The first company founded by a co-founder, which built commercial products around open-source software.

Mentioned as a platform that shut down its API, contributing to data fragmentation.

Gina AI

A startup focused on embeddings, mentioned as an example of new companies emerging in this specialized area.

Concepts

IoT

Internet of Things, mentioned as a future area of interest for one of the co-founders if not working at Together AI.

State Space Models

A class of models being researched by Together AI, aiming for more efficient training and better performance, especially for long contexts.

Books

RedPajama V1

The initial version of the RedPajama dataset, created to reproduce the data recipe from LLaMA's paper.

RedPajama V2

An updated version of the RedPajama dataset with 30 trillion tokens and an emphasis on data quality through modular filtering.

SlimPajama

A dataset that performs deduplication over the RedPajama data, mentioned as an example of community building on RedPajama.

People

Dylan Patel

Analyst from SemiAnalysis whose inference market post was discussed, with some analysis errors noted.

Vipul Prashar

Co-founder and CEO of Together AI, with a background in product and entrepreneurship.

Chris Ray

Mentioned as someone who discussed with Vipul how to reduce the cost of building AI models, and introduced Vipul to Sir.

Shi Yi

One of the co-founders of Together AI, focusing on the technical side and system architecture.

Products

H100

A high-end NVIDIA GPU mentioned in the context of inference performance and comparisons with other systems.

A type of NVIDIA GPU mentioned as part of the discussion on hardware choices for performance.

Organizations

SemiAnalysis

An analysis firm whose posts on the AI inference market were discussed, with one specific post by Dylan Patel mentioned.

New York Times

Mentioned as a company that is swinging open its AI, contrasting with others that have restricted API access.

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Get Started Free