What is 'continual learning' in the context of AI?

Continual learning refers to AI systems that can continuously learn and improve over time from new data and user interactions, rather than being static models that need full retraining. This allows AI to adapt and become more accurate by incorporating feedback and real-world usage patterns.

How does Trajectory's platform work?

Trajectory's platform distills all data, expert traces, and agent interactions into a format called 'trajectory,' which is then used to create training environments. This enables self-serve optimization of AI models, making them faster, cheaper, and more accurate by learning from real-world signals.

What are the challenges in data curation for continual learning?

A key challenge is that companies store usage data differently, and signals like 'accept/reject' or 'thumbs up/down' are often too noisy. Trajectory focuses on capturing more nuanced signals, like user modifications to agent outputs or corrections made in workflows, which provide richer learning opportunities.

What is Self-Distillation Policy Optimization (SDPO)?

SDPO is an advancement in training algorithms for continual learning. Unlike traditional Reinforcement Learning (RL) that relies on a single reward signal, SDPO uses privileged information or 'hints' to refine a teacher model, allowing for more nuanced guidance and faster convergence by learning from actual text-based feedback.

What is the '5 days of Trajectory' launch event?

The '5 days of Trajectory' was a launch event where the company showcased their research and product advancements, including open-sourcing a training stack for continual learning and discussing future product development.

How is Trajectory's approach to AI training different from traditional methods?

Traditional AI training is often linear and research-focused. Trajectory is building a production-ready training stack with a systems-level mindset, enabling continuous, parallelized training runs and dynamic workload management, similar to advanced operating system schedulers.

What is Trajectory's long-term vision for AI?

Trajectory aims to move beyond AI-native companies to incumbent tech firms and eventually Fortune 500 companies. Their vision is for any product's interface to dynamically learn from its users, driving constant updates and improvements through true continual learning across models, harnesses, and memory layers.

Key Moments

⚡️Every product of the future will be a living system — Ronak Malde, Trajectory.ai

Latent Space Podcast

Science & Technology6 min read34 min video

Jun 21, 2026|886 views|21|2

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

AI agents are making costly mistakes that developers are discarding, but Trajectory.ai's platform uses a novel 'trajectory' to capture and learn from these errors, promising a future where AI continually improves with user interaction.

Key Insights

The acquisition of Windsurf by Google DeepMind for $2 billion occurred just 35 days after the company's launch of its foundational model, Sui 1.

Trajectory.ai's platform distills user interactions and agent attempts into a 'trajectory,' a format encompassing all data needed for training, evaluation, and environment setup for AI agents.

Using their platform, Trajectory.ai partnered with NVIDIA and Harvey to train Neumatron 3 Super, improving legal workflow metrics by 20% and offering a drastically cheaper and faster alternative to frontier models.

The company's onboarding time for new customers has decreased from three months to under a week, enabling rapid model training and flywheel acceleration.

Trajectory.ai has open-sourced a training stack for continual learning in conjunction with SkyRL and AnyScale, demonstrating a 50% reduction in wall-clock time for concurrent training jobs.

Trajectory.ai's 'act two' involves making the platform legible to customers with user-friendly observability and evaluation tools, granting them direct control over model modification and deployment.

From coding agents to continual learning

Ronak Malde, CEO of Trajectory.ai, reflects on his journey co-founding Windsurf, an AI coding agent company that saw rapid innovation with the advent of models like Sonnet 3.5. A pivotal moment was witnessing an agent instantly create the game 2048, highlighting the 'magical' potential of early AI. This experience, coupled with a thesis that structured information like code would be key to AI's next leap, led him to focus on coding agents. Windsurf's success was amplified by developing their own foundation model, Sui 1, which leveraged extensive user signal from agent interactions to outperform existing models. This rapid growth culminated in a swift acquisition by Google DeepMind shortly after the model's success, a testament to the value of real-world user data over static benchmarks.

The rapid acquisition by Google DeepMind

Malde recounts the whirlwind experience of Windsurf's acquisition, which was initially rumored to be by OpenAI. Instead, on a Thursday morning, the team was summoned to a meeting where they learned it was Google DeepMind making the offer, with key figures like Demis Hassabis involved. The transition was incredibly fast; by Friday, the team was already onboarding at DeepMind. This rapid aqua-hire was driven by Google and DeepMind's recognition that owning stellar models wasn't enough; understanding how users interact with products and capturing that signal was crucial, especially as AI moved towards real-world applications. This experience, along with contributing to Gemini 3 and the Anti-Gravity launch, underscored the power of immense compute and cutting-edge technology.

The limitations of static AI and the need for continual learning

Despite the advancements and a comfortable position at DeepMind, Malde felt compelled to address a fundamental limitation in current AI: its static nature. He observed that even powerful AI models, like those used in coding, healthcare, or legal fields, repeat the same mistakes. Corrections and user feedback are often disregarded, essentially wasting valuable data. Malde realized that the 'compounding loop' of learning from real-world usage, similar to what was beginning to emerge in products like Cursor, was the key to unlocking AI's true potential across all domains. This led him to leave DeepMind, declining his share of the $2 billion acquisition, to found Trajectory.ai with co-founders Michael and Arjun, who brought expertise from DeepMind (robotics, Gemini 1.5) and Apple (Vision Pro interaction models), all focusing on how AI interacts with the real world.

Trajectory.ai's platform and its 'trajectory' data format

Trajectory.ai addresses the static AI problem by building a platform for continual learning. Their core innovation is the 'trajectory' data format, which distills all necessary information from user interactions, agent attempts, and expert feedback into a unified structure. This format enables the creation of robust evaluations, judges, and training environments. The platform focuses on optimizing AI agents, models, and harnesses by capturing user signals, especially corrections and modifications to agent outputs. This is critical for fields like law, where exactness is paramount, unlike coding where partial success might be tolerated. By turning these real-world interactions into learnable data, Trajectory.ai empowers companies to continuously improve their AI agents.

Improving legal workflows with Neumatron 3 Super

A key showcase of Trajectory.ai's platform is their partnership with Harvey, a legal AI company. Recognizing the need for sovereign intelligence in regulated industries, they collaborated with NVIDIA to train Neumatron 3 Super. The platform captured Harvey's expert legal workflows, including issue spotting, analysis, citation, and completeness. This resulted in significant improvements across these metrics, making the AI more effective. Crucially, Neumatron 3 Super is described as drastically cheaper and faster than frontier models, a vital consideration for large-scale deployments. This demonstrates the platform's ability to enhance domain-specific AI performance while reducing costs, making advanced AI more accessible.

Accelerating customer onboarding and model training

The efficiency of Trajectory.ai's platform is highlighted by its decreasing onboarding and training times. Early engagements took up to three months to set up, akin to 'building the airplane while flying.' However, with iteration and platform maturity, they were able to train a model for Harvey in under a month. More recently, they onboarded a new customer and trained a functional model within a week. This rapid iteration cycle signifies the power of continual learning in practice, allowing companies to quickly benefit from smarter, continuously improving AI agents, setting a new standard for product launches in the AI space.

Advancements in continual learning algorithms and open-source contributions

Trajectory.ai is pushing the boundaries of continual learning through innovative algorithm development and open-sourcing key infrastructure. They've developed and scaled 'Self-Distillation Policy Optimization' (SDPO), an enhancement to Reinforcement Learning (RL) that leverages privileged information or 'hints' to improve models, moving beyond simple reward signals. This SDPO approach allows for richer guidance of AI agents by incorporating nuanced corrections and modifications from user interactions. Furthermore, Trajectory.ai, in collaboration with SkyRL and AnyScale, has open-sourced a training stack for continual learning. This stack optimizes resource allocation for continuous training and sampling, significantly reducing wall-clock time for concurrent jobs—cutting it in half for two concurrent jobs and maintaining efficiency up to eight or more.

The future vision: Legible, controllable, and transformative AI

Trajectory.ai's future roadmap includes making their platform 'legible' and controllable for customers. The next phase focuses on providing customers, such as Product Managers, with clear observability into agent performance, identifying strengths and weaknesses. This will empower them to directly modify models and agent behavior, waking up to smarter production AI solutions. Beyond AI-native companies, they aim to partner with tech incumbents and eventually Fortune 500 companies, transforming knowledge work across all industries. This involves evolving beyond model training to also improving agent harnesses, skills, and memory layers, creating a holistic continual learning solution. Their ultimate goal is to enable any company's product to dynamically learn from its users and constantly update, making AI a truly living system.

Mentioned in This Episode

●Software & Apps

●Companies

●Organizations

●Concepts

Common Questions

Trajectory.ai is a company founded by Ronak Malde and former DeepMind and Apple colleagues. Their mission is to empower every company to adopt continual learning, enabling AI systems to dynamically improve based on real-world user interactions and data.

Topics

Reinforcement Learning AI & Machine Learning Technology & Innovation Programming & Software Continual Learning AI Development Foundation Models Developer Tools Data Curation AI Platforms Machine Learning Operations

Mentioned in this video

Companies

Trajectory.ai

Ronak Malde's new company focused on building a platform for continual learning across various industries.

DeepMind

An AI research lab acquired by Google, where Ronak Malde and his co-founders previously worked. It was part of the acquisition of WindSurf.

Kodium

A company where Ronak Malde worked before Windsurf, involved in product engineering and model training.

Mercury

An early foundation model developed at Windsurf by Ronak Malde, used for capturing user signals and improving performance.

OpenAI

A prominent AI company that was initially rumored to acquire WindSurf, but the company was instead acquired by Google/DeepMind.

XAI

A company mentioned as doing a similar deal, highlighting the trend of AI companies focusing on real-world usage and continuous learning.

NVIDIA

A company that Trajectory partnered with to train models, specifically Neimatron 3 Super, for legal workflows.

Rogo

A company partnered with Trajectory, mentioned as an AI-native company.

Dakugon

A partner company of Trajectory, working with them on AI solutions.

Mor

A partner company of Trajectory, working with them on AI solutions.

DeepSeek

An open-source AI model that was considered less promising before the release of DeepSeek V3.

Deca Gun

A company partnered with Trajectory, mentioned as an AI-native company.

Walmart

A Fortune 500 company mentioned as an example of where Trajectory aims to bring continual learning, observing user behavior to build tailored AI solutions.

Stripe

A company where a member of Trajectory's product team previously worked, focusing on backend SDKs.

Figma

A company where a member of Trajectory's product team previously worked, focusing on interface design.

Thinking Machines

A company mentioned as having similar infrastructure approaches to Trajectory regarding distributed learning.

Organizations

Len Space

The podcast or platform hosting the interview with Ronak Malde.

Software & Apps

Windsurf

An AI company where Ronak Malde previously worked, focusing on coding agents and user signal capture, later acquired by Google/DeepMind.

Sonnet 35

An AI model that had recently been released around the time Windsurf was launched, which the team experimented with.

Gemini

A post-training project at Google that Ronak Malde had the option to join before opting for Windsurf.

Sui 1.5

A later iteration of the Sui model developed by Cognition.

Anti-gravity

A project launched by DeepMind after the WindSurf acquisition, in which Ronak Malde contributed.

Gemini 3

A model that Ronak Malde and his team contributed to at DeepMind after the acquisition.

Cursor

An AI coding assistant that exemplifies building models around user actions and real-world usage, a key concept for continual learning.

Harvey

A legal tech company that Trajectory is partnering with to train models for complex legal workflows using continual learning.

Neimatron 3 Super

A model trained by Trajectory in partnership with NVIDIA and Harvey, aiming to achieve state-of-the-art performance in legal workflows.

Clay

A company partnered with Trajectory, mentioned as an AI-native company.

DeepSeek V3

An advanced open-source AI model from China that significantly impressed the AI community.

Kimmy

A large trillion-parameter model, mentioned in the context of advanced AI models currently existing, predominantly in China.

GLM

An advanced AI model from China mentioned alongside DeepSeek.

Notion

A tech incumbent company that Trajectory plans to work with in the future, representing a move beyond AI-native companies.

Sky RL

A lab from UC Berkeley that collaborated with Trajectory on open-sourcing a training stack for continual learning.

Slurm

A workload manager used for high-performance computing clusters, mentioned as a comparison for traditional training job management.

Concepts

AR/VR

Augmented Reality and Virtual Reality, areas where Ronak's co-founders worked on AI interaction models, highlighting the focus on AI interacting with the real world.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free

⚡️Every product of the future will be a living system — Ronak Malde, Trajectory.ai

Want to know something specific about what's covered?

Key Insights

From coding agents to continual learning

The rapid acquisition by Google DeepMind

The limitations of static AI and the need for continual learning

Trajectory.ai's platform and its 'trajectory' data format

Improving legal workflows with Neumatron 3 Super

Accelerating customer onboarding and model training

Advancements in continual learning algorithms and open-source contributions

The future vision: Legible, controllable, and transformative AI

Mentioned in This Episode

Common Questions

Topics

Mentioned in this video

More from Latent Space

The AI Frontier: from FLOPs to Megawatts — Anjney Midha, AMP

🔬 The Limits of AI in Science - Why We Need Self-Driving Labs — Joseph Krause, Radical AI

⚡️Making DeepSeek v4 outperform Opus 4.7 with Taste — @AhmadAwais , CommandCode.ai

When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs

Ask anything from this episode.