Key Moments

AI Dev 26 x SF | Daniel Beutel: Flower SuperGrid Agents

DeepLearning.AIDeepLearning.AI
Education7 min read31 min video
May 22, 2026|221 views|2
Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

TL;DR

AI is moving beyond isolated data silos to collaborative networks, unlocking vast untapped data beyond the 1% of public web data currently used, but building these systems is complex.

Key Insights

1

Less than 1% of the world's data, estimated at 15 trillion tokens of public English data, is currently utilized for AI training and agent contexts, while an estimated 2,000 trillion tokens remain in private data silos.

2

Flower Labs has achieved several milestones, including running the first H100 GPU in space on the StarCloud 1 satellite and performing the first vision transformer training in space.

3

Flower SuperGrid simplifies the creation of decentralized AI systems, reducing what could be over 250 complex steps to a few clicks.

4

The Dr. Nicholas Con example at Dockport demonstrated the use of SuperGrid to analyze data from over 250,000 patients for primary care improvement without centralizing the data.

5

Project Kaya is a collaborative AI agent designed to run on SuperGrid, capable of messaging and collaborating with other agents across distributed data silos.

6

Flower's decentralized training pipeline, SuperGrid Frontier, has achieved up to a 1,000x reduction in communication costs for decentralized training runs.

The untapped potential of private data silos

The current landscape of AI development is heavily reliant on publicly available data, with an estimated 15 trillion tokens representing nearly all high-quality English data on the web. This limited pool is the foundation for most foundation model training and powers the web search capabilities of current AI agents. However, this represents less than 1% of the total data available. A staggering 2,000 trillion tokens of data are locked away in private data silos, largely unused. This vast amount of information, distributed across numerous silos of varying sizes, represents a massive missed opportunity. Organizations currently try to access this data by consolidating it into larger data lakes or through licensing deals. This approach is inherently limited by the difficulty and cost of data acquisition and centralization. The speaker argues that this 'scaling up' approach is inefficient and that a new paradigm of 'scaling out' through collaborative networks is necessary to unlock this data's potential and build truly advanced AI systems.

Flower Labs' journey and commitment to collaborative AI

Flower Labs' journey began in 2022 at the University of Cambridge with a focus on decentralized AI approaches, challenging the dominant centralized model where data is moved to computation. They advocate for moving computation to the data, letting data reside where it is and leveraging distributed compute and data sources. They prefer the term 'collaborative AI' to emphasize the joint use of resources by different parties. Flower has emerged as the industry standard in this space, boasting an Apache 2.0 license, a vibrant community with over 7,000 Slack members and 180 contributors, and the world's largest conference for federated AI systems, the Flower AI Summit. The framework is utilized by major organizations across diverse sectors, from Bosch and Samsung to academic institutions and government bodies like the US Air Force. This strong open-source foundation led to the establishment of Flower Labs as a startup in 2023, followed by their entry into Y Combinator and the release of two deep learning courses.

Flower SuperGrid: Simplifying decentralized AI infrastructure

Building decentralized AI systems traditionally involves a complex, multi-step process that can be overwhelmingly daunting. Daniel Beutel illustrated this by showing a sequence of over 80 steps, followed by another list of more than 80, and then yet another, before even beginning the actual system development. Recognizing this impediment, Flower Labs developed Flower SuperGrid to drastically simplify and democratize the creation of these systems. Their goal was to make building decentralized AI at least an order of magnitude easier. To achieve this, they re-architected the entire stack, resulting in a user experience where the underlying complexity is hidden. Users can now simply go to FL.AI, create a new federation, invite collaborators (akin to creating a project on GitHub), and add 'super nodes' – the individual nodes running on data silos. Once these super nodes are added, users possess everything needed to run decentralized and distributed AI workloads. Initiating these workloads is as simple as typing a command like 'flower run' in the terminal, specifying the federation and location, which then leverages the entire network. This transformation reduces the process from hundreds of steps to just a few clicks, making decentralized AI accessible to a much wider audience.

Collaborative AI agents: Moving beyond isolated reasoning

Current AI agents are often limited to accessing public data via web search or, in more advanced cases, have access to an organization's private data. This still leaves vast amounts of data inaccessible. The core limitation is the lack of collaboration. Similar to a single human working in isolation, AI agents can achieve more by working together. This involves enabling agents to message each other, share information, and receive responses. However, for production environments, this collaboration requires robust governance, auditability, discovery, security, and permissioning layers – precisely what Flower SuperGrid provides. The SuperGrid architecture features a coordinator (the 'superlink') and distributed 'super nodes' (organizations or data silos). The coordinating agent breaks down a task and sends messages to individual super nodes. These nodes are autonomous and can choose whether to participate. Once a message is accepted, the super node processes it using its local data and sends back a result. Importantly, super node operators can reject messages containing sensitive information from being sent externally, ensuring data privacy. The aggregating agent then combines these results to provide a comprehensive answer. This distributed collaboration allows agents to solve tasks that would be impossible for any single agent.

Project Kaya: A practical implementation of collaborative agents

As a demonstration of their collaborative agent capabilities, Flower Labs has developed Project Kaya. This is a collaborative AI agent built to run on the SuperGrid platform. The demo showcased how Project Kaya, initially appearing as a standard agentic system, becomes aware of its access to the SuperGrid network. It then reaches out to super nodes located in different geographical regions (San Francisco, Mumbai, Sydney, Seoul). While some nodes might be disconnected or refuse the message, the system intelligently aggregates replies from those that participate. The central agent then processes these aggregated results to generate a final response. This practical example highlights the power of decentralization, where data never moves, and only learnings are shared within a governed framework. Project Kaya is currently in early access, and interested parties can apply to the Flower Pilot Program to work with this technology.

Flower SuperGrid Frontier: Decentralized training for LLMs

The third crucial building block for advancing collaborative AI is a decentralized training pipeline. Once collaborative agents generate data traces across various locations, a robust system is needed to train models on this distributed data. Flower Labs brings years of experience in decentralized training research, with publications in top conferences like ICLR. They have developed advanced techniques such as decoupled embeddings for language model pre-training and strategies like Photon for federated pre-training. A significant recent announcement was the release of Lizzy 7B, an open-weight model specifically built for the UK market, demonstrating strong generalization while offering superior contextual performance for UK-specific queries. Lizzy 7B was trained using SuperGrid Frontier, Flower's decentralized training pipeline. This pipeline has demonstrated substantial efficiency gains, including up to a 1,000x reduction in communication costs for decentralized training runs. The process involves initializing a model, distributing it to participating super nodes, performing local training on their private data, sending updated model parameters back to a coordinator, and then aggregating these updates into a single, improved model. This cycle repeats until the model converges, ensuring a model that has learned from diverse datasets without the data ever leaving its origin.

The future is a collaborative grid, mirroring electricity and computing

Daniel Beutel reiterated the core message: the next frontier in AI is collaboration, moving from isolated data silos to interconnected, collaborative networks. He drew a parallel to the evolution of electricity, which transitioned from isolated generators to a unified, collaborative grid, enabling seamless energy transactions. Similarly, computing evolved from standalone mainframes to the interconnected internet and personal computing networks. Flower Labs is building the essential three layers to facilitate this collaborative future: Flower SuperGrid as the decentralized AI platform, SuperGrid Frontier as the decentralized training pipeline, and FL Agents and Project Kaya as collaborative agents that leverage these distributed networks. This vision positions collaborative networks as the inevitable next chapter of AI development, promising systems and capabilities previously unimaginable.

Common Questions

Flower Labs develops collaborative AI systems, focusing on enabling multiple parties to jointly use compute and data resources. They offer a decentralized AI platform, collaborative agents, and a decentralized training pipeline.

Topics

Mentioned in this video

More from DeepLearningAI

View all 98 summaries

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free