Key Moments

Why AI Agents Need Context | Deep Dives with a16z

a16za16z
Science & Technology6 min read51 min video
Jun 2, 2026|7 views
Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

TL;DR

AI agents require extensive context, forcing companies to centralize data. Vendors are locking down APIs, a move that, while seemingly protective, poses a risk to customer data access and future AI integration.

Key Insights

1

AI agents need context, making unified data platforms crucial, much like connecting ChatGPT to the internet was a breakthrough.

2

Vendors like SAP are implementing API policies that ban AI agent access to their systems, a reaction to perceived threats from AI.

3

The perceived threat of AI agents disintermediating SaaS vendors is compared to API openness debates in the 1990s, with historical precedent suggesting a continued trend towards openness.

4

Fiverr's 'Open Data Infrastructure' website benchmarks vendors on data access policies, highlighting SAP and Salesforce (with Slack as an exception) as struggling in this area.

5

George Fraser argues that 'data gravity' is fake, stating that the actual network traffic for data replication is minimal due to efficient change data capture, not massive data movement.

6

Fiverr is leveraging AI internally for its core business of building and maintaining connectors, seeing AI as an opportunity to improve quality and reliability, rather than solely a threat.

AI agents demand centralized data for contextual understanding

The primary driver for consolidating data into a single location is the escalating need for context for AI agents. George Fraser, CEO of Fiverr, likens this to the pre-internet era of ChatGPT, where limited knowledge restricted its utility. For AI agents to be effective in a business environment, they require access to comprehensive and up-to-date business data. Historically, companies centralized data for business intelligence and reporting; now, this same data foundation is essential for AI agents to understand and act upon business operations. Without this context, AI agents operate with a significant knowledge gap, similar to an AI with a knowledge cut-off date unable to access real-time information.

Vendors' defensive stance against AI agents

In response to the perceived threat of AI agents undermining their value, some Software as a Service (SaaS) vendors are adopting a defensive strategy by restricting data access. Fraser highlights SAP's recent API policy change, which explicitly bans AI agent access unless specifically approved by SAP, as a prime example of this 'lock-it-out' approach. While such policies might not immediately override existing contracts, they signal a broader trend of vendors attempting to control how their systems and data are accessed and utilized by AI. This reaction stems from concerns that agents, by accessing data directly, could replicate or substitute the functions of existing SaaS applications, thereby reducing the perceived value of those applications as primary interfaces.

The historical pattern of API openness repeats with AI

The current vendor reaction to AI agents accessing APIs echoes debates from the 1990s surrounding the opening of APIs. Historically, companies that embraced open APIs survived and thrived by allowing integration and customization, while those that remained closed often became legacy systems. Fraser argues that locking down APIs is a foolish strategy for vendors, as customers have relied on these open APIs for decades. While agents might access APIs in ways that substitute for human workflows, the underlying principle remains the same: open access fosters innovation and customer reliance. The concern that agents will disintermediate SaaS is also tempered by the observation that software costs are often a small percentage of overall business spend, suggesting AI will be used to improve core business functions rather than simply cutting seat licenses.

Centralized data platforms remain vital for AI

Despite the evolving landscape, the need for robust, centralized data platforms remains paramount. Fraser emphasizes that businesses must ensure they have a copy of all their data in a controlled data lake. This is crucial for meaningful reporting, understanding business operations, and providing the necessary context for AI agents. Vendors who erect barriers to data access hinder this essential practice. While customers may find workarounds, these often come at significant cost and complexity. Fiverr, through its 'Open Data Infrastructure' initiative, actively scores vendors based on their data access policies, flagging those that impose egress charges, make complete data extraction difficult, or enforce restrictive terms of use. SAP and Salesforce (with the notable exception of Slack) are identified as areas where vendor-customer data access is historically challenging.

The myth of data gravity and efficient replication

George Fraser challenges the concept of 'data gravity,' which posits that the sheer volume of data makes it prohibitively expensive to move, thereby anchoring services to specific data locations. He argues that this is a misconception, influenced by inefficient legacy data pipelines that copied entire datasets daily. With modern techniques like change data capture (CDC), only incremental changes are replicated, resulting in minimal network traffic even for massive datasets. Fiverr's own network dashboards demonstrate this, showing tiny data movement relative to the total data replicated for thousands of customers. This suggests that the cost and complexity associated with moving data are often overstated, and the focus should be on efficient replication rather than assuming data is immovably 'heavy'.

AI agents evolving towards distinct identities

The interaction model for AI agents is shifting from treating them as mere software seats or extensions of users to recognizing them as entities with distinct identities. Early approaches involved giving agents access to personal emails and API keys, but concerns about privacy and overwriting human workflows led to a new paradigm. The adoption of independent identities, complete with their own email addresses and phone numbers, allows agents to integrate into existing human workflows without compromising user data. While this can initially create more 'seats' or consumption, the long-term vision might lean towards more consolidated, unified agent systems, though the current intermediate form allows for easier integration into existing business processes.

Fiverr leverages AI to enhance core connector business

Contrary to the notion that AI might commoditize its services, Fiverr sees AI as a powerful tool to enhance its core business of building data replication connectors. The complexity of accurately replicating data from diverse systems of record is immense, often requiring significant human effort. Fiverr is increasingly using AI coding agents, akin to an 'infinite supply of junior engineers,' to identify and fix bugs, improve completeness, and accelerate the development of its 750+ connectors. While the deep intricacies of data replication still present challenges, AI is enabling Fiverr to push the boundaries of quality and reliability, seeing this as an opportunity rather than a direct threat to its existence.

The future of data infrastructure and the SaaS apocalypse

Despite predictions of a 'SaaS apocalypse' driven by AI-native companies, Fraser believes the threat is more nuanced. While AI may enable new companies to emerge and compete effectively, the fundamental need for established data infrastructure and platforms like Snowflake, Databricks, or BigQuery remains strong. AI is currently generating more demand for infrastructure rather than commoditizing it. The most susceptible layer to AI disruption is likely the highest abstraction layer, where user-friendly interfaces might be superseded by AI's ability to navigate more complex systems directly. The merger with DBT, a leader in data transformation, further solidifies Fiverr's position to provide comprehensive data foundations essential for the AI era. Ultimately, large enterprises that have invested in modern data platforms likely already possess a suitable foundation for AI, emphasizing continuity over radical replacement.

Navigating Vendor Data Access in the Age of AI

Practical takeaways from this episode

Do This

Insist on having a copy of all your company data in a data lake you control.
Write language guaranteeing your own data access into your MSAs, especially for large contracts.
Understand that AI agents primarily use data for context, not usually for training models.
Recognize that open APIs have historically been crucial for SAS vendor longevity.
Embrace AI to improve business operations rather than solely focusing on reducing software seat costs.
Leverage AI tools to build custom connectors and improve core business functions.
Consider modern data platforms like Snowflake, Databricks, or BigQuery as strong foundations for AI.

Avoid This

Overreact to policy memos from vendors like SAP regarding AI agent access.
Assume data gravity is real; egress charges are less of a barrier than perceived due to change data capture.
Rely solely on vendor-provided AI tools if they restrict data access.
Build unnecessary intermediate layers like MCPs if agents can call APIs and tools directly (though MCPs do solve practical problems).
Assume AI commoditizes infrastructure; it generally increases demand.
Think you need exotic new systems for AI data foundations; existing modern platforms are often sufficient.

Common Questions

AI agents need context because they function like a pre-internet ChatGPT. Without access to a business's up-to-date internal data, their ability to answer questions and perform tasks is significantly limited, akin to having a knowledge cut-off.

Topics

Mentioned in this video

Software & Apps
ChatGPT

Mentioned as a reference point for AI agents needing internet connectivity and data context, comparing it to pre-internet versions.

PostgreSQL

Criticized as old technology with significant technical debt, with the speaker arguing that undergraduate projects write better databases.

dbt

Mentioned as a company that merged with Fiverr. DBT is a tool used to organize and model data after it's centralized.

NetSuite

One of the SAS systems from which Fiverr helps customers ingest data.

Open Data Infrastructure

A website and benchmark created by Fiverr to score vendors on their data access policies, evaluating egress charges, data completeness, and terms of use.

Slack

Mentioned as a SAS tool where Salesforce has historically been good with data access, except for Slack, where they are 'terrible'. Also mentioned as a platform where AI agents might join teams.

BigQuery

Mentioned as a modern data platform that can serve as a good foundation for AI context.

AWS

Mentioned as a cloud vendor and as a platform where Fiverr's networking dashboard shows surprisingly low data movement.

GCP

Mentioned as a cloud platform where Fiverr's networking dashboard shows surprisingly low data movement.

Nanobot

A personal AI agent that the host transitioned to from OpenClaw and Nanoclaw, used for managing a tennis team.

Nanoclaw

An AI agent that the host used after OpenClaw, finding it complex. Debugging Nanoclaw led to the exploration of Nanobot. It's mentioned in the context of debugging efforts.

Selenium

Used for Python browser automation to interact with websites like the USDA site for the tennis team agent.

SQLite

Mentioned as a reference for a proof-of-concept OLTP SQL database project that uses S3 as its backing store.

S3

Mentioned as the backing store for a proof-of-concept OLTP SQL database project.

SQL Mesh

Acquired by Fiverr, as part of Fiverr's stated acquisition strategy.

GPT-3

Mentioned as an early version of AI that Fiverr has used to build data replication connectors.

More from a16z Deep Dives

View all 54 summaries

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free