Can I still use a single AI model if I use a gateway like Portkey?

Yes, deterministic flows for a single model are a common use case. Gateways like Portkey can manage authentication and authorization, ensuring requests hit the intended model without necessarily performing automatic routing.

Why are AI gateways better than using SDKs like LangChain directly in application code?

Using SDKs directly can lead to complex codebases managing multiple API keys, model configurations, and functionalities like retries. Gateways specialize in these operational tasks, separating them from application logic for better manageability and focus.

What are the main jobs or features of an AI Gateway?

The primary jobs include efficient routing of traffic to various LLMs, comprehensive observability for monitoring performance and cost, and implementing guardrails to ensure response quality, safety, and adherence to specific rules.

How does Portkey ensure its AI Gateway is fast and reliable?

Portkey's open-source gateway aims for speed and reliability by reducing its size, using a JSON-based Transformer architecture. This results in a low memory footprint, high throughput on minimal compute, and decreased latency, allowing for efficient multi-HA cluster operation.

What kind of guardrails are most practically used in AI applications?

While sensitive data redaction is a demo highlight, practical guardrails often involve regex-based checks for specific words, ensuring appropriate response length, or catching empty/unknown responses to trigger orchestration or retries.

How does AI observability differ from traditional observability?

AI observability focuses on surfacing key LLM metrics directly from the gateway without requiring extensive code instrumentation. While it provides valuable AI-specific metrics, it's designed to integrate with existing observability platforms like Datadog or Splunk for broader tracing and tracking.

What are the challenges in standardizing AI observability, like with OpenTelemetry?

Standards like OpenTelemetry are evolving, but AI APIs change frequently, making standardization difficult. Current efforts might encode previous paradigms (like completion over reasoning) and may not keep pace with the rapid evolution of AI models and their interaction formats.

How is logging different for agentic AI systems compared to simple LLM completions?

Agentic systems are more complex, with multiple moving parts and potential for error multiplication, making them harder to debug when they fail. Observability needs to provide clear UI to follow agent flows, and support routing to different LLMs and services for various tasks within the agent.

What is the role of human feedback in improving AI applications?

Human feedback is crucial for optimizing AI, especially in production. Successful implementation involves mapping feedback to business metrics (e.g., video download rates) rather than just simple thumbs up/down, to close the loop and drive continuous improvement.

What is MCP and why is it considered important for AI agents?

MCP (Meta Communication Protocol) simplifies how LLM agents interact with external tools and services. It standardizes the registration and calling of tools, making it easier for agents to leverage various functionalities without developers needing to manage complex SDKs and networking themselves.

Key Moments

Why every AI Engineer needs an AI Gateway (ft Portkey.ai CEO)

Latent Space Podcast

Science & Technology4 min read32 min video

Feb 5, 2025|2,197 views|38|1

Save to Pod

Key Moments

TL;DR

AI Gateways streamline LLM integration with routing, observability, and guardrails for efficient, secure applications.

Key Insights

AI Gateways act as operational platforms for efficient LLM connection, enhancing cost, performance, and accuracy.

Routing capabilities are vital for complex AI systems, especially with reasoning models and agentic workflows.

Observability allows teams to monitor AI performance and accuracy through centralized metrics and dashboards.

Guardrails are crucial for preventing inaccurate or harmful AI responses, focusing on content, length, and sensitive data.

Human feedback is essential for AI improvement but often underutilized; effective collection links to business metrics.

MCP (Model Composable Protocol) simplifies LLM integration with various services by standardizing tool usage.

Open-source gateways can offer high throughput and low latency through optimized architectures like Transformers in JSON.

The evolution of AI requires operational efficiency tools, similar to the DevOps impact on cloud adoption.

DEFINING THE AI GATEWAY AND ITS CORE FUNCTIONS

AI Gateways are presented as operational platforms designed to manage connections to Large Language Models (LLMs) more efficiently. They aim to improve cost, performance, and accuracy by abstracting away the complexities of building individual integrations with various AI services. Beyond a simple proxy, a gateway routes traffic to appropriate LLMs and provides added value through features like monitoring, guardrails, prompt management, and governance, simplifying the process for development teams.

ROUTING CAPABILITIES AND COMPLEX AI SYSTEMS

While not always mandatory, routing capabilities are becoming increasingly important for complex AI systems. This is particularly evident with the rise of reasoning models where specific questions might be better handled by more powerful, albeit slower, models compared to simpler ones. Agentic systems also benefit significantly from routing, as different tasks within a multi-step agent can be directed to specialized LLMs, with the gateway managing authorization and other layers.

OBSERVABILITY FOR AI PERFORMANCE AND ACCURACY

Observability is a critical function of AI Gateways, enabling teams to monitor the performance and accuracy of their AI applications. By emitting metrics from a central point, gateways provide a consolidated view for dashboards, allowing teams to track improvements and identify issues. This is especially valuable for enterprises, where rate and budget limits can be enforced to prevent unexpected cost overruns due to rogue code or inefficient usage.

IMPLEMENTING GUARDRAILS FOR SAFER AI RESPONSES

Guardrails are essential for preventing AI responses that are inaccurate, incomplete, or harmful. While demonstrations often focus on sensitive data redaction, practical implementations frequently involve simpler rules like detecting specific words or empty outputs to trigger orchestration. For commercial models, content filtering and moderation are often built-in, but for open-source models, developers need to build these guardrails themselves, potentially integrating with third-party providers.

THE VALUE OF HUMAN FEEDBACK AND PRACTICAL COLLECTION

Human feedback is crucial for improving AI applications, yet it remains underutilized. Challenges include determining effective feedback mechanisms beyond simple thumbs up/down and integrating this feedback into production cycles. Successful implementations often link feedback to tangible business metrics, such as download rates for AI-generated videos, thereby closing the loop and providing actionable insights for continuous AI refinement.

MCP AND THE FUTURE OF AGENT CONNECTIVITY

Model Composable Protocol (MCP) is emerging as a key technology for simplifying the integration of various services with LLMs, particularly for agents. It standardizes how LLMs interact with tools, making it easier to register and call services like Slack. The protocol's potential for two-way communication, where agents can also suggest or request completions from LLMs, opens up new avenues for advanced agent capabilities, though its current implementation relies on RPC-style over stdio.

OPERATIONAL EFFICIENCY AND THE PRODUCTION AI ERA

As AI adoption moves from proof-of-concept to production, the demand for operational efficiency tools is growing. This parallels the evolution of the cloud era, where DevOps practices and tools like Datadog and Cloudflare were essential for building applications efficiently. AI Gateways and similar platforms are poised to become critical infrastructure, helping teams maintain reliability, efficiency, and cost-effectiveness in production AI deployments.

STANDARDIZATION CHALLENGES IN RAPIDLY EVOLVING APIS

Standardization efforts, such as those within OpenTelemetry (OTel), are being explored for AI observability. However, the rapid evolution of LLM APIs, which can change monthly, presents a significant challenge to establishing stable, universally adopted standards. While gateways can produce logs in an OTel-compliant format, the lack of coherence and stability in underlying APIs makes it difficult for standards to keep pace, particularly for newer paradigms like agentic reasoning.

OBSERVABILITY FOR AGENTS AND MULTI-LLM WORKFLOWS

Observing agentic behavior, especially when agents fail, is more complex than monitoring simple LLM calls due to error multiplication across multiple steps. Effective observability requires not only tracing but also intuitive UIs to visualize agent flows. Furthermore, managing multi-LLM agents where different parts use diverse LLMs and services to complete tasks requires seamless integration, allowing agents to dynamically decide and utilize available resources facilitated by the gateway.

THE ROLE OF PROMPT MANAGEMENT AND GOVERNANCE

Beyond core functionalities like routing and observability, AI Gateways are incorporating features for prompt management and governance. This includes managing prompt templates directly within the gateway, simplifying updates and consistency. Governance aspects like audit logs and cost attribution are also becoming integral, allowing for internal chargebacks and better oversight, ensuring that costs are distributed fairly among different teams utilizing the AI resources.

Mentioned in This Episode

●Products

●Software & Apps

●Companies

●Organizations

●People Referenced

Common Questions

An AI Gateway is an operational platform that helps teams connect to Large Language Models (LLMs) more efficiently. It improves cost, performance, and accuracy by managing connections, routing traffic, and providing services like monitoring, guardrails, and governance, reducing the need for individual API connections.

Topics

AI & Machine Learning Technology & Innovation Programming & Software LLM Operations Machine Learning Engineering Agentic Systems Model Routing AI Gateways Prompt Management

Mentioned in this video

Software & Apps

LangSmith

A platform for debugging, testing, and monitoring LLM applications, mentioned in the context of ambient agents.

Portkey

The company whose CEO is being interviewed, offering an AI Gateway solution.

Gemini

Google's AI model that can be accessed through a gateway.

LangChain

A framework for developing applications powered by language models, mentioned as an alternative that can lead to unmanageable code if not handled by a gateway.

Pydantic

A data validation library, mentioned in the context of code that should ideally be in a gateway.

Grafana

An open-source platform for data visualization and monitoring, mentioned as a destination for AI metrics.

OpenLLRy

Software related to OpenTelemetry, discussed in the context of evolving standards for AI observation.

Claude

An AI chatbot whose feedback mechanisms are discussed.

Companies

Anthropic

A major AI company whose APIs can be managed through a gateway.

Groq

An AI company whose APIs can be managed through a gateway.

Deso

The company where the host, Alesio, is a partner and CTO.

Small Hey

The company founded by Twix, one of the hosts.

OpenAI

A prominent AI company whose APIs are often integrated via gateways.

DeepSeek

An AI model that was quickly integrated into the Portkey gateway.

Splunk

An observability platform mentioned as a place where AI-related data can be sent.

DataDog

An observability platform mentioned as a place where AI-related data can be sent.

TraceLoop

A competitor company mentioned in the context of attempting to standardize observability in AI.

Juji

A company building ambient agent technology.

Salesforce

A large enterprise that inspires discussions around AI adoption.

Walmart

A large enterprise that inspires discussions around AI adoption.

People

Rohit Aral

CEO of Portkey, discussing AI Gateways.

Concepts

OpenTelemetry

A set of APIs, libraries, and tools for instrumenting and observing software, mentioned in relation to standardizing AI observability.

Organizations

Petronus

A guardrail provider that Portkey integrates with.

Pangia

A guardrail provider that Portkey integrates with.

Pillar

A guardrail provider that Portkey integrates with.

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Get Started Free

Why every AI Engineer needs an AI Gateway (ft Portkey.ai CEO)

Key Insights

DEFINING THE AI GATEWAY AND ITS CORE FUNCTIONS

ROUTING CAPABILITIES AND COMPLEX AI SYSTEMS

OBSERVABILITY FOR AI PERFORMANCE AND ACCURACY

IMPLEMENTING GUARDRAILS FOR SAFER AI RESPONSES

THE VALUE OF HUMAN FEEDBACK AND PRACTICAL COLLECTION

MCP AND THE FUTURE OF AGENT CONNECTIVITY

OPERATIONAL EFFICIENCY AND THE PRODUCTION AI ERA

STANDARDIZATION CHALLENGES IN RAPIDLY EVOLVING APIS

OBSERVABILITY FOR AGENTS AND MULTI-LLM WORKFLOWS

THE ROLE OF PROMPT MANAGEMENT AND GOVERNANCE

Mentioned in This Episode

Common Questions

Topics

Mentioned in this video

More from Latent Space

Marc Andreessen introspects on Death of the Browser, Pi + OpenClaw, and Why "This Time Is Different"

Moonlake: Multimodal, Interactive, and Efficient World Models — with Fan-yun Sun and Chris Manning

The Stove Guy: Sam D'Amico Shows New AI Cooking Features on America's Most Powerful Stove at Impulse

Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Found this useful? Build your knowledge library