What is Haystack and what does it do?

Haystack is an open-source LLM orchestration framework by Deepset. It provides tools and components for Python developers to build and deploy real-world agentic AI systems.

How does Haystack handle complex workflows?

Haystack uses pipelines, which are directed acyclic graphs, allowing for flexible data flow with branches and loops. Components are modular units with single functions that connect to form these pipelines.

How can I create custom logic in Haystack?

You can easily build custom components in Haystack as Python classes with a `@component` decorator and a `run` method, which can then be plugged into your pipelines.

How can an agent interact with external tools like GitHub?

Haystack agents can be equipped with tools. Custom Python components can be converted into tools using the `ComponentTool` class, allowing the agent to call their functions, like viewing repositories or posting comments.

How do I deploy a Haystack agent?

Haystack provides Hay Hooks, a tool to deploy and serve Haystack pipelines as REST APIs. This allows you to expose your agent's functionality via HTTP endpoints.

What is Deepset Studio?

Deepset Studio is a free, visual development environment for Haystack. It allows you to build pipelines using a drag-and-drop interface before exporting the code or deploying.

What happens if the code repository is very large?

For very large repositories, agents might struggle with broad searches. A potential approach is to split the agent's work into modules or use issue classification to direct the agent to specific parts of the codebase.

Key Moments

AI Dev 25 | Bilge Yücel: Building and Deploying Agentic Workflows with Haystack

DeepLearning.AI

Entertainment5 min read30 min video

Mar 27, 2025|3,245 views|52

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Build and deploy AI agents with Haystack: modular components, tool integration, and production-ready deployment via Hay Hooks and Deepset Studio.

Key Insights

AI agents require more than just LLMs; they need robust engineering for planning, tool integration, and workflow structuring.

Haystack is an open-source framework for building AI agents using modular components that can be connected into directed acyclic graphs (DAGs) called pipelines.

Custom components in Haystack are Python classes that can be easily integrated, providing flexibility for specific use cases and tools.

Haystack's agent component supports tool-calling, enabling agents to interact with external functionalities like GitHub APIs to perform actions.

Hay Hooks allows for the deployment of Haystack pipelines as REST APIs, simplifying production readiness and offering OpenAI-compatible endpoints.

Deepset Studio provides a visual, drag-and-drop environment for building, testing, and deploying Haystack pipelines, with options to export code.

UNDERSTANDING AI AGENTS AND HAYSTACK

AI agents are autonomous, LLM-based systems capable of planning actions to achieve a goal, potentially accessing memory and tools. Building such agents goes beyond simply using a large language model; it involves significant engineering to structure workflows and integrate functionalities. Haystack, an open-source LLM orchestration framework by Deepset, provides the necessary tools for Python developers to construct these real-world agentic AI systems. It utilizes a component-based architecture where individual components perform single functions and can be connected to form pipelines.

HAYSTACK'S COMPONENT AND PIPELINE ARCHITECTURE

The core of Haystack lies in its components and pipelines. Components are discrete, single-function units, such as creating embeddings, generating text, or retrieving data. Pipelines are directed acyclic graphs (DAGs) formed by connecting these components, offering flexibility in controlling data flow with branches and loops, which is crucial for agentic behavior. This modularity allows for easy swapping of components, like replacing one retriever with another, to experiment with different approaches or models. Custom components can also be created as simple Python classes with a decorator and a `run` method, easily pluggable into any pipeline.

DESIGNING A GITHUB ISSUE RESOLVER AGENT

To illustrate agent development, a GitHub issue resolver agent was proposed. This agent takes a GitHub issue URL, reads its comments, and generates instructions on how to solve the issue by understanding the codebase. The envisioned pipeline involves several components: one to fetch the issue and comments, another to format this information into a prompt for the LLM, and finally, an agent component. This agent component utilizes tool-calling capabilities, specifically with Anthropic's Claude 3.5 Sonnet model, and is equipped with tools for viewing the GitHub repository and writing comments back to the issue.

DEVELOPING CUSTOM COMPONENTS AND TOOLS

For the GitHub issue resolver agent, custom components were developed. The `IssueViewer` component retrieves issue descriptions and comments from a given URL, returning them as a list of documents. The `RepoViewer` component, designed to be used as an agent tool, recursively navigates a GitHub repository path, returning directory contents or file content as documents. The `IssueCommentor` component is responsible for posting the agent's generated comments back to the GitHub issue via its API. These components are Python classes decorated for Haystack integration and feature a `run` method for execution.

INTEGRATING AGENTS WITH TOOL-CALLING CAPABILITIES

Creating the agent component involves defining its 'brain' – a chat generator, specified here as an Anthropic chat generator using the Claude 3.5 Sonnet model. Components are transformed into tools using `ComponentTool`, which automatically generates necessary metadata like descriptions and parameter schemas from the component's docstrings and signatures. The agent is then configured with a system prompt providing detailed instructions (e.g., to format comments in Markdown) and a list of these created tools. An `exit_condition` parameter can be set to define when the agent should stop, such as immediately after using the comment-writing tool.

BUILDING AND TESTING THE HAYSTACK PIPELINE

The defined components are assembled into a Haystack pipeline, a directed graph where edges represent the flow of data between nodes. The `IssueViewer`'s output connects to the `IssueBuilder`, and its prompt output connects to the agent. Testing the pipeline involves initializing it with input data, such as a GitHub issue URL. The pipeline execution shows real-time rendering of component runs. In a live demo, the agent successfully analyzed a GitHub issue within the Haystack repository, identified the relevant file for a code change, and proposed two approaches for resolution, demonstrating its ability to navigate and understand the codebase.

DEPLOYING AGENTS WITH HAY HOOKS

Transitioning an agent from prototype to production requires deployment. Hay Hooks is a tool within Haystack for serving pipelines as REST APIs. It simplifies wrapping pipelines with custom logic and exposing them via HTTP endpoints, including OpenAI-compatible chat completion endpoints, useful for integrating with UIs. To deploy, a pipeline wrapper acts as a handler between the pipeline and the API endpoint. Using `hayhooks run`, a server is started, and then `hayhooks pipeline push` deploys the pipeline. The deployment provides access to API documentation (Swagger), allowing interaction with the agent, such as submitting an issue URL and receiving proposed solutions in comments, all managed via web requests.

VISUAL DEVELOPMENT WITH DEEPSET STUDIO

For a more visual development experience, Deepset Studio offers a free, open-source environment for building Haystack pipelines with a drag-and-drop interface. Users can visually construct complex pipelines, test them within the platform, and collect feedback via interactive elements like thumbs up/down. Once satisfied, pipelines can be deployed directly on the platform or exported as YAML or Python code for deployment elsewhere, using tools like Hay Hooks. This approach streamlines development, allowing creators to focus on logic rather than extensive coding, and offers flexibility for integration into various deployment scenarios.

SCALABILITY AND CHALLENGES IN LARGE CODEBASES

A key question regarding agentic workflows, especially in large codebases, concerns how the LLM navigates and understands vast amounts of code. While agents can perform breadth-first searches, this can be time-consuming for deeply nested files. Potential solutions involve modularizing the codebase and instructing the LLM to focus on specific repository sections based on issue classification, suggesting that intelligent routing and selective code analysis are crucial for managing complexity and improving efficiency in large-scale projects. This remains a potential bottleneck for agents operating on extensive code repositories.

Mentioned in This Episode

●Software & Apps

●Companies

●Concepts

●People Referenced

Building and Deploying Agents with Haystack

Practical takeaways from this episode

Do This

Define components with a single function and a `@component` decorator.

Connect components to form directed acyclic graphs (pipelines) for data flow control.

Convert Python components into tools for agents using the `ComponentTool` class.

Utilize `Hay Hooks` to deploy Haystack pipelines as REST APIs.

Leverage `Deepset Studio` for a visual, drag-and-drop pipeline development experience.

Export developed pipelines from Deepset Studio as YAML or Python code.

Use prompt engineering and system messages to guide agent behavior.

Define exit conditions for agents to control their stopping behavior.

Avoid This

Do not expect agents to be purely LLM-based; significant engineering is required.

Do not overcomplicate the initial pipeline; start simple and iterate.

Do not rely solely on the LLM's understanding for complex codebases; consider modularity.

Do not forget to set up API keys for services like Entropic and GitHub.

Common Questions

An agent is an LLM-based autonomous system capable of planning and executing actions to achieve a goal. It might have access to memory or tools to carry out its tasks.

Topics

Haystack GitHub API REST APIs Component-Based Architecture Deepset Studio Hay Hooks

Mentioned in this video

Software & Apps

Haystack

An open-source LLM orchestration framework developed by Deepset, used for building real-world agentic AI systems.

YAML

A data serialization format that can be used to export Haystack pipelines from Deepset Studio.

FastAPI

A web framework used by Hay Hooks, providing automatic API documentation.

Hay Hooks

A tool within the Haystack ecosystem for deploying Haystack pipelines as REST APIs.

Deepset Studio

A free, visual development environment for creating and deploying Haystack pipelines.

Jinja2

A templating engine used in Haystack for creating dynamic prompts for LLM agents.

Companies

Deepset

The company behind the open-source Haystack LLM orchestration framework.

People

Bilge Yücel

The speaker, a developer relations engineer at Deepset, presenting on building and deploying agentic workflows with Haystack.

Locations

Istanbul

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free