How do AI companies like Parahelp structure their prompts for customer support agents?

Effective prompts for AI agents often start by defining the LLM's role, clearly stating the task (e.g., approve/reject tool calls), outlining a step-by-step plan, and specifying output structure. They can also incorporate XML-like formatting for better parsing.

What are the different types of prompts used in AI agent architecture?

An emerging architecture includes system prompts (defining the high-level API), developer prompts (adding specific customer context), and user prompts (input from end-users).

How can developers prevent LLMs from hallucinating or making up answers?

Developers should include an 'escape hatch' in the prompt, instructing the LLM to state if it lacks enough information to answer, rather than fabricating a response. An alternative is to include a 'debug info' parameter in the response format where the LLM can report underspecified information.

What is the role of a 'forward-deployed engineer' in startups?

A forward-deployed engineer embeds themselves with users to deeply understand their problems and workflows. They then translate these insights into software, enabling rapid iteration and demonstrating value directly, much like early Palantir engineers did.

What are the key differences between Claude 3.5 and Llama 4 in terms of steerability?

Claude 3.5 is described as more 'happy and human steerable,' while Llama 4 requires more steering and is 'rougher to work with.' Llama 4 might be better suited for users skilled in extensive prompting and fine-tuning.

How can LLMs help founders evaluate potential investors?

LLMs can be used with scoring rubrics to evaluate investors based on criteria like responsiveness and process. They can highlight differences, like Gemini 2.5 Pro's flexibility versus GPT-4's rigidity, providing nuanced insights beyond simple scores.

Key Moments

State-Of-The-Art Prompting For AI Agents

Y Combinator

Science & Technology4 min read32 min video

May 30, 2025|339,025 views|6,348|166

YC Y Combinator

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Prompt engineering is crucial for AI agents, focusing on detailed instructions, metaprompting, and the owner's deep understanding of users.

Key Insights

Prompt engineering has evolved from a workaround to a critical component for effective AI agent interaction.

Detailed, structured prompts, including role assignment, task breakdown, and output formatting, are essential for AI agent performance.

Metaprompting, where prompts can dynamically generate or improve themselves, is a powerful technique for enhancing AI capabilities.

The 'forward-deployed engineer' model, emphasizing deep user understanding and rapid software iteration, is key for founders in AI.

Evals (evaluations) are considered the true intellectual property, providing the context and data necessary for prompt improvement and AI success.

Different LLMs exhibit distinct 'personalities' and require varying levels of guidance, impacting how prompts should be constructed.

THE EVOLVING ROLE OF PROMPT ENGINEERING

Prompt engineering, initially seen as a temporary fix for interacting with large language models (LLMs), has become a fundamental aspect of AI development. It's likened to early-stage software development in 1995, where tools were nascent, and to managing human personnel, requiring clear communication for optimal decision-making. This evolution highlights the growing sophistication and necessity of precisely guiding AI systems.

DECONSTRUCTING ADVANCED PROMPTS

Effective prompts for AI agents are highly detailed and structured. They typically begin with defining the LLM's role, followed by a clear breakdown of the task, and a high-level plan with step-by-step instructions. Crucial elements include specifying desired output formats for integration with other systems and providing 'important things to keep in mind' to prevent deviation. The use of markdown-like formatting and XML-style tags further aids LLM comprehension and adherence to instructions, resembling programming more than natural language writing.

THE POWER OF METAPROMPTING AND DYNAMIC GENERATION

Metaprompting, where an AI can generate or refine its own prompts, is a significant advancement. Techniques like 'prompt folding' allow a prompt to dynamically create specialized versions based on previous queries or user input. This self-improvement mechanism enables prompts to become more robust and tailored, especially when dealing with complex tasks or when manual prompt writing becomes inefficient, effectively turning prompt creation into an iterative, AI-assisted process.

THE 'FORWARD-DEPLOYED ENGINEER' PARADIGM

The concept of a 'forward-deployed engineer' (FDE), originating from Palantir, is highly relevant for AI startup founders. FDEs embed themselves with users, deeply understanding their workflows and challenges to rapidly build and iterate on software solutions. This hands-on, empathetic approach, focusing on user needs and delivering tangible value quickly through demos, allows founders to outmaneuver larger, more established companies and build crucial domain expertise which forms their defensible moat.

THE CRITICAL IMPORTANCE OF EVALUATIONS (EVALS)

While prompts are important, evaluations (evals) are identified as the true crown jewels of AI companies. Evals provide the necessary context and objective measures to understand why a prompt is written a certain way and how to improve it. They are derived from deep, qualitative understanding of user needs and reward functions, often requiring founders to be intimately familiar with niche domains. This deep understanding, codified into evals, is what truly differentiates successful AI products and creates lasting competitive advantage.

NAVIGATING LLM PERSONALITIES AND RELIABILITY

Different LLMs exhibit distinct 'personalities,' impacting how they respond to prompts and rubrics. Some models are more rigid and strictly adhere to instructions, while others are more flexible and can reason through exceptions, similar to human employees. A critical aspect of ensuring reliability is building 'escape hatches' into prompts, instructing the AI to ask for clarification rather than hallucinate when information is insufficient. This requires careful prompt design to manage AI behavior and prevent undesirable outputs.

STRUCTURING AI INTERACTIONS WITH SYSTEM, DEVELOPER, AND USER PROMPTS

An emerging architectural pattern involves structuring prompts into three layers: system, developer, and user prompts. The system prompt defines the core API and company operations, while the developer prompt adds specific customer context and nuances. The user prompt, if applicable, captures direct end-user requests. This layered approach helps in creating scalable, general-purpose AI products while allowing for necessary customization without turning into a bespoke consulting service for each client.

THE ROLE OF CONTINUOUS IMPROVEMENT AND DEBUGGING

Continuous improvement, akin to the Kaizen principle, is vital in prompt engineering. Users can leverage LLMs themselves to critique and refine existing prompts. This involves feeding a prompt into the LLM and asking it to suggest improvements or identify weaknesses. Furthermore, detailed debugging information, such as thinking traces and error reports within the LLM's output, is invaluable for developers to pinpoint issues and iteratively enhance prompt effectiveness, mirroring software development's test-driven approach.

OPTIMIZING FOR PRODUCTION AND SCALABILITY

For production environments, especially where latency is critical, a common strategy involves using larger, more capable models for prompt refinement and then distilling those refined prompts into smaller, faster models. This process ensures high performance and acceptable response times, crucial for user experience in applications like voice AI. Additionally, using LLMs to automatically extract and ingest examples from customer data can streamline the process of customizing prompts for various clients.

THE CHALLENGE OF GENERALIZATION VS. SPECIALIZATION

A significant challenge for vertical AI agents is balancing flexibility for special-purpose logic with the need to avoid becoming a consulting firm. The concept of 'forking and merging' prompts across customers addresses this by defining which parts of a prompt are company-wide standards versus customer-specific. This allows for scalable solutions that can adapt to diverse customer workflows and preferences without requiring entirely new prompt development for each unique client.

Mentioned in This Episode

●Software & Apps

●Companies

●Organizations

●Concepts

●People Referenced

Prompt Engineering Best Practices

Practical takeaways from this episode

Do This

Define clear roles for the LLM.

Break down tasks and outline high-level plans.

Use markdown-style formatting for structure.

Provide examples to help LLMs reason about complex tasks.

Consider using XML tags for better LLM parsing.

Implement an escape hatch for the LLM if information is insufficient.

Use 'debug info' in the response format for developer feedback.

For metaprompting, give the LLM the role of an expert prompt engineer.

Leverage LLM thinking traces (like Gemini's) for debugging prompts.

Focus on user needs and codify them into specific evals.

Founders should act as forward-deployed engineers, embedding with users.

Use well-refined prompts from larger models to improve smaller, faster models (distillation).

Provide LLMs with clear rubrics for scoring, but acknowledge exceptions.

Understand the distinct personalities and steerability of different models (e.g., Claude vs. Llama).

Avoid This

Don't allow LLMs to hallucinate when information is lacking; instruct them to ask for clarification.

Don't rely solely on sales tactics; demonstrate working software quickly.

Don't treat prompts as the sole 'crown jewel'; evals are crucial for understanding and improvement.

Avoid becoming a consulting company by building generic yet flexible prompts.

Don't assume LLMs will strictly adhere to rubrics; some offer more flexibility (e.g., Gemini 2.5 Pro).

Common Questions

Metaprompting involves using an AI model to refine or generate prompts. It's powerful because it can dynamically create better versions of prompts, overcome task complexity with examples, and improve output quality, much like a skilled prompt engineer.

Topics

Metaprompting Vertical AI Prompt Folding Evaluations Model Personalities Debugging AI

Mentioned in this video

Companies

Parahelp

A company specializing in AI customer support, powering support for Perplexity, Replet, and Bolt. They shared their prompt internally.

Giger ML

A company specializing in customer support, especially voice support, that uses the forward-deployed engineer model to close deals and refine its AI.

Jasberry

A company working with the speaker that builds automatic bug finding in code, using complex examples in meta prompts.

Booz Allen

A large enterprise company that can be challenged by startups using the forward-deployed engineer model.

Zeppo

A company with which Giger ML closed a large deal.

Ducky

A YC company that has been helped by Troopier for in-depth understanding and debugging of prompts and return values in multi-stage workflows.

People

Nathan Gettings

Co-founder of Palantir, who recognized the need for technologists to solve complex problems in Fortune 500 and government agencies.

Software & Apps

Palantir Foundry

Palantir's core data visualization and data mining suite, used by forward-deployed engineers to build and get feedback on software within days.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free