How does GStack's 'Office Hours' feature help improve startup ideas?

Office Hours is a GStack skill modeled after Y Combinator's process. It asks key reframing questions, analyzes market demand, identifies pain points, and helps refine a business model, acting like a conversation with the AI to improve initial ideas.

Can GStack help with downloading tax documents like 1099s?

Yes, the video demonstrates how GStack can be used to build a tax app that automates the process of finding and downloading 1099 forms from your Gmail inbox using AI browser automation.

What is the difference between Claude Opus 4.6 and Codex in GStack?

Opus 4.6 is described as the 'ADHD CEO' – full of ideas but potentially unfocused. Codex is the 'autistic CTO' – more precise and better for executing difficult tasks under guidance. Together, they form a more capable AI engineering team.

How does GStack handle code review and bug catching?

GStack includes a 'review' skill that acts as a staff-level bug catching service, putting the work through paces to find potential issues. Additionally, it integrates browser automation tools like Playwright for automated QA and testing.

What are the benefits of using AI agents for software development?

AI agents, when structured into a team with roles and processes like GStack, can significantly accelerate software development. They allow for rapid iteration, design generation, and code production, drastically reducing the time and effort required compared to traditional methods.

How does GStack manage multiple projects and contributions?

GStack allows users to run multiple AI coding sessions in parallel across different projects or features. This enables simultaneous development of PRs and branches, and facilitates efficient evaluation of community contributions and bug fixes.

What is the 'ship tool' in GStack?

The ship tool is the final step in the GStack workflow before merging code. It ensures that a pull request is ready to be landed on the main branch, acting as a last check for readiness.

Key Moments

How to Make Claude Code Your AI Engineering Team

Y Combinator

Science & Technology7 min read22 min video

Apr 23, 2026|19,843 views|1,030|89

YC Y Combinator

Save to Pod

Key Moments

On this page

TL;DR

An open-source tool called GStack leverages AI agents like Claude Code to automate software engineering tasks, significantly accelerating development but requiring careful management of AI 'skills' for effective team collaboration.

Key Insights

Garry Tan coded more in the past two months than in all of 2013, highlighting the accelerated development pace with AI coding tools.

GStack has achieved over 70,000 GitHub stars, surpassing established tools like Ruby on Rails, indicating strong community adoption.

The 'Office Hours' skill in GStack is a distilled version of Y Combinator's partner sessions, designed to pressure-test startup ideas with six forcing questions before coding begins.

GStack's 'design shotgun' feature, utilizing OpenAI's DALL-E for image generation, can produce multiple visual design options for a UI element in about 60 seconds.

The `SLQA` and `SL browse` tools, built around Playwright and Chromium, automate browser interactions, testing, and QA, addressing a bottleneck Garry experienced with AI-driven development.

Garry Tan currently manages 10-15 parallel Claude code sessions and has approximately 400 open PRs to review, illustrating the scale of parallel development enabled by GStack.

The agent era demands a team-based approach to AI development

Garry Tan, president and CEO of Y Combinator, introduces GStack, an open-source toolkit designed to harness the power of AI agents, specifically Claude Code, by treating them as a cohesive engineering team. Tan emphasizes that the 'agent era' of building software requires a structured approach mirroring human teamwork, complete with defined roles, processes, and review mechanisms. He shares his personal experience, noting he has written more code in the past two months than in the entirety of 2013, highlighting the dramatic increase in development velocity enabled by AI. GStack's core philosophy is the 'thin harness, fat skills' model, providing a lightweight framework to manage specialized AI capabilities. The project has rapidly gained traction, accumulating over 70,000 GitHub stars, outperforming established tools like Ruby on Rails, underscoring the significant interest and potential of this AI-driven development paradigm.

Office hours: pressure-testing ideas like a YC partner session

One of GStack's foundational skills, 'Office Hours,' is modeled directly after the rigorous sessions YC partners conduct with startups. This skill is designed to critically evaluate a product idea before any code is written, employing six forcing questions to reframe the concept and identify potential weaknesses. Tan demonstrates this by simulating an office hours session for a tax app idea designed to extract 1099s from Gmail. The AI, in 'Gary mode,' reveals its reasoning process, showing how it searches for context, identifies pain points, and probes the viability of the idea. A key question it poses is, 'What's the strongest evidence you have that someone actually wants this?' This deep dive aims to uncover the true user need and business model potential, moving beyond a superficial solution to a more robust and scalable offering. The process simulates the personalized, adversarial feedback that founders receive at YC, helping to refine the idea and identify its core value proposition.

Navigating the 'adverse review' and conceptualizing business models

The simulated office hours session prompts a deep dive into the startup idea, revealing its potential as a wedge strategy. While the initial idea focuses on aggregating tax documents, the AI points towards a more lucrative expansion: matchmaking and lead generation for tax preparers. This represents a classic business model evolution, moving from a low-margin service ($2-5 per year) to a potentially 10x higher revenue stream based on transaction percentages. The AI challenges the premise, questioning why existing solutions like TurboTax or Plaid aren't sufficient, and pushes for a more comprehensive understanding of the user's needs. The interaction highlights how AI can act as a critical thinking partner, uncovering hidden opportunities and potential pitfalls. Tan even notes that the process is so engaging and insightful that he might build the app for himself simply for the learning experience, emphasizing that GStack fosters a conversational and exploratory approach to development.

Browser automation as an unexpected solution

During the office hours session, a critical consideration emerges: how to access and download tax documents. While initial thoughts might involve complex integrations like Plaid, the AI proposes a more unconventional but effective approach using browser automation, integrated into GStack's 'browser' skill. This method involves the user logging into their account, after which the AI takes over, navigating to tax document portals, locating the 1099s, and downloading the PDFs. A key advantage is that the process occurs within the user's visible browser, not in the cloud, enhancing transparency and control. This solution bypasses the need for stored credentials or deeper API integrations, relying instead on simulating user actions. Tan expresses surprise and admiration for this approach, noting that such creative solutions might not have been considered even a few months prior, underscoring the evolving capabilities of AI in software development.

Iterative design with 'design shotgun' and AI-generated previews

After the initial idea refinement in office hours, GStack offers tools for rapid design visualization. Tan demonstrates 'design shotgun,' a feature that leverages OpenAI's codecs and image generation capabilities to produce multiple visual mockups of a user interface in approximately 60 seconds. For the tax app, three distinct design directions are generated: a 'command center friendly' version, a more 'progress-oriented' friendly view, and a 'split view' option. Tan reviews these options, scoring them and selecting 'Option B' for its user-friendly, card-based approach with progress indicators. The ability to quickly iterate on visual design and receive AI-generated feedback significantly accelerates the design phase, allowing developers to lock in a direction before committing to extensive coding. This rapid prototyping capability is crucial for agile development, enabling quick adjustments based on AI-driven insights and user preferences.

Automating QA and the role of Codex as the 'autistic CTO'

Tan distinguishes between the creative, idea-generating aspect of AI, which he likens to an 'ADHD CEO' (like Claude Opus 4.6), and the meticulous, detail-oriented execution needed for coding. For the latter, he advocates for tools like Codex, describing it as the 'autistic CTO' – essential for debugging and ensuring code quality. GStack incorporates this by offering a 'review' skill that performs staff-level bug catching after code generation. Furthermore, Tan developed specialized tools like `SLQA` and `SL browse`, which wrap Playwright and Chromium. These tools enable AI agents to perform complex browser interactions, take screenshots, run regression tests, and identify real-world browser issues, mirroring human QA processes. This automation is crucial, as Tan found himself spending excessive time on manual QA, a task he deems the 'least fun part of software development' once AI handles the majority of coding.

Scaling parallel development and managing a high volume of PRs

GStack is designed to support a highly scalable development workflow, enabling users like Tan to manage an enormous volume of work. Tan reveals he runs 10 to 15 parallel Claude Code sessions simultaneously, often working across multiple open-source projects concurrently. This leads to a substantial number of open Pull Requests (PRs), currently around 400, which he reviews in waves. The tools within GStack, including office hours, adversarial review, and automated QA, streamline the process of evaluating and integrating community contributions. He emphasizes GStack's role in mitigating risks, particularly concerning supply chain attacks, providing a layer of security and integrated review. The platform transforms the development process from a sequential to a highly parallel one, allowing for rapid iteration and efficient management of a large-scale open-source contribution pipeline. This allows him to potentially handle 10, 15, 20, or even 50 PRs per day, depending on his meeting schedule.

The future of software development: collapsing barriers

Garry Tan concludes by asserting that we are witnessing the most incredible time in history to build software, with the barriers to entry drastically reduced by AI tools like GStack. The core challenge is no longer the technical difficulty but rather identifying 'what to build.' He encourages developers to leverage these advancements, stating, 'It's time to let it rip. Go make something people want.' The GStack toolkit, available on GitHub, offers a practical gateway into this new era, providing AI-powered capabilities that mimic the rigorous product thinking and development processes found at Y Combinator, democratizing access to sophisticated engineering support.

Mentioned in This Episode

●Software & Apps

●Companies

●Organizations

●People Referenced

GStack Workflow for Building with AI Agents

Practical takeaways from this episode

Do This

Utilize GStack's 'Office Hours' skill for initial product reframing and idea validation.

Leverage 'Gary Mode' to visualize AI reasoning traces and understand the model's thought process.

Employ 'Adversarial Review' to test and refine your design documents for potential issues.

Use 'Design Shotgun' for AI-generated visual design concepts.

Consider 'Auto Plan' for a streamlined process with default recommendations.

Integrate browser automation tools like Playwright for complex interactions and QA.

Monitor AI coding activity via Conductor, running multiple sessions in parallel.

Focus on supply chain security when evaluating community contributions.

Avoid This

Don't rely solely on the AI model's initial output without validation or review.

Avoid building without a clear understanding of user needs and market demand.

Do not assume AI-generated code is perfect; always use review and testing pipelines.

Don't neglect the importance of privacy, failure handling, and security measures.

Avoid manual QA if possible; automate it using tools like GStack's browser automation.

Common Questions

GStack is an open-source tool that transforms AI coding models like Claude Code into an AI engineering team. It uses specialized 'skills' and a 'thin harness fat skills' approach to manage AI agents, enabling them to perform complex software development tasks.

Topics

Ai Tools AI & Machine Learning Technology & Innovation Programming & Software Code Generation Developer Productivity AI Coding Assistants Agent-based Development Software Engineering Workflows AI-powered Startups

Mentioned in this video

Software & Apps

GStack

GStack is an open-source repository Gary built to function as an AI engineering team, using specialized skills and following a thin harness fat skills approach. It has gained significant traction on GitHub.

Codex

Codex is described as the 'autistic CTO' that complements Opus 4.6, suitable for when complex tasks require focused execution.

Plaid

Plaid connects to banks and is mentioned in the context of why existing solutions aren't fully solving the 1099 aggregation problem.

Ruby on Rails

GStack has more GitHub stars than Ruby on Rails, highlighting the rapid growth and interest in agent-based software development.

Conductor

Conductor is the platform through which GStack is accessed, and it provides features like quick start and Gary mode for visualizing AI reasoning.

Gmail

The tax app demo uses Gmail to find 1099 forms and other tax documents.

Playwright

Playwright is a tool wrapped by GStack at the CLI level to enable browser automation, including screenshots, complex interactions, and downloading media.

Chromium

Chromium is mentioned as the browser that GStack's CLI tool, built around Playwright, interacts with.

Opus 4.6

Opus 4.6 is described as the default Claude model, characterized as an 'ADHD CEO' with many ideas but needing the 'autistic CTO' (Codex) for when the going gets tough.

Bookface

Gary built the first version of Bookface, which is Y Combinator's internal social platform and knowledge base.

Products

TurboTax

TurboTax is mentioned as a service that has 1099 import features, but it's implied that these don't fully solve the user's problem.

Companies

Posterous

Gary co-founded Posterous, a microblogging platform that was sold to Twitter. He also mentions building it with AI agents took a fraction of the time compared to the original manual build.

GitHub

GStack has accumulated more GitHub stars than Ruby on Rails, indicating its popularity and adoption.

Twitter

Posterous, a microblogging platform co-founded by Gary, was sold to Twitter.

Y Combinator

Gary, the speaker, is the president and CEO of Y Combinator; the organization's internal social platform and knowledge base, Bookface, was built by him.

Block

HR Block is mentioned as a service that has 1099 import features, but it's implied that these don't fully solve the user's problem.

OpenAI

OpenAI's codecs are used by GStack for image generation tasks during the design phase.

People

Andrej Karpathy

Andrej Karpathy was mentioned as someone who noted that they were no longer manually writing code.