Key Moments

AI Dev 26 x SF | Manos Koukoumidis & Stefan Webb: VibeML: Build your AI model in hours, not months

DeepLearning.AIDeepLearning.AI
Education6 min read26 min video
May 22, 2026|249 views|4
Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

TL;DR

Enterprises are shifting from renting generic AI to owning specialized models, but building them previously took months. VibeML aims to automate this, enabling engineers to create custom AI in hours.

Key Insights

1

Enterprises are moving from consuming generic AI APIs (like OpenAI, Anthropic, Google Gemini) to building and owning their specialized AI models, a shift that began gaining traction in late 2025.

2

Specialized AI models offer dramatically higher quality, lower costs (potentially 10-100x smaller and more efficient), improved privacy, and greater control compared to generic models.

3

Building custom AI models has historically been prohibitively difficult, requiring months of effort and deep AI expertise for each use case, leading to a desire for a 'model factory'.

4

VibeML is presented as a 'model factory' that enables engineers to build specialized AI models automatically from a prompt, with the process taking hours instead of months.

5

A healthcare provider using VibeML technology saw a 20% quality improvement and a 70% cost reduction by building a custom model for extracting information from medical records.

6

A study utilizing VibeML technology to evaluate Google AI overviews found that only 39% of claims were fully supported by the provided sources, highlighting a significant hallucination rate.

The enterprise AI landscape is rapidly evolving

The predominant trend in enterprise AI is a significant shift from renting generic intelligence to owning specialized intelligence. Following the initial wave of Generative AI in late 2022, enterprises largely relied on consuming APIs from major providers like OpenAI, Anthropic, and Google. However, starting in late 2025, a faster-than-ever adoption of building proprietary, specialized models has emerged. This move is driven by several key benefits: significantly improved quality tailored to specific use cases, dramatically lower operational costs and latency, enhanced privacy and security by controlling data deployment, complete control over AI roadmaps, and the ability to build unique market differentiation and durable competitive advantages.

Why specialized models outperform generic ones

Generic AI models, built to perform reasonably well across a wide array of tasks, are inherently optimized for nothing specific. This means they are not ideal for niche enterprise use cases. In contrast, a specialized model is fine-tuned for a particular application, leading to superior performance. For instance, Intercom reported building a model better than GPT-3 for code generation that was five times cheaper to operate. Similarly, Kensho announced a coding agent model superior to GPT-3 and ten times cheaper to run. These specialized models can be 10 to 100 times smaller and more efficient than their generic counterparts, resulting in substantial cost savings and performance gains. Furthermore, owning a custom model provides enterprises with control over their data and infrastructure, preventing reliance on third-party roadmaps and terms of service.

The historical challenges of building custom AI

Despite the clear advantages, building specialized AI models has been a significant hurdle for most enterprises. The primary reasons are the immense effort and time required. Organizations have reported that developing a single specialized model for one use case could take months, a process that would need to be repeated for every new model or update. This pace is unsustainable, especially when coupled with the challenge of maintaining and improving models in production. Many companies also admit to lacking sufficient in-house AI expertise or the personnel to dedicate to such resource-intensive projects. This created a need for a solution that could abstract away the complexity and accelerate the development cycle.

VibeML: Automating specialized AI development

VibeML is presented as the solution to these challenges, functioning as an automated 'model factory'. It aims to enable any engineer, regardless of deep AI expertise, to build specialized AI models quickly, ideally within hours rather than months. The platform guides users through a structured process starting with defining the task, much like prompting a general AI, but with the goal of building a custom AI model. For example, a prompt like 'Build a model to summarize key news items in news articles in bullet point format' initiates the process. VibeML then devises a plan for building this model, which includes defining evaluators (metrics for success), synthesizing test data, evaluating baseline models, creating training data, fine-tuning the model, and then re-evaluating. The platform handles much of this complexity, allowing users to focus on defining their needs.

The VibeML workflow and agent-driven process

The VibeML platform uses an agentic approach to guide users through model development. After an initial task definition, the agent proposes a plan detailing steps like setting up evaluators, synthesizing data, and fine-tuning. For instance, when building the news summarizer, the agent suggested evaluators like completeness, conciseness, and format adherence, and the user could add more, such as 'faithfulness' to prevent hallucinations. The platform can also synthesize training data if custom data isn't provided. It suggests baseline models and evaluates them, providing scores across the defined metrics. A key feature is the 'review failure modes' tool, which analyzes where the model performed poorly and allows for the generation of targeted training data to improve those specific areas. The process then moves to fine-tuning, offering options like LoRA (Low-Rank Adaptation) for parameter-efficient tuning. Finally, the platform quantifies the improvements post-fine-tuning, allowing for iterative cycles of improvement until desired metrics are met. The ultimate goal is to create a deployable model that offers higher quality and lower cost than generic alternatives.

Real-world impact and enterprise adoption

Companies are already leveraging VibeML technology to achieve significant results. A leading healthcare provider used it to build a custom agent for extracting medical record information, achieving a 20% quality improvement and a 70% cost reduction compared to general-purpose AI models which suffered from high latency, cost, and low accuracy. In another instance, a global media giant, The New York Times, utilized VibeML's technology to conduct a study evaluating the faithfulness of Google AI overviews. This partnership resulted in a published article that highlighted a surprising finding: only 39% of claims made in Gemini 3 overviews were fully supported by their cited sources, indicating a substantial hallucination rate that VibeML's technology helped identify at scale. Other leading enterprises like Microsoft, HP, and IBM are also adopting VibeML.

VibeML's impact on differentiation and IP

The ability to rapidly build and iterate on specialized AI models is crucial for enterprise differentiation. By owning proprietary models that are continuously improved in production, companies can build compounding intellectual property (IP) and establish a competitive mode that generic models cannot match. This proactive ownership of intelligence, rather than reactive prompting of rented services, is positioned as the key differentiator for future enterprise success. VibeML empowers this by allowing engineers to create models that are not only better and cheaper for specific tasks but also evolve over time, outpacing competitors and generic AI advancements. This control and continuous improvement loop are vital for long-term market leadership.

Accessibility through open-source and enterprise platforms

VibeML offers both an enterprise platform and an open-source library, making its technology accessible to a broader audience. The open-source library, available on GitHub, has garnered significant traction with over 9,000 stars and active community contributions. The enterprise platform, launched recently, has seen rapid adoption with approximately 2,000 signups since its launch. This dual approach allows organizations to explore the capabilities of automated AI development and then scale to enterprise-grade solutions. Examples like fine-tuning a customer support model that outperforms top-tier LLMs from Anthropic (Opus, Sonnet, Haiku) with a much smaller, 0.8 billion parameter model, achieving 100x faster speeds and lower costs, illustrate the practical power of this technology for creating highly efficient, specialized AI applications.

Building Custom AI Models with Umei: Dos and Don'ts

Practical takeaways from this episode

Do This

Define your enterprise's specific problem and desired AI model use case.
Import your own data and domain expertise to build specialized models.
Choose to deploy your custom model on-premise, on-device, or in the cloud.
Monitor model performance in production and use insights for continuous improvement.
Leverage automated processes like those in Umei to build models efficiently.
Consider parameters like faithfulness, conciseness, and completeness when evaluating models.
Utilize failure mode analysis to synthesize targeted training data.

Avoid This

Continue relying solely on generic, rented AI APIs for critical business functions.
Assume generic models are optimized for your specific production use cases.
Overlook the importance of data privacy and security when using third-party AI.
Be beholden to the roadmaps and terms of service of large AI providers.
Underestimate the value of ownership and differentiation through custom AI models.
Ignore the potential for significantly lower costs and latency with specialized models.
Underestimate the time and expertise required without an automated solution.

Comparison of Custom vs. Generic AI Performance

Data extracted from this episode

MetricGeneric AI (Example)Custom AI (Umei Example)
Quality/AccuracyLower (optimized for nothing)Higher (optimized for an enterprise use case)
CostHigherLower (10x-100x more efficient)
LatencyHigherLower
Development TimeMonths per modelHours (with Umei)
Differentiation/ModeNone (competitors can use same tools)Compound IP and competitive advantage

Healthcare Provider AI Improvement with Umei

Data extracted from this episode

MetricPerformance Change
Quality+20%
Cost-70%

AI Model Comparison for Supporting Customer Queries

Data extracted from this episode

ModelAccuracyParametersSpeed/Cost
Umei Fine-tuned ModelHigher than Opus, Sonnet, Haiku0.8 Billion2 orders of magnitude faster and cheaper
Opus (Anthropic)Baseline for comparisonN/AN/A
Sonnet (Anthropic)Baseline for comparisonN/AN/A
Haiku (Anthropic)Baseline for comparisonN/AN/A

Model Performance in Calculating Hallucinations

Data extracted from this episode

ModelPerformance
Custom AI Model (Umei Technology)Outperforms GPT-5.2 and Opus for this specific task
GPT-5.2Outperformed by custom AI
OpusOutperformed by custom AI

Google AI Overviews Faithfulness Study (Gemini 3)

Data extracted from this episode

MetricResult
Claims fully supported by sources39%

Common Questions

Enterprises are moving towards owning specialized AI models to gain higher quality, lower costs, improved latency, better privacy and security, full control over their tools, and to build unique competitive advantages that generic, rented models cannot offer.

Topics

Mentioned in this video

More from DeepLearningAI

View all 98 summaries

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free