What were the main challenges enterprises faced when trying to build their own specialized AI models?

The primary challenges were the significant effort and time required, often taking months for a single model, the prohibitive cost and effort of keeping models updated, and a potential lack of internal expertise or sufficient personnel to build and maintain these custom models.

What is Umei and how does it solve the challenges of building custom AI?

Umei is described as a 'model factory' that automates and streamlines the AI development lifecycle. It allows engineers to import data, define tasks, synthesize data, fine-tune models, and deploy them efficiently, reducing the time from months to hours and making custom AI accessible to more engineers.

How does the Umei platform guide the model development process?

Umei uses an agentic approach that devises a plan, starting with defining evaluation metrics (like completeness, conciseness, and faithfulness). It then synthesizes data, evaluates baseline models, creates training data, fine-tunes the model, and re-evaluates to measure improvements.

What are some real-world examples of companies using Umei technology?

A leading healthcare provider improved quality by 20% and reduced costs by 70% by building a custom model for medical records. The New York Times used Umei's technology to study and publish findings on hallucinations in Google AI overviews.

What is the benefit of using a fine-tuned Umei model for customer support?

A fine-tuned Umei model for customer support achieved higher accuracy than leading models like Anthropic's Opus, Sonnet, and Haiku, using a significantly smaller parameter count (0.8 billion). This makes it much faster and cheaper to run, suitable for local or on-device deployment.

What was the surprising finding from the study on Google AI overviews using Umei technology?

The study revealed that for Gemini 3, only 39% of the claims made in AI overviews were fully supported by the cited sources, indicating a significant potential for misinformation or unsupported statements.

What is the key takeaway regarding future enterprise success in AI?

The winners of the next era will be enterprises that own their specialized AI intelligence, continuously improve it in production, and build compounding compounds of IP, rather than those who merely consume APIs from generic model providers.

Key Moments

AI Dev 26 x SF | Manos Koukoumidis & Stefan Webb: VibeML: Build your AI model in hours, not months

DeepLearning.AI

Education6 min read26 min video

May 22, 2026|249 views|4

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Enterprises are shifting from renting generic AI to owning specialized models, but building them previously took months. VibeML aims to automate this, enabling engineers to create custom AI in hours.

Key Insights

Enterprises are moving from consuming generic AI APIs (like OpenAI, Anthropic, Google Gemini) to building and owning their specialized AI models, a shift that began gaining traction in late 2025.

Specialized AI models offer dramatically higher quality, lower costs (potentially 10-100x smaller and more efficient), improved privacy, and greater control compared to generic models.

Building custom AI models has historically been prohibitively difficult, requiring months of effort and deep AI expertise for each use case, leading to a desire for a 'model factory'.

VibeML is presented as a 'model factory' that enables engineers to build specialized AI models automatically from a prompt, with the process taking hours instead of months.

A healthcare provider using VibeML technology saw a 20% quality improvement and a 70% cost reduction by building a custom model for extracting information from medical records.

A study utilizing VibeML technology to evaluate Google AI overviews found that only 39% of claims were fully supported by the provided sources, highlighting a significant hallucination rate.

The enterprise AI landscape is rapidly evolving

The predominant trend in enterprise AI is a significant shift from renting generic intelligence to owning specialized intelligence. Following the initial wave of Generative AI in late 2022, enterprises largely relied on consuming APIs from major providers like OpenAI, Anthropic, and Google. However, starting in late 2025, a faster-than-ever adoption of building proprietary, specialized models has emerged. This move is driven by several key benefits: significantly improved quality tailored to specific use cases, dramatically lower operational costs and latency, enhanced privacy and security by controlling data deployment, complete control over AI roadmaps, and the ability to build unique market differentiation and durable competitive advantages.

Why specialized models outperform generic ones

Generic AI models, built to perform reasonably well across a wide array of tasks, are inherently optimized for nothing specific. This means they are not ideal for niche enterprise use cases. In contrast, a specialized model is fine-tuned for a particular application, leading to superior performance. For instance, Intercom reported building a model better than GPT-3 for code generation that was five times cheaper to operate. Similarly, Kensho announced a coding agent model superior to GPT-3 and ten times cheaper to run. These specialized models can be 10 to 100 times smaller and more efficient than their generic counterparts, resulting in substantial cost savings and performance gains. Furthermore, owning a custom model provides enterprises with control over their data and infrastructure, preventing reliance on third-party roadmaps and terms of service.

The historical challenges of building custom AI

Despite the clear advantages, building specialized AI models has been a significant hurdle for most enterprises. The primary reasons are the immense effort and time required. Organizations have reported that developing a single specialized model for one use case could take months, a process that would need to be repeated for every new model or update. This pace is unsustainable, especially when coupled with the challenge of maintaining and improving models in production. Many companies also admit to lacking sufficient in-house AI expertise or the personnel to dedicate to such resource-intensive projects. This created a need for a solution that could abstract away the complexity and accelerate the development cycle.

VibeML: Automating specialized AI development

VibeML is presented as the solution to these challenges, functioning as an automated 'model factory'. It aims to enable any engineer, regardless of deep AI expertise, to build specialized AI models quickly, ideally within hours rather than months. The platform guides users through a structured process starting with defining the task, much like prompting a general AI, but with the goal of building a custom AI model. For example, a prompt like 'Build a model to summarize key news items in news articles in bullet point format' initiates the process. VibeML then devises a plan for building this model, which includes defining evaluators (metrics for success), synthesizing test data, evaluating baseline models, creating training data, fine-tuning the model, and then re-evaluating. The platform handles much of this complexity, allowing users to focus on defining their needs.

The VibeML workflow and agent-driven process

The VibeML platform uses an agentic approach to guide users through model development. After an initial task definition, the agent proposes a plan detailing steps like setting up evaluators, synthesizing data, and fine-tuning. For instance, when building the news summarizer, the agent suggested evaluators like completeness, conciseness, and format adherence, and the user could add more, such as 'faithfulness' to prevent hallucinations. The platform can also synthesize training data if custom data isn't provided. It suggests baseline models and evaluates them, providing scores across the defined metrics. A key feature is the 'review failure modes' tool, which analyzes where the model performed poorly and allows for the generation of targeted training data to improve those specific areas. The process then moves to fine-tuning, offering options like LoRA (Low-Rank Adaptation) for parameter-efficient tuning. Finally, the platform quantifies the improvements post-fine-tuning, allowing for iterative cycles of improvement until desired metrics are met. The ultimate goal is to create a deployable model that offers higher quality and lower cost than generic alternatives.

Real-world impact and enterprise adoption

Companies are already leveraging VibeML technology to achieve significant results. A leading healthcare provider used it to build a custom agent for extracting medical record information, achieving a 20% quality improvement and a 70% cost reduction compared to general-purpose AI models which suffered from high latency, cost, and low accuracy. In another instance, a global media giant, The New York Times, utilized VibeML's technology to conduct a study evaluating the faithfulness of Google AI overviews. This partnership resulted in a published article that highlighted a surprising finding: only 39% of claims made in Gemini 3 overviews were fully supported by their cited sources, indicating a substantial hallucination rate that VibeML's technology helped identify at scale. Other leading enterprises like Microsoft, HP, and IBM are also adopting VibeML.

VibeML's impact on differentiation and IP

The ability to rapidly build and iterate on specialized AI models is crucial for enterprise differentiation. By owning proprietary models that are continuously improved in production, companies can build compounding intellectual property (IP) and establish a competitive mode that generic models cannot match. This proactive ownership of intelligence, rather than reactive prompting of rented services, is positioned as the key differentiator for future enterprise success. VibeML empowers this by allowing engineers to create models that are not only better and cheaper for specific tasks but also evolve over time, outpacing competitors and generic AI advancements. This control and continuous improvement loop are vital for long-term market leadership.

Accessibility through open-source and enterprise platforms

VibeML offers both an enterprise platform and an open-source library, making its technology accessible to a broader audience. The open-source library, available on GitHub, has garnered significant traction with over 9,000 stars and active community contributions. The enterprise platform, launched recently, has seen rapid adoption with approximately 2,000 signups since its launch. This dual approach allows organizations to explore the capabilities of automated AI development and then scale to enterprise-grade solutions. Examples like fine-tuning a customer support model that outperforms top-tier LLMs from Anthropic (Opus, Sonnet, Haiku) with a much smaller, 0.8 billion parameter model, achieving 100x faster speeds and lower costs, illustrate the practical power of this technology for creating highly efficient, specialized AI applications.

Mentioned in This Episode

●Software & Apps

●Companies

Building Custom AI Models with Umei: Dos and Don'ts

Practical takeaways from this episode

Do This

Define your enterprise's specific problem and desired AI model use case.

Import your own data and domain expertise to build specialized models.

Choose to deploy your custom model on-premise, on-device, or in the cloud.

Monitor model performance in production and use insights for continuous improvement.

Leverage automated processes like those in Umei to build models efficiently.

Consider parameters like faithfulness, conciseness, and completeness when evaluating models.

Utilize failure mode analysis to synthesize targeted training data.

Avoid This

Continue relying solely on generic, rented AI APIs for critical business functions.

Assume generic models are optimized for your specific production use cases.

Overlook the importance of data privacy and security when using third-party AI.

Be beholden to the roadmaps and terms of service of large AI providers.

Underestimate the value of ownership and differentiation through custom AI models.

Ignore the potential for significantly lower costs and latency with specialized models.

Underestimate the time and expertise required without an automated solution.