How does Universal 2 compare to Universal 3 Pro without any prompting?

The video demonstrates a direct out-of-the-box comparison between Universal 2 and Universal 3 Pro with no prompt modification to highlight baseline differences in transcription quality and corrections.

What improvements does Universal 3 Pro show out of the box?

Out of the box, Universal 3 Pro fixes broken words, capitalizes proper nouns, and improves overall meaning of sentences compared to initial transcriptions.

What is the difference between a simple prompt and a verbatim prompt?

A simple prompt is a concise instruction to influence the transcript, while a verbatim prompt emphasizes preserving or showcasing hesitations and disfluencies; the video shows both prompts altering the transcription style and content.

What should I do to start using Universal 3 Pro in my own projects?

Reference the quick start guide, use the speech models parameter to select Universal 3 Pro, and include the prompt parameter to experiment with different capabilities and tailor transcripts for your use case.

Where can I find the practical steps to begin experimentation with prompts?

The video guides you to use the prompt parameter, start with a simple prompt, then try additional prompts (including verbatim prompts) to observe how outputs change, and to provide feedback.

Key Moments

Universal-3 Pro Technical Overview

AssemblyAI

Science & Technology5 min read6 min video

Feb 3, 2026|274 views|13|1

Save to Pod

Key Moments

TL;DR

Universal 3 Pro enables prompt-driven, customized transcripts with improved accuracy.

Key Insights

Prompting shapes transcript output, affecting style, formatting, context clues, entity accuracy, speaker attribution, and audio-event tagging.

Universal 3 Pro outperforms the prior model (Universal 2) even before applying prompts, with clearer corrections and improved meaning.

Verbatim vs. standard prompts produce markedly different results, adding hesitations and fillers when requested.

API support lets you specify a prompt alongside the Universal 3 Pro model to tailor transcripts for specific use cases.

Practical demonstrations show real-world differences on a sample file, illustrating how prompts influence accuracy and readability.

INTRODUCTION AND CONTEXT

Today’s overview starts with Ryan from Assembly AI introducing Universal 3 Pro, the latest in their speech-to-text lineup. This model is notable because it accepts a text prompt alongside an audio file, enabling customized transcripts tuned to a user’s particular use case and customers. Alongside the announcement, Assembly AI points to a prompt engineering guide that explains how prompts can shape output, including style, formatting, context cues, and speaker or event tagging. The demonstration uses a GitLab SEC growth data science staff meeting as the sample, paired with a comparison app that pits Universal 2 against Universal 3 Pro before prompting.

NEW FEATURES AND CAPABILITIES OF UNIVERSAL-3 PRO

Universal 3 Pro extends the transcription task beyond words to produce outputs tailored by user prompts. The model supports a range of capabilities described in the guide: increasing or reducing disfluencies, altering style and formatting, adding context-aware clues, improving entity accuracy, and providing speaker attribution and audio-event tags. There is also mention of model code-switching, enabling cross-linguistic or domain-specific phrasing. The team emphasizes that many capabilities remain under documentation and discovery, suggesting a living feature set. The core idea is to give customers control over the transcript’s tone, structure, and annotative details to fit their workflows.

PROMPTING CAPABILITIES EXPLORED

To illustrate how prompting changes results, the team runs side-by-side comparisons using a single file. They show UniversaI 2 on the left and Universal 3 Pro on the right, initially without prompts to establish a baseline. Early observations highlight corrections to broken words, capitalization of proper nouns, and a clearer rendering of meaning—such as re-phrasing a question about arrival time. The takeaway is that prompts can nudge the model toward more accurate, readable transcripts from the very first pass, even before applying any customized instructions.

OUT-OF-THE-BOX PERFORMANCE VS PROMPTED

With the baseline established, the team tests a simple prompt and compares it to no prompt. The differences are subtle but noticeable: the prompt can improve sequence of words, punctuation hints, and the handling of ambiguous phrases. They also point to the possibility of using more verbose prompts to drive specific behavior. The demonstration suggests that even lightweight prompts yield tangible gains in readability and correctness, indicating that prompt design matters as much as model choice.

VERBATIM PROMPTS AND DETAIL EXTRACTION

Next, they turn to a verbatim-style prompt to capture hesitations and fillers. By selecting a verbatim option, transcripts show more ums and hesitations, with the model displaying the stumbles in real time. The visualized transcript confirms that a single prompt choice can reframe how the audio is translated—shifting from a cleaner, summarized render to a more detailed, verbatim record. This capability is particularly relevant for meeting minutes, legal deposits, and research notes where exact phrasing and pauses matter.

PROMPT STRATEGIES AND BEST PRACTICES

Beyond simple versus verbatim prompts, the team discusses best practices for prompt design. They warn that prompts influence context, emphasis, and even whether certain words are treated as disfluencies or essential terms. They point to the prompt engineering guide as a resource and encourage teams to experiment with different prompt styles to align transcripts with their downstream tasks. The underlying message is that prompting is a design lever that can dramatically alter the usefulness of the output.

USING THE API: PROMPT AND MODEL SELECTION

Operationalizing prompts requires using the API: the speech models parameter can request Universal 3 Pro, and the prompt parameter can inject instructions. The quick start guide is recommended to get teams started, and the team invites users to experiment with prompts to understand the tradeoffs. This section makes clear that users can embark on custom transcription workflows without writing bespoke models, by leveraging the combination of model choice and prompt design.

HANDS-ON RESULTS: A PRACTICAL EXAMPLE

In the practical demonstration, the SEC growth data science meeting is used to show real differences. The panel discusses acronyms like MLOps and APAC and notes how capitalization and entity recognition improve with the prompt. They also highlight how the plain transcript initially misinterpreted a phrase, while the prompt-adjusted version restored intended meaning. The side-by-side comparison reinforces that Universal 3 Pro benefits from both stronger out-of-the-box accuracy and targeted prompting, enabling teams to tailor outputs for compliance, analytics, or customer-facing documentation.

IMPACT ON USE CASES AND QUALITY ATTRIBUTES

For organizations, the combination of prompt-driven control and improved baseline accuracy expands the set of viable use cases. Unknown terms, domain-specific entities, and speaker attribution can be tuned via prompts, while code-switching supports multilingual or cross-domain transcription. The model’s ability to tag audio events and maintain consistency across speakers can streamline downstream NLP tasks, indexing, and search. The example demonstrates improved brand-naming of terms and a more faithful rendering of meaning, ensuring transcripts meet legal, technical, or customer-service expectations.

FUTURE DIRECTIONS, DOCUMENTATION, AND COMMUNITY FEEDBACK

The presentation closes with a note that the prompt engineering guide continues to evolve and that there are additional capabilities under development. Assembly AI invites users to try Universal 3 Pro via the API, provide feedback, and help shape future documentation. The take-away is that the platform aims to be a robust, customizable transcription tool adaptable to diverse industries, with transparent documentation, ongoing improvements, and an open channel for user-driven enhancements.

Mentioned in This Episode

●Software & Apps

●Concepts

●People Referenced

Prompting Quick Reference: Do's and Don'ts for Universal 3 Pro

Practical takeaways from this episode

Do This

Use the speech models parameter to request Universal 3 Pro in your API call.

Include the prompt parameter to experiment with different prompting capabilities.

Start with a simple prompt to establish a baseline before moving to verbose or verbatim prompts.

Avoid This

Don’t rely solely on the default transcript; compare outputs with different prompts to see differences.

Don’t assume one prompt style fits all domains; try simple, verbose, and verbatim prompts to find best results.

Don’t ignore false starts or hesitations; consider verbatim prompting to analyze or adjust transcripts.

Common Questions

Universal 3 Pro is Assembly AI's speech-to-text model that supports a text prompt input to customize output. Prompts can affect style, formatting, context, entity accuracy, speaker attribution, and other transcript attributes, enabling tailored results for different use cases.