Can Universal 3 Pro preserve disfluencies in transcripts?

Yes. The presenters showed prompts explicitly asking to preserve linguistic patterns including disfluencies, such as ums and uhs, to control transcript style.

How does code-switching get handled in transcripts?

Code-switching can be preserved by prompts that instruct the model to keep original languages without translating, demonstrated with the Miami Corpus around 1242 seconds.

Which languages does Universal 3 Pro support today?

Universal 3 Pro supports English, Spanish, French, German, Italian, and Portuguese today, with API support for 99 languages.

What is the difference between unclear and masked outputs for PII redaction?

Unclear signals uncertainty in audio segments, while 'masked' is used for stylistic redaction; PII redaction can be prompted separately to improve privacy in transcripts.

Is streaming supported for Universal 3 Pro?

Streaming support is in development, including a streaming version of Universal 3 Pro and integration with speaker diarization for better live transcripts.

What is the Prompt Repair Wizard?

The Prompt Repair Wizard is a dashboard tool that analyzes prompts and suggests improvements based on prompting best practices to scale prompt engineering.

How much does prompting cost with Universal 3 Pro?

Prompting adds an additional five cents per hour on top of the base price of 21 cents per hour, totaling 26 cents per hour when prompting is enabled.

Can the system label speakers or diarize who spoke when?

Speaker diarization is available in the API and is being refined; there are experimental prompts for speaker tagging but the recommended path is to use the built-in diarization features for production reliability.

Key Moments

Prompt Engineering Workshop: Universal-3 Pro

AssemblyAI

Science & Technology3 min read61 min video

Feb 19, 2026|656 views|21

Save to Pod

Key Moments

TL;DR

Universal 3 Pro demo: promptable STT, multilingual, tagging, streaming.

Key Insights

Promptable transcription lets you steer transcripts with natural language prompts, enabling more control over style, clarity, and content.

Out-of-the-box Universal 3 Pro surpasses Universal 2 in accuracy; applying prompts further improves context, disfluencies, and meaning.

Explicit prompts for disfluencies, hesitations, repetitions, stutters, and colloquialisms significantly shape how the transcript reflects natural speech.

Code-switching and language preservation are supported by targeted prompts; six native languages plus API access to 99 languages, with Universal 4 on the roadmap.

Audio tagging, PII redaction, and diarization are actionable prompts; streaming speaker labeling is experimental but promising.

Practical workflows include evaluating prompts on your data, balancing domain-specific vs. generic prompts, and understanding pricing (prompting adds a small per-hour cost).

INTRODUCTION AND CONTEXT

The session opens with introductions from Ryan, who leads Assembly AI's customer-facing teams, and colleagues Zach and Griffin from the applied AI engineering team. They frame Universal 3 Pro as a promptable speech-to-text model that can customize transcripts via natural language prompts. A live comparison tool is introduced to pit Universal 2 against Universal 3 Pro, using a GitLab meeting transcription as a baseline and then applying prompts to demonstrate improvements. The group emphasizes hands-on demos, live debugging, and leaving ample time for Q&A, with the event being recorded for later sharing.

BASELINE VS PROMPTING: SETTING THE SCENE

The left side of the demo shows Universal 3 Pro in its baseline form, while the right side introduces prompts to steer transcription. Early focus centers on preserving linguistic patterns, with a key realization: a generic instruction like disfluencies is too vague. Through iterative prompting, they refine to specify disfluencies as filler words, hesitations, repetitions, stutters, false starts, and colloquialisms. The result is a clearer, more context-aware transcript that better captures natural speech while maintaining flexibility to be more or less literal depending on needs.

PROMPTING MECHANICS: HOW TO INSTRUCT THE MODEL

The presenters walk through constructing prompts, starting with mandatory instructions to preserve linguistic patterns and then adding an always-go-with-your-best-guess rule based on context. They show how stronger, authoritative prompts push the model to infer missing words rather than skip uncertain segments. The discussion also covers risks like overfitting prompts to a single file and the value of testing prompts against diverse datasets. Live commentary highlights how the model interprets speech patterns such as ums and uhhs as discourse signals rather than noise.

MULTI-LANGUAGE CODE-SWITCHING AND PRESERVING LANGUAGES

A Miami corpus example demonstrates code-switching with Spanglish and the importance of preserving original languages and scripts. The team explains language support: six native languages (English, Spanish, French, German, Italian, Portuguese) plus API access to 99 languages; Universal 4 is in development for broader coverage. A key takeaway is instructing the model to keep code-switching intact rather than translating, which yields transcripts that reflect actual speaker behavior and multilingual contexts more faithfully.

AUDIO TAGGING, PII REDACTION, AND DIARIZATION

The workflow introduces audio tagging for non-speech events (coughs, laughter, noise, silence, unclear portions) as an experimental feature. A critical decision point is choosing between ‘unclear’ and ‘mask’ tagging, which interacts with organizational style guides and potential profanity handling. PII redaction remains available, with prompts guiding privacy considerations. Speaker diarization is described as experimental, with streaming speaker labeling on the roadmap. The session hints at future integration that fuses model-based speaker tags with native diarization for more robust separation across chunks.

PRACTICAL USES, EVALUATION, AND PRICING

The panel discusses evaluating prompts on your own data, using domain-specific prompts (medical, legal, finance) or more generic prompts to adapt to unknown contexts. They distinguish between key terms prompting and open-field prompting: the features are mutually exclusive at the parameter level but can be combined by embedding key terms within a broader open prompt. A live note on streaming and pricing clarifies that prompting adds about five cents per hour on top of the base rate. The takeaway is to experiment, document results, and consult documentation for ongoing updates.

Mentioned in This Episode

●Software & Apps

●Tools

●Studies Cited

●Concepts

●People Referenced

Prompting Do's and Don'ts for Universal 3 Pro

Practical takeaways from this episode

Do This

Be explicit and authoritative in prompts (e.g., 'always transcribe with your best guess', 'prioritize medications').

Use code-switching/preserve original languages; avoid translating mixed-language segments unless needed.

Enable and leverage audio tagging (laughter, coughs, silence) to enrich transcripts.

Consider PII redaction and 'unclear' outputs to handle sensitive data in transcripts.

Iterate prompts with the Prompt Repair Wizard and run small-scale evals before scaling.

Avoid This

Don’t overfit prompts to a single file; risk poor generalization across datasets.

Don’t rely solely on model judgments; validate with human review when necessary.

Avoid vague or soft prompts; use clear, task-relevant commands to improve accuracy.

Pricing: Universal 3 Pro with and without prompting

Data extracted from this episode

Model	Base price per hour	Prompting price per hour
Universal 3 Pro	21c	26c

Common Questions

Universal 3 Pro is a promptable speech-to-text model that can be customized with natural-language prompts to influence transcription output. The session compared baseline Universal 2 with Universal 3 Pro and demonstrated prompt-based improvements starting around 119 seconds.

Topics

Multilingual Transcription PII Redaction Speaker Diarization Streaming Transcription Prompt Repair Wizard Word Error Rate Semantic Were Miami Corpus Prompting Strategy

Mentioned in this video

Studies & Research

Miami Corpus

Spanglish dataset used to test code-switching in transcripts

Software & Apps

Key Terms Prompt

Prompt component to boost specific terms before transcription

PII redaction

Prompt-based approach to flag and handle personal data in transcripts

Speaker Diarization

Experimental speaker labeling feature (streaming context)

Prompt Repair Wizard

Dashboard tool for prompt quality improvements and iteration guidance

People

Zach

Co-presenter and prompt engineer on session

Griffin

Co-presenter on prompt engineering session

Ryan Seams

Host and speaker from AssemblyAI guiding the session

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free