Whisper
machine learning model for speech recognition and transcription
Common Themes
Videos Mentioning Whisper

The AI Agent Economy Is Here
Y Combinator
Speech-to-text model discussed as part of the transcription pipeline (Whisper V1).

OpenClaw Creator: Why 80% Of Apps Will Disappear
Y Combinator
OpenAI Whisper used for audio transcription.

Breaking down the OG GPT Paper by Alec Radford
Latent Space
Mentioned as a topic Amad has written blog posts about.

Personal AI Meetup - Bee, BasedHardware, LangChain LangFriend, Deepgram EmilyAI
Latent Space
An open-source speech-to-text model, mentioned in the context of challenges in getting it to work across different systems.
![[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz](https://i.ytimg.com/vi/8BN9CdIYaqc/maxresdefault.jpg)
[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz
Latent Space
A state-of-the-art automatic speech recognition (ASR) model from OpenAI, capable of transcription, translation, and speech detection. The discussion covers its architecture and the release of its Turbo version.

Building AGI in Real Time (OpenAI Dev Day 2024)
Latent Space
An AI model used for transcribing audio, specifically mentioned in the context of processing hour-long YouTube videos before multimodal capabilities were available.

Building AGI with OpenAI's Structured Outputs API
Latent Space
OpenAI's speech-to-text model, with discussions covering its API limitations (lack of diarization) and potential future improvements.

Gemini 1.5 and The Biggest Night in AI
AI Explained

Snipd: The AI Podcast App for Learning — with CEO Kevin Ben-Smith
Latent Space
An AI model for speech-to-text that became available after Snip's initial launch, significantly improving transcription capabilities and enabling features like speaker diarization.

Lessons From 50 Of The Worlds Greatest Minds with Jake Humphrey | E59
The Diary Of A CEO
Jake Humphrey's production company, which he co-founded, aims to uplift underrepresented talent in the TV industry.

From skeptic to true believer: How OpenClaw changed my life | Claire Vo
Lenny's Podcast
OpenAI's automatic speech recognition model, implied to be used by Telegram for transcribing voice notes into text for OpenClaw agents.

Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample
Latent Space
An ASR model known for its 30-second processing limit, which served as inspiration for Mistral's longer-form audio processing.

Your Ground Truth Is Wrong: Evaluating STT with truth files & semantic WER | AssemblyAI Workshop
AssemblyAI
An open-source speech-to-text model developed by OpenAI, used as a benchmark comparison.