Whisper

Software / App

machine learning model for speech recognition and transcription

Mentioned in 13 videos

Videos Mentioning Whisper

The AI Agent Economy Is Here

The AI Agent Economy Is Here

Y Combinator

Speech-to-text model discussed as part of the transcription pipeline (Whisper V1).

OpenClaw Creator: Why 80% Of Apps Will Disappear

OpenClaw Creator: Why 80% Of Apps Will Disappear

Y Combinator

OpenAI Whisper used for audio transcription.

Breaking down the OG GPT Paper by Alec Radford

Breaking down the OG GPT Paper by Alec Radford

Latent Space

Mentioned as a topic Amad has written blog posts about.

Personal AI Meetup - Bee, BasedHardware, LangChain LangFriend, Deepgram EmilyAI

Personal AI Meetup - Bee, BasedHardware, LangChain LangFriend, Deepgram EmilyAI

Latent Space

An open-source speech-to-text model, mentioned in the context of challenges in getting it to work across different systems.

[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz

[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz

Latent Space

A state-of-the-art automatic speech recognition (ASR) model from OpenAI, capable of transcription, translation, and speech detection. The discussion covers its architecture and the release of its Turbo version.

Building AGI in Real Time (OpenAI Dev Day 2024)

Building AGI in Real Time (OpenAI Dev Day 2024)

Latent Space

An AI model used for transcribing audio, specifically mentioned in the context of processing hour-long YouTube videos before multimodal capabilities were available.

Building AGI with OpenAI's Structured Outputs API

Building AGI with OpenAI's Structured Outputs API

Latent Space

OpenAI's speech-to-text model, with discussions covering its API limitations (lack of diarization) and potential future improvements.

Gemini 1.5 and The Biggest Night in AI

Gemini 1.5 and The Biggest Night in AI

AI Explained

Snipd: The AI Podcast App for Learning — with CEO Kevin Ben-Smith

Snipd: The AI Podcast App for Learning — with CEO Kevin Ben-Smith

Latent Space

An AI model for speech-to-text that became available after Snip's initial launch, significantly improving transcription capabilities and enabling features like speaker diarization.

Lessons From 50 Of The Worlds Greatest Minds with Jake Humphrey | E59

Lessons From 50 Of The Worlds Greatest Minds with Jake Humphrey | E59

The Diary Of A CEO

Jake Humphrey's production company, which he co-founded, aims to uplift underrepresented talent in the TV industry.

From skeptic to true believer: How OpenClaw changed my life | Claire Vo

From skeptic to true believer: How OpenClaw changed my life | Claire Vo

Lenny's Podcast

OpenAI's automatic speech recognition model, implied to be used by Telegram for transcribing voice notes into text for OpenClaw agents.

Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Mistral: Voxtral TTS, Forge, Leanstral, & Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Latent Space

An ASR model known for its 30-second processing limit, which served as inspiration for Mistral's longer-form audio processing.

Your Ground Truth Is Wrong: Evaluating STT with truth files & semantic WER | AssemblyAI Workshop

Your Ground Truth Is Wrong: Evaluating STT with truth files & semantic WER | AssemblyAI Workshop

AssemblyAI

An open-source speech-to-text model developed by OpenAI, used as a benchmark comparison.