Hands-on · 2026-05-30

Whisper 101: how to transcribe any video for free

Whisper is one of my single biggest unlocks, and most people have no idea it's free.

Whisper is one of the biggest unlocks I've found, and most people have no idea it exists. It's an open source voice-to-text model OpenAI released a few years ago. You download it, install it locally, and transcribe any video you ever want for free. No subscriptions. No per-minute fees.

01 / Free ForeverLocal install means zero ongoing cost

The whole game is downloading it to your own machine. Once it's there, you own it. No API meter running, no monthly bill, no usage caps. Every video you ever want to transcribe costs you nothing but time.

02 / Video Becomes ContextTurn any recording into usable text

Your videos, other people's videos, a one-hour podcast — doesn't matter. Whisper converts it all into clean text you can feed anywhere. That transcript becomes context for your agents, tools, and workflows. Pipe it into anything downstream.

You can also transcribe yourself. Talk, get text, route it wherever you want. It's the simplest possible bridge between your voice and any system that reads text.

03 / Stupid FastA one-hour video transcribes in under a minute

The speed is what makes "transcribe everything" actually realistic. Short clips take a few seconds. Long recordings barely register. You stop rationing what you send through it because the cost, in every sense, is basically zero.

It also handles messy audio well. Background noise, low quality, disruptions — it still hits high accuracy on what it picks up.

04 / Chain ItConnect transcripts to your full stack

Whisper output is plain text, so it wires into everything. Feed transcripts into agents. Use them as input for tools you're already running. Chain it with a video rendering pipeline to go from raw footage to finished output without touching a timeline manually.

05 / Install itRunning in about 2 minutes

The plain-English version: Whisper is a small Python program. You set up three things once, then transcribing is a single command you reuse forever. The three pieces are Python, ffmpeg (the part that actually reads your video and audio files), and Whisper itself.

1. Install ffmpeg. Whisper does not decode audio on its own. It hands every file to ffmpeg first, so this one is required, not optional. Pick the line for your machine:

Mac: brew install ffmpeg
Linux: sudo apt install ffmpeg
Windows: choco install ffmpeg

2. Install Whisper (needs Python 3.8 to 3.11):

pip install -U openai-whisper

3. Transcribe anything:

whisper my-video.mp4 --model turbo

turbo is the new fast model, roughly 8x realtime. Want lighter and English-only? Swap in --model small.en. That is the whole setup. Point it at any file and you get a clean text transcript back.

06 / For captionsWhisperX gives you word-perfect timing

Plain Whisper timestamps a whole sentence at a time, which can drift by a few seconds. For captions, subtitles, or anything where each word has to land on the exact frame, use WhisperX. It aligns every single word to the millisecond, and can even label who is speaking.

pip install whisperx
whisperx my-video.mp4 --model large-v2

Don't want to think about it?
Paste this to Claude or ChatGPT and let it set everything up for you:

"Install OpenAI Whisper locally on my machine. Reference github.com/openai/whisper and walk me through ffmpeg plus pip step by step, then show me how to transcribe a video file. If I need word-level timestamps for captions, set up WhisperX (github.com/m-bain/whisperX) instead."
// cost
Completely free, forever, no meter running
// speed
One-hour video transcribed in under a minute
// utility
Any video becomes context for any downstream tool
Once it's on your machine, every video you ever want to transcribe costs you nothing but time.
TRANSCRIPTION SPEED
<1 min
for a full one-hour video, locally
Official docs

Read the Whisper docs

The full setup guide, model list, and command options, straight from OpenAI's official GitHub.

View the official docs
Coming next

Get the next one in your inbox.

I write about the AI tools I'm testing, what just shipped, and what's actually worth your time. Weekly, no fluff.

Subscribe to the newsletter

Or join the community of AI builders and creators. DMs, AMAs, and behind-the-build.