VOICE TO TEXT
Record straight from your mic or upload an audio clip (up to 5 minutes, 50 MB). Get a timestamped transcript with speaker labels. Works in Chrome, Safari, Firefox, Edge — no signup.
Audio Input
3 of 3 free transcriptions remaining
Transcribe in source language or cross-lingually translate.
Transcript
Record or upload audio to transcribe
Speaker labels · timestamps · 6 export formats
How It Works
Record or Upload
Hit Record to dictate straight from your mic (auto-stops at 5 min), or drag in an audio file up to 50 MB — MP3, WAV, M4A, OGG, FLAC, WebM.
Pick a Language
Keep original or auto-translate to any of 40+ languages — dictate in English, export Spanish (or any combination).
Export 6 Ways
TXT · Timestamped TXT · SRT · VTT · JSON · clickable Markdown. Speaker labels included automatically. No signup.
Everything You Need — In One Tool
Dual input modes in one tool. Most competitors only do one or the other.
Free automatic 'Speaker 1 / 2 / 3…' labels. Otter and Notta paywall this on free tier.
TXT · Timestamped TXT · SRT · VTT · JSON · clickable Markdown. Most tools give plain TXT only.
Cross-lingual output in the same pass. Dictate in English, export Spanish (or any language).
Inline search with highlight. Click any timestamp, the embedded player jumps to that second.
Works in Chrome, Safari, Firefox, Edge (Web Speech tools are Chrome-only). No account needed.
Why CopyRocket Beats Speechnotes, Dictation.io, Google Docs Voice Typing, Otter, and Notta
Most free voice-to-text tools use the browser's built-in Web Speech API. That means Chrome-only, no speaker labels, no export formats beyond plain text, and inconsistent accuracy across devices. The paid alternatives (Otter, Notta) lock speaker diarization and multi-format export behind subscription tiers and require signup.
We use server-side Gemini 3.1 Flash Lite — the same AI model that powers YouTube auto-captions — so accuracy and features are consistent on every browser. Plus we ship live-mic recording AND file upload, speaker labels, and six export formats all on one free page.
What Makes Us Different
- Live mic + audio upload in one tool. Speechnotes is mic-only. Otter is upload-heavy. We give you both on the same page — pick your input, same output pipeline.
- Works in ALL browsers. Web Speech API tools (Dictation.io, SpeechTexter, voicetotextonline.com) only work in Chrome. We use server-side Gemini — Safari, Firefox, Edge, mobile browsers all get identical quality.
- Speaker diarization on the free tier. Otter gates this. Notta gates this. We include it by default — great for interview clips, short meetings, couple conversations.
- 6 export formats in one click — TXT (plain), TXT with timestamps + speaker labels, SRT (for video editors), VTT (web captions), JSON (structured for developers), Markdown with clickable timestamps (unique).
- Live browser recording with MediaRecorder API — waveform level meter, pause/resume, automatic stop at 5 min, cross-browser compatible mime-type selection.
- Embedded audio player + click-to-jump. Click any timestamp in the transcript, the recorded/uploaded audio jumps to that second for instant verification.
- 40+ language cross-lingual output. Dictate in English, export in Spanish. Transcribe a Japanese voice memo in English. Same pass, same accuracy.
- Per-segment hover copy. Hover any line, click copy — grab just that quote without manually selecting.
- No signup, no email, no credit card. 3 free runs per browser session. Pro upgrade removes the cap.
Who This Is For
- Journalists — transcribe interview voice memos without a subscription; get speaker-labelled quotes in seconds.
- Podcasters — dictate show-note drafts or pull quotes from short segments.
- Writers and bloggers — dictate a 5-minute chunk instead of typing; export as Markdown ready to paste.
- Students — voice-note a study summary, lecture observation, or research idea and get it instantly text-ready.
- Coaches and therapists — capture session reflections. Speaker labels help when 2 voices appear in the recording.
- Lawyers and consultants — transcribe short client voice memos without sending audio to a subscription service.
- Anyone on Safari or Firefox — who's tired of being told “open this in Chrome” by Web Speech tools.
Technical Notes
Mic input is captured via the MediaRecorder API with opus-encoded WebM (or platform-best codec fallback). An analyser node drives the live waveform meter. Recording auto-stops at the 5-minute mark. Uploaded audio is validated client-side (size + duration via HTMLAudioElement metadata) before the request fires. Transcription is powered by Google Gemini 3.1 Flash Lite Preview through OpenRouter in JSON mode. Timestamps preserve sub-second precision. Speaker labels come from voice-based turn detection — not a pre-defined roster.
Unlimited Transcripts, Longer Recordings, Bulk Upload
CopyRocket Pro: unlimited runs, longer duration limits, bulk mode, and 50+ other AI tools.
Get Unlimited Access