Live Mic + Upload · ≤5 min · ≤50 MB · All Browsers

VOICE TO TEXT

Record straight from your mic or upload an audio clip (up to 5 minutes, 50 MB). Get a timestamped transcript with speaker labels. Works in Chrome, Safari, Firefox, Edge — no signup.

Audio Input

3 of 3 free transcriptions remaining

Ready to record
Auto-stops at 5 minutes

Transcribe in source language or cross-lingually translate.

Transcript

Record or upload audio to transcribe

Speaker labels · timestamps · 6 export formats

How It Works

1

Record or Upload

Hit Record to dictate straight from your mic (auto-stops at 5 min), or drag in an audio file up to 50 MB — MP3, WAV, M4A, OGG, FLAC, WebM.

2

Pick a Language

Keep original or auto-translate to any of 40+ languages — dictate in English, export Spanish (or any combination).

3

Export 6 Ways

TXT · Timestamped TXT · SRT · VTT · JSON · clickable Markdown. Speaker labels included automatically. No signup.

Everything You Need — In One Tool

Live mic + upload

Dual input modes in one tool. Most competitors only do one or the other.

Speaker diarization

Free automatic 'Speaker 1 / 2 / 3…' labels. Otter and Notta paywall this on free tier.

6 export formats

TXT · Timestamped TXT · SRT · VTT · JSON · clickable Markdown. Most tools give plain TXT only.

40+ languages

Cross-lingual output in the same pass. Dictate in English, export Spanish (or any language).

Search + click-to-jump

Inline search with highlight. Click any timestamp, the embedded player jumps to that second.

All browsers, no signup

Works in Chrome, Safari, Firefox, Edge (Web Speech tools are Chrome-only). No account needed.

Why CopyRocket Beats Speechnotes, Dictation.io, Google Docs Voice Typing, Otter, and Notta

Most free voice-to-text tools use the browser's built-in Web Speech API. That means Chrome-only, no speaker labels, no export formats beyond plain text, and inconsistent accuracy across devices. The paid alternatives (Otter, Notta) lock speaker diarization and multi-format export behind subscription tiers and require signup.

We use server-side Gemini 3.1 Flash Lite — the same AI model that powers YouTube auto-captions — so accuracy and features are consistent on every browser. Plus we ship live-mic recording AND file upload, speaker labels, and six export formats all on one free page.

What Makes Us Different

  • Live mic + audio upload in one tool. Speechnotes is mic-only. Otter is upload-heavy. We give you both on the same page — pick your input, same output pipeline.
  • Works in ALL browsers. Web Speech API tools (Dictation.io, SpeechTexter, voicetotextonline.com) only work in Chrome. We use server-side Gemini — Safari, Firefox, Edge, mobile browsers all get identical quality.
  • Speaker diarization on the free tier. Otter gates this. Notta gates this. We include it by default — great for interview clips, short meetings, couple conversations.
  • 6 export formats in one click — TXT (plain), TXT with timestamps + speaker labels, SRT (for video editors), VTT (web captions), JSON (structured for developers), Markdown with clickable timestamps (unique).
  • Live browser recording with MediaRecorder API — waveform level meter, pause/resume, automatic stop at 5 min, cross-browser compatible mime-type selection.
  • Embedded audio player + click-to-jump. Click any timestamp in the transcript, the recorded/uploaded audio jumps to that second for instant verification.
  • 40+ language cross-lingual output. Dictate in English, export in Spanish. Transcribe a Japanese voice memo in English. Same pass, same accuracy.
  • Per-segment hover copy. Hover any line, click copy — grab just that quote without manually selecting.
  • No signup, no email, no credit card. 3 free runs per browser session. Pro upgrade removes the cap.

Who This Is For

  • Journalists — transcribe interview voice memos without a subscription; get speaker-labelled quotes in seconds.
  • Podcasters — dictate show-note drafts or pull quotes from short segments.
  • Writers and bloggers — dictate a 5-minute chunk instead of typing; export as Markdown ready to paste.
  • Students — voice-note a study summary, lecture observation, or research idea and get it instantly text-ready.
  • Coaches and therapists — capture session reflections. Speaker labels help when 2 voices appear in the recording.
  • Lawyers and consultants — transcribe short client voice memos without sending audio to a subscription service.
  • Anyone on Safari or Firefox — who's tired of being told “open this in Chrome” by Web Speech tools.

Technical Notes

Mic input is captured via the MediaRecorder API with opus-encoded WebM (or platform-best codec fallback). An analyser node drives the live waveform meter. Recording auto-stops at the 5-minute mark. Uploaded audio is validated client-side (size + duration via HTMLAudioElement metadata) before the request fires. Transcription is powered by Google Gemini 3.1 Flash Lite Preview through OpenRouter in JSON mode. Timestamps preserve sub-second precision. Speaker labels come from voice-based turn detection — not a pre-defined roster.

Unlimited Transcripts, Longer Recordings, Bulk Upload

CopyRocket Pro: unlimited runs, longer duration limits, bulk mode, and 50+ other AI tools.

Get Unlimited Access

Frequently Asked Questions