Question 1

What are the limits?

Accepted Answer

Up to 5 minutes of audio or 50 MB — whichever hits first. That covers voice memos, interview clips, podcast segments, meeting notes, and short dictation sessions. For longer audio, use Audio to Text Converter or Podcast Summarizer.

Question 2

Can I record straight from my microphone?

Accepted Answer

Yes. Click Record, grant mic permission, and speak — we record in-browser using the MediaRecorder API. Waveform timer shows live. Auto-stops at 5 minutes. Works on desktop Chrome/Safari/Firefox/Edge and on mobile browsers.

Question 3

Which audio formats can I upload?

Accepted Answer

MP3, WAV, M4A, OGG, FLAC, WebM — any standard format. If you record a voice memo on iPhone (M4A) or Android (M4A/AMR), it works. If you pulled audio from a video, works too.

Question 4

Does it work in Safari and Firefox?

Accepted Answer

Yes — all modern browsers. Competitors like SpeechTexter and Dictation.io only work in Chrome because they use the Web Speech API. We use server-side Gemini 3.1 Flash Lite, so accuracy and features are consistent across every browser.

Question 5

Does it include speaker labels?

Accepted Answer

Yes — automatic speaker diarization. Each segment is labelled 'Speaker 1', 'Speaker 2', etc. based on the voice. Great for 2-person interviews, short meetings, or panel clips. Otter and Notta paywall this on free tiers.

Question 6

Can I translate the transcript?

Accepted Answer

Yes. Pick from 40+ output languages before transcribing. Gemini transcribes and cross-lingually translates in one pass.

Question 7

What export formats?

Accepted Answer

Six: TXT · Timestamped TXT · SRT · VTT · JSON · Markdown with clickable timestamps. Most competitors give plain TXT only.

Question 8

How accurate is it?

Accepted Answer

95%+ for clear speech using Gemini 3.1 Flash Lite. Background noise, overlapping speakers, and strong accents can reduce accuracy. Transcribe in a quiet room with mic close to your mouth for best results.

Question 9

Is my audio saved?

Accepted Answer

No. Audio is sent to Gemini, transcribed, and discarded. We don't log or store your recordings. Privacy-first.

Question 10

How is this different from Google Docs Voice Typing, Dictation.io, Otter, or Notta?

Accepted Answer

(1) Works in ALL browsers (Web Speech tools are Chrome-only). (2) Dual input — live mic OR audio upload (most tools are one or the other). (3) Speaker labels free (Otter/Notta paywall). (4) 6 export formats (most give TXT only). (5) 40+ language cross-lingual. (6) No signup.

VOICE TO TEXT

Audio Input

Transcript

How It Works

Record or Upload

Pick a Language

Export 6 Ways

Everything You Need — In One Tool

Why CopyRocket Beats Speechnotes, Dictation.io, Google Docs Voice Typing, Otter, and Notta

What Makes Us Different

Who This Is For

Technical Notes

Unlimited Transcripts, Longer Recordings, Bulk Upload

Frequently Asked Questions