Glossary
Voice activity detection
Voice activity detection (VAD) is a technique that identifies which parts of an audio stream contain speech and which are silence or background noise.
What it means
VAD answers a narrow question: is someone speaking right now? It does not transcribe the audio or identify the speaker. Software uses it to skip silent stretches, to know when to start and stop processing, and to feed cleaner audio into transcription.
In a transcription pipeline, VAD makes a useful first pass. By trimming long silences and non-speech audio before the recognition model runs, it cuts processing time and reduces spurious output from background noise.
How this relates to Autorec
Autorec's transcription step works on the recorded audio of your call. Detecting where speech actually occurs helps the on-device transcription stay efficient and keeps room noise from turning into stray text in the transcript.
Try Autorec
A local-first meeting recorder for Linux and Windows. It auto-detects your calls, records to your own disk, and transcribes on your machine. One-time €20, with a free tier to start.
Download Autorec