Transcription & AI
Autorec transcribes recordings locally using whisper.cpp and optionally generates AI summaries via any OpenAI-compatible API.
How Transcription Works
- After a recording finishes, autorec extracts the audio track
- The audio is processed by the selected Whisper model entirely on your machine
- Two output files are created alongside the video:
.txt— plain text transcript.srt— subtitle file with timestamps
No audio or video data leaves your computer during transcription.
Whisper Models
Models are downloaded on first use and stored in ~/.local/share/autorec/models/ (Linux) or %LOCALAPPDATA%\autorec\models\ (Windows).
| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
| tiny | ~75 MB | Fastest | Basic | Quick notes, low-power machines |
| base | ~142 MB | Fast | Good | Default — recommended for most users |
| small | ~466 MB | Moderate | Better | When accuracy matters more than speed |
| medium | ~1.5 GB | Slow | High | Non-English languages, difficult audio |
| large | ~3 GB | Slowest | Best | Maximum accuracy, powerful hardware |
Downloading Models
- Open Settings from the tray menu
- Go to the Transcription section
- Select a model size
- Click Download — the model downloads once and is reused for all future transcriptions
AI Summaries
AI summaries use a cloud API to generate a title and summary from the transcript text. Only the text is sent — no audio or video.
Setup
- Open Settings > AI Summaries
- Enter your API endpoint (e.g.,
https://api.openai.com/v1) - Enter your API key
- Choose a model (e.g.,
gpt-4o-mini) - Enable auto-summarize
Compatible Services
Any service with an OpenAI-compatible chat completions endpoint works:
- OpenAI —
https://api.openai.com/v1 - OpenRouter —
https://openrouter.ai/api/v1 - Local models (Ollama, LM Studio, etc.) — use your local endpoint
What Gets Generated
For each transcribed recording, autorec generates:
- Title — a short, descriptive title for the meeting
- Summary — a concise summary of key points discussed
Both appear in the video library and the video detail view, making it easy to find the meeting you need without rewatching.