Audio โ
Summaries aren't just read aloud. They're rewritten as audio scripts by the same AI provider you use for summarizing, using a configurable audio mode that controls the persona and structure. The script adapts to your cognitive traits and tone setting. The result sounds like a natural spoken piece, not a screen reader.
How audio works โ
Three things happen when you press a:
- Your summary is sent to the AI provider (same one used for summarizing).
- The AI rewrites it as a spoken script using your chosen audio mode โ each mode has a distinct persona, structure, and delivery style.
- The script is sent to a TTS voice (Edge TTS or OpenAI) for synthesis and playback.
Audio modes โ
Audio modes control how the script is structured and delivered:
| Mode | Persona | Best for |
|---|---|---|
| podcast (default) | Conversational host with hooks and transitions | General listening |
| briefing | Analyst delivering concise, numbered facts | Daily catch-ups |
| lecture | Patient teacher building understanding progressively | Deep learning |
| storyteller | Narrator weaving a compelling narrative | Story-driven content |
| study-buddy | Study partner with quizzes and mnemonics | Exam prep, retention |
| calm | Gentle, soothing narrator | Relaxed / bedtime listening |
tldr config set audio-mode briefing # Change default
tldr --audio-mode lecture "https://..." # Override for one runBuilt-in presets bundle an audio mode with matching style, tone, and voice settings โ see Presets.
Cognitive traits โ
The rewrite also adapts to your cognitive traits. These work with any audio mode:
| Trait | How the audio script changes |
|---|---|
| ADHD | Leads with a hook. High energy. Mini-takeaway per segment. |
| Dyslexia | Short punchy sentences. Key terms repeated naturally. |
| Autism | Direct and precise. No idioms or implied meanings. |
| ESL | Common vocabulary. Specialized terms explained inline. |
| Visual thinker | Spatial language. Word pictures. Narrative structure. |
Traits stack โ enable multiple with tldr preset edit. Your tone setting (casual, professional, academic, eli5) also shapes the script.
Listening to a summary โ
- Summary appears. A bordered Audio panel shows
[a] listen ยท [w] save + audio. - Press
a. A spinner shows "Generating audio..." then audio plays. - Press
sto stop playback. - Press
aagain. Cached audio replays instantly โ no re-generation.
Saving with audio โ
Why save audio? Listen later โ on a commute, at the gym, or to revisit without regenerating.
Press w instead of Enter. Saves both summary.md and audio.mp3. Press Enter to save the summary only.
After saving, you stay on the result view โ you can still copy, chat, re-listen, or re-summarize. The footer shows "Saved" and [q] exits with a single tap (no confirmation needed since nothing will be lost).
TTS Providers โ
tldr supports two TTS providers:
| Provider | Cost | Quality | Setup |
|---|---|---|---|
| Edge TTS (default) | Free | Good (Microsoft Neural voices) | None |
| OpenAI TTS | Paid (per-character) | High quality | Requires OPENAI_API_KEY |
# Switch TTS provider
tldr config set tts-provider openai
tldr config set tts-provider edge-tts # back to defaultYou can also change the TTS provider in the preset editor (tldr preset edit / /config).
TTS Model โ
When using OpenAI TTS, you can choose which model to use:
tldr config set tts-model tts-1-hd| Model | Cost | Notes |
|---|---|---|
| tts-1 (default) | ~$0.01/summary | Faster, lower cost |
| tts-1-hd | ~$0.02/summary | Higher audio quality |
| gpt-4o-mini-tts | ~$0.01/summary | Newest, supports instructions |
Voices โ
Each TTS provider has its own set of voices.
Edge TTS voices:
| Voice | ID | Style |
|---|---|---|
| Jenny (default) | en-US-JennyNeural | Friendly, warm |
| Guy | en-US-GuyNeural | Professional, clear |
| Aria | en-US-AriaNeural | Positive, conversational |
| Sonia | en-GB-SoniaNeural | Clear, British |
| Natasha | en-AU-NatashaNeural | Bright, Australian |
OpenAI TTS voices:
| Voice | ID | Style |
|---|---|---|
| Alloy (default) | alloy | Neutral, balanced |
| Echo | echo | Warm, engaging |
| Fable | fable | Expressive, British |
| Onyx | onyx | Deep, authoritative |
| Nova | nova | Friendly, upbeat |
| Shimmer | shimmer | Clear, gentle |
# Set voice via CLI
tldr config set voice en-US-GuyNeural # edge-tts voice
tldr config set voice nova # openai voiceWhen you switch TTS providers, the voice automatically resets to the new provider's default if your current voice doesn't belong to the new provider.
Speed, Pitch & Volume โ
# Set via preset editor
tldr preset edit
# Or via CLI
tldr config set tts-speed 1.25 # 25% faster
tldr config set pitch low # deeper, warmer voice
tldr config set pitch high # brighter, more energetic
tldr config set volume loud # more presence
tldr config set volume quiet # softerSpeed works with both providers. Pitch and volume presets only apply to Edge TTS โ they are silently ignored when using OpenAI TTS.
Session output โ
Audio files are saved alongside summaries in the session output directory:
~/Documents/tldr/
2026-02-14-how-llms-work/
summary.md
audio.mp3Change the output directory:
tldr config set output-dir ~/summariesKeyboard shortcuts โ
| Key | Action |
|---|---|
a | Generate and play audio |
s | Stop audio playback |
Enter | Save summary |
w | Save with audio |
q | Exit (single-tap after save, double-tap to discard unsaved) |