Skip to content

Audio โ€‹

Summaries aren't just read aloud. They're rewritten as audio scripts by the same AI provider you use for summarizing, using a configurable audio mode that controls the persona and structure. The script adapts to your cognitive traits and tone setting. The result sounds like a natural spoken piece, not a screen reader.

How audio works โ€‹

Three things happen when you press a:

  1. Your summary is sent to the AI provider (same one used for summarizing).
  2. The AI rewrites it as a spoken script using your chosen audio mode โ€” each mode has a distinct persona, structure, and delivery style.
  3. The script is sent to a TTS voice (Edge TTS or OpenAI) for synthesis and playback.

Audio modes โ€‹

Audio modes control how the script is structured and delivered:

ModePersonaBest for
podcast (default)Conversational host with hooks and transitionsGeneral listening
briefingAnalyst delivering concise, numbered factsDaily catch-ups
lecturePatient teacher building understanding progressivelyDeep learning
storytellerNarrator weaving a compelling narrativeStory-driven content
study-buddyStudy partner with quizzes and mnemonicsExam prep, retention
calmGentle, soothing narratorRelaxed / bedtime listening
bash
tldr config set audio-mode briefing           # Change default
tldr --audio-mode lecture "https://..."        # Override for one run

Built-in presets bundle an audio mode with matching style, tone, and voice settings โ€” see Presets.

Cognitive traits โ€‹

The rewrite also adapts to your cognitive traits. These work with any audio mode:

TraitHow the audio script changes
ADHDLeads with a hook. High energy. Mini-takeaway per segment.
DyslexiaShort punchy sentences. Key terms repeated naturally.
AutismDirect and precise. No idioms or implied meanings.
ESLCommon vocabulary. Specialized terms explained inline.
Visual thinkerSpatial language. Word pictures. Narrative structure.

Traits stack โ€” enable multiple with tldr preset edit. Your tone setting (casual, professional, academic, eli5) also shapes the script.

Listening to a summary โ€‹

  1. Summary appears. A bordered Audio panel shows [a] listen ยท [w] save + audio.
  2. Press a. A spinner shows "Generating audio..." then audio plays.
  3. Press s to stop playback.
  4. Press a again. Cached audio replays instantly โ€” no re-generation.

Saving with audio โ€‹

Why save audio? Listen later โ€” on a commute, at the gym, or to revisit without regenerating.

Press w instead of Enter. Saves both summary.md and audio.mp3. Press Enter to save the summary only.

After saving, you stay on the result view โ€” you can still copy, chat, re-listen, or re-summarize. The footer shows "Saved" and [q] exits with a single tap (no confirmation needed since nothing will be lost).

TTS Providers โ€‹

tldr supports two TTS providers:

ProviderCostQualitySetup
Edge TTS (default)FreeGood (Microsoft Neural voices)None
OpenAI TTSPaid (per-character)High qualityRequires OPENAI_API_KEY
bash
# Switch TTS provider
tldr config set tts-provider openai
tldr config set tts-provider edge-tts   # back to default

You can also change the TTS provider in the preset editor (tldr preset edit / /config).

TTS Model โ€‹

When using OpenAI TTS, you can choose which model to use:

bash
tldr config set tts-model tts-1-hd
ModelCostNotes
tts-1 (default)~$0.01/summaryFaster, lower cost
tts-1-hd~$0.02/summaryHigher audio quality
gpt-4o-mini-tts~$0.01/summaryNewest, supports instructions

Voices โ€‹

Each TTS provider has its own set of voices.

Edge TTS voices:

VoiceIDStyle
Jenny (default)en-US-JennyNeuralFriendly, warm
Guyen-US-GuyNeuralProfessional, clear
Ariaen-US-AriaNeuralPositive, conversational
Soniaen-GB-SoniaNeuralClear, British
Natashaen-AU-NatashaNeuralBright, Australian

OpenAI TTS voices:

VoiceIDStyle
Alloy (default)alloyNeutral, balanced
EchoechoWarm, engaging
FablefableExpressive, British
OnyxonyxDeep, authoritative
NovanovaFriendly, upbeat
ShimmershimmerClear, gentle
bash
# Set voice via CLI
tldr config set voice en-US-GuyNeural       # edge-tts voice
tldr config set voice nova                   # openai voice

When you switch TTS providers, the voice automatically resets to the new provider's default if your current voice doesn't belong to the new provider.

Speed, Pitch & Volume โ€‹

bash
# Set via preset editor
tldr preset edit

# Or via CLI
tldr config set tts-speed 1.25        # 25% faster
tldr config set pitch low             # deeper, warmer voice
tldr config set pitch high            # brighter, more energetic
tldr config set volume loud           # more presence
tldr config set volume quiet          # softer

Speed works with both providers. Pitch and volume presets only apply to Edge TTS โ€” they are silently ignored when using OpenAI TTS.

Session output โ€‹

Audio files are saved alongside summaries in the session output directory:

~/Documents/tldr/
  2026-02-14-how-llms-work/
    summary.md
    audio.mp3

Change the output directory:

bash
tldr config set output-dir ~/summaries

Keyboard shortcuts โ€‹

KeyAction
aGenerate and play audio
sStop audio playback
EnterSave summary
wSave with audio
qExit (single-tap after save, double-tap to discard unsaved)

Released under the MIT License.