{"slug":"llm-operator-fundamentals","versions":[{"slug":"llm-operator-fundamentals","version":"0.1.0","updated_at":"2026-04-19T00:00:00.000Z","saved_at":"2026-04-22T11:29:26.905Z","change_summary":"Initial public version.","summary":"The three knobs that actually change what an LLM system ships — temperature, attention design, and prompt caching — explained at the operator level, not the research level.","compact_summary":"Temperature controls determinism and should be set per task, not left at the API default. Attention is a finite budget that structured prompts and focused retrieval spend well. Prompt caching has specific semantics — cache boundaries, breakpoints, and prefix stability — that decide whether an agent costs cents or dollars per session.","kind":"essay","confidence":"high","estimated_tokens":1643,"word_count":1217,"content_hash":"f7744ac7ec2f0cdce7558e8f3d4a2b922d922213772241f63ccc8ba0687f6b1f"}]}