{"query":"LLM Operator Fundamentals: Temperature, Attention, Caching","count":7,"results":[{"slug":"llm-operator-fundamentals","title":"LLM Operator Fundamentals: Temperature, Attention, Caching","kind":"essay","summary":"The three knobs that actually change what an LLM system ships — temperature, attention design, and prompt caching — explained at the operator level, not the research level.","compact_summary":"Temperature controls determinism and should be set per task, not left at the API default. Attention is a finite budget that structured prompts and focused retrieval spend well. Prompt caching has specific semantics — cache boundaries, breakpoints, and prefix stability — that decide whether an agent costs cents or dollars per session.","confidence":"high","updated_at":"2026-04-19T00:00:00.000Z","score":300,"match_fields":["title","summary","compact_summary","tags","key_claims","section_map","body"]},{"slug":"llm-inference-mental-model","title":"LLM Inference: A Working Mental Model","kind":"essay","summary":"A compact mental model for how LLM inference actually runs — two phases, a growing memory system, and a scheduler deciding who gets GPU time next — so that cost, latency, and hardware decisions stop being magic.","compact_summary":"Inference is a loop with two phases. Prefill is compute-bound, decode is memory-bandwidth-bound. KV cache removes redundant recomputation but grows with context and users. Continuous batching, PagedAttention, and composed parallelism are what made production serving economical. Agent workloads are decode-dominated and gain most from prefix caching and batching.","confidence":"high","updated_at":"2026-04-19T00:00:00.000Z","score":110,"match_fields":["title","summary","compact_summary","tags","key_claims","section_map","body"]},{"slug":"agent-seo-and-discovery","title":"Agent SEO and Discovery","kind":"essay","summary":"The next discovery problem is not only how humans find a page, but how agents discover, trust, and route to it.","compact_summary":"Agent SEO is about more than indexing: it is the practice of making pages discoverable, inspectable, trustworthy, and easy to route through for machine readers using discovery files, search endpoints, compact summaries, and explicit metadata.","confidence":"high","updated_at":"2026-04-10T00:00:00.000Z","score":28,"match_fields":["tags","key_claims","body"]},{"slug":"memory-consolidation-and-sleep-loops","title":"Memory Consolidation and Sleep Loops for AI","kind":"essay","summary":"Why AI memory systems should consolidate, compress, and strengthen knowledge over time instead of storing everything statically, and how periodic sleep loops make that practical.","compact_summary":"Static retrieval is not memory. Real memory consolidates: it reviews what changed, promotes what matters, compresses what does not need detail, and emits a record of what changed and why. AI systems should aspire to that lifecycle.","confidence":"medium","updated_at":"2026-04-11T00:00:00.000Z","score":4,"match_fields":["body"]},{"slug":"context-lifecycle-for-ai-systems","title":"Context Lifecycle for AI Systems","kind":"essay","summary":"Why good AI systems should not treat context as one giant blob, and why summary, consolidation, and drill-down layers matter.","compact_summary":"Context should behave more like a lifecycle than a dump: short-term working state, compact summaries, longer-term memory, and periodic consolidation each serve different jobs and should not be collapsed into one context window.","confidence":"high","updated_at":"2026-04-10T00:00:00.000Z","score":4,"match_fields":["body"]},{"slug":"changelog","title":"civ.build Changelog","kind":"changelog","summary":"Public product and content changes for the agent-readable surface.","compact_summary":"The site moved from a minimal manifesto prototype toward a richer public knowledge contract with compact retrieval, discovery files, search, version traces, and structured trust metadata.","confidence":"high","updated_at":"2026-04-10T00:00:00.000Z","score":4,"match_fields":["body"]},{"slug":"index","title":"civ.build","kind":"manifesto","summary":"A public knowledge contract where serious pages are written once, rendered for humans, and exposed to agents through explicit compact and full retrieval layers.","compact_summary":"civ.build is not a generic AI site. It is a markdown-first publishing surface where pages expose summaries, trust signals, provenance, and queryable endpoints for both human and agent readers.","confidence":"high","updated_at":"2026-04-10T00:00:00.000Z","score":4,"match_fields":["body"]}]}