Back to Index
Answer Surface
rel: 1

System Architecture

CiteMap operates on a three-stage pipeline designed to maximize signal and minimize noise for LLMs.

System Architecture

CiteMap operates on a three-stage pipeline designed to maximize signal and minimize noise for LLMs.

1. The Miner (Ingest)

The Miner is a headless browser service (Playwright) or a static file parser.

  • Input: URL list or Local Markdown directory.
  • Operation: It traverses the content, identifying "Integration Nodes" (e.g., "How to use X with Y").
  • Output: Raw Content Blocks.

2. The Engine (Optimize)

The Engine processes the raw blocks into a Semantic Manifest.

  • Token Optimization: Strips HTML tags, CSS classes, and non-content DOM elements.
  • Vectorization: (Optional) Embeds the content for semantic drift detection.
  • Output: manifest.json.

3. The Broadcaster (Serve)

The final stage serves the content to agents.

  • /llms.txt: The "Robots.txt for Agents".
  • /contexts/*.md: The actual Answer Surfaces.