Channels

The structured channel (function-calling over the fact store), the narrative channel (hybrid retrieve → listwise rerank → cited synthesis), and the composite path that runs both and merges.

RAGSpine answers two fundamentally different kinds of question with two fundamentally different mechanisms, and the agent's router decides which one — or both — a question needs. This page describes the internal pipeline of each channel; for the why and the routing decision, see the Dual-channel concept.

Structured channel

“What’s the number?” — function-calling over a fact table. Deterministic; the value is never synthesized by the model.

Narrative channel

“Why / what happened?” — hybrid retrieval over document chunks, optional rerank, then LLM synthesis with forced citations.

Composite

Both at once — run the structured path, then append a narrative attribution section, and concatenate sources.

Structured channel

The structured channel's only fact-producing primitive is the query_metric tool (agent/query_tools.py), driven through the tool-use loop in agent/agent.py (_run_tool_loop, capped at MAX_TOOL_ITERATIONS = 5). Its execution function normalizes every parameter through the glossary, queries the fact_metric store, and returns one of three statuses — never a guess:

The exact row exists. Returns the value, unit, every controlled dimension code, and full lineage. The agent then rebuilds the answer line from this value (it does not trust the model's prose for the number).

{
  "status": "found",
  "value": 1320,
  "unit": "USD_M",
  "metric_code": "REVENUE",
  "entity": "ACME_CN",
  "period_type": "FY",
  "period": "2024",
  "channel": "TOTAL",
  "source": { "doc": "ACME_FY2024_Review.pptx", "locator": "slide=2,table=1,row=REVENUE,col=FY2024" }
}

Every parameter normalized, but no matching row in the fact table. No interpolation, no inference — the agent rewrites this to an honest refusal.

{
  "status": "not_found",
  "normalized": { "metric_code": "REVENUE", "entity": "ACME_CN", "period": "2025", "channel": "TOTAL" }
}

A parameter could not be normalized to a controlled code — the glossary returns None rather than guessing. The offending parameter and its raw value are named.

{ "status": "unrecognized_param", "param": "entity", "raw": "some unknown company" }

The glossary normalizers (normalize_metric / normalize_entity / normalize_period) return None on anything they don't recognize, and that None becomes unrecognized_param — it is never coerced into a best-guess code. A present-but-unknown channel is passed through literally, so a typo'd channel yields not_found rather than a silent fallback to TOTAL.

When the user explicitly lists several metrics / entities / periods, the structured side expands into sub-tasks executed deterministically without the LLM (_multi_subtask_answer) — and for exactly two comparable periods, the agent computes the difference itself.

Narrative channel

When the route is narrative, the agent calls an injected NarrativeRetriever (_run_narrative). The default chain (retrieval/) runs in a fixed order:

Metadata pre-filter. Before any scoring, candidate chunks are filtered by metadata (topic / entity / geography / period / language). For the versioned chunk store this filter is pushed down to storage (ChunkStore.iter_chunks, active versions only).

Glossary multi-query rewrite. GlossaryQueryRewriter (a QueryRewriter Protocol impl) expands the query into variants — substituting recognized terms with their controlled code, then with sibling synonyms sharing that code. Pure-rule, zero LLM.

Hybrid retrieve. For each rewritten query, CJK-aware Okapi BM25 (unigram + adjacent bigram tokenization) produces a ranking; if and only if an EmbeddingBackend is injected, a cosine-similarity vector ranking is added too. The default is none = pure BM25.

RRF fusion. All rankings are merged with Reciprocal Rank Fusion (rrf_fuse) at the standard constant DEFAULT_RRF_K = 60, sorted by (-fused_score, chunk_id) for a deterministic tie-break, and truncated to top_k (default 50).

Listwise rerank. An optional LLM listwise judge (listwise_rerank, behind the ListwiseJudge Protocol) reorders the top candidates. On any judge exception or invalid output it falls back to the RRF order — it never raises.

Synthesize with citations. The LLM answers only from the supplied snippets, and the agent force-appends any source document name the model failed to cite.

RESTRICTED isolation. Sensitivity-RESTRICTED content is filtered at two exits before it can reach a prompt: the rerank layer keeps RESTRICTED chunks out of the judge prompt, and the link adapter (NarrativeIndexRetriever.retrieve) physically drops them from the returned snippets before synthesis. Both checks use the same case-insensitive RESTRICTED test — see RESTRICTED isolation.

If no retriever is wired, or retrieval returns nothing, the narrative channel degrades honestly ("not retrieved / not yet wired") rather than inventing an answer.

Composite path — run both, compare, merge

For a composite question (a recognized metric and a narrative cue, e.g. "why did revenue fall?"), the agent runs the structured path first, then runs the narrative path and merges:

<structured answer, with the number(s) + lineage>

归因分析：
<narrative answer, synthesized from cited snippets>

Sources from both channels are concatenated. The two channels never blur into a single "ask the model" path — the structured side remains the only thing allowed to produce a numeric fact, and it does so without trusting model prose.

At a glance

	Structured	Narrative	Composite
Answers	"what's the number"	"why / what happened"	both
Backing store	`fact_metric` (numeric)	versioned chunk store	both
Mechanism	`query_metric` function-calling	hybrid retrieve → rerank → synthesize	run both, append `归因分析`
Model trust	none — value from fact store	prose trusted, citations forced	per-path
Failure mode	`not_found` / `unrecognized`	honest degrade if unwired/empty	per-path

Structured channel

Narrative channel

Composite

Structured channel

Narrative channel

Composite path — run both, compare, merge

At a glance

Dual-channel concept

Anti-fabrication

Retrieval guide

Agent guide

On this page