Extension points
Every external dependency is a typed Protocol seam — LLM provider, embeddings, listwise judge, OCR, narrative retriever, task queue. Implement the Protocol and inject it; the core imports zero SDKs and runs offline with MockProvider.
RAGSpine's pluggability is not a plugin registry — it is plain structural typing. Each
external dependency is a Python Protocol; the core depends on the abstraction, never on a
vendor SDK. Adding a provider, vector store, reranker, or OCR engine touches one new
file, and every heavy SDK is lazy-imported so the core imports cleanly and runs fully
offline with the deterministic MockProvider.
All seams below are @runtime_checkable Protocols — your implementation does not need to subclass
anything, it only needs the right method signatures. mypy --strict covers the core, and the
anthropic / openai / sentence-transformers / paddleocr SDKs are only imported inside the
concrete implementations, never in the seam.
The seams
LLMProvider
agent/llm_provider.py — send system + messages + tools, get a unified response.
EmbeddingBackend
retrieval/lexical/retrieval.py — batch texts to vectors for the injectable vector channel.
ListwiseJudge
retrieval/rerank/listwise_rerank.py — listwise rerank: return candidate indices best-first.
OcrBackend
extraction/extractors/pdf_scanned_extractor.py — recognize a single scanned page image.
NarrativeRetriever
agent/agent.py — the narrative retrieval seam injected into the orchestrator.
TaskQueue
service/tasks/task_queue.py — async job queue (FakeQueue in tests, RQQueue in prod).
LLMProvider
src/ragspine/agent/llm_provider.py — the single method the agent's tool-use loop drives.
AnthropicProvider lazy-imports the anthropic SDK; MockProvider needs neither key nor
network.
Prop
Type
EmbeddingBackend
src/ragspine/retrieval/lexical/retrieval.py — the dependency-injection point for the
vector channel. The default is none (pure BM25); inject this to add a vector channel.
Prop
Type
ListwiseJudge
src/ragspine/retrieval/rerank/listwise_rerank.py — the optional LLM listwise reranker
seam. The real implementation is Claude (via build_listwise_prompt /
parse_listwise_response); tests use a deterministic fake. Falls back to RRF order if absent.
Prop
Type
OcrBackend
src/ragspine/extraction/extractors/pdf_scanned_extractor.py — the scanned-PDF OCR/VLM
seam. The real backend (PaddleOCR) runs on Ubuntu + GPU; logic tests use a fake so the
render → map → threshold → review flow is fully testable on a GPU-less machine.
Prop
Type
NarrativeRetriever
src/ragspine/agent/agent.py — the narrative retrieval implementation injected into
answer_question. Duck-typed; when omitted, the narrative path degrades honestly.
Prop
Type
TaskQueue
src/ragspine/service/tasks/task_queue.py — the async job queue. RQQueue (RQ + Redis)
for production, FakeQueue (synchronous inline) for tests. rq / redis are lazy-imported
inside RQQueue only.
Prop
Type
Implement a Protocol and inject it
Because the seams are structural, you implement the method(s) and pass the instance in — no
registration, no base class. Here is an OpenAI-backed LLMProvider, grounded in the real
create_message signature and ProviderResponse shape:
from openai import OpenAI # lazy: only your file imports the SDK
from ragspine.agent.llm_provider import ProviderResponse, ToolCall
class OpenAIProvider:
"""A custom LLMProvider — no subclassing, just the create_message method."""
def __init__(self, model: str = "gpt-4o") -> None:
self._client = OpenAI()
self._model = model
def create_message(
self,
*,
system: str,
messages: list[dict[str, object]],
tools: list[dict[str, object]],
) -> ProviderResponse:
resp = self._client.chat.completions.create(
model=self._model,
messages=[{"role": "system", "content": system}, *messages], # adapt as needed
)
text = resp.choices[0].message.content or ""
return ProviderResponse(text=text, tool_calls=[], stop_reason="end_turn")Inject it exactly where MockProvider would go:
from ragspine.agent.agent import answer_question
from ragspine.storage.fact_store import FactStore
from my_openai_provider import OpenAIProvider
store = FactStore("data/fact_metric.db")
store.init_schema()
result = answer_question("...", store, OpenAIProvider())The structured channel's anti-fabrication guard does not trust provider prose for the number — a found fact is deterministically rendered from the fact value, and a no-fact result is rewritten to "not found" regardless of model output. Swapping the provider cannot defeat the guard.
Configuration
Environment variables (RAGSPINE_*) read by ServiceConfig, and the CompanyProfile TOML that drives identity, metrics, and competitor scope — no hardcoded company anywhere.
HTTP API
The FastAPI service endpoints — health, ask, async ingestion jobs, and job status — with exact paths, methods, and representative request/response payloads.