Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

HTTP API Reference

Base URL

All endpoints are served from the Arfniia Router instance, typically http://<host>:5525.
The API is split into control plane endpoints for router configuration and runtime endpoints for inference, learning, and observability.

Control Plane

POST /v1/routers

  • Creates a router definition. Payload must satisfy the ArfniiaRouter schema: router name, at least one entry in base_models or provisioned_throughputs, an embedding model, feedback configuration, feedback_cost_weights that sum to 1, and optional training overrides.
  • Returns the stored router definition or 409 if the name already exists.
{
"name": "advanced-reasoning",
"base_models": [
"anthropic.claude-3-5-haiku-20241022-v1:0",
"us.anthropic.claude-sonnet-4-20250514-v1:0"
],
"embedding": "amazon.titan-embed-text-v2:0",
"training": {
"num_of_steps": 5,
"batch_size": 16,
"context_cache_similarity": 0.95,
"exploration_level": "low"
},
"feedback": {
"goal": "max",
"min_value": 0,
"max_value": 1
},
"feedback_cost_weights": [1.0, 0.0]
}

GET /v1/routers/{name}

  • Fetches the stored router definition. Returns 404 if the router is missing.

PATCH /v1/routers/{name}

  • Partially updates an existing router. Any supplied fields are merged with the current definition; unspecified fields keep their existing values. Name changes are rejected with 400.
  • Returns the updated definition and refreshes in-memory caches.

DELETE /v1/routers/{name}

  • Removes a router and clears associated caches. Returns 204 on success or 404 if not found.

Runtime Inference

POST /v1/chat/completions

  • Accepts the standard OpenAI-compatible chat completions payload. When model equals a router name, the router embeds the request, selects a base model (or provisioned throughput), invokes Bedrock, and schedules learning.
  • When model matches a Bedrock model ID instead of a router name, the call is proxied to Bedrock without learning.
  • Optional request headers:
    • X-Arfniia-Disable-Learning: truthy value (true, 1, yes, on) skips learning for this call.
    • X-Arfniia-Episode-Id, X-Arfniia-Episode-Start, X-Arfniia-Episode-End: mark episodic rollouts; omit when each prompt is independent.
    • X-Arfniia-Feature-*: attach runtime features (see custom features). Headers are case-insensitive; values are parsed as floats, booleans, or categorical tokens.
  • Response mirrors the upstream LLM payload (id, choices, usage, etc.). 500 is returned if routing or downstream inference fails.

GET /v1/routers/{router_name}/explanations/{response_id}

  • Retrieves the explanation blob saved for the given router response (chosen_model, Q-value deltas, cache membership). Returns 404 when the response has no stored explanation.

Feedback APIs

PUT /v1/feedbacks/{router_name}/sparse/{feedback_value}

  • Stores delayed or aggregated KPI feedback (e.g., conversion rate). The latest value is mapped to router_name/sparse and mirrored under an aggregate key.

PUT /v1/feedbacks/{router_name}/{feedback_name}/{feedback_value}

  • Records immediate feedback keyed by feedback_name (commonly the response id). Use this endpoint to reward or penalize individual responses.

GET /v1/feedbacks/{router_name}

  • Returns the most recent feedback bundle for the router, combining sparse and per-response entries. Responds with 404 if no feedback has been recorded.

Observability

GET /metrics

  • Exposes Prometheus metrics for router latency, downstream LLM usage, token counts, exploration rates, and learning statistics.

Error Handling

  • 400: validation error (missing model, unsupported Bedrock identifier, weight sum mismatch).
  • 404: router or feedback record not found.
  • 409: router name conflict on create.
  • 500: unexpected runtime or downstream provider failure.

Authentication

Arfniia Router relies on the surrounding network perimeter. If you need authentication, terminate TLS and enforce headers at your ingress; the router itself does not ship with built-in auth.