Episodic Routing
Why Episodes Matter
Some applications evaluate success only after a series of turns: tool-executing agents, troubleshooting flows, tutoring sessions, or coding assistants that gather requirements before delivering a solution. Episodic routing lets Arfniia accumulate the intermediate observations and assign the final reward to the whole trajectory.
Headers to Set
Add the following headers when calling /v1/chat/completions:
X-Arfniia-Episode-Id: stable identifier for the session (UUID recommended).X-Arfniia-Episode-Start: truthy value on the first turn.X-Arfniia-Episode-End: truthy value on the last turn.
Intermediate turns only need the episode ID. The router drops these headers before invoking Bedrock.
Example Timeline
| Turn | Headers | Notes |
|---|---|---|
| 1 | X-Arfniia-Episode-Id=123X-Arfniia-Episode-Start=true | Capture initial prompt and action |
| 2 | X-Arfniia-Episode-Id=123 | Feedback from turn 1 is pending |
| 3 | X-Arfniia-Episode-Id=123X-Arfniia-Episode-End=true | Final answer generated |
After the user or system submits feedback (either sparse KPI or per-response), the learner stitches the cached observations together and updates the policy.
Providing Feedback
- Send a single sparse KPI once the episode outcome is known:
PUT /v1/feedbacks/<router>/sparse/<value>. - Optionally submit per-turn feedback keyed by response IDs if you want faster corrections.
Operational Tips
- Keep episode IDs unique per router; stale caches are cleared automatically when
Episode-Startis received. - If the process aborts early, send
Episode-Endwith the latest turn so the replay buffer is flushed. - Combine episodic signals with custom features (e.g. user sentiment or dialog depth) to give the learner richer state.