Episodic Routing
Why Episodes Matter
Some applications evaluate success only after a series of turns: tool-executing agents, troubleshooting flows, tutoring sessions, or coding assistants that gather requirements before delivering a solution. Episodic routing lets Arfniia accumulate the intermediate observations and assign the final reward to the whole trajectory.
Headers to Set
Add the following headers when calling /v1/chat/completions
:
X-Arfniia-Episode-Id
: stable identifier for the session (UUID recommended).X-Arfniia-Episode-Start
: truthy value on the first turn.X-Arfniia-Episode-End
: truthy value on the last turn.
Intermediate turns only need the episode ID. The router drops these headers before invoking Bedrock.
Example Timeline
Turn | Headers | Notes |
---|---|---|
1 | X-Arfniia-Episode-Id=123 X-Arfniia-Episode-Start=true | Capture initial prompt and action |
2 | X-Arfniia-Episode-Id=123 | Feedback from turn 1 is pending |
3 | X-Arfniia-Episode-Id=123 X-Arfniia-Episode-End=true | Final answer generated |
After the user or system submits feedback (either sparse KPI or per-response), the learner stitches the cached observations together and updates the policy.
Providing Feedback
- Send a single sparse KPI once the episode outcome is known:
PUT /v1/feedbacks/<router>/sparse/<value>
. - Optionally submit per-turn feedback keyed by response IDs if you want faster corrections.
Operational Tips
- Keep episode IDs unique per router; stale caches are cleared automatically when
Episode-Start
is received. - If the process aborts early, send
Episode-End
with the latest turn so the replay buffer is flushed. - Combine episodic signals with custom features (e.g. user sentiment or dialog depth) to give the learner richer state.