Current AI agent systems operate primarily as stateless executors: they do not retain procedural experience across tasks. I propose five desirable properties for experience-driven agent systems and present OrKa Brain, an open-source prototype that implements a procedural skill memory loop (learn, persist, retrieve, apply, feedback, decay) within a YAML-based LLM agent orchestration framework. I evaluate the system on a 30-task benchmark across two tracks (cross-domain transfer and same-domain accumulation) using an LLM-as-judge evaluation protocol. Results show a consistent but modest advantage for the Brain-augmented condition: 63.3% pairwise win rate, with the strongest signal in perceived trustworthiness (19/28 wins). Absolute rubric deltas remain small (+0.10 overall on a 10-point scale), revealing a ceiling effect: the underlying LLM already possesses the procedural knowledge the Brain recalls. The current implementation uses rule-based keyword extraction rather than semantic understanding, and the benchmark carries significant confounds (unequal pipeline lengths, single model, single run). I report both the positive signals and the negative ones, identify the bottlenecks, and outline the architectural slots designed for progressive upgrade.
Marco Somma (Thu,) studied this question.