kusp
AI systems that learn what to remember
Adaptive retrieval research — per-agent Thompson Sampling for RAG systems. No human feedback. No task labels. Each agent finds its own balance.
Adaptive Retrieval Weight Learning via Thompson Sampling
Each agent learns its own retrieval balance through self-assessment of recency, importance, and relevance. The approach avoids reward hacking by evaluating retrieval quality — not task output. Tested across 2,200+ episodes with 12–20% token reduction.
Read paperExperiment Framework
Full code, synthetic datasets, eval harnesses, and reproduction guides. 238 tests, 80% coverage. Apache 2.0 + CC BY 4.0.
View on GitHub WritingResearch Dispatches
Field notes from 2,200+ agent experiments. Failed approaches, surprising findings, and what actually works.
Read on SubstackAbout
Alfonso DiRocco is an independent AI researcher exploring how agents learn to manage their own memory. His work applies Thompson Sampling to retrieval-augmented generation — letting each agent discover its optimal search strategy through self-assessment rather than human feedback.
Connect on LinkedIn