Skip to content

Added SDM Activations to paper list.#1

Open
allenschmaltz wants to merge 1 commit intoAgentMemoryWorld:mainfrom
allenschmaltz:sdm-activation-memory
Open

Added SDM Activations to paper list.#1
allenschmaltz wants to merge 1 commit intoAgentMemoryWorld:mainfrom
allenschmaltz:sdm-activation-memory

Conversation

@allenschmaltz
Copy link
Copy Markdown

Thanks for putting this repo together!

I updated the README with a work that may be of interest, as it provides another dimension to consider that is largely distinct from (but potentially complementary to) the other papers: An updatable memory in service of estimating the predictive uncertainty, with a particular (unique) emphasis on modeling the epistemic uncertainty:

"Similarity-Distance-Magnitude Activations". 16 Sep 2025. https://arxiv.org/abs/2509.12760

This is also relevant to one of the thematic areas for future work stated in the survey: "9.4 Life-Long Personalization and Trustworthy Memory".

SDM activations and estimators provide robust estimates of the predictive uncertainty and interpretability-by-exemplar, the latter of which can also be viewed as a type of instance-wise data valuation. The SDM activation is a general component for neural networks, but particularly useful as a final-layer activation; for agents, it is relevant for conditional branching decision during test-time search.

This cuts the plane a bit differently than the other papers, but as a first approximation using the existing taxonomy, we can consider this episodic memory in feature space (i.e., that of the hidden representation states, conditional on the output prediction).

By design, the SDM estimator is a mix of parametric and non-parametric components. The argmax prediction does not change with modifications to the support set (a.k.a., memory), but changes to the support set do impact the uncertainty estimates (by changing the Similarity and Distance estimates). Thus, "fast-moving updates" by adding/deleting/modifying the support set (and otherwise holding the parameters fixed) can impact test-time search (by means of conditional branching decisions in a test-time search graph), with "slower-moving updates" achievable by running SGD on the parametric components (i.e., the CNN and linear layer) when recalibrating across all of the available data. (For context, using this informal terminology, "slowest-moving updates" in this sense would be updating the parameters of the underlying model itself.)

For the GitHub, I linked the main codebase, which also includes an applied implementation as an MCP Server, which may be of interest to this audience for dropping into agent pipelines. (Alternatively, the repo with the replication scripts and data for the paper is linked in the arXiv paper.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant