Hugging Face Daily Papers · · 4 min read

AutoMem: Automated Learning of Memory as a Cognitive Skill

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

On long-horizon tasks, an LLM agent has to manage a lot of memory — what to write down, what to look up. We turn managing memory into a skill the model learns, and improve it automatically. An open 32B model approaches frontier-level performance on three procedurally generated long-horizon games: Crafter, MiniHack, and NetHack.</p>\n<p>Code &amp; demos: <a href=\"https://autolearnmem.github.io/\" rel=\"nofollow\">https://autolearnmem.github.io/</a></p>\n","updatedAt":"2026-07-03T15:34:05.697Z","author":{"_id":"65222f97ef06bb99753cb829","avatarUrl":"/avatars/f1a743d74e6d38b916acaec91b4e7e4f.svg","fullname":"Shengguang Wu","name":"danielwusg","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.8738757967948914},"editors":["danielwusg"],"editorAvatarUrls":["/avatars/f1a743d74e6d38b916acaec91b4e7e4f.svg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2607.01224","authors":[{"_id":"6a4696336ee372f6920de100","name":"Shengguang Wu","hidden":false},{"_id":"6a4696336ee372f6920de101","name":"Hao Zhu","hidden":false},{"_id":"6a4696336ee372f6920de102","name":"Yuhui Zhang","hidden":false},{"_id":"6a4696336ee372f6920de103","name":"Xiaohan Wang","hidden":false},{"_id":"6a4696336ee372f6920de104","name":"Serena Yeung-Levy","hidden":false}],"publishedAt":"2026-07-01T00:00:00.000Z","submittedOnDailyAt":"2026-07-03T00:00:00.000Z","title":"AutoMem: Automated Learning of Memory as a Cognitive Skill","submittedOnDailyBy":{"_id":"65222f97ef06bb99753cb829","avatarUrl":"/avatars/f1a743d74e6d38b916acaec91b4e7e4f.svg","isPro":false,"fullname":"Shengguang Wu","user":"danielwusg","type":"user","name":"danielwusg"},"summary":"Memory expertise is a learned skill: knowing what to encode, when to retrieve, and how to organize knowledge--a capacity known in cognitive science as metamemory. We bring this perspective to LLMs by treating memory management as a trainable skill. We promote file-system operations to first-class memory actions alongside task actions, letting the model itself decide how to manage its memory. This memory skill improves along two axes: the structure that supports it (prompts, file schemas, action vocabulary), and the proficiency of the model exercising it. Both axes resist manual optimization: episodes in long-horizon tasks run for thousands of steps, and a single memory mistake can hide long before it surfaces, making human review of full trajectories impractical. We introduce AutoMem, a framework that automates both axes. In the first loop, a strong LLM reviews complete agent trajectories and iteratively revises the memory structure that shapes how the agent interacts with its memory files. In the second loop, the agent's own good memory decisions are identified from many episodes and used as training signal to sharpen the model's memory proficiency directly. Across three procedurally generated long-horizon games (Crafter, MiniHack, and NetHack), optimizing memory alone--without modifying the model's task-action behavior--improved the base agent's performance ~2x-4x, bringing a 32B open-weight model competitive with frontier systems such as Claude Opus 4.5 and Gemini 3.1 Pro Thinking. Our results show that memory management is an independently learnable skill, and a high-leverage objective yielding large gains on long-horizon tasks.","upvotes":6,"discussionId":"6a4696346ee372f6920de105","projectPage":"https://autolearnmem.github.io/","githubRepo":"https://github.com/autoLearnMem/AutoMem","githubRepoAddedBy":"user","ai_summary":"Memory management in large language models is treated as a trainable skill through a framework that automates both memory structure optimization and proficiency enhancement, leading to significant performance improvements in long-horizon tasks.","ai_keywords":["memory management","metamemory","file-system operations","memory skill","memory structure","memory proficiency","long-horizon tasks","AutoMem","agent trajectories","memory expertise"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":20},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6371b11c67cd0e88150753c7","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6371b11c67cd0e88150753c7/yi-sesPPpDjKF_Mw_IKvB.jpeg","isPro":false,"fullname":"Matthew","user":"tinycrops","type":"user"},{"_id":"65222f97ef06bb99753cb829","avatarUrl":"/avatars/f1a743d74e6d38b916acaec91b4e7e4f.svg","isPro":false,"fullname":"Shengguang Wu","user":"danielwusg","type":"user"},{"_id":"651c80a26ba9ab9b9582c273","avatarUrl":"/avatars/e963452eafd21f517d800f2e58e0f918.svg","isPro":false,"fullname":"siyeng feng","user":"siyengfeng","type":"user"},{"_id":"62da55164398e21bf7f0e292","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/62da55164398e21bf7f0e292/xjKkG8IA2IZZqCdjApSh3.jpeg","isPro":false,"fullname":"Yuhui Zhang","user":"yuhuizhang","type":"user"},{"_id":"62455ccd4db06ca3fff351c7","avatarUrl":"/avatars/c5a483aa3c6be912fd6eec01349dac10.svg","isPro":false,"fullname":"yuanmei424","user":"yuanmei424","type":"user"},{"_id":"64b57bebfc5a8ae2a7d8f00d","avatarUrl":"/avatars/a1903aa77c0c7db4d41d47e8792b910a.svg","isPro":false,"fullname":"Elaine Sui","user":"esui","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2607/2607.01224.md","query":{}}">
Papers
arxiv:2607.01224

AutoMem: Automated Learning of Memory as a Cognitive Skill

Published on Jul 1
· Submitted by
Shengguang Wu
on Jul 3
Authors:
,
,
,
,

Abstract

Memory management in large language models is treated as a trainable skill through a framework that automates both memory structure optimization and proficiency enhancement, leading to significant performance improvements in long-horizon tasks.

Memory expertise is a learned skill: knowing what to encode, when to retrieve, and how to organize knowledge--a capacity known in cognitive science as metamemory. We bring this perspective to LLMs by treating memory management as a trainable skill. We promote file-system operations to first-class memory actions alongside task actions, letting the model itself decide how to manage its memory. This memory skill improves along two axes: the structure that supports it (prompts, file schemas, action vocabulary), and the proficiency of the model exercising it. Both axes resist manual optimization: episodes in long-horizon tasks run for thousands of steps, and a single memory mistake can hide long before it surfaces, making human review of full trajectories impractical. We introduce AutoMem, a framework that automates both axes. In the first loop, a strong LLM reviews complete agent trajectories and iteratively revises the memory structure that shapes how the agent interacts with its memory files. In the second loop, the agent's own good memory decisions are identified from many episodes and used as training signal to sharpen the model's memory proficiency directly. Across three procedurally generated long-horizon games (Crafter, MiniHack, and NetHack), optimizing memory alone--without modifying the model's task-action behavior--improved the base agent's performance ~2x-4x, bringing a 32B open-weight model competitive with frontier systems such as Claude Opus 4.5 and Gemini 3.1 Pro Thinking. Our results show that memory management is an independently learnable skill, and a high-leverage objective yielding large gains on long-horizon tasks.

Community

Paper submitter about 4 hours ago

On long-horizon tasks, an LLM agent has to manage a lot of memory — what to write down, what to look up. We turn managing memory into a skill the model learns, and improve it automatically. An open 32B model approaches frontier-level performance on three procedurally generated long-horizon games: Crafter, MiniHack, and NetHack.

Code & demos: https://autolearnmem.github.io/

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2607.01224
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2607.01224 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2607.01224 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2607.01224 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers