Hugging Face Daily Papers · · 8 min read

Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks

Mirrored from Hugging Face Daily Papers for archival readability. Support the source by reading on the original site.

Would experience designing faster GPU kernels also help close in on a long-standing open mathematical conjecture? Large Language Models (LLMs) integrated into evolutionary search have recently produced state-of-the-art solutions on optimization tasks, including open mathematical conjectures, GPU kernel design, scientific law discovery, and combinatorial puzzles. To achieve this, prior work applied search scaffolds to one target task at a time, so every new problem is approached from scratch and the experience accumulated during search is discarded once the model finishes its attempt. This leaves the capability of iteratively evolving a solution (e.g., knowing which part to mutate and how, deciding when to backtrack) entirely in the scaffold rather than in the model itself. Whether the model itself could acquire this capability and reuse it across different tasks has been largely unexamined. To address this, we introduce Evolution Fine-Tuning (EFT), a mid-training paradigm that teaches LLMs to evolve solutions across tasks by converting evolutionary search trajectories into supervision. We construct Finch Collection, a 156K-trajectory dataset spanning 10 domains and 371 optimization tasks, and fine-tune open-source LLMs from 2B to 9B parameters. Empirically, EFT confers cross-task generalization: across 22 held-out tasks, our models surpass their base counterparts by 10.22% on average. Furthermore, when paired with test-time RL, our model matches state-of-the-art performance on two circle-packing tasks and outperforms its base-model counterpart on the Erdős minimum-overlap problem. EFT thus serves as a \"practice phase\" for general-purpose discovery agents that do not solve new problems from scratch.</p>\n","updatedAt":"2026-07-01T02:03:42.315Z","author":{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","fullname":"Young-Jun Lee","name":"passing2961","type":"user","isPro":true,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":15,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.9247519373893738},"editors":["passing2961"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg"],"reactions":[],"isReport":false}},{"id":"6a45c3aaf443043a67e84ab8","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":372,"isUserFollowing":false},"createdAt":"2026-07-02T01:49:30.000Z","type":"comment","data":{"edited":false,"hidden":false,"latest":{"raw":"This is an automated message from the [Librarian Bot](https://huggingface.co/librarian-bots). I found the following papers similar to this paper. \n\nThe following papers were recommended by the Semantic Scholar API \n\n* [PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents](https://huggingface.co/papers/2605.07039) (2026)\n* [BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution](https://huggingface.co/papers/2606.01286) (2026)\n* [What Do Evolutionary Coding Agents Evolve?](https://huggingface.co/papers/2605.20086) (2026)\n* [GPU Forecasters: Language Models as Selective Surrogates for Kernel Runtime Optimization](https://huggingface.co/papers/2605.31464) (2026)\n* [LLM4Branch: Large Language Model for Discovering Efficient Branching Policies of Integer Programs](https://huggingface.co/papers/2605.10401) (2026)\n* [Evolutionary Multi-Task Optimization for LLM-Guided Program Discovery](https://huggingface.co/papers/2605.22613) (2026)\n* [Strategy-Aware Optimization Modeling with Reasoning LLMs](https://huggingface.co/papers/2605.02545) (2026)\n\n\n Please give a thumbs up to this comment if you found it helpful!\n\n If you want recommendations for any Paper on Hugging Face checkout [this](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers) Space\n\n You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: `@librarian-bot recommend`","html":"<p>This is an automated message from the <a href=\"https://huggingface.co/librarian-bots\">Librarian Bot</a>. I found the following papers similar to this paper. </p>\n<p>The following papers were recommended by the Semantic Scholar API </p>\n<ul>\n<li><a href=\"https://huggingface.co/papers/2605.07039\">PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2606.01286\">BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.20086\">What Do Evolutionary Coding Agents Evolve?</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.31464\">GPU Forecasters: Language Models as Selective Surrogates for Kernel Runtime Optimization</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.10401\">LLM4Branch: Large Language Model for Discovering Efficient Branching Policies of Integer Programs</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.22613\">Evolutionary Multi-Task Optimization for LLM-Guided Program Discovery</a> (2026)</li>\n<li><a href=\"https://huggingface.co/papers/2605.02545\">Strategy-Aware Optimization Modeling with Reasoning LLMs</a> (2026)</li>\n</ul>\n<p> Please give a thumbs up to this comment if you found it helpful!</p>\n<p> If you want recommendations for any Paper on Hugging Face checkout <a href=\"https://huggingface.co/spaces/librarian-bots/recommend_similar_papers\">this</a> Space</p>\n<p> You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: <code>@librarian-bot recommend</code></p>\n","updatedAt":"2026-07-02T01:49:30.777Z","author":{"_id":"63d3e0e8ff1384ce6c5dd17d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg","fullname":"Librarian Bot (Bot)","name":"librarian-bot","type":"user","isPro":false,"isHf":false,"isHfAdmin":false,"isMod":false,"followerCount":372,"isUserFollowing":false}},"numEdits":0,"identifiedLanguage":{"language":"en","probability":0.7189532518386841},"editors":["librarian-bot"],"editorAvatarUrls":["https://cdn-avatars.huggingface.co/v1/production/uploads/1674830754237-63d3e0e8ff1384ce6c5dd17d.jpeg"],"reactions":[],"isReport":false}}],"primaryEmailConfirmed":false,"paper":{"id":"2606.29082","authors":[{"_id":"6a432a3a763f63ca3757e83c","user":{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","isPro":true,"fullname":"Young-Jun Lee","user":"passing2961","type":"user","name":"passing2961"},"name":"Young-Jun Lee","status":"claimed_verified","statusLastChangedAt":"2026-07-01T13:54:14.803Z","hidden":false},{"_id":"6a432a3a763f63ca3757e83d","name":"Seungone Kim","hidden":false},{"_id":"6a432a3a763f63ca3757e83e","name":"Minki Kang","hidden":false},{"_id":"6a432a3a763f63ca3757e83f","user":{"_id":"6054b4106158e94e71944d25","avatarUrl":"/avatars/5d0d61f01c2a7587366508053563f723.svg","isPro":false,"fullname":"Alistair Cheong","user":"cheongalc","type":"user","name":"cheongalc"},"name":"Alistair Cheong Liang Chuen","status":"claimed_verified","statusLastChangedAt":"2026-07-01T09:50:03.762Z","hidden":false},{"_id":"6a432a3a763f63ca3757e840","name":"Zerui Chen","hidden":false},{"_id":"6a432a3a763f63ca3757e841","name":"Seungho Han","hidden":false},{"_id":"6a432a3a763f63ca3757e842","name":"Taehee Jung","hidden":false},{"_id":"6a432a3a763f63ca3757e843","name":"Dongyeop Kang","hidden":false}],"publishedAt":"2026-06-27T00:00:00.000Z","submittedOnDailyAt":"2026-07-01T00:00:00.000Z","title":"Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks","submittedOnDailyBy":{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","isPro":true,"fullname":"Young-Jun Lee","user":"passing2961","type":"user","name":"passing2961"},"summary":"Would experience designing faster GPU kernels also help close in on a long-standing open mathematical conjecture? Large Language Models (LLMs) integrated into evolutionary search have recently produced state-of-the-art solutions on optimization tasks, including open mathematical conjectures, GPU kernel design, scientific law discovery, and combinatorial puzzles. To achieve this, prior work applied search scaffolds to one target task at a time, so every new problem is approached from scratch and the experience accumulated during search is discarded once the model finishes its attempt. This leaves the capability of iteratively evolving a solution (e.g., knowing which part to mutate and how, deciding when to backtrack) entirely in the scaffold rather than in the model itself. Whether the model itself could acquire this capability and reuse it across different tasks has been largely unexamined. To address this, we introduce Evolution Fine-Tuning (EFT), a mid-training paradigm that teaches LLMs to evolve solutions across tasks by converting evolutionary search trajectories into supervision. We construct Finch Collection, a 156K-trajectory dataset spanning 10 domains and 371 optimization tasks, and fine-tune open-source LLMs from 2B to 9B parameters. Empirically, EFT confers cross-task generalization: across 22 held-out tasks, our models surpass their base counterparts by 10.22% on average. Furthermore, when paired with test-time RL, our model matches state-of-the-art performance on two circle-packing tasks and outperforms its base-model counterpart on the Erdős minimum-overlap problem. EFT thus serves as a \"practice phase\" for general-purpose discovery agents that do not solve new problems from scratch.","upvotes":23,"discussionId":"6a432a3a763f63ca3757e844","projectPage":"https://open-galapagos.github.io/evolution_finetuning/","githubRepo":"https://github.com/Open-Galapagos/evolution-fine-tuning","githubRepoAddedBy":"user","ai_summary":"Evolutionary fine-tuning enables large language models to develop cross-task problem-solving capabilities by learning from search trajectories, demonstrating improved performance on mathematical conjectures and optimization tasks.","ai_keywords":["evolutionary search","large language models","optimization tasks","mathematical conjectures","evolutionary fine-tuning","search trajectories","cross-task generalization","reinforcement learning","test-time RL","trajectory supervision"],"ai_summary_model":"Qwen/Qwen2.5-Coder-32B-Instruct","githubStars":10,"organization":{"_id":"643536750b30bd434ea1f1c3","name":"minnesotanlp","fullname":"Minnesota NLP","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61e0c5053a1781f66b4e9aed/YHpZgv9x3v9lLYnXyOWi_.png"}},"canReadDatabase":false,"canManagePapers":false,"canSubmit":false,"hasHfLevelAccess":false,"upvoted":false,"upvoters":[{"_id":"6434b6619bd5a84b5dcfa4de","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/6434b6619bd5a84b5dcfa4de/h8Q6kPNjFNc03wmdboHzq.jpeg","isPro":true,"fullname":"Young-Jun Lee","user":"passing2961","type":"user"},{"_id":"657152eb12f162153b50ec9d","avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/657152eb12f162153b50ec9d/qnldHP35PclV0pDz_05q8.jpeg","isPro":false,"fullname":"Byung-Kwan Lee","user":"BK-Lee","type":"user"},{"_id":"667ffa2520ee9ac417d475d6","avatarUrl":"/avatars/f6d4eb90f75a2d573e7eb6b97537ca6a.svg","isPro":false,"fullname":"Shady Ali Said Ahmed","user":"shixoom","type":"user"},{"_id":"64356b40a4bd75c62cbc5926","avatarUrl":"/avatars/5f4c603464e9c8ad613a3a25fa4cacbf.svg","isPro":false,"fullname":"Dongyeop Kang","user":"dykang","type":"user"},{"_id":"65b744d605c25412bb7f08b8","avatarUrl":"/avatars/5a1c6632bd4320b882b25ff8aa6f2fd8.svg","isPro":false,"fullname":"Seungyeon Jwa","user":"syhuggingface","type":"user"},{"_id":"6a2da6c8ca070ee12c6e396c","avatarUrl":"/avatars/0355287dcabaa67dbc7f0b10b87451f9.svg","isPro":false,"fullname":"Joe Mama","user":"JoeMama123123123","type":"user"},{"_id":"67864e969ade3b15efd4044b","avatarUrl":"/avatars/3d3fdcc111515be5652f97f16e7d521d.svg","isPro":false,"fullname":"Chanuk Lee","user":"tally0818","type":"user"},{"_id":"647c4a2692182942d7c2e698","avatarUrl":"/avatars/bcddf5fe49aa092a2645f70812108348.svg","isPro":false,"fullname":"HWANCHANG","user":"HwanChang0106","type":"user"},{"_id":"612c8e7b1dba3c21765b9124","avatarUrl":"/avatars/7c0ae296535fac0ea8021f37fb33bac5.svg","isPro":false,"fullname":"Seungone Kim","user":"louisdebroglie","type":"user"},{"_id":"6a448bfa0efc19184d48d64b","avatarUrl":"/avatars/f9caef3f3c42ec0f9ed7a8bdd1d80492.svg","isPro":false,"fullname":"MYU03","user":"MYU03","type":"user"},{"_id":"6054b4106158e94e71944d25","avatarUrl":"/avatars/5d0d61f01c2a7587366508053563f723.svg","isPro":false,"fullname":"Alistair Cheong","user":"cheongalc","type":"user"},{"_id":"64b74920fe6a108d03fed767","avatarUrl":"/avatars/a2c05b809c36fa5fab8e1a43b3e67051.svg","isPro":false,"fullname":"Minki Kang","user":"Nardien","type":"user"}],"acceptLanguages":["en"],"dailyPaperRank":0,"organization":{"_id":"643536750b30bd434ea1f1c3","name":"minnesotanlp","fullname":"Minnesota NLP","avatar":"https://cdn-avatars.huggingface.co/v1/production/uploads/61e0c5053a1781f66b4e9aed/YHpZgv9x3v9lLYnXyOWi_.png"},"markdownContentUrl":"https://huggingface.co/buckets/huggingchat/papers-content/resolve/2606/2606.29082.md","query":{}}">
Papers
arxiv:2606.29082

Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks

Published on Jun 27
· Submitted by
Young-Jun Lee
on Jul 1

Abstract

Evolutionary fine-tuning enables large language models to develop cross-task problem-solving capabilities by learning from search trajectories, demonstrating improved performance on mathematical conjectures and optimization tasks.

Would experience designing faster GPU kernels also help close in on a long-standing open mathematical conjecture? Large Language Models (LLMs) integrated into evolutionary search have recently produced state-of-the-art solutions on optimization tasks, including open mathematical conjectures, GPU kernel design, scientific law discovery, and combinatorial puzzles. To achieve this, prior work applied search scaffolds to one target task at a time, so every new problem is approached from scratch and the experience accumulated during search is discarded once the model finishes its attempt. This leaves the capability of iteratively evolving a solution (e.g., knowing which part to mutate and how, deciding when to backtrack) entirely in the scaffold rather than in the model itself. Whether the model itself could acquire this capability and reuse it across different tasks has been largely unexamined. To address this, we introduce Evolution Fine-Tuning (EFT), a mid-training paradigm that teaches LLMs to evolve solutions across tasks by converting evolutionary search trajectories into supervision. We construct Finch Collection, a 156K-trajectory dataset spanning 10 domains and 371 optimization tasks, and fine-tune open-source LLMs from 2B to 9B parameters. Empirically, EFT confers cross-task generalization: across 22 held-out tasks, our models surpass their base counterparts by 10.22% on average. Furthermore, when paired with test-time RL, our model matches state-of-the-art performance on two circle-packing tasks and outperforms its base-model counterpart on the Erdős minimum-overlap problem. EFT thus serves as a "practice phase" for general-purpose discovery agents that do not solve new problems from scratch.

Community

Paper author Paper submitter about 24 hours ago

Would experience designing faster GPU kernels also help close in on a long-standing open mathematical conjecture? Large Language Models (LLMs) integrated into evolutionary search have recently produced state-of-the-art solutions on optimization tasks, including open mathematical conjectures, GPU kernel design, scientific law discovery, and combinatorial puzzles. To achieve this, prior work applied search scaffolds to one target task at a time, so every new problem is approached from scratch and the experience accumulated during search is discarded once the model finishes its attempt. This leaves the capability of iteratively evolving a solution (e.g., knowing which part to mutate and how, deciding when to backtrack) entirely in the scaffold rather than in the model itself. Whether the model itself could acquire this capability and reuse it across different tasks has been largely unexamined. To address this, we introduce Evolution Fine-Tuning (EFT), a mid-training paradigm that teaches LLMs to evolve solutions across tasks by converting evolutionary search trajectories into supervision. We construct Finch Collection, a 156K-trajectory dataset spanning 10 domains and 371 optimization tasks, and fine-tune open-source LLMs from 2B to 9B parameters. Empirically, EFT confers cross-task generalization: across 22 held-out tasks, our models surpass their base counterparts by 10.22% on average. Furthermore, when paired with test-time RL, our model matches state-of-the-art performance on two circle-packing tasks and outperforms its base-model counterpart on the Erdős minimum-overlap problem. EFT thus serves as a "practice phase" for general-purpose discovery agents that do not solve new problems from scratch.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

· Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.29082
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 6

Browse 6 models citing this paper

Datasets citing this paper 3

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.29082 in a Space README.md to link it from this page.

Collections including this paper 3

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Hugging Face Daily Papers