Recently tried out the new GEPA algorithm for prompt evolution with great results. I think using LLMs to write their own prompt and analyze their trajectories is pretty neat once appropriate guardrails are in place
The "Three Laws of Self-Evolving AI Agents" suffer from not being checkable except in retrospect.
I Endure (Safety Adaptation) Self-evolving AI agents must maintain safety and stability during any modification.
II. Excel (Performance Preservation) Subject to the First law, self-evolving AI agents must preserve or enhance existing task performance.
So, if some change is proposed for the system, when does it commit? Some kind of regression testing is needed. The designs sketched out in Figure 3 suggest applying changes immediately, and relying on later feedback to correct degradation. That may not be enough to ensure sanity.
In a code sense, it's like making changes directly on trunk, and fixing them on trunk if something breaks. The usual procedure today is to work on a branch or branches and merge to trunk only when you have some accumulated successful experience that the branch is an improvement.
Self-evolving AI agents may need a back-out procedure like that. Maybe even something like "blame".
Very interesting read. I build self evolving ai agents for my own use with Claude Code, and although the paper seems to be slightly behind where we are today, there are many ideas I hadn't considered I should explore more.
I often think the problem with LLMs is just with training. I think there exists a set of weights such that it produces an LLM that is functionally an agi.
Maybe self evolution will solve the training problem? Who knows.
The problem with LLMs reaching true AGI is it's basically "static" intelligence. Changing code, context, prompts and even fine tuning can improve output, but is still far from realtime learning.
The "weights" in our brains are constantly evolving.
Interesting. The reason why companies aren't trying their best yet into non-static weights/online learning is probably (cloud) logistics. It seems simpler, easier and cheaper to serve a static, well-evaluated, and tuned model, rather than trying to let it learn alongside a specific user or all users.
Even the greatest LLM will only just give you a snapshot of a perceived world state. You’ll only ever get one state, input, to output. Each snapshot in sequence is what will perceptively appear to us as AGI initially.
If we stick with the frames analogy, we know the frames of a movie will never give us a true living and moving person (it will never be real). When we watch a movie, we believe we are seeing a living breathing thing that is deliberate in its existence, but we know that is not true.
So what the hell would real AGI be? Given that you provide the input, it can only ever be a super human augmentation. That along with your own biological world state forming, you have an additional computed world state that you can merge with your biological world state.
We will be AGI, is the implication. Perfect weights will never be perfect because they are historical. We have to embrace being part of the AI to maximize its potential to be AGI.
I think, while i agree to "problem with LLMs is just with training" i also think to a certain degree we need to step back from LLM's as in text processors and to achieve "AI" as in something really intelligent we need to go more abstract back to NN and build a self learning "entity". While LLM's accomplish fascinating results, we are trying to force speech as the primary way of learning, tho this is a really limiting factor. If we would accomplish to create an NN driven AI in a virtual space which would have an simulated environment and learn from a base state like a "newborn" it still could accomplish the skills to understand language as we humans prefer to use it, tho it wouldn't be limited in "thinking" in and only based on this.
I know this is a very simple and abstract way to explain it but i think you get my point.
Towards the simulated AI learning environment, theres this interview with Jensen Huang that i can recommend in which he touches on the topic and how nvidia is working on such https://www.youtube.com/watch?v=7ARBJQn6QkM
While im not a "expert" in this topic, i might have spend quite a portion of the past 10 years in my freetime to think about it and tinker, and ill stick with the point - we need a free self-trained system to actually call it AI, and while LLM's as GPT's nowadays are powerfull tools, for me those are not "Artificial Intelligence" (intelligence from my pov must include reasoning, understanding of its own action, pro-active acting, self-awareness). And even tho the LLM's we use can "answer" to certain questions as if they would have any of those, its just pre-trained answers and they dont bring any of those (we work on reasoning but lets be fair its not that great yet).
https://arxiv.org/abs/2507.19457
https://observablehq.com/@tomlarkworthy/gepa
I guess GEPA is still preprint and before this survey but I recommend taking a look due to it's simplicity
I Endure (Safety Adaptation) Self-evolving AI agents must maintain safety and stability during any modification.
II. Excel (Performance Preservation) Subject to the First law, self-evolving AI agents must preserve or enhance existing task performance.
So, if some change is proposed for the system, when does it commit? Some kind of regression testing is needed. The designs sketched out in Figure 3 suggest applying changes immediately, and relying on later feedback to correct degradation. That may not be enough to ensure sanity.
In a code sense, it's like making changes directly on trunk, and fixing them on trunk if something breaks. The usual procedure today is to work on a branch or branches and merge to trunk only when you have some accumulated successful experience that the branch is an improvement. Self-evolving AI agents may need a back-out procedure like that. Maybe even something like "blame".
Very much appreciate the submission.
Maybe self evolution will solve the training problem? Who knows.
The "weights" in our brains are constantly evolving.
If we stick with the frames analogy, we know the frames of a movie will never give us a true living and moving person (it will never be real). When we watch a movie, we believe we are seeing a living breathing thing that is deliberate in its existence, but we know that is not true.
So what the hell would real AGI be? Given that you provide the input, it can only ever be a super human augmentation. That along with your own biological world state forming, you have an additional computed world state that you can merge with your biological world state.
We will be AGI, is the implication. Perfect weights will never be perfect because they are historical. We have to embrace being part of the AI to maximize its potential to be AGI.
I know this is a very simple and abstract way to explain it but i think you get my point.
Towards the simulated AI learning environment, theres this interview with Jensen Huang that i can recommend in which he touches on the topic and how nvidia is working on such https://www.youtube.com/watch?v=7ARBJQn6QkM
While im not a "expert" in this topic, i might have spend quite a portion of the past 10 years in my freetime to think about it and tinker, and ill stick with the point - we need a free self-trained system to actually call it AI, and while LLM's as GPT's nowadays are powerfull tools, for me those are not "Artificial Intelligence" (intelligence from my pov must include reasoning, understanding of its own action, pro-active acting, self-awareness). And even tho the LLM's we use can "answer" to certain questions as if they would have any of those, its just pre-trained answers and they dont bring any of those (we work on reasoning but lets be fair its not that great yet).
Just my two cents.