This is cute and retro! But I think training only on GoldenEye undersells the concept a bit, since their world model inherits the N64-era graphics from GoldenEye, which automatically makes it look dated.
They don't want to push the envelope too far in the first release and Goldeneye level graphics are more forgiving towards flaky textures and wonky physics. Not to say what it is isn't already groundbreaking.
Very cool work, the learned world state is a smart way of getting consistent generation across all the views (and not having the map vanish when you 180 like some other models). Multi-agent is such an interesting field, because it's clear that humanity benefits from distributed intelligence, but I don't think MARL has really had a big breakthrough like AlphaGo or RLVR for single-agent RL.
Two thoughts about where this could go: first, the internal world state would need to be learned to transfer to real-life robotics, since you can't query the internals of a game engine in training. Second, an enormous challenge for many of these world models is going to be truly unbounded environmental interactivity - Agora is still mostly about a few agents interacting in a static environment. Learning interaction will be hard, because the interactions in games are intentionally added in, by hand. But we (human learners) acquire a strong model for environental interaction very efficiently, which is part of what helps us generalise so effectively.
Hmm. Neat ( especially prowl -- as an idea ). But.. I don't see anything beyond the game. I might be a little cautious, but there is no way for me test any of it ( and I was actually setting up an old mmo this weekend to see how well agent can survive within a rigid ecosystem ). Is it just intended for pure researchers or something?
World models are there for planning capabilities and data efficiency in training, they are an old and general idea (model based RL). You just see them in video games etc because these are easier cases.
I played the game - the inputs feel like trash... I'm not convinced this is the correct direction to generate games. We should probably only be generating scripts and assets to plug into game engines, rather than relying on GenAI for the actual engine.
I'm confused how the world itself works - or would work - at scale. Taking that minecraft clip on the link, if I jump down a cave first, and the model decides the cave goes off to the left. A few hours later I come back, and this time the model decides the cave bends to the right?
I don't think it's a great approach for an actual game. In this case they're demonstrating that the model can create a shared space that's consistent for many players/users, for a short time. But like you say it would probably not endure.
Although it does put me in mind of playing daggerfall years ago, which had randomly generated dungeons that didn't always work. You could sometimes get trapped in architecture, jump from a height with no way to climb back up again. There was a reason Bethesda moved away from that
21 comments:
This is cute and retro! But I think training only on GoldenEye undersells the concept a bit, since their world model inherits the N64-era graphics from GoldenEye, which automatically makes it look dated.
If they retrained the same model on real video data, they could potentially get a multiplayer world with quite realistic-looking graphics (see https://wayve.ai/wp-content/uploads/2025/11/Ex-2-GAIA-3.mp4, https://wayve.ai/wp-content/uploads/2025/11/Ex-3-GAIA-3.mp4).
Maybe for Agora-2 :)
They don't want to push the envelope too far in the first release and Goldeneye level graphics are more forgiving towards flaky textures and wonky physics. Not to say what it is isn't already groundbreaking.
Very cool work, the learned world state is a smart way of getting consistent generation across all the views (and not having the map vanish when you 180 like some other models). Multi-agent is such an interesting field, because it's clear that humanity benefits from distributed intelligence, but I don't think MARL has really had a big breakthrough like AlphaGo or RLVR for single-agent RL.
Two thoughts about where this could go: first, the internal world state would need to be learned to transfer to real-life robotics, since you can't query the internals of a game engine in training. Second, an enormous challenge for many of these world models is going to be truly unbounded environmental interactivity - Agora is still mostly about a few agents interacting in a static environment. Learning interaction will be hard, because the interactions in games are intentionally added in, by hand. But we (human learners) acquire a strong model for environental interaction very efficiently, which is part of what helps us generalise so effectively.
Hmm. Neat ( especially prowl -- as an idea ). But.. I don't see anything beyond the game. I might be a little cautious, but there is no way for me test any of it ( and I was actually setting up an old mmo this weekend to see how well agent can survive within a rigid ecosystem ). Is it just intended for pure researchers or something?
Unlike LLMs which made it into the public view, I have a hard time seeing these world simulation models doing the same
I'm not sure how to imagine their use in education or gaming, but it's clear that they have a real potential for being used in military programs
It's nightmarish to think these could be trained on shooting game footage and thrown into real life scenarios in some form or another
World models are there for planning capabilities and data efficiency in training, they are an old and general idea (model based RL). You just see them in video games etc because these are easier cases.
So much of tech progress links back to gaming, it’s astounding
gaming is all about simulation after all
Well now I'm imagining Optimus robots teabagging the enemy.
Yeah, it's obvious that this will be used to pilot drones.
Oh, not at all! It will be used to train drone pilot AIs.
Is there a little bit more on this in terms of evaluation or is this rather a Show-HN post?
Be careful when transposing game-learned behaviors into real life.
Underwhelming demo. Also the controls are terrible, but, the real Goldeneye was also underwhelming with bad controls if you had played Quake II.
I played the game - the inputs feel like trash... I'm not convinced this is the correct direction to generate games. We should probably only be generating scripts and assets to plug into game engines, rather than relying on GenAI for the actual engine.
Can't wait for a fully immersive game with npcs
good step in a direction
Super cool!
we have this before gta-6
I'm confused how the world itself works - or would work - at scale. Taking that minecraft clip on the link, if I jump down a cave first, and the model decides the cave goes off to the left. A few hours later I come back, and this time the model decides the cave bends to the right?
I don't think it's a great approach for an actual game. In this case they're demonstrating that the model can create a shared space that's consistent for many players/users, for a short time. But like you say it would probably not endure.
Although it does put me in mind of playing daggerfall years ago, which had randomly generated dungeons that didn't always work. You could sometimes get trapped in architecture, jump from a height with no way to climb back up again. There was a reason Bethesda moved away from that