Highlight

WorldMem enables long-term consistent world simulation with memory mechanism. We can control agents to explore diverse and consistent worlds with an expansive action space, crafting environments by placing objects like pumpkin light or freely roaming around. Most importantly, after exploring for a while and glancing back, we find the objects we placed are still there, with the sight of the light melting surrounding snow, indicating the passage of time!

Initial View

Revisited View

Initial View

Revisited View

w/o Memory

Initial View

Revisited View

w/ Memory

Compare with Ground Truth

By conditioning on the memory bank, our framework accurately generates diverse and dynamic worlds that remains consistent with past states.

Generate Results

Ground Truth

Generate Results

Ground Truth

Generate Results

Ground Truth

Generate Results

Ground Truth

Interactive with the World

We can interact with the world by placing hay in the desert or planting wheat in the plains. Meanwhile, these changes are recorded and reproduced in revisited views. Over time, we can observe transformations such as wheat growing.

Initial View

Revisited View

Place Hay in Desert

Initial View

Revisited View

Plant Wheat in Plains

Without timestamps as a condition, the model struggles to distinguish between memory units representing the same location at different time points, leading to incorrect generations. In contrast, with time conditioning, the model effectively aligns with the updated world state, ensuring consistent outputs.

Initial View

Revisited View

w/o Time Condition

Initial View

Revisited View

w/ Time Condition

Real Scenes

In the 360-degree consistency testing. Our approach with memory successfully returns to the original location without losing previously generated details.

Initial View

Revisited View

w/o Memory

Initial View

Revisited View

w/ Memory

Our method can generate consistent results with customized trajectories.

Initial View

Revisited View

Initial View

Revisited View

BibTeX

@misc{xiao2025worldmemlongtermconsistentworld,
        title={WORLDMEM: Long-term Consistent World Simulation with Memory}, 
        author={Zeqi Xiao and Yushi Lan and Yifan Zhou and Wenqi Ouyang and Shuai Yang and Yanhong Zeng and Xingang Pan},
        year={2025},
        eprint={2504.12369},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2504.12369},
  }