STARBench

Evaluating Open-World, Memory-Augmented Object Search in Dynamic Environments

arXiv Benchmark Repository

3 Task Families

Visible, interactive, and common-sense object search

5 Task Types

Object descriptions based on object class, properties, space and time

STARBench is a VirtualHome-based benchmark for evaluating open-world object search in dynamic environments. Agents are expected to leverage accumulated observations from the past to search for objects in the present, such as:

  • “Bring me a book.”
  • “Bring me a robotics textbook.”
  • “Bring me a book from the shelf.”
  • “Bring me the book that is usually on the study desk.”
  • “Bring me the book that I was on the study desk yesterday evening.”

To get started, check out the code and instructions here: STARBench

Overview

BibTeX

@misc{chen2025searchingspacetimeunified,
      title={Searching in Space and Time: Unified Memory-Action Loops for Open-World Object Retrieval}, 
      author={Taijing Chen and Sateesh Kumar and Junhong Xu and Georgios Pavlakos and Joydeep Biswas and Roberto Martín-Martín},
      year={2025},
      eprint={2511.14004},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2511.14004}, 
}