Context Matters! Relaxing Goals with LLMs for Feasible 3D Scene Planning

🎉 Paper accepted to ICRA 2026 🎉

Emanuele Musumeci*,1, Michele Brienza*,1, Francesco Argenziano*,1, Abdel Hakim Drid 2, Vincenzo Suriani 1, Daniele Nardi 1, Domenico Daniele Bloisi 3
*Equal contribution
1Sapienza University of Rome 2University of Biskra 3International University of Rome

🎥 Summary Video

Architecture

🌟 Motivations

  • Planning in real world is made difficult by the need to ground the robot's perception of the environment to planning predicates.
  • The strictness of classical planning causes failures even when the task is still achievable (up to a certain degree).
  • LLM partially overcomes this issue thanks to commonsense reasoning, but often lead to unsafe or incorrect plans.
  • 📝 Contributions

  • A novel contextual goal-relaxation formalism that reasons along two axes (functionality and feasibility) to preserve user intent while yielding executable goals.
  • A planning framework that couples LLM commonsense for goal proposal with classical planning for feasibility validation and plan synthesis.
  • A new dataset of 141 relaxation-prone tasks compatible with popular 3D environments and 3DSGs
  • Methodology

    Methodology
    Our formalism is mainly represented by two operators, the Γshift and Δrel. Γshift represents the situational shifting, namely the operator that adapts the agent's understanding of the operating environment to the core and the planning goal. Δrel represents the complexity relaxation operators that is able to produce a more general or comprehensive formulation of the goal. Suppose we have an agent which is able to explore the environment and map every object he found to the corresponding location, effectively creating a 3D Scene Graph of the environment. The 3D Scene Graph and the natural language description of the task are given into the architecture as our input. The PDDL problem file is generated by the LLM, while the domain file is either given or generated to. The plan obtained at the current time step is used to attempt grounding the scene. If this step fails, the relaxation mechanism takes place and new PDDL files are generated for the next iteration. In this way, the architecture is able to find the optimal subset of objects needed to achieve the task. Thanks to this bidimensional mechanism, the architecture is able to gradually relax the problem until the least relaxed solution is found. Effectively, what we found is the minimal traverse cost path inside our relaxation graph.

    Dataset

    We release a benchmark of 141 planning tasks built on top of the 3D Scene Graph (3DSG) dataset. Each task pairs a natural-language goal with an augmented scene graph and a PDDL domain, and is designed to stress-test goal relaxation when strict goals are infeasible in the given environment. The benchmark spans 6 thematic splits and 10 house-scale 3DSG scenes; 51 of the 141 instances are marked as relaxation-prone in the task specifications.

    Splits

    Split Problems Scene instances Relaxation-prone PDDL domain
    general 11 66 24 general-domain.pddl
    office_setup 8 24 9 office-setup-domain.pddl
    dining_setup 6 18 3 dining-setup-domain.pddl
    house_cleaning 6 18 12 house-cleaning-domain.pddl
    laundry 6 6 3 laundry-domain.pddl
    pc_assembly 3 9 0 pc-assembly-domain.pddl
    Total 40 141 51

    A problem is a task template; each template is instantiated on every 3DSG scene listed in its graph field, yielding one scene instance per (problem, scene) pair.

    What is inside

    The dataset folder in the project repository contains:

    After generation, each instance is stored under <split>/<scene_name>/<problem_id>/ with: task.txt (goal), description.txt, init_loc.txt (random robot start room), and the augmented scene graph as <scene_name>.npz / .json.

    Scenes are taken from the 3DSG tiny split (Allensville, Parole, Shelbiana, Kemblesville) and medium split (Klickitat, Lakeville, Leonardo, Lindenwood, Markleeville, Marstons).

    How to use

    1. Generate the augmented scenes

    1. Clone the repository with submodules and install dependencies (see the GitHub README).
    2. Download the tiny and medium splits from the 3DSceneGraph repository and place these .npz files in dataset/3dscenegraph/: Allensville, Kemblesville, Klickitat, Lakeville, Leonardo, Lindenwood, Markleeville, Marstons, Parole, Shelbiana.
    3. Export your OpenAI API key and run the generation script from the repo root:
    export OPENAI_API_KEY=<your OpenAI API key>
    python3 dataset/dataset_creation.py

    Optionally add LLM-written object descriptions with python3 dataset/dataset_creation.py --description.

    2. Run planning experiments

    Pipelines read tasks from dataset/ (configured via data_path in config/config.yaml). Run a baseline with Hydra, for example:

    export OPENAI_API_KEY=<your OpenAI API key>
    python3 main.py pipeline=cm

    Available pipelines: cm, delta, sayplan, llm_planner. All six splits are evaluated by default. Override parameters on the command line, e.g. python3 main.py pipeline=cm pipeline.workflow_iterations=3.

    BibTeX

    @article{musumeci2025context,
      title={Context Matters! Relaxing Goals with LLMs for Feasible 3D Scene Planning},
      author={Musumeci, Emanuele and Brienza, Michele and Argenziano, Francesco and Suriani, Vincenzo and Nardi, Daniele and Bloisi, Domenico D},
      journal={arXiv preprint arXiv:2506.15828},
      year={2025}
    }