Multi-agent Planning using Visual Language Models

ECAI 2024

To pick up the statue and turn on the lamp next to the couch. (trial_T20190909_193045_208933)


Ground truth Single Agent with table Multi Agent with table Single Agent with image Multi Agent with image
STEP 1: Walk down the hallway into the living room and turn right, walk past the couch and turn left towards the brown coffee table.
STEP 2: Pick up the brown statue behind the plant in the gold pot.
STEP 3: Turn around and head to the small brown coffee table next to the couch with the lamp.
STEP 4: Turn the lamp on that's on the small table next to the couch.
Turn around and walk to the SideTable at position (-0.589, 0.000, 1.227).,
Pick up the Statue from the SideTable.,
Turn around and walk to the Sofa at position (0.057, -0.002, -1.386).,
Turn on the DeskLamp next to the Sofa at position (1.657, 0.654, -1.605).
Walk to the SideTable at position (-1.276, 1.000, 1.327).,
Pick up the statue from the SideTable.,
Walk to the DeskLamp at position (1.657, 0.654, -1.605).,
Turn on the DeskLamp next to the couch.
1. Walk to the statue on the cabinet to the right of the window.
2. Pick up the statue from the cabinet.
3. Turn to face the lamp on the tripod stand to the left of the couch.
4. Walk to the lamp on the tripod stand.
5. Turn on the lamp next to the couch.
1. Walk to the cabinet that is next to the wall on the right side of the room.
2. Reach for the statue that is on the cabinet.
3. Pick up the statue from the cabinet.
4. Walk to the table that is next to the couch on the left side of the room.
5. Locate the lamp that is on the table.
6. Reach for the switch on the lamp.
7. Turn on the lamp next to the couch.
Architecture