Multi-agent Planning using Visual Language Models

ECAI 2024

Move a tea pot to the lamp, turn on the lamp (trial_T20190907_161326_928347)

Ground truth Single Agent with table Multi Agent with table Single Agent with image Multi Agent with image
STEP 1: Move across the room to the wood TV stand
STEP 2: Pick up the vase that is on the shelf in the middle of the TV stand
STEP 3: Carry the vase to the floor lamp to the left of the chair
STEP 4: Turn on the lamp
Turn around and walk to the dining table.,
Take the tea pot from the dining table.,
Turn around and walk to the floor lamp.,
Place the tea pot next to the floor lamp.,
Turn the floor lamp on.
Turn around and walk to the kitchen counter.,
Take the teapot from the kitchen counter.,
Turn around and walk to the floor lamp.,
Place the teapot near the floor lamp.,
Turn around and walk to the light switch.,
Flip the light switch to turn on the floor lamp.
1. Walk to the tea pot on the dining table in the center of the room.
2. Pick up the tea pot from the dining table.
3. Turn towards the lamp hanging above the dining table.
4. Place the tea pot near the base of the lamp on the dining table.
5. Walk to the light switch on the wall to the right of the white door.
6. Turn on the lamp using the light switch.
1. Stand up from your current position.
2. Walk to the ottoman at the position directly in front of you.
3. Walk beyond the ottoman to reach the dining area.
4. Walk to the dining table in the room beyond the couch.
5. Pick up the teapot from the dining table.
6. Inspect the lamp above the dining table to locate the switch.
7. If the switch is on the lamp, reach up to the lamp above the dining table and turn it on.
8. If the switch is on the wall, walk to the nearest wall switch in the dining area and turn on the lamp.
9. Place the teapot near the base of the lamp on the dining table.
Architecture