Multi-agent Planning using Visual Language Models

ECAI 2024

Put a wine bottle on the second from the bottom shelf, in the kitchen.(trial_T20190907_200154_378982)

Ground truth Single Agent with table Multi Agent with table Single Agent with image Multi Agent with image
STEP 1: Walk across the room and turn right past the kitchen counter, then walk to the light brown dining table.
STEP 2: Pick up the wine bottle on the table, between the bread and the chair.
STEP 3: Turn around and walk to the kitchen counter then make left and walk to the tall, wooden shelf at the end of the room.
STEP 4: Place the wine bottle on the second shelf from the bottom.
Turn around and walk to the dining table.,
Take the wine bottle from the dining table.,
Turn around and walk to the shelf.,
Place the wine bottle on the second from the bottom shelf.
Turn and walk to the location of the wine bottle (id 33).,
Pick up the wine bottle.,
Turn around and walk to the shelf (id 81).,
Place the wine bottle on the second from the bottom shelf.
1. Turn right towards the kitchen area.
2. Walk forward to the kitchen island.
3. Turn left towards the wine bottle on the kitchen island.
4. Pick up the wine bottle from the kitchen island.
5. Turn right towards the shelving unit in the kitchen.
6. Walk to the shelving unit located against the kitchen wall.
7. Place the wine bottle on the second shelf from the bottom of the shelving unit.
1. Identify the wine bottle on the kitchen island.
2. Walk to the wine bottle at the kitchen island.
3. Pick up the wine bottle from the kitchen island.
4. Identify the shelf in the kitchen.
5. Walk to the shelf in the kitchen.
6. Locate the second from bottom shelf on the shelf.
7. Place the wine bottle on the second from bottom shelf of the shelf.
Architecture