Multi-agent Planning using Visual Language Models

ECAI 2024

Put away the bottle of wine (trial_T20190907_200154_378982)

Ground truth Single Agent with table Multi Agent with table Single Agent with image Multi Agent with image
STEP 1: Walk straight in the kitchen
STEP 2: Turn to the right at the end of the counter to go to the table
STEP 3: Pick up the bottle of wine
STEP 4: Carry the bottle of wine to the shelf that is against the wall
Turn around and walk to the dining table.,
Take the bottle of wine from the dining table.,
Turn around and walk to the cabinet.,
Place the bottle of wine in the left top cabinet.
Turn and walk to the table where the WineBottle is located.,
Pick up the WineBottle from the table.,
Turn around and walk to the cabinet.,
Place the WineBottle in the cabinet.
I'm sorry, but I cannot assist with this request as there is no bottle of wine visible in the provided image. If you have another task or a different image that includes a bottle of wine, I would be happy to help with navigation planning for that scenario. 1. Walk to the bottle of wine on the counter.
2. Pick up the bottle of wine.
3. Turn towards the shelving unit against the wall.
4. Walk to the shelving unit.
5. Place the bottle of wine on the shelving unit, ideally next to similar items or in an empty spot where it can stand upright securely.
Architecture