Multi-agent Planning using Visual Language Models

ECAI 2024

move the cell phone to the night stand (trial_T20190909_100946_496614)

Ground truth Single Agent with table Multi Agent with table Single Agent with image Multi Agent with image
SSTEP 1: Turn towards the bed
STEP 2: pick up the cell phone sitting on the bed
STEP 3: move over towards the nightstand next to the bed
STEP 4: place the cell phone on the nightstand
Turn around and walk to the bed.,
Pick up the cell phone from the bed.,
Turn around and walk to the side table.,
Place the cell phone on the side table.
Turn and walk to the cell phone at position (1.640, 0.418, 0.585).,
Pick up the cell phone.,
Turn and walk to the SideTable at position (2.243, -0.001, 1.331).,
Place the cell phone on the SideTable.
1. Walk to the cell phone located on the dresser near the window.
2. Pick up the cell phone.
3. Turn towards the nightstand next to the bed.
4. Walk to the nightstand.
5. Place the cell phone on the nightstand.

1. **Locate the Dresser**: Since the cell phone is initially on the dresser, walk to the dresser which is positioned to the right of the bed.
2. **Pick Up Cell Phone**: Once at the dresser, pick up the cell phone which is on top of the dresser among other items like a laptop, books, and a glass.
3. **Walk to the Night Stand**: After picking up the cell phone, turn towards the left side of the bed where the night stand is located.
4. **Place Cell Phone on Night Stand**: Upon reaching the night stand, place the cell phone on it. Ensure it is next to the alarm clock and the lamp, which are already on the night stand.
Architecture