The Collaborative Building Task
We define the Collaborative Building Task as a two-player game between an Architect (A) and a Builder (B). A is given a target structure (Target) and has to instruct B via a text chat interface to build a copy of Target on a given build region. A and B can communicate back and forth via chat throughout the game (e.g. to resolve confusions or to correct B's mistakes). B is given access to an inventory of 120 blocks of six given colors that it can place and remove. A can observe B and move around in its world, allowing it to provide instructions from varying perspectives. But A cannot move blocks, and remains invisible to B. The task is complete when the structure built by B (Built) matches Target, invariant to translations within the horizontal plane and rotations about the vertical axis. Built also needs to lie completely within the boundaries of the predefined build region.
Dataset Examples
The following sequence shows intermittent screenshots from an instance of the Collaborative Building Task:
Implementation
We implement this task within
Malmo, a version of Minecraft for AI research that also includes an API to create agents etc. We have built a data collection platform and have used it to collect the Minecraft Dialogue Corpus, consisting of 509 human-human written dialogues, screenshots and complete game logs for this task.
For more details on the task, the human-human dialogue dataset we created, baseline models, etc., please refer to our ACL 2019
paper on this work (supplementary
here).
The Minecraft Dialogue Corpus
NB: Code and data will be made available shortly
As part of the task, we record the game log as well as screenshots.
The human-human dialogue data we collected can be accessed
here. There are two zip files:
- The Minecraft Dialogue Corpus -- with screenshots.zip
- The Minecraft Dialogue Corpus -- no screenshots.zip
The former contains screenshots taken during the Collaborative Building Task as well. If you do not need them, we recommend downloading the zip file without this data as it is significanlty smaller in size.
This document in the same Google Drive folder describes the data we collect and the data format used for our game logs.
This file contains the data splits we use for modeling purposes. These splits were done across target structures. There are three sets in it: train (target structures used in training data), test (target structures used in test data) and val (target structures used in validation data). Hence, all of the corresponding dialogue data collected for a certain target structure goes into the data split which the structure has been assigned to.
The Architect utterance generation subtask
Although the Minecraft Dialogue Corpus was motivated by our ultimate goal of building agents that can successfully play an entire collaborative building game as Architect or Builder, we first consider the task of Architect utterance generation: given access to the entire game state context leading up to a certain point in a human-human game at which the human Architect spoke next, we aim to generate a suitable Architect utterance.
Our Systems
You will be able to try out the following systems for the Collaborative Building Task:
- Data collection (to collect human-human dialogue data)
- Creating target structures (to create new target structures for the task)
- Baseline models we developed for the Architect utterance generation subtask
Installation Instructions
Instructions for data collection and creating target structures will be made available shortly
- Clone the following repo which hosts our machine learning modeling related code for the Architect system: https://github.com/prashant-jayan21/minecraft-dialogue-models
- To run the baselines models for the Architect utterance generation subtask follow this
Citing our work
If you use this work, please cite:
@inproceedings{narayan-chen-etal-2019-collaborative,
title = "Collaborative Dialogue in {M}inecraft",
author = "Narayan-Chen, Anjali and
Jayannavar, Prashant and
Hockenmaier, Julia",
booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2019",
address = "Florence, Italy",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/P19-1537",
pages = "5405--5415",
}
Acknowledgements
This work was supported by Contract W911NF-15-1-0461 with the US Defense Advanced Research Projects Agency (DARPA) Communicating with Computers Program and the Army Research Office (ARO). Approved for Public Release, Distribution Unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.