This repo provides code for GraphEQA, a novel approach for utilizing 3D scene graphs for embodied question answering (EQA), introduced in the paper GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering.
If you find GraphEQA relevant or useful for your research, please use the following citation:
@misc{saxena2024grapheqausing3dsemantic,
title={GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering},
author={Saumya Saxena and Blake Buchanan and Chris Paxton and Bingqing Chen and Narunas Vaskevicius and Luigi Palmieri and Jonathan Francis and Oliver Kroemer},
year={2024},
eprint={2412.14480},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2412.14480},
}This work was in part supported by the National Science Foundation under Grant No. CMMI-1925130 and in part by the EU Horizon 2020 research and innovation program under grant agreement No. 101017274 (DARKO). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.
Below are instructions for how to set up a workspace to run GraphEQA. We provide instructions for both Docker and a local setup if you happen to be running Ubuntu 20.04.
Owners and collaborators of this repo are not claiming to have developed anything original to Hydra or any other MIT Spark lab tools.
-
Install docker.
-
Run
git clone https://github.com/SaumyaSaxena/graph_eqa.git. GraphEQA is also pip installable should you prefer installing it in a Python environment (see below).
There are two Docker images for GraphEQA to support simulation-based experiments in Habitat-Sim and embodied experiments on the Hello Robot Stretch platform.
Run the following script to build the docker image locally.
./docker/docker_build_with_habitat.shIf you have a reasonably fast internet connection, it may be faster for you to just pull the image directly from Docker hub. It is around 30 GB.
docker pull blakerbuchanan/grapheqa_for_habitat:0.0.1The following script will instantiate a container based on that image:
./docker/docker_run_habitat.shIf the docker container is already running and you want to execute the container in another terminal instance, do:
docker exec -it grapheqa-for-habitat bashYou will need to download the HM3D dataset and set up grapheqa_habitat.yaml to point to the directories local to your system that contain the HM3D dataset, Explore-EQA dataset, etc. (see below)
Run the following script to build the docker image locally.
./docker/docker_build.shIf you have a reasonably fast internet connection, it may be faster for you to just pull the image directly from Docker hub.
docker pull blakerbuchanan/grapheqa_for_stretch:0.0.1The following script will run a container
./docker/docker_run.shIf the docker container is already running and you want to execute the container in another terminal instance, do:
docker exec -it grapheqa-for-stretch bashThis set of instructions is only for local Ubuntu 20.04 installations.
-
Install our fork of Hydra following the instructions at this link. Verify that you are on the
grapheqabranch. -
If you do not have conda, install it. Then create a conda environment:
conda create -n "grapheqa" python=3.10
conda activate grapheqa-
Follow the instructions for installing the Hydra Python bindings inside of the conda environment created above. Before installing, be sure to source the hydra catkin workspace, otherwise the installation of the python bindings will fail.
-
Install Habitat Simulator in the 'grapheqa' conda environment.
The HM3D dataset along with semantic annotations can be downloaded from here. Follow the instructions here to download together the train/val scenes, semantic annotations and configs. The train set is used for HM-EQA and the val set for OpenEQA. The data folder should look as follows:
hm3d/train
├─ hm3d-train-semantic-annots-v0.2
| ├─ ...
├─ 00366-fxbzYAGkrtm
| ├─ fxbzYAGkrtm.basis.glb
| ├─ fxbzYAGkrtm.basis.navmesh
| ├─ fxbzYAGkrtm.semantic.glb
| ├─ fxbzYAGkrtm.semantic.txt
├─ ...For the Explore-EQA question dataset, navigate to this repo and download questions.csv and scene_init_poses.csv into a directory in your workspace. The OpenEQA dataset questions, open-eqa-v0.json, can be downloaded from here; this is the only file from this repo you will need. Place this in the same directory as the the previous .csv files.
If running GraphEQA on Stretch RE2 platform, follow the install instructions at our fork of stretch_ai found here.
Both of the above docker containers comes with GraphEQA installed by default. However, it may not have the latest commits. Run git pull to ensure the default version is updated. If you wish to develop the graph_eqa package, clone and install it in the 'grapheqa' conda environment:
git clone git@github.com:SaumyaSaxena/graph_eqa.git
cd graph_eqa
pip install -e .The OpenAI API requires an API key. Add the following line to your .bashrc:
export OPENAI_API_KEY=<YOUR_OPENAI_KEY>
If using Google's Gemini, add the following line to your .bashrc:
export GOOGLE_API_KEY=<YOUR_GOOGLE_KEY>
If using Anthropic's Claude, add the following line to your .bashrc:
export ANTHROPIC_API_KEY=<YOUR_GOOGLE_KEY>
If using Llama's Maverick, set up a together.ai account and API key, then the following line to your .bashrc:
export TOGETHER_API_KEY=<YOUR_TOGETHER_KEY>
Update paths to Explore-EQA data: Change question_data_path and init_pose_data_path fields in grapheqa_habitat.yaml to correspond to the locations where you downloaded the questions.csv and scene_init_poses.csv files from the Explore-EQA dataset.
Update paths to HM3D data: Change scene_data_path and semantic_annot_data_path fields in grapheqa_habitat.yaml to correspond to the directories where you downloaded the HM3D data. Update question_data_path and init_pose_data_path to point to the explore-eqa_semnav/data/questions.csv and explore-eqa_semnav/data/scene_init_poses.csv files, respectively.
To run GraphEQA with Habitat Sim on the HM-EQA dataset, run:
python scripts/run_vlm_planner_hmeqa_habitat.py -cf grapheqa_hmeqa_habitatResults will be saved in the graph_eqa/outputs directory.
Similarly, to run GraphEQA with Habitat Sim on the OpenEQA dataset, run:
python scripts/run_vlm_planner_openeqa_habitat.py -cf grapheqa_openeqa_habitatTo run GraphEQA on Hello Robot's Stretch platform, you will need to run the server on the Stretch robot following the instructions at this fork. Once you have successfully launched the server, open a terminal on your computer (client side) and run:
cd graph_eqa
python scripts/run_vlm_planner_eqa_stretch.py -cf grapheqa_stretch