MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion
This work was presented at the 19th IEEE International Conference on Automatic Face and Gesture Recognition (FG2025). Best Student Paper Award.
Authors: Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, Ichiro Ide
This repository contains the implementation of MultiTSF on the MultiSensor-Home dataset.
- Download dataset: https://huggingface.co/datasets/thanhhff/MultiSensor-Home1/
A simple way to download the dataset:
# Make sure hf CLI is installed: pip install -U "huggingface_hub[cli]"
hf download thanhhff/MultiSensor-Home1 --repo-type=dataset --local-dir dataset
The Python code is developed and tested in the environment specified in requirements.txt.
Experiments on the MultiSensor-Home dataset were conducted on four NVIDIA A100 GPUs, each with 32 GB of memory.
You can adjust the batch_size parameter in the code to accommodate GPUs with smaller memory.
Download the MultiSensor-Home dataset and place it in the dataset/MultiSensor-Home directory.
To train the model, execute the following command:
bash ./scripts/train.sh
To perform inference, use the following command:
bash ./scripts/infer.sh
@inproceedings{nguyen2025multisensor,
author = {Trung Thanh Nguyen and Yasutomo Kawanishi and Vijay John and Takahiro Komamizu and Ichiro Ide},
title = {MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion},
booktitle = {Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition},
year = {2025},
note = {Best Student Paper Award}
}
This work was partly supported by Japan Society for the Promotion of Science (JSPS) KAKENHI JP21H03519 and JP24H00733.