Here is our code repository to reproduce the basic results of the article “FIJO”: a French Insurance Soft Skill Detection Dataset.
The following details the steps necessary to fetch our dataset and reproduce our results. Since each step might require multiple commands and/or command line arguments, we have put a Makefile in place to ease the reproductibility experience.
-
Download our dataset with the following command :
make download-dataset
or by downloading it manually here and unzipping it
data/directory at the root of the repository. -
Once the dataset is downloaded, you can reproduce our dataset statistics as well as the results for each of our models in one simple command :
- To reproduce dataset stats, run the following command :
make reproduce-stats
- To reproduce our bi-LSTM model results, run :
make reproduce-biLstm [device=0] [local_logging=False] [remote_logging=False]
- To reproduce our CamemBERT frozen model results, run :
make reproduce-camembertFrozen [device=0] [local_logging=False] [remote_logging=False]
- To reproduce our CamemBERT unfrozen model results, run :
make reproduce-camembertUnfrozen [device=0] [local_logging=False] [remote_logging=False]
- To reproduce our CamemBERT unfrozen warmup model results, run :
make reproduce-camembertUnfrozenWarmup [device=0] [local_logging=False] [remote_logging=False]
N.B: The last four commands include three optional arguments:
device: indicates which GPU device to use, if any. By default0.local_logging: boolean flag indicating whether or not to log model weights and metrics locally. Bear in mind that the CamemBERT based models have quite a high memory footprint. By defautFalse.remote_logging: boolean flag indicating whether or not to log model metrics remotely using Weights & Biases. If True, you must be logged to a Weights & Biases account locally. By defaultFalse.
- To reproduce dataset stats, run the following command :
The installation of all the dependencies is handled automatically.
If you wish to run the python/pip commands manually, or if you're encountering problems with make, you can check out the annotated Makefile.
@article{
title = {{"FIJO": a French Insurance Soft Skill Detection Dataset}},
author = {Beauchemin, David and Laumonier, Julien and Le Ster, Yvan and Yassine, Marouane},
year={2022},
eprint={arXiv: 2204.05208}
}