CCF-dataset

Here is our code repository to reproduce the basic results of the article “FIJO”: a French Insurance Soft Skill Detection Dataset.

The following details the steps necessary to fetch our dataset and reproduce our results. Since each step might require multiple commands and/or command line arguments, we have put a Makefile in place to ease the reproductibility experience.

Steps

Download our dataset with the following command :
```
make download-dataset
```
or by downloading it manually here and unzipping it data/ directory at the root of the repository.
Once the dataset is downloaded, you can reproduce our dataset statistics as well as the results for each of our models in one simple command :
- To reproduce dataset stats, run the following command :
```
make reproduce-stats
```
- To reproduce our bi-LSTM model results, run :
```
make reproduce-biLstm [device=0] [local_logging=False] [remote_logging=False]
```
- To reproduce our CamemBERT frozen model results, run :
```
make reproduce-camembertFrozen [device=0] [local_logging=False] [remote_logging=False]
```
- To reproduce our CamemBERT unfrozen model results, run :
```
make reproduce-camembertUnfrozen [device=0] [local_logging=False] [remote_logging=False]
```
- To reproduce our CamemBERT unfrozen warmup model results, run :
```
make reproduce-camembertUnfrozenWarmup [device=0] [local_logging=False] [remote_logging=False]
```
N.B: The last four commands include three optional arguments:
- device: indicates which GPU device to use, if any. By default 0.
- local_logging: boolean flag indicating whether or not to log model weights and metrics locally. Bear in mind that the CamemBERT based models have quite a high memory footprint. By defaut False.
- remote_logging: boolean flag indicating whether or not to log model metrics remotely using Weights & Biases. If True, you must be logged to a Weights & Biases account locally. By default False.

The installation of all the dependencies is handled automatically.

If you wish to run the python/pip commands manually, or if you're encountering problems with make, you can check out the annotated Makefile.

Cite this article

  @article{
  title = {{"FIJO": a French Insurance Soft Skill Detection Dataset}},
  author = {Beauchemin, David and Laumonier, Julien and Le Ster, Yvan and Yassine, Marouane},
  year={2022},
  eprint={arXiv: 2204.05208}
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
data		data
experiment		experiment
stats		stats
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
download_dataset.py		download_dataset.py
requirements.txt		requirements.txt
stats-requirements.txt		stats-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CCF-dataset

Steps

Cite this article

About

Uh oh!

Releases

Uh oh!

Contributors 2

Uh oh!

Languages

iid-ulaval/FIJO-code

Folders and files

Latest commit

History

Repository files navigation

CCF-dataset

Steps

Cite this article

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 2

Uh oh!

Languages