The aim of this project is to program a neural language model using a multi-layer perceptron. This language model takes as input the plunges of
In this folder you will find the following folders and files:
/config: This project contains a configuration system. This file is the basis. See the section on how to run the code for a better understanding. It contains:config.yaml: the configuration file for the basic AI model. It also contains a description of all the parameters.utils.py: python script to run the configuration system.
/data: contains the data and the embedding learned from Word2Vec. Contains alsosplit_data.pyin order to have a validation data./generate: contains the input and ouput file to the generation./logs: contains all experiments (FROZEN, SCRATCH, ADAPT -> see report), with models configuration, train logs, weights, learning curves, .../report: contains final report (pdf and tex version)./src: containts the following python code:data.py: load data.genere.py: process the generation from the input file with an experiement.loss.py: definition of perplexity loss.metrics.py: compute metrics.model.py: load model according the configuration.test.py: run a test on an experiement.train.py: train a new experiment.
/utils: contains usefull python script.main.py: python code that centralizes training, testing and generation.README.md: this file;requierements.txt: list of all python packages with their versions.tp_mlp.pdf: project subject.
To run the code you need python (We use python 3.9.13). You can run the following code to install all packages in the correct versions:
pip install -r requirements.txt
To run the program, simply execute the main.py file. However, there are several modes.
To do this, you need to choose a .yaml configuration file to set all the training parameters. By default, the code will use the config/configs.yaml file. The code will create a folder: 'name' in logs to store all the training information, such as a copy of the configuration used, the loss and metrics values at each epoch, the learning curves and the model weights.
To run a training session, enter the following command:
python main.py --mode trainIf you want use a specific configuration, you can add --config <path to the configuration>
To run a test, you have to choose the your experiment, and run this line:
python main.py --mode test --path <path to the experiment>If you want generate a text from a input, you can write your input in generate\input.txt and you can run a generation according to an experiment with:
python main.py --mode generate --path <path to your experiment>Then, the model will generate a text and save it in generate\output_<name of experiment>.