EPTX: a Faithful Explainer for STory-form Algebra

This is a repository for the paper EPTX.

Prerequisites

You need Python 3.8+ to run this repository.
Install libraries using: pip install -r requirements.txt

How to run training/evaluation

Download dataset files from here, and extract it.
Modify dataset path specified in the './run.sh' and execute the file.
Or, run python script files

Training & Hyperparameter searching

usage: train_model.py [-h] --name NAME --dataset DATASET [--seed SEED]
                      --experiment-dir EXPERIMENT_DIR [--beam-expl BEAM_EXPL]
                      [--beam-expr BEAM_EXPR]
                      [--window-size WINDOW_SIZE [WINDOW_SIZE ...]]
                      [--max-iter MAX_ITER]
                      [--stop-conditions [STOP_CONDITIONS [STOP_CONDITIONS ...]]]
                      [--model {EPT,EPTX,SWAN_A,SWAN_B,SWAN_P1} [{EPT,EPTX,SWAN_A,SWAN_B,SWAN_P1} ...]]
                      [--encoder ENCODER] [--equation-hidden EQUATION_HIDDEN]
                      [--equation-intermediate EQUATION_INTERMEDIATE]
                      [--equation-layer EQUATION_LAYER [EQUATION_LAYER ...]]
                      [--equation-head EQUATION_HEAD]
                      [--explanation-shuffle EXPLANATION_SHUFFLE]
                      [--log-path LOG_PATH] [--num-cpu NUM_CPU]
                      [--num-gpu NUM_GPU] [--opt-lr OPT_LR [OPT_LR ...]]
                      [--opt-beta1 OPT_BETA1] [--opt-beta2 OPT_BETA2]
                      [--opt-eps OPT_EPS] [--opt-grad-clip OPT_GRAD_CLIP]
                      [--opt-warmup OPT_WARMUP [OPT_WARMUP ...]]
                      [--batch-size BATCH_SIZE]

optional arguments:
  -h, --help            show this help message and exit

Dataset & Evaluation:
  --name NAME, -name NAME
  --dataset DATASET, -data DATASET
  --seed SEED, -seed SEED
  --experiment-dir EXPERIMENT_DIR, -exp EXPERIMENT_DIR
  --beam-expl BEAM_EXPL, -beamX BEAM_EXPL
  --beam-expr BEAM_EXPR, -beamQ BEAM_EXPR
  --window-size WINDOW_SIZE [WINDOW_SIZE ...], -win WINDOW_SIZE [WINDOW_SIZE ...]
  --max-iter MAX_ITER, -iter MAX_ITER
  --stop-conditions [STOP_CONDITIONS [STOP_CONDITIONS ...]], -stop [STOP_CONDITIONS [STOP_CONDITIONS ...]]

Model:
  --model {EPT,EPTX,EPTX_U,EPTX_F,EPTX_P1} [{EPT,EPTX,EPTX_U,EPTX_F,EPTX_P1} ...], -model {EPT,EPTX,EPTX_U,EPTX_F,EPTX_P1} [{EPT,EPTX,EPTX_U,EPTX_F,EPTX_P1} ...]
  --encoder ENCODER, -enc ENCODER
  --equation-hidden EQUATION_HIDDEN, -eqnH EQUATION_HIDDEN
  --equation-intermediate EQUATION_INTERMEDIATE, -eqnI EQUATION_INTERMEDIATE
  --equation-layer EQUATION_LAYER [EQUATION_LAYER ...], -eqnL EQUATION_LAYER [EQUATION_LAYER ...]
  --equation-head EQUATION_HEAD, -eqnA EQUATION_HEAD
  --explanation-shuffle EXPLANATION_SHUFFLE, -expS EXPLANATION_SHUFFLE

Logger setup:
  --log-path LOG_PATH, -log LOG_PATH

Worker setup:
  --num-cpu NUM_CPU, -cpu NUM_CPU
  --num-gpu NUM_GPU, -gpu NUM_GPU

Optimization setup:
  --opt-lr OPT_LR [OPT_LR ...], -lr OPT_LR [OPT_LR ...]
  --opt-beta1 OPT_BETA1, -beta1 OPT_BETA1
  --opt-beta2 OPT_BETA2, -beta2 OPT_BETA2
  --opt-eps OPT_EPS, -eps OPT_EPS
  --opt-grad-clip OPT_GRAD_CLIP, -clip OPT_GRAD_CLIP
  --opt-warmup OPT_WARMUP [OPT_WARMUP ...], -warmup OPT_WARMUP [OPT_WARMUP ...]
  --batch-size BATCH_SIZE, -bsz BATCH_SIZE

Training folds based on hyper-parameter search results

usage: train_fold.py [-h] --name NAME --experiment-dir EXPERIMENT_DIR
                     [EXPERIMENT_DIR ...] --model-config MODEL_CONFIG
                     [MODEL_CONFIG ...] [--log-path LOG_PATH]

optional arguments:
  -h, --help            show this help message and exit

Dataset & Evaluation:
  --name NAME, -name NAME
  --experiment-dir EXPERIMENT_DIR [EXPERIMENT_DIR ...], -exp EXPERIMENT_DIR [EXPERIMENT_DIR ...]
  --model-config MODEL_CONFIG [MODEL_CONFIG ...], -model MODEL_CONFIG [MODEL_CONFIG ...]

Logger setup:
  --log-path LOG_PATH, -log LOG_PATH

Evaluating models

usage: test_model.py [-h] --model MODEL [MODEL ...] --dataset DATASET
                     --experiment-dir EXPERIMENT_DIR [--seed SEED]
                     [--faithfulness {sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} [{sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} ...]]
                     [--bootstrap-trials BOOTSTRAP_TRIALS]
                     [--sample-size SAMPLE_SIZE]
                     [--repeating-counts REPEATING_COUNTS [REPEATING_COUNTS ...]]
                     [--digressing-counts DIGRESSING_COUNTS [DIGRESSING_COUNTS ...]]
                     [--perturbation-samples PERTURBATION_SAMPLES]
                     [--num-cpu NUM_CPU] [--num-gpu NUM_GPU]

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL [MODEL ...], -model MODEL [MODEL ...]
  --dataset DATASET, -data DATASET
  --experiment-dir EXPERIMENT_DIR, -exp EXPERIMENT_DIR
  --seed SEED, -seed SEED
  --faithfulness {sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} [{sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} ...], -faith {sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} [{sufficiency,comprehensiveness,error_propagation_corr,error_propagation_tree} ...]
  --bootstrap-trials BOOTSTRAP_TRIALS, -ntr BOOTSTRAP_TRIALS
  --sample-size SAMPLE_SIZE, -smp SAMPLE_SIZE
  --repeating-counts REPEATING_COUNTS [REPEATING_COUNTS ...], -rep REPEATING_COUNTS [REPEATING_COUNTS ...]
  --digressing-counts DIGRESSING_COUNTS [DIGRESSING_COUNTS ...], -dig DIGRESSING_COUNTS [DIGRESSING_COUNTS ...]
  --perturbation-samples PERTURBATION_SAMPLES, -ptr PERTURBATION_SAMPLES
  --num-cpu NUM_CPU, -cpu NUM_CPU
  --num-gpu NUM_GPU, -gpu NUM_GPU

To use BLEURT, please download BLEURT-D6-20.zip from the BLEURT repository and unzip it under 'resource/bleurt'.

File structure (expected)

/train_model.py Python script file for hyperparameter searching or training model
/train_model_debug.py Python script file for debug purpose
/train_fold.py Python script file for training cross-validation folds based on the result of train_model.py
/test_model.py Python script file for evaluating model
/run.sh Shell script for training models across datasets
/common python package for constants or common functionalities
- /common/const python package for constants
  - /common/const/model.py model-specific key names
  - /common/const/operand.py definition of constant operands, index of operands, and how to represent it in the text.
  - /common/const/operator.py definition of operators, its computational procedure, and how to represent it in the text.
  - /common/const/pad.py defines PAD_ID and *_INF values
- /common/data python package for data classes
  - /common/data/__init__.py Example-level data class and its information data class
  - /common/data/base.py Base abstract classes for data representation
  - /common/data/explanation.py Explanation-related data classes
  - /common/data/encoded.py Data class for context vectors
  - /common/data/equation.py Equation-related data classes
  - /common/data/label.py Data class for target labels
  - /common/data/prediction.py Data class for predicted result
  - /common/data/text.py Data class for text input labels
- /common/seed python package for evaluating SEED dataset
  - /common/seed/metric.py Metric for explanation plausibility
  - /common/seed/parse.py Parser function for infix equation
  - /common/seed/pattern.py Frequently used regex patterns, like numbers or fractions.
  - /common/seed/solve.py Solver class that evaluates the output equation, and related functions/constants.
- /common/torch python package for adopting PyTorch
  - /common/torch/loss.py Defines Cross-entropy loss with label smoothing
  - /common/torch/util.py Various utility function for models
- /common/dataset.py Class for build minibatch and other dataset-specific things
- /common/tester.py Class for evaluating correctness & plausibility
- /common/trial.py Utility for building trial directory names (for Ray)
/experiment python package for evaluation experiments
- /experiment/__init__.py Defines experimet loader
- /experiment/base.py Abstrat class for evaluation experiment
- /experiment/endtask.py Classes for error propagation analyses
- /experiment/faithfulness.py Classes for faithfulness experiment
/model python package for model classes
- /model/__init__.py Defines model loader
- /model/base python package for defining base classes
  - /model/base/beamsearch.py Utility for beam search
  - /model/base/chkpt.py Abstract class for checkpointing a model
  - /model/base/layernorm.py Class for LayerNorm with pad masked
  - /model/base/posenc.py Defines position encoding
  - /model/base/util.py Defines utility function for defining EPT
- /model/ept python package for defining EPT (EMNLP 2020)
  - /model/ept/__init__.py Defines EPT model
  - /model/ept/attention.py Implements Multi-head Attention and Transformer Decoder layer
  - /model/ept/decoder.py Defines decoder part of EPT, except operand-context pointer
  - /model/ept/encoder.py Defines encoder part of EPT
- /model/eptx python package for defining EPTX (the proposed model)
  - /model/eptx/__init__.py Defines EPTX model (based on EPT)
  - /model/eptx/explain.py Defines explanation decoder part of EPTX, except pointer-generator head.
  - /model/eptx/pg_head.py Defines pointer-generator head for explanation decoder.
/learner python package for defining learning algorithm
- /learner/__init__.py Defines Supervised Trainer class
- /learner/const.py Defines constants for learner-specific options
- /learner/scheduler.py Defines various warm-up scheduling methods
- /learner/util.py Defines utility for defining Trainer class
/resource Directory for storing resources: Please download this folder from here.
- /resource/dataset/pen.json the PEN dataset
- /resource/experiments Directory containing split specifications
  - /resource/experiments/pen Directory containing splits of PEN dataset
  - /resource/experiments/draw Directory containing splits of DRAW-1K dataset
  - /resource/experiments/alg514-fold* Directory containing splits of ALG514-fold* dataset
  - /resource/experiments/mawps-fold* Directory containing splits of MAWPS-fold* dataset
  - Each directory has train, test, and/or dev file.

Citation

Whenever you use this code for any academic purpose, please cite the paper below.

@inproceedings{kim-etal-2022-ept,
    title = "{EPT}-{X}: An Expression-Pointer Transformer model that generates e{X}planations for numbers",
    author = "Kim, Bugeun  and
      Ki, Kyung Seo  and
      Rhim, Sangkyu  and
      Gweon, Gahgene",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.305",
    doi = "10.18653/v1/2022.acl-long.305",
    pages = "4442--4458",
    abstract = "In this paper, we propose a neural model EPT-X (Expression-Pointer Transformer with Explanations), which utilizes natural language explanations to solve an algebraic word problem. To enhance the explainability of the encoding process of a neural model, EPT-X adopts the concepts of plausibility and faithfulness which are drawn from math word problem solving strategies by humans. A plausible explanation is one that includes contextual information for the numbers and variables that appear in a given math word problem. A faithful explanation is one that accurately represents the reasoning process behind the model{'}s solution equation. The EPT-X model yields an average baseline performance of 69.59{\%} on our PEN dataset and produces explanations with quality that is comparable to human output. The contribution of this work is two-fold. (1) EPT-X model: An explainable neural model that sets a baseline for algebraic word problem solving task, in terms of model{'}s correctness, plausibility, and faithfulness. (2) New dataset: We release a novel dataset PEN (Problems with Explanations for Numbers), which expands the existing datasets by attaching explanations to each number/variable.",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EPTX: a Faithful Explainer for STory-form Algebra

Prerequisites

How to run training/evaluation

Training & Hyperparameter searching

Training folds based on hyper-parameter search results

Evaluating models

File structure (expected)

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
common		common
experiment		experiment
learner		learner
model		model
README.md		README.md
compute_gpu_fraction.py		compute_gpu_fraction.py
requirements.txt		requirements.txt
run.sh		run.sh
system_spec.sh		system_spec.sh
test_model.py		test_model.py
train_fold.py		train_fold.py
train_model.py		train_model.py
train_model_debug.py		train_model_debug.py

snucclab/ept-x

Folders and files

Latest commit

History

Repository files navigation

EPTX: a Faithful Explainer for STory-form Algebra

Prerequisites

How to run training/evaluation

Training & Hyperparameter searching

Training folds based on hyper-parameter search results

Evaluating models

File structure (expected)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages