Skip to content

fpasoslibres/AIMS-Code4Freedom-AUS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Code4Freedom

  • Kseniia Rydannykh
  • Manuela Posso Baena
  • Anh Minh Do
  • Huu Dang (Tony) Hoang
  • Maria Paula Gonzalez Vasquez

AIMS Hackathon - Explainable Review Pipeline

This repository contains the solution developed during the AIMS Hackathon – AI Against Modern Slavery.
It was created by Team Code4Freedom Australia for Challenge 2: AI Model Optimisation & Explainability, and includes the results, code, and experiments produced as part of the hackathon.

1. Problem Statement

Under the Modern Slavery Acts, large companies are required to submit annual modern slavery statements.
These statements are reviewed by expert organisations, and each company receives a rating/grade.

Accuracy in grading is critical:

  • A lower grade can lead to negative consequences for the company, including investor pressure, reputational risks, and regulatory scrutiny.
  • Reviewing organisations must grade without errors to maintain trust.
  • To ensure reliability, each statement is currently reviewed by three independent reviewers.

At present, only statements from large companies are being assessed. However, to scale the process to more organisations, AI-assisted solutions are required.

Existing platforms such as AIMS.au and AIMSCheck provide strong technical foundations — sentence-level annotations, cross-jurisdictional evaluations, and token-level explainability — but two challenges remain:

  1. Models sometimes hallucinate explanations, making predictions that do not follow human-like logic.
  2. Reviewing organisations need visibility into how confident the model is in its predictions.

2. Objective

To address these issues, our project:

  1. Designs a rationale-based pipeline that makes model reasoning more transparent and aligned with human logic.
  2. Introduces an analysis of confidence scores, giving reviewers an additional layer of trust in automated assessments.

3. Solution/Data use case description

EXPLAINABLE AI FOR MODERN SLAVERY REVIEWS

To improve explainability and trust in the model, we designed and implemented a solution with two main components, rationale-based pipeline and confidence analysis pipeline, which we describe in detail below.

3.1 Rationale-based pipeline

Idea & Architecture

The baseline models for modern slavery statements sometimes produced hallucinated explanations, highlighting tokens that did not match human logic. To address this, we designed a two-head model architecture To address this, we designed a two-head transformer model: one head predicts the sentence category, while the second highlights the words that justify the prediction.

The rationale head was supervised with pseudo-rationales (automatically generated keywords) to teach the model not only what category to assign, but also why.

This allows predictions to be explainable and aligned with human reviewers’ reasoning.

Pipeline Steps

  1. Pseudo-rationale generation – extract compact rationales for supervision.

Notebook: 1-create-pseudo-rationales.ipynb

Rationales were generated using a pseudo-rationale pipeline:

  • Extracted category-specific keywords with TF-IDF,
  • Kept only high-coverage words across examples,
  • Limited to ≤5 tokens per category and removed duplicates.

During the project, we explored several approaches for rationale generation, including LLM-based methods and TF-IDF. The latter proved to be faster and more cost-efficient for processing a large dataset while still providing strong results. A detailed description and outcomes of the experiments can be found here generate-rationales-eperiments

  1. Two-head model training – fine-tune microsoft/deberta-v3-xsmall with two heads.

Notebook: 2-train-main-classifier.ipynb

Here we trained a two-head transformer model for multi-label classification of modern slavery statements.

  • Backbone: DeBERTa, fine-tuned (2 unfrozen layers + classifier) ​
  • Head 1 – Categories: predicts sentence category.​
  • Head 2 – Rationales: highlights words supporting predictions.​ ​ During training, the model optimizes a combined loss:

$$ \text{Loss} = \text{Loss}_{categories} + \alpha \cdot \text{Loss}_{rationales} $$

where α controls the trade-off between accurate category predictions and meaningful rationale extraction.

To ensure computational feasibility, the training dataset was downsampled to 100,000 sentences using stratified sampling across all categories. The model was trained for only 3 epochs due to resource limitations, but it showed promising results with steadily improving training performance.

Results & Examples

The notebook below shows the model’s predictions on test data and their outputs.

Notebook: 3-classifier-model-inference.ipynb

Below you can see examples where the model not only predicts categories but also provides highlighted rationales directly in the text.

model-output

This approach makes predictions more interpretable by:

  • Highlighting the exact words in a sentence that support the decision,
  • Providing human-readable rationales aligned with categories,
  • Improving trust and transparency of the model,

3.2 Confidence analysis pipeline

Notebook : 4-explore-confidence.ipynb

Idea & Architecture

Provide reviewers with a measure of how confident the model is in its predictions. The confidence pipeline is designed to provide reviewers with a measure of how certain the model is about its predictions.

Instead of returning only a binary decision (e.g., “Policies covered” vs. “not covered”), the model also outputs a confidence score between 0 and 1.

These scores allow reviewers to distinguish between high-certainty outputs (trustworthy, can be automated) and low-certainty outputs (flagged for manual review).

Pipeline Steps

  • Prediction: The classifier (based on DistilBERT/AimsDistilModel) outputs raw logits for 11 compliance labels.
  • Probability Conversion: Logits are transformed into probabilities using sigmoid/softmax functions.
  • Calibration: Confidence scores are calibrated (e.g., Platt scaling or isotonic regression) to improve reliability.
  • Aggregation: Scores are aggregated at sentence, section, and document levels.
  • Export: Results are output in JSON format, including both the prediction and the associated confidence score.

Results & Examples

Distribution Example: Confidence scores across sections (e.g., Remediation) show how certain the model is about missing vs. covered reporting elements. probability-dstribution Sentence-level Example: image

In practice:

  • High-confidence (≥0.8) → reviewers can trust the output.
  • Low-confidence (<0.5) → flagged for manual review.

This pipeline improves trust and transparency, helping reviewers prioritize their time, reduce errors, and scale assessments without compromising reliability.

Conclusion

This project introduces two complementary pipelines:

  • Rationale-based pipeline – improves explainability and aligns predictions with human logic.
  • Confidence analysis pipeline – provides transparency on model reliability. Together, these approaches form a foundation for scaling automated grading of modern slavery statements to a larger set of organisations while maintaining trust.

4. Pitch

Youtube pitch video: https://youtu.be/KoE29AXD-js

Presentation: Presentation-CODE4FREEDOM-AUS.pdf

5. Datasets

Location: /datasets

6. Project Code

Location: /project-code

7. Additional docs (Optional)

Location: /docs

Presentation-CODE4FREEDOM-AUS.pdf

8. Declaration of Intellectual Property

This project builds on the open research of Project AIMS (AI against Modern Slavery) by Mila and QUT.
GitHub repository: ai4h_aims-au.

Disclaimers

Computational Resources & Comparative Results

  • Describe here the resources used in developing your solution (e.g. GPUs, etc).

No Claims About Companies

This repository and its accompanying models, datasets, metrics, dashboards, and comparative analyses are provided strictly for research and demonstration purposes.

Any comparisons, rankings, or assessments of companies or organizations are exploratory in nature. They may be affected by incomplete data, modeling limitations, or methodological choices. These results must not be used to make factual, legal, or reputational claims about any entity without independent expert review and validation.

Do not use this repository’s contents to make public statements or claims about specific companies, organizations, or individuals.

Terms and Conditions

By submitting this solution to the AIMS Hackathon, our team acknowledges and agrees to abide by the Event’s Terms and Conditions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5