Skip to content
/ TFG Public

This repository contains the artifacts produced for my Bachelor's Degree Thesis.

License

Notifications You must be signed in to change notification settings

jaestevan/TFG

Repository files navigation

Benchmarking Large Language Models toward reasoning fairness and unanticipated bias

This repository contains the artifacts used towards my Bachelor's Degree Thesis, presented June 16th, 2025.

TL;DR: Check this file run-huggingface-bbqdataset-qascorer.ipynb (If you want more details, keep reading).

Main Publication

Published under CC BY-NC-ND at UOC's public repository O2 here:

Estevan Estevan, José Antonio. 2025. "Benchmarking Large Language Models toward reasoning fairness and unanticipated bias". Universitat Oberta de Catalunya (UOC). https://hdl.handle.net/10609/153037

Main components

Python Risk Identification Tool for generative AI (PyRIT)

A hand-built Bias Benchmark for Question Answering (BBQ)

Plain Python files

  • bbq_dataset.py · PyRIT valid dataset class representing BBQ data. This class imports the dataset from the original BBQ data files.

  • pyrit_tuning.py · Inherited PyRIT class to modify how PyRIT QuestionAnswerScorer generates question prompts.

Notebooks

Data

Models

These are the models used for this experiment:

Model Developer Size (params) Training data (tokens) Release date License
Phi 3 mini 4k instruct Microsoft 3.8B 3.3T Dec. 2024 MIT
GPT-4o mini Open AI ~8B N/A July 2024 N/A
Gemma 3 4b it Google 675M 4T Mar. 2025 Gemma
SmolLM 360M instruct Hugging Face 360M 600B July 2024 Apache

About

This repository contains the artifacts produced for my Bachelor's Degree Thesis.

Topics

Resources

License

Stars

Watchers

Forks