Skip to content

Yash3561/Project_CLAIRE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project CLAIRE: A Benchmark Study on the Trade-off Between Factual Retention and Linguistic Coherence in Continual Learning

This repository contains the code, methodology, and results of Project CLAIRE, an independent research initiative that empirically demonstrates a fundamental and previously under-documented trade-off between memory and coherence in continually trained Large Language Models.


The Core Finding: A Surprising Victor

Our central discovery is that for long-term, multi-task continual learning, structural preservation methods (PEFT/LoRA) are significantly more effective at memory retention than even the most sophisticated active rehearsal strategies.

In a grueling five-task "gauntlet" benchmark, the PEFT-Regularization Only method—a strategy with no memory replay whatsoever—emerged as the undisputed champion of memory retention.

Calibrated Benchmark Results

Key Quantitative Insights

Our calibrated, multi-stage benchmark produced several non-obvious, quantitative results:

  • Structural Preservation Dominates: PEFT-Regularization Only was the most effective method, retaining 23.70% of its factual knowledge after being trained on four subsequent new tasks. This significantly outperformed all replay-based full fine-tuning methods.

  • Random Replay is Actively Harmful: Standard Random Replay was the worst-performing strategy, retaining only 15.00% of facts and causing a catastrophic ~25% drop in linguistic coherence compared to other methods, proving that naive rehearsal can be more damaging than no rehearsal at all.

  • Intelligent Curation is Effective but Insufficient: Our novel Interference-Aware Replay (CLAIRE), which intelligently selects at-risk memories, systematically outperformed Random Replay (19.20% vs. 15.00%). However, this proves that even a superior replay strategy cannot, by itself, overcome the fundamental instability of full fine-tuning.

  • The Coherence Collapse: All methods involving full fine-tuning suffered from severe degradation of linguistic coherence over the full sequence. Only the PEFT-based methods (PEFT-Only and CLAIRE+) were able to maintain a high degree of linguistic stability, proving that the PEFT architecture is the primary guardian of a model's ability to generate sane, structured language.

Methodology in Brief

  • Model: meta-llama/Llama-3.1-8B-Instruct
  • Benchmark: A five-task sequential learning "gauntlet" using a controlled, synthetic corpus of 500 unique facts.
  • Novel Components:
    1. Interference-Aware Teacher: An AI curator that uses semantic embeddings to identify and prioritize the most "at-risk" memories for rehearsal.
    2. LLM-as-a-Judge Evaluator: A calibrated AI judge used to score each model's memory of all past tasks on both Factual Correctness (semantic equivalence) and Linguistic Coherence.

Conclusion: A New Theory of the Case

We propose a new framework for understanding continual learning: it is a battle to solve two distinct problems, Factual Forgetting and Coherence Collapse.

Our results conclusively show that PEFT/LoRA is the most powerful tool for solving Coherence Collapse and serves as the most robust baseline for Factual Forgetting. Active rehearsal strategies like our novel CLAIRE method, while demonstrably better than naive approaches, should be viewed as a secondary tool.

Future research should not focus on replay as a primary solution, but rather on finding the optimal, synergistic balance between the structural armor of PEFT and the targeted, active reminders of an intelligent rehearsal system.

How to Reproduce

  1. Clone this repository: git clone https://github.com/Yash3561/Project_CLAIRE.git
  2. Create a Python virtual environment and run pip install -r requirements.txt.
  3. Create a .env file and add your HF_TOKEN='your_token_here'.
  4. For running on an HPC with a Slurm scheduler, copy the provided template: cp run_gauntlet.slurm.template run_gauntlet.slurm
  5. Edit run_gauntlet.slurm and replace the placeholder values (YOUR_ACCOUNT_NAME, etc.) with your specific HPC configuration.
  6. Submit the job to the scheduler: sbatch run_gauntlet.slurm.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages