In the rapidly evolving landscape of artificial intelligence, large language models have become indispensable tools, demonstrating remarkable proficiency across a wide array of tasks. However, their ability to memorise training data poses significant risks, particularly concerning the exposure of private or copyrighted information. While most current defences focus on mitigating memorisation during the pre-training phase, the phenomenon during fine-tuning, especially for domain adaptation and instruction tuning, remains poorly understood.
A recent study by Dean L. Slack and Noura Al Moubayed sheds light on this critical issue. The researchers fine-tuned models from the Pythia, Llama3, and Mistral families, ranging from 1.4 billion to 70 billion parameters, on common evaluation datasets. Their findings reveal that memorisation increases dramatically within the first few epochs of training, often well before validation perplexity or evaluation performance is optimised. This early surge in memorisation underscores the need for proactive measures to address the issue during the fine-tuning stage.
The study introduces a simple yet effective n-gram memorisation score, which reliably precedes verbatim memorisation. By using this score as an early-stopping criterion, the researchers demonstrated a significant reduction in memorisation with minimal impact on performance. This approach provides a practical and scalable solution to mitigate memorisation risks during the fine-tuning process.
Furthermore, the researchers developed an n-gram-aware loss regulariser, which reduces memorisation across all tested model families by up to 40%. This method minimises evaluation performance trade-offs compared to existing memorisation mitigation strategies. The findings offer valuable insights into the dynamics of memorisation during language model fine-tuning and provide actionable strategies to enhance data security and privacy.
The implications of this research are far-reaching for the defence and security sectors, where the integrity and confidentiality of data are paramount. By integrating these findings into their practices, organisations can better protect sensitive information while leveraging the advanced capabilities of large language models. The study not only advances our understanding of memorisation in AI but also equips practitioners with the tools needed to navigate this complex challenge effectively. Read the original research paper here.

