LLM Output Drift: Cross-Provider Validation & Mitigation for Financial Workflows
Published 10 Nov 2025 ยท arXiv
Key Points
- Smaller LLMs (Granite-3-8B, Qwen2.5-7B) achieve 100% output consistency at T=0.0 temperature
- Larger models (GPT-OSS-120B) exhibit significant output drift undermining auditability
- Study covers 5 model architectures (7B-120B parameters) on regulated financial tasks
Implications
Output drift in large models poses compliance risks for reconciliations, regulatory reporting, and client communications requiring consistent results.
Action Required
Financial institutions should prioritise smaller, more consistent models for regulated workflows requiring audit trails.
researcher functional_specialist executive global peer-reviewed-paper