Heavy-tailed Information-Theoretic Generalization Bounds with Applications to LLM Safety Alignment

A+

：张慧铭（北京航空航天大学）
：2026-01-11 16:30
：海韵园实验楼S106

报告人：张慧铭（北京航空航天大学）

时间：2026年1月11日16:30

地点：海韵园实验楼S106

内容摘要:

Classical information-theoretic generalization bounds, which link generalization error to the mutual information between the algorithm's input and output, typically rely on sub-Gaussian assumptions or finite moment generating functions(MGFs). However, these assumptions are often violated in heavy-tailed scenarios, such as adversarial training, reinforcement learning with rare high-reward events, and financial modeling. In this work, we bridge this gap by establishing a comprehensive framework for generalization under heavy-tailed sub-Weibull regimes. We demonstrate that standard K-L divergence bounds are vacuous in these settings due to the unboundedness of extreme events. To overcome this, we introduce a novel decorrelation lemma based on Rényi divergence and a generalized Young-type inequality, which circumvents the need for MGFs. By combining these tools with a refined chaining technique on the space of measures, we derive Dudley-type generalization bounds that explicitly depend on the tail parameter and the Rényi information. Additionally, we establish new maximal inequalities and information-theoretic generalization bounds under sub-Weibullity of loss of data in machine learning. The work also explores the application of these results to large language models (LLM): providing tail-adaptive reward guarantees for Reinforcement Learning from Human Feedback in LLM alignment (mitigating catastrophic Goodhart effects where KL-regularization fails).

个人简介：

张慧铭，北航人工智能研究院副教授(准聘)、硕士生导师；北航数学科学学院兼职博导。曾在澳门大学担任濠江学者博士后研究员(2020-2022)；曾就读于北京大学(2016-2020)获得统计学博士学位。研究方向为稳健机器学习, AI统计理论(泛化误差、非渐近\小样本理论)、高维概率统计、函数型数据、子抽样估计、莱维过程等。发表SCI论文30篇(包括AI与自动化领域顶刊JMLR, IEEE-TAC；统计顶刊JASA, Biometrika、精算顶刊IME；Nature子刊Scientific Reports)，谷歌学术引用次数超过900次。曾担任过美国《数学评论》评论员；概率统计、AI与机器学习领域顶刊(AOS, AOAP, JASA, JMLR, IEEE-TSP)的审稿人。

联系人：陈俊彤

2025/12/31 10:14:19

：张慧铭（北京航空航天大学）：2026-01-11 16:30：海韵园实验楼S106

：张慧铭（北京航空航天大学）
：2026-01-11 16:30
：海韵园实验楼S106