ICLR '26
Denoising Neural Reranker for Recommender Systems
Wenyu Mao, Shuchang Liu, HailanYang, Xiaobei Wang, Xiaoyu Yang, Xu Gao, Xiang Li, Lantao Hu, Han Li, Kun Gai, An Zhang, Xiang Wang
Neural Reranking
Sequential RecSys
LLM for RecSys
ICLR '26
Quantile Advantage Estimation: Stabilizing RLVR for LLM Reasoning
Junkang Wu, Kexin Huang, Jiancan Wu, An Zhang, Xiang Wang, Xiangnan He
LLM Reasoning
LLM Alignment
ICLR '26
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
Yaorui Shi, Yuxin Chen, Siyuan Wang, Sihang Li, Hengxing Cai, Qi GU, Xiang Wang, An Zhang
Agentic Systems
Long-context Agents
LLM Reasoning
ICLR '26
GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models
Xingyu Zhu, Beier Zhu, Junfeng Fang, Shuo Wang, Yin Zhang, Xiang Wang, Xiangnan He
Safety Alignment
Multimodal Safety
ICLR '26
AlphaAlign: Incentivizing Safety Alignment with Extremely Simplified Reinforcement Learning
Yi Zhang, An Zhang, XiuYu Zhang, Leheng Sheng, Yuxin Chen, Zhenkai Liang, Xiang Wang
Safety Alignment
LLM Alignment
ICLR '26
AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint
Leheng Sheng, Changshuo Shen, Weixiang Zhao, Junfeng Fang, Xiaohao Liu, Zhenkai Liang, Xiang Wang, An Zhang, Tat-Seng Chua
Safety Alignment
Backdoor Defense
ICLR '26
Beyond Magnitude: Leveraging Direction of RLVR Updates for LLM Reasoning
Kexin Huang, Haoming Meng, Junkang Wu, Jinda Lu, Chiyu Ma, Ziqian Chen, Xue Wang, Bolin Ding, Jiancan Wu, Xiang Wang, Xiangnan He, Guoyin Wang, Jingren Zhou
LLM Reasoning
LLM Alignment
ICLR '26
DualEdit: Mitigating Safety Fallback in LLM Backdoor Editing via Affirmation-Refusal Regulation
Houcheng Jiang, Zetong Zhao, Junfeng Fang, Haokai Ma, Ruipeng Wang, Xiang Wang, Xiangnan He, Yang Deng
Backdoor Defense
Knowledge Editing