Research
I have general interest in deep learning, natural language processing, and robust machine
learning. Recently, I focus my research on large language models (LLMs), LLM reasoning, LLM-based
agents, and LLM Alignment.
|
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, Wei He, Boyang Hong, Shihan Do, Wenyu Zhan, Xiao Wang, Rui Zheng, Tao Ji, Xiaowei Shi, Yitao Zhai, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Zuxuan Wu, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Yu-Gang Jiang
Preprint. Nov, 2024
project page /
codes /
paper /
dataset
|
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Yiwen Ding*, Zhiheng Xi* (Co-first Author), Wei He, Zhuoyuan Li, Yitao Zhai, Xiaowei Shi, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Nov, 2024
codes /
paper
|
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Wei He*, Zhiheng Xi* (Co-first Author), Wanxu Zhao*, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Oct, 2024
codes /
paper /
dataset
|
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Preprint. June, 2024
project page /
codes and platform /
paper /
dataset /
benchmark /
model
|
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou*, Guodong Zheng*, Binghai Wang*, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Oct, 2024
codes /
paper
|
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang
ICML 2024; CIPS-LMG 2024 Outstanding Poster
codes
/
paper
|
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming
Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Qin
Liu, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan
Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing
Huang, Tao Gui
SCIENCE CHINA Information Sciences (SCIS), Cover Paper of SCIS Volume 68, Number 2, February 2025.
project page
/
paper
|
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Jia Liu, Tao Gui,
Qi Zhang, Xuanjing Huang
EMNLP 2023 Findings.
codes
/
paper
|
Connectivity Patterns are Task Embeddings
Zhiheng Xi, Rui Zheng, Yuansen Zhang, XuanJing Huang, Zhongyu Wei, Minlong
Peng, Mingming Sun, Qi Zhang, Tao Gui
ACL 2023 Findings.
codes
/
paper
|
Efficient Adversarial Training with Robust Early-Bird Tickets
Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
EMNLP 2022.
codes
/
paper
|
Characterizing the Impacts of Instances on Robustness
Rui Zheng*, Zhiheng Xi* (Co-first Author), Qin Liu, Wenbin Lai, Tao Gui, Qi
Zhang, Xuanjing Huang, Jin Ma, Ying Shan, Weifeng Ge
ACL 2023 Findings.
codes
/
paper
|
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
ICLR 2024 (Spotlight).
codes
/
paper
|
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Mar, 2024.
project page
/
paper
|
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Preprint. Jan, 2024.
codes
/
paper
|
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin,
Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang
Preprint. July, 2023.
codes
/
paper
|
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
COLING 2024.
paper
|
Miscellanea
- I'm passionate about FPS games, including Counter-Strike: Global Offensive (CS:GO / CS2)
and CrossFire (CF).
- I love watching soccer and am a big fan of Mourinho.
- I also love watching basketball games and my favorite player is Kevin Durant.
|
|