Zhiheng Xi

I am a third-year PhD student at Fudan NLP Group of Computer School, Fudan University, where I work on natural language processing (NLP) and deep learning (DL). I am advised by Prof. Tao Gui, Prof. Qi Zhang, and Prof. Xuanjing Huang. Previously, I got my bachelor's degree from Nanjing University, advised by Prof. Jia Liu. I'm honored to be interning at Shanghai AI lab currently. In 2021, I completed a fantastic internship at Microsoft Azure, advised by Jiaye Wu and Hang Zhang. My research is supported by CIE-Tencent Doctoral Research Incentive Project (首届中国电子学会—腾讯博士生科研激励计划(混元大模型专项)).

Email  /  Google Scholar  /  Github  /  Twitter  /  Zhihu

profile photo

Research

I have general interest in deep learning, natural language processing, and robust machine learning. Recently, I focus my research on large language models (LLMs), LLM reasoning, LLM-based agents, and LLM Alignment.

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, Wei He, Boyang Hong, Shihan Do, Wenyu Zhan, Xiao Wang, Rui Zheng, Tao Ji, Xiaowei Shi, Yitao Zhai, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Zuxuan Wu, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Yu-Gang Jiang
Preprint. Nov, 2024
project page / codes / paper / dataset
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Yiwen Ding*, Zhiheng Xi* (Co-first Author), Wei He, Zhuoyuan Li, Yitao Zhai, Xiaowei Shi, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Nov, 2024
codes / paper
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Wei He*, Zhiheng Xi* (Co-first Author), Wanxu Zhao*, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang Preprint. Oct, 2024
codes / paper / dataset
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Preprint. June, 2024
project page / codes and platform / paper / dataset / benchmark / model
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou*, Guodong Zheng*, Binghai Wang*, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Oct, 2024
codes / paper
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang
ICML 2024; CIPS-LMG 2024 Outstanding Poster
codes / paper
The Rise and Potential of Large Language Model Based Agents: A Survey
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Qin Liu, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui
SCIENCE CHINA Information Sciences (SCIS), Cover Paper of SCIS Volume 68, Number 2, February 2025.
project page / paper
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Jia Liu, Tao Gui, Qi Zhang, Xuanjing Huang
EMNLP 2023 Findings.
codes / paper
Connectivity Patterns are Task Embeddings
Zhiheng Xi, Rui Zheng, Yuansen Zhang, XuanJing Huang, Zhongyu Wei, Minlong Peng, Mingming Sun, Qi Zhang, Tao Gui
ACL 2023 Findings.
codes / paper
Efficient Adversarial Training with Robust Early-Bird Tickets
Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
EMNLP 2022.
codes / paper
Characterizing the Impacts of Instances on Robustness
Rui Zheng*, Zhiheng Xi* (Co-first Author), Qin Liu, Wenbin Lai, Tao Gui, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan, Weifeng Ge
ACL 2023 Findings.
codes / paper
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
ICLR 2024 (Spotlight).
codes / paper
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang
Preprint. Mar, 2024.
project page / paper
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Preprint. Jan, 2024.
codes / paper
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang
Preprint. July, 2023.
codes / paper
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
COLING 2024.
paper

Miscellanea

  • I'm passionate about FPS games, including Counter-Strike: Global Offensive (CS:GO / CS2) and CrossFire (CF).
  • I love watching soccer and am a big fan of Mourinho.
  • I also love watching basketball games and my favorite player is Kevin Durant.



Updated at Nov 2024
Thanks Jon Barron for this fantastic template!