The Rise and Potential of Large Language Model Based Agents: A Survey Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ... arXiv preprint arXiv:2309.07864, 2023 | 573 | 2023 |
Secrets of RLHF in Large Language Models Part I: PPO R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ... arXiv preprint arXiv:2307.04964, 2023 | 106* | 2023 |
Secrets of RLHF in Large Language Models Part II: Reward Modeling B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ... arXiv preprint arXiv:2401.06080, 2024 | 65* | 2024 |
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin S Dou, E Zhou, Y Liu, S Gao, W Shen, L Xiong, Y Zhou, X Wang, Z Xi, ... ACL 2024, 1932-1945, 2024 | 58* | 2024 |
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement Z Xi, S Jin, Y Zhou, R Zheng, S Gao, T Gui, Q Zhang, X Huang EMNLP 2023 (findings), 11383–11406, 2023 | 32 | 2023 |
Robust Lottery Tickets for Pre-trained Language Models R Zheng, R Bao, Y Zhou, D Liang, S Wang, W Wu, T Gui, Q Zhang, ... ACL 2022, 2211–2224, 2022 | 20 | 2022 |
Poly-Visual-Expert Vision-Language Models X Fan, T Ji, S Li, S Jin, S Song, J Wang, B Hong, L Chen, G Zheng, ... COLM 2024, 2024 | 7* | 2024 |
Training large language models for reasoning through reverse curriculum reinforcement learning Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ... ICML 2024, 2024 | 7 | 2024 |
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan, ... ACL 2024, 4571–4585, 2024 | 5 | 2024 |
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study S Dou, H Jia, S Wu, H Zheng, W Zhou, M Wu, M Chai, J Fan, C Huang, ... arXiv preprint arXiv:2407.06153, 2024 | 5 | 2024 |
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ... ICLR 2024, 2024 | 3 | 2024 |
Delve into PPO: Implementation Matters for Stable RLHF R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Y Zhou, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023 | 2 | 2023 |
Detecting Adversarial Samples through Sharpness of Loss Landscape R Zheng, S Dou, Y Zhou, Q Liu, T Gui, Q Zhang, Z Wei, XJ Huang, ... ACL 2023 (findings), 11282-11298, 2023 | 2 | 2023 |
CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection S Dou, Y Wu, H Jia, Y Zhou, Y Liu, Y Liu Proceedings of the ACM on Software Engineering 1 (FSE), 1564-1584, 2024 | 1 | 2024 |
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning L Chen, R Zheng, B Wang, S Jin, C Huang, J Ye, Z Zhang, Y Zhou, Z Xi, ... EMNLP 2024, 15270-15283, 2024 | | 2024 |
Reward Modeling Requires Automatic Adjustment Based on Data Quality B Wang, R Zheng, L Chen, Z Xi, W Shen, Y Zhou, D Yan, T Gui, Q Zhang, ... EMNLP 2024 (findings), 4041-4064, 2024 | | 2024 |
ORTicket: Let One Robust BERT Ticket Transfer across Different Tasks Y Zhou, W Chen, R Zheng, Z Xi, T Gui, Q Zhang, XJ Huang COLING 2024, 12527-12538, 2024 | | 2024 |
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals R Zheng, Y Zhou, Z Xi, T Gui, Q Zhang, X Huang COLING 2024, 15410–15421, 2024 | | 2024 |