Towards Principled Disentanglement for Domain Generalization H Zhang, YF Zhang, W Liu, A Weller, B Schölkopf, EP Xing CVPR 2022, 2022 | 129 | 2022 |
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark A Pan, CJ Shern, A Zou, N Li, S Basart, T Woodside, J Ng, H Zhang, ... ICML 2023, 2023 | 120 | 2023 |
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models H Zhang, BL Edelman, D Francati, D Venturi, G Ateniese, B Barak ICML 2024, 2023 | 50 | 2023 |
DataComp-LM: In search of the next generation of training sets for language models J Li, A Fang, G Smyrnis, M Ivgi, M Jordan, S Gadre, H Bansal, E Guha, ... arXiv preprint arXiv:2406.11794, 2024 | 46* | 2024 |
Iterative Graph Self-Distillation H Zhang, S Lin, W Liu, P Zhou, J Tang, X Liang, EP Xing TKDE 2023, 2020 | 43 | 2020 |
Towards Interpretable Natural Language Understanding with Explanations as Latent Variables W Zhou, J Hu, H Zhang, X Liang, M Sun, C Xiong, J Tang NeurIPS 2020, 2020 | 40 | 2020 |
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation YF Zhang, H Zhang, ZC Lipton, LE Li, EP Xing TMLR 2023, 2022 | 37* | 2022 |
Improved Logical Reasoning of Language Models via Differentiable Symbolic Programming H Zhang, Z Li, J Huang, M Naik, E Xing ACL-Findings 2023, 2022 | 29 | 2022 |
Toward Learning Human-aligned Cross-domain Robust Models by Countering Misaligned Features H Wang, Z Huang, H Zhang, E Xing UAI 2022, 2021 | 16 | 2021 |
A Study on the Calibration of In-context Learning H Zhang, YF Zhang, Y Yu, D Madeka, D Foster, E Xing, H Lakkaraju, ... NAACL 2024, 2023 | 12 | 2023 |
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems Z Qi, H Zhang, E Xing, S Kakade, H Lakkaraju arXiv preprint arXiv:2402.17840, 2024 | 11 | 2024 |
Eliminating Position Bias of Language Models: A Mechanistic Approach Z Wang, H Zhang, X Li, KH Huang, C Han, S Ji, SM Kakade, H Peng, H Ji arXiv preprint arXiv:2407.01100, 2024 | 5 | 2024 |
Evaluating Step-by-Step Reasoning through Symbolic Verification YF Zhang, H Zhang, LE Li, E Xing NAACL-Findings 2024, 2022 | 4* | 2022 |
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training D Brandfonbrener, H Zhang, A Kirsch, JR Schwarz, S Kakade arXiv preprint arXiv:2406.10670, 2024 | 3 | 2024 |
A Closer Look at the Calibration of Differentially Private Learners H Zhang, X Li, P Sen, S Roukos, T Hashimoto arXiv preprint arXiv:2210.08248, 2022 | 3 | 2022 |
Stochastic Neural Networks with Infinite Width are Deterministic L Ziyin, H Zhang, X Meng, Y Lu, E Xing, M Ueda arXiv preprint arXiv:2201.12724, 2022 | 3 | 2022 |
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Y Song, H Zhang, C Eisenach, S Kakade, D Foster, U Ghai arXiv preprint arXiv:2412.02674, 2024 | | 2024 |
How Does Critical Batch Size Scale in Pre-training? H Zhang, D Morwani, N Vyas, J Wu, D Zou, U Ghai, D Foster, S Kakade arXiv preprint arXiv:2410.21676, 2024 | | 2024 |