Youliang Yuan 袁尤良😄
Youliang Yuan 袁尤良😄
PhD student of Computer Science, The Chinese University of Hong Kong (Shenzhen)
Verified email at - Homepage
Cited by
Cited by
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Y Yuan, W Jiao, W Wang, J Huang, P He*, S Shi, Z Tu
ICLR 2024, 2023
All languages matter: On the multilingual safety of large language models
W Wang, Z Tu, C Chen, Y Yuan, J Huang, W Jiao, MR Lyu
ACL 2024 Findings, 2023
On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs
J Huang, W Wang, EJ Li, MH Lam, S Ren, Y Yuan, W Jiao, Z Tu, MR Lyu
ICLR 2024 (Oral), 2023
A & b== b & a: Triggering logical reasoning failures in large language models
Y Wan, W Wang, Y Yang, Y Yuan, J Huang, P He, W Jiao, MR Lyu
arXiv preprint arXiv:2401.00757, 2024
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
J Huang, EJ Li, MH Lam, T Liang, W Wang, Y Yuan, W Jiao, X Wang, Z Tu, ...
arXiv preprint arXiv:2403.11807, 2024
New Job, New Gender? Measuring the Social Bias in Image Generation Models
W Wang, H Bai, J Huang, Y Wan, Y Yuan, H Qiu, N Peng, MR Lyu
MM 2024 (Oral), 2024
The earth is flat? unveiling factual errors in large language models
W Wang, J Shi, Z Tu, Y Yuan, J Huang, W Jiao, MR Lyu
arXiv preprint arXiv:2401.00761, 2024
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Y Yuan, W Jiao, W Wang, J Huang, J Xu, T Liang, P He, Z Tu
arXiv preprint arXiv:2407.09121, 2024
Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT
Y Yuan, W Wang, Q Guo, Y Xiong, C Shen, P He
COLING 2024 (Oral), 5191-5201, 2024
The system can't perform the operation now. Try again later.
Articles 1–9