Follow
akbir khan
akbir khan
Anthropic; University College London
Verified email at cantab.ac.uk - Homepage
Title
Cited by
Cited by
Year
The Goldilocks of Pragmatic Understanding: Fine-tuning Strategy Matters for Implicature Resolution by LLMs
L Ruis, A Khan, S Biderman, S Hooker, T Rocktäschel, E Grefenstette
NeurIPS (Oral), 2024
71*2024
Debating with More Persuasive LLMs Leads to More Truthful Answers
A Khan, J Hughes, D Valentine, L Ruis, K Sachan, A Radhakrishnan, ...
ICML (Best Paper), 2024
672024
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
A Rutherford, B Ellis, M Gallici, J Cook, A Lupu, G Ingvarsson, T Willi, ...
AAMAS, 2444-2446, 2024
49*2024
MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
M Samvelyan, A Khan, M Dennis, M Jiang, J Parker-Holder, J Foerster, ...
ICLR, 2023
322023
Detecting Anomalous Application Messages in Telecommunication Networks
D Shah, J Hopkins, A Khan, J Lakha
US Patent App. 16/647,166, 2021
242021
Considering Racial Bias as a Problem of Transfer Learning
A Khan, M Mahmoud
2019 IEEE Winter Applications of Computer Vision (Oral), 100-106, 2019
15*2019
Chatarena: Multi-Agent Language Game Environments for Large Language Models
Y Wu, Z Jiang, A Khan, Y Fu, L Ruis, E Grefenstette, T Rocktäschel
GitHub repository, 2023
142023
Language Models Learn to Mislead Humans via RLHF
J Wen, R Zhong, A Khan, E Perez, J Steinhardt, M Huang, SR Bowman, ...
arXiv preprint arXiv:2409.12822, 2024
122024
Alignment Faking in Large Language Models
R Greenblatt, C Denison, B Wright, F Roger, M MacDiarmid, S Marks, ...
arXiv preprint arXiv:2412.14093, 2024
62024
Scaling Opponent Shaping to High Dimensional Games
A Khan, T Willi, N Kwan, A Tacchetti, C Lu, E Grefenstette, T Rocktäschel, ...
AAMAS 23, 2023
52023
Balrog: Benchmarking Agentic LLM and VLM Reasoning on Games
D Paglieri, B Cupiał, S Coward, U Piterbarg, M Wolczyk, A Khan, ...
arXiv preprint arXiv:2411.13543, 2024
32024
Leading the Pack: N-player Opponent Shaping
A Souly, T Willi, A Khan, R Kirk, C Lu, E Grefenstette, T Rocktäschel
Multi-Agent Security Workshop NeurIPS'23, 2023
32023
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
J Wen, V Hebbar, C Larson, A Bhatt, A Radhakrishnan, M Sharma, ...
arXiv preprint arXiv:2411.17693, 2024
2024
Shell Games: Control Protocols for Adversarial AI Agents
A Bhatt, C Rushing, A Kaufman, V Georgiev, T Tracy, A Khan, B Shlegeris
Melting Pot Contest: Charting the Future of Generalized Cooperative Intelligence
R Trivedi, A Khan, J Clifton, L Hammond, EA Duéñez-Guzmán, ...
The Thirty-eight Conference on Neural Information Processing Systems …, 0
The Concordia Contest: Advancing the Cooperative Intelligence of Language Agents
C Smith, R Trivedi, J Clifton, L Hammond, A Khan, S Vezhnevets, ...
NeurIPS 2024 Competition Track, 0
The system can't perform the operation now. Try again later.
Articles 1–16