Iason Gabriel
Iason Gabriel
Staff Research Scientist, DeepMind
Verified email at
Cited by
Cited by
Scaling language models: Methods, analysis & insights from training gopher
JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ...
arXiv preprint arXiv:2112.11446, 2021
Ethical and social risks of harm from language models
L Weidinger, J Mellor, M Rauh, C Griffin, J Uesato, PS Huang, M Cheng, ...
arXiv preprint arXiv:2112.04359, 2021
Artificial intelligence, values, and alignment
I Gabriel
Minds and machines 30 (3), 411-437, 2020
Taxonomy of risks posed by language models
L Weidinger, J Uesato, M Rauh, C Griffin, PS Huang, J Mellor, A Glaese, ...
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022
Improving alignment of dialogue agents via targeted human judgements
A Glaese, N McAleese, M Trębacz, J Aslanides, V Firoiu, T Ewalds, ...
arXiv preprint arXiv:2209.14375, 2022
Effective altruism and its critics
I Gabriel
Journal of Applied Philosophy 34 (4), 457-473, 2017
Power to the people? Opportunities and challenges for participatory AI
A Birhane, W Isaac, V Prabhakaran, M Diaz, MC Elish, I Gabriel, ...
Proceedings of the 2nd ACM Conference on Equity and Access in Algorithms …, 2022
Alignment of language agents
Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving
arXiv preprint arXiv:2103.14659, 2021
Model evaluation for extreme risks
T Shevlane, S Farquhar, B Garfinkel, M Phuong, J Whittlestone, J Leung, ...
arXiv preprint arXiv:2305.15324, 2023
In conversation with artificial intelligence: aligning language models with human values
A Kasirzadeh, I Gabriel
Philosophy & Technology 36 (2), 27, 2023
Toward a theory of justice for artificial intelligence
I Gabriel
Daedalus 151 (2), 218-231, 2022
Sociotechnical safety evaluation of generative ai systems
L Weidinger, M Rauh, N Marchal, A Manzini, LA Hendricks, ...
arXiv preprint arXiv:2310.11986, 2023
The Challenge of Value Alignment
I Gabriel, V Ghazavi
The Oxford Handbook of Digital Ethics, 2022
A human rights-based approach to responsible AI
V Prabhakaran, M Mitchell, T Gebru, I Gabriel
arXiv preprint arXiv:2210.02667, 2022
Characteristics of harmful text: Towards rigorous benchmarking of language models
M Rauh, J Mellor, J Uesato, PS Huang, J Welbl, L Weidinger, S Dathathri, ...
Advances in Neural Information Processing Systems 35, 24720-24739, 2022
Beyond privacy trade-offs with structured transparency
A Trask, E Bluemke, B Garfinkel, CG Cuervas-Mons, A Dafoe
arXiv preprint arXiv:2012.08347, 2020
Using the Veil of Ignorance to align AI systems with principles of justice
L Weidinger, KR McKee, R Everett, S Huang, TO Zhu, MJ Chadwick, ...
Proceedings of the National Academy of Sciences 120 (18), e2213709120, 2023
Permissible secrets
H Lazenby, I Gabriel
The Philosophical Quarterly 68 (271), 265-285, 2018
Effective Altruism, Global Poverty, and Systemic Change
I Gabriel, B McElwee
Effective Altruism, 99-114, 2019
Representation in ai evaluations
AS Bergman, LA Hendricks, M Rauh, B Wu, W Agnew, M Kunesch, I Duan, ...
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023
The system can't perform the operation now. Try again later.
Articles 1–20