Ian Osband

Cited by

	All	Since 2019
Citations	7684	6849
h-index	25	25
i10-index	29	28

1600

800

400

1200

201520162017201820192020202120222023202427 74 222 467 751 1154 1359 1472 1533 578

Co-authors

Benjamin Van RoyStanford UniversityVerified email at stanford.edu
Zheng WenGoogle DeepMindVerified email at google.com
Vikranth DwaracherlaDeepMindVerified email at google.com
Xiuyuan LuGoogle DeepMindVerified email at google.com
Daniel RussoColumbia UniversityVerified email at gsb.columbia.edu
Morteza IbrahimiStanford UniversityVerified email at stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindVerified email at alumni.stanford.edu
Mohammad Gheshlaghi AzarCohereVerified email at google.com
Todd HesterWaymoVerified email at waymo.com
Bilal PiotGoogle DeepmindVerified email at google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Verified email at univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindVerified email at nyu.edu
Rémi MunosDeepMindVerified email at inria.fr
Alexander PritzelDeepmindVerified email at google.com
Marc LanctotResearch Scientist, Google DeepMindVerified email at google.com

Ian Osband

OpenAI

Verified email at openai.com - Homepage

Reinforcement Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1405	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1173	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1063	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	781	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	396	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	323	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	320	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	256	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	209	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	182	2014
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	175	2019
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	174	2017
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	163*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	136	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	122	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	108	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	99	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	94	2013
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	82	2019
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	79	2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors