Flamingo: a visual language model for few-shot learning JB Alayrac, J Donahue, P Luc, A Miech, I Barr, Y Hasson, K Lenc, ... Advances in neural information processing systems 35, 23716-23736, 2022 | 4100 | 2022 |
Emergent abilities of large language models J Wei, Y Tay, R Bommasani, C Raffel, B Zoph, S Borgeaud, D Yogatama, ... arXiv preprint arXiv:2206.07682, 2022 | 3623* | 2022 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 3543 | 2023 |
Training compute-optimal large language models J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 2795* | 2022 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ... arXiv preprint arXiv:2403.05530, 2024 | 1388 | 2024 |
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1344* | 2021 |
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... arXiv preprint arXiv:2112.04426, 2021 | 1171 | 2021 |
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 1132 | 2024 |
Perceiver io: A general architecture for structured inputs & outputs A Jaegle, S Borgeaud, JB Alayrac, C Doersch, C Ionescu, D Ding, ... arXiv preprint arXiv:2107.14795, 2021 | 665 | 2021 |
Gemma 2: Improving open language models at a practical size G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ... arXiv preprint arXiv:2408.00118, 2024 | 560 | 2024 |
Accelerating large language model decoding with speculative sampling C Chen, S Borgeaud, G Irving, JB Lespiau, L Sifre, J Jumper arXiv preprint arXiv:2302.01318, 2023 | 336 | 2023 |
OpenSpiel: A framework for reinforcement learning in games M Lanctot, E Lockhart, JB Lespiau, V Zambaldi, S Upadhyay, J Pérolat, ... arXiv preprint arXiv:1908.09453, 2019 | 332 | 2019 |
Unsupervised learning of object keypoints for perception and control TD Kulkarni, A Gupta, C Ionescu, S Borgeaud, M Reynolds, A Zisserman, ... Advances in neural information processing systems 32, 2019 | 229 | 2019 |
General-purpose, long-context autoregressive modeling with perceiver AR C Hawthorne, A Jaegle, C Cangea, S Borgeaud, C Nash, M Malinowski, ... International Conference on Machine Learning, 8535-8558, 2022 | 75 | 2022 |
Unified scaling laws for routed language models A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ... International conference on machine learning, 4057-4086, 2022 | 72 | 2022 |
Emergent abilities of large language models. arXiv 2022 J Wei, Y Tay, R Bommasani, C Raffel, B Zoph, S Borgeaud, D Yogatama, ... arXiv preprint arXiv:2206.07682, 2023 | 70 | 2023 |
Gemini: A family of highly capable multimodal models, 2024 G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2024 | 57 | 2024 |
Gemini: A family of highly capable multimodal models. arXiv 2023 G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2024 | 50 | 2024 |
Gemini: A family of highly capable multimodal models. CoRR, abs/2312.11805, 2023. doi: 10.48550 R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint ARXIV.2312.11805, 24-28, 0 | 48 | |
Spriteworld: A flexible, configurable reinforcement learning environment N Watters, L Matthey, S Borgeaud, R Kabra, A Lerchner | 19 | 2019 |