Follow
Antoine Yang
Antoine Yang
Google DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Just ask: Learning to answer questions from millions of narrated videos
A Yang, A Miech, J Sivic, I Laptev, C Schmid
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
2302021
NAS evaluation is frustratingly hard
A Yang, PM Esperança, FM Carlucci
International Conference on Learning Representations, 2020
1912020
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
A Yang, A Miech, J Sivic, I Laptev, C Schmid
Advances in Neural Information Processing Systems 35, 124-141, 2022
1252022
Vid2seq: Large-scale pretraining of a visual language model for dense video captioning
A Yang, A Nagrani, PH Seo, A Miech, J Pont-Tuset, I Laptev, J Sivic, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
792023
TubeDETR: Spatio-Temporal Video Grounding with Transformers
A Yang, A Miech, J Sivic, I Laptev, C Schmid
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
752022
MANAS: multi-agent neural architecture search
V Lopes, FM Carlucci, P Esperanca, M Singh, A Yang, V Gabillon, H Xu, ...
Machine Learning, 1-24, 2023
24*2023
Learning to Answer Visual Questions from Web Videos
A Yang, A Miech, J Sivic, I Laptev, C Schmid
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022
212022
Just ask: Learning to answer questions from millions of narrated videos. 2021 IEEE
A Yang, A Miech, J Sivic, I Laptev, C Schmid
CVF International Conference on Computer Vision (ICCV), 1666-1677, 2021
92021
VidChapters-7M: Video Chapters at Scale
A Yang, A Nagrani, I Laptev, J Sivic, C Schmid
Advances in Neural Information Processing Systems 36, 2023
72023
Covr: Learning composed video retrieval from web video captions
L Ventura, A Yang, C Schmid, G Varol
Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5270-5279, 2024
62024
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ...
arXiv preprint arXiv:2403.05530, 2024
32024
Learning Visual Language Models for Video Understanding
A Yang
Ecole Normale Superieure de Paris-ENS Paris, 2023
2023
VidChapters-7M: Video Chapters at Scale Supplementary Material
A Yang, A Nagrani, I Laptev, J Sivic, C Schmid
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models Supplementary Material
A Yang, A Miech, J Sivic, I Laptev, C Schmid
TubeDETR: Spatio-Temporal Video Grounding with Transformers Supplementary Material
A Yang, A Miech, J Sivic, I Laptev, C Schmid
Just Ask: Learning to Answer Questions from Millions of Narrated Videos Supplementary Material
A Yang, A Miech, J Sivic, I Laptev, C Schmid
The system can't perform the operation now. Try again later.
Articles 1–16