Open-vocabulary object detection via vision and language knowledge distillation X Gu, TY Lin, W Kuo, Y Cui International Conference on Learning Representations (ICLR), 2022 | 1045 | 2022 |
Scaling open-vocabulary image segmentation with image-level labels G Ghiasi, X Gu, Y Cui, TY Lin European Conference on Computer Vision, 540-557, 2022 | 445 | 2022 |
Hplflownet: Hierarchical permutohedral lattice flownet for scene flow estimation on large-scale point clouds X Gu, Y Wang, C Wu, YJ Lee, P Wang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 252 | 2019 |
F-vlm: Open-vocabulary object detection upon frozen vision and language models W Kuo, Y Cui, X Gu, AJ Piergiovanni, A Angelova arXiv preprint arXiv:2209.15639, 2022 | 191 | 2022 |
Language Model Beats Diffusion--Tokenizer is Key to Visual Generation L Yu, J Lezama, NB Gundavarapu, L Versari, K Sohn, D Minnen, Y Cheng, ... arXiv preprint arXiv:2310.05737, 2023 | 178 | 2023 |
Videopoet: A large language model for zero-shot video generation D Kondratyuk, L Yu, X Gu, J Lezama, J Huang, R Hornung, H Adam, ... The Forty-first International Conference on Machine Learning, 2024 | 165 | 2024 |
Photorealistic video generation with diffusion models A Gupta, L Yu, K Sohn, X Gu, M Hahn, FF Li, I Essa, L Jiang, J Lezama European Conference on Computer Vision, 393-411, 2025 | 118 | 2025 |
Interspecies knowledge transfer for facial keypoint detection M Rashid, X Gu, Y Jae Lee Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2017 | 66 | 2017 |
Password-conditioned Anonymization and Deanonymization with Face Identity Transformers X Gu, W Luo, MS Ryoo, YJ Lee European Conference on Computer Vision (ECCV), 2020 | 58 | 2020 |
A simple zero-shot prompt weighting technique to improve prompt ensembling in text-image models JU Allingham, J Ren, MW Dusenberry, X Gu, Y Cui, D Tran, JZ Liu, ... International Conference on Machine Learning, 547-568, 2023 | 37 | 2023 |
Dataseg: Taming a universal multi-dataset multi-task segmentation model X Gu, Y Cui, J Huang, A Rashwan, X Yang, X Zhou, G Ghiasi, W Kuo, ... Advances in Neural Information Processing Systems 36, 2024 | 23 | 2024 |
Pixel-Aligned Language Model J Xu, X Zhou, S Yan, X Gu, A Arnab, C Sun, X Wang, C Schmid Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 23 | 2024 |
Clip as rnn: Segment countless visual concepts without training endeavor S Sun, R Li, P Torr, X Gu, S Li Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 22 | 2024 |
A revisit on deep hashings for large-scale content based image retrieval D Cai, X Gu, C Wang arXiv preprint arXiv:1711.06016, 2017 | 16 | 2017 |
Polymax: General dense prediction with mask transformer X Yang, L Yuan, K Wilber, A Sharma, X Gu, S Qiao, S Debats, H Wang, ... Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024 | 11 | 2024 |
Open-vocabulary image segmentation G Ghiasi, X Gu, Y Cui, TY Lin arXiv preprint arXiv:2112.12143 2 (3), 6, 2021 | 6 | 2021 |
Open-vocabulary temporal action detection with off-the-shelf image-text features V Rathod, B Seybold, S Vijayanarasimhan, A Myers, X Gu, V Birodkar, ... arXiv preprint arXiv:2212.10596, 2022 | 5 | 2022 |
Explore deep graph generation X Gu | 1 | 2019 |
Human or robot X Gu, S Shi | 1 | 2017 |
Language-Guided Image Tokenization for Generation K Zha, L Yu, A Fathi, DA Ross, C Schmid, D Katabi, X Gu arXiv preprint arXiv:2412.05796, 2024 | | 2024 |