Gao Peng
Cited by
Cited by
Dynamic fusion with intra-and inter-modality attention flow for visual question answering
P Gao, Z Jiang, H You, P Lu, SCH Hoi, X Wang, H Li
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019
End-to-end object detection with adaptive clustering transformer
M Zheng, P Gao, R Zhang, K Li, X Wang, H Li, H Dong
arXiv preprint arXiv:2011.09315, 2020
Question-guided hybrid convolution for visual question answering
P Gao, H Li, S Li, P Lu, Y Li, SCH Hoi, X Wang
Proceedings of the European Conference on Computer Vision (ECCV), 469-485, 2018
Fast convergence of detr with spatially modulated co-attention
P Gao, M Zheng, X Wang, J Dai, H Li
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
Multi-modality latent interaction network for visual question answering
P Gao, H You, Z Zhang, X Wang, H Li
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
Video object detection with locally-weighted deformable neighbors
Z Jiang, P Gao, C Guo, Q Zhang, S Xiang, C Pan
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 8529-8536, 2019
Container: Context aggregation network
P Gao, J Lu, H Li, R Mottaghi, A Kembhavi
arXiv preprint arXiv:2106.01401, 2021
Learning where to focus for efficient video object detection
Z Jiang, Y Liu, C Yang, J Liu, P Gao, Q Zhang, S Xiang, C Pan
European conference on computer vision, 18-34, 2020
Dynamic graph representation learning for video dialog via multi-modal shuffled transformers
S Geng, P Gao, M Chatterjee, C Hori, J Le Roux, Y Zhang, H Li, A Cherian
Proceedings of the AAAI Conference on Artificial Intelligence 35 (2), 1415-1423, 2021
Clip-adapter: Better vision-language models with feature adapters
P Gao, S Geng, R Zhang, T Ma, R Fang, Y Zhang, H Li, Y Qiao
arXiv preprint arXiv:2110.04544, 2021
Dual-stream network for visual recognition
M Mao, R Zhang, H Zheng, T Ma, Y Peng, E Ding, B Zhang, S Han
Advances in Neural Information Processing Systems 34, 25346-25358, 2021
Contrastive visual-linguistic pretraining
L Shi, K Shuang, S Geng, P Su, Z Jiang, P Gao, Z Fu, G de Melo, S Su
arXiv preprint arXiv:2007.13135, 2020
Tip-adapter: Training-free clip-adapter for better vision-language modeling
R Zhang, R Fang, P Gao, W Zhang, K Li, J Dai, Y Qiao, H Li
arXiv preprint arXiv:2111.03930, 2021
Uniformer: Unifying convolution and self-attention for visual recognition
K Li, Y Wang, J Zhang, P Gao, G Song, Y Liu, H Li, Y Qiao
arXiv preprint arXiv:2201.09450, 2022
Uniformer: Unified transformer for efficient spatiotemporal representation learning
K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao
arXiv preprint arXiv:2201.04676, 2022
Accurate and efficient image super-resolution via global-local adjusting dense network
X Zhang, P Gao, S Liu, K Zhao, G Li, L Yin, CW Chen
IEEE Transactions on Multimedia 23, 1924-1937, 2020
Character matters: Video story understanding with character-aware relations
S Geng, J Zhang, Z Fu, P Gao, H Zhang, G de Melo
arXiv preprint arXiv:2005.08646, 2020
Oriented object detection with transformer
T Ma, M Mao, H Zheng, P Gao, X Wang, S Han, E Ding, B Zhang, ...
arXiv preprint arXiv:2106.03146, 2021
Romebert: Robust training of multi-exit bert
S Geng, P Gao, Z Fu, Y Zhang
arXiv preprint arXiv:2101.09755, 2021
Multi-layer content interaction through quaternion product for visual question answering
L Shi, S Geng, K Shuang, C Hori, S Liu, P Gao, S Su
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
The system can't perform the operation now. Try again later.
Articles 1–20