Univl: A unified video and language pre-training model for multimodal understanding and generation H Luo, L Ji, B Shi, H Huang, N Duan, T Li, J Li, T Bharti, M Zhou arXiv preprint arXiv:2002.06353, 2020 | 471 | 2020 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ... arXiv preprint arXiv:2404.16821, 2024 | 162 | 2024 |
Multi-modal sensor fusion for auto driving perception: A survey K Huang, B Shi, X Li, X Li, S Huang, Y Li arXiv preprint arXiv:2202.02703, 2022 | 126 | 2022 |
Drive like a human: Rethinking autonomous driving with large language models D Fu, X Li, L Wen, M Dou, P Cai, B Shi, Y Qiao Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024 | 115 | 2024 |
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models L Wen, D Fu, X Li, X Cai, T Ma, P Cai, M Dou, B Shi, L He, Y Qiao The Twelfth International Conference on Learning Representations (ICLR), 2024 | 100 | 2024 |
Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion X Li, T Ma, Y Hou, B Shi, Y Yang, Y Liu, X Wu, Q Chen, Y Li, Y Qiao, L He Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 96 | 2023 |
Knowledge Aware Semantic Concept Expansion for Image-Text Matching. B Shi, L Ji, P Lu, Z Niu, N Duan Proceedings of the Twenty-Eighth International Joint Conference on …, 2019 | 79 | 2019 |
Dense procedure captioning in narrated instructional videos B Shi, L Ji, Y Liang, N Duan, P Chen, Z Niu, M Zhou Proceedings of the 57th annual meeting of the association for computational …, 2019 | 78 | 2019 |
Microsoft concept graph: Mining semantic concepts for short text understanding L Ji, Y Wang, B Shi, D Zhang, Z Wang, J Yan Data Intelligence 1 (3), 238-270, 2019 | 65 | 2019 |
Streetsurf: Extending multi-view implicit surface reconstruction to street views J Guo, N Deng, X Li, Y Bai, B Shi, C Wang, C Ding, D Wang, Y Li arXiv preprint arXiv:2306.04988, 2023 | 56 | 2023 |
On the road with gpt-4v (ision): Early explorations of visual-language model on autonomous driving L Wen, X Yang, D Fu, X Wang, P Cai, X Li, T Ma, Y Li, L Xu, D Shang, ... arXiv preprint arXiv:2311.05332, 2023 | 53 | 2023 |
Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection X Li, B Shi, Y Hou, X Wu, T Ma, Y Li, L He Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel …, 2022 | 46 | 2022 |
Multi-sensor Fusion and Cooperative Perception for Autonomous Driving: A Review C Xiang, C Feng, X Xie, B Shi, H Lu, Y Lv, M Yang, Z Niu IEEE Intelligent Transportation Systems Magazine, 2023 | 36 | 2023 |
Uni3d: A unified baseline for multi-dataset 3d object detection B Zhang, J Yuan, B Shi, T Chen, Y Li, Y Qiao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 36 | 2023 |
Bi3d: Bi-domain active learning for cross-domain 3d object detection J Yuan, B Zhang, X Yan, T Chen, B Shi, Y Li, Y Qiao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 29 | 2023 |
Learning semantic concepts and temporal alignment for narrated video procedural captioning B Shi, L Ji, Z Niu, N Duan, M Zhou, X Chen Proceedings of the 28th ACM international conference on multimedia, 4355-4363, 2020 | 26 | 2020 |
A benchmark for structured procedural knowledge extraction from cooking videos FF Xu, L Ji, B Shi, J Du, G Neubig, Y Bisk, N Duan arXiv preprint arXiv:2005.00706, 2020 | 24 | 2020 |
Chartx & chartvlm: A versatile benchmark and foundation model for complicated chart reasoning R Xia, B Zhang, H Ye, X Yan, Q Liu, H Zhou, Z Chen, M Dou, B Shi, J Yan, ... arXiv preprint arXiv:2402.12185, 2024 | 21 | 2024 |
Detzero: Rethinking offboard 3d object detection with long-term sequential point clouds T Ma, X Yang, H Zhou, X Li, B Shi, J Liu, Y Yang, Z Liu, L He, Y Qiao, Y Li, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 20 | 2023 |
Is sora a world simulator? a comprehensive survey on general world models and beyond Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou, Y Wang, B Shi, K Wang, ... arXiv preprint arXiv:2405.03520, 2024 | 19 | 2024 |