Follow
Botian Shi
Botian Shi
Shanghai Artificial Intelligence Laboratory
Verified email at pjlab.org.cn
Title
Cited by
Cited by
Year
Univl: A unified video and language pre-training model for multimodal understanding and generation
H Luo, L Ji, B Shi, H Huang, N Duan, T Li, J Li, T Bharti, M Zhou
arXiv preprint arXiv:2002.06353, 2020
4712020
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ...
arXiv preprint arXiv:2404.16821, 2024
1622024
Multi-modal sensor fusion for auto driving perception: A survey
K Huang, B Shi, X Li, X Li, S Huang, Y Li
arXiv preprint arXiv:2202.02703, 2022
1262022
Drive like a human: Rethinking autonomous driving with large language models
D Fu, X Li, L Wen, M Dou, P Cai, B Shi, Y Qiao
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024
1152024
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models
L Wen, D Fu, X Li, X Cai, T Ma, P Cai, M Dou, B Shi, L He, Y Qiao
The Twelfth International Conference on Learning Representations (ICLR), 2024
1002024
Logonet: Towards accurate 3d object detection with local-to-global cross-modal fusion
X Li, T Ma, Y Hou, B Shi, Y Yang, Y Liu, X Wu, Q Chen, Y Li, Y Qiao, L He
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
962023
Knowledge Aware Semantic Concept Expansion for Image-Text Matching.
B Shi, L Ji, P Lu, Z Niu, N Duan
Proceedings of the Twenty-Eighth International Joint Conference on …, 2019
792019
Dense procedure captioning in narrated instructional videos
B Shi, L Ji, Y Liang, N Duan, P Chen, Z Niu, M Zhou
Proceedings of the 57th annual meeting of the association for computational …, 2019
782019
Microsoft concept graph: Mining semantic concepts for short text understanding
L Ji, Y Wang, B Shi, D Zhang, Z Wang, J Yan
Data Intelligence 1 (3), 238-270, 2019
652019
Streetsurf: Extending multi-view implicit surface reconstruction to street views
J Guo, N Deng, X Li, Y Bai, B Shi, C Wang, C Ding, D Wang, Y Li
arXiv preprint arXiv:2306.04988, 2023
562023
On the road with gpt-4v (ision): Early explorations of visual-language model on autonomous driving
L Wen, X Yang, D Fu, X Wang, P Cai, X Li, T Ma, Y Li, L Xu, D Shang, ...
arXiv preprint arXiv:2311.05332, 2023
532023
Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
X Li, B Shi, Y Hou, X Wu, T Ma, Y Li, L He
Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel …, 2022
462022
Multi-sensor Fusion and Cooperative Perception for Autonomous Driving: A Review
C Xiang, C Feng, X Xie, B Shi, H Lu, Y Lv, M Yang, Z Niu
IEEE Intelligent Transportation Systems Magazine, 2023
362023
Uni3d: A unified baseline for multi-dataset 3d object detection
B Zhang, J Yuan, B Shi, T Chen, Y Li, Y Qiao
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
362023
Bi3d: Bi-domain active learning for cross-domain 3d object detection
J Yuan, B Zhang, X Yan, T Chen, B Shi, Y Li, Y Qiao
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
292023
Learning semantic concepts and temporal alignment for narrated video procedural captioning
B Shi, L Ji, Z Niu, N Duan, M Zhou, X Chen
Proceedings of the 28th ACM international conference on multimedia, 4355-4363, 2020
262020
A benchmark for structured procedural knowledge extraction from cooking videos
FF Xu, L Ji, B Shi, J Du, G Neubig, Y Bisk, N Duan
arXiv preprint arXiv:2005.00706, 2020
242020
Chartx & chartvlm: A versatile benchmark and foundation model for complicated chart reasoning
R Xia, B Zhang, H Ye, X Yan, Q Liu, H Zhou, Z Chen, M Dou, B Shi, J Yan, ...
arXiv preprint arXiv:2402.12185, 2024
212024
Detzero: Rethinking offboard 3d object detection with long-term sequential point clouds
T Ma, X Yang, H Zhou, X Li, B Shi, J Liu, Y Yang, Z Liu, L He, Y Qiao, Y Li, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
202023
Is sora a world simulator? a comprehensive survey on general world models and beyond
Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou, Y Wang, B Shi, K Wang, ...
arXiv preprint arXiv:2405.03520, 2024
192024
The system can't perform the operation now. Try again later.
Articles 1–20