Truly proximal policy optimization Y Wang, H He, C Wen, X Tan Uncertainty Artificial Intelligence, 2019 | 166 | 2019 |
Trust region-guided proximal policy optimization Y Wang, H He, X Tan, Y Gan Advances in Neural Information Processing Systems, 2019 | 67 | 2019 |
Mindstorms in Natural Language-Based Societies of Mind M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ... NeurIPSW (Best Paper Award), 2023 | 60 | 2023 |
SMIX (): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning X Yao, C Wen, Y Wang, X Tan AAAI Conference on Artificial Intelligence, 2019 | 51 | 2019 |
Pornographic image recognition via weighted multiple instance learning X Jin, Y Wang, X Tan IEEE transactions on cybernetics 49 (12), 4412-4420, 2018 | 45 | 2018 |
Pornographic image recognition by strongly-supervised deep multiple instance learning Y Wang, X Jin, X Tan 2016 IEEE International Conference on Image Processing (ICIP), 4418-4422, 2016 | 32 | 2016 |
A cooperative-competitive multi-agent framework for auto-bidding in online advertising C Wen, M Xu, Z Zhang, Z Zheng, Y Wang, X Liu, Y Rong, D Xie, X Tan, ... Proceedings of the Fifteenth ACM International Conference on Web Search and …, 2022 | 13 | 2022 |
Guiding online reinforcement learning with action-free offline pretraining D Zhu, Y Wang, J Schmidhuber, M Elhoseiny arXiv preprint arXiv:2301.12876, 2023 | 9 | 2023 |
Learning to identify critical states for reinforcement learning from videos H Liu, M Zhuge, B Li, Y Wang, F Faccio, B Ghanem, J Schmidhuber Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 9 | 2023 |
Deep Recurrent Belief Propagation Network for POMDPs Y Wang, X Tan Proceedings of the AAAI Conference on Artificial Intelligence 35 (11), 10236 …, 2021 | 9 | 2021 |
ACRM: Attention cascade R-CNN with Mix-NMS for metallic surface defect detection J Fang, X Tan, Y Wang 2020 25th International Conference on Pattern Recognition (ICPR), 423-430, 2021 | 8 | 2021 |
Alleviating the estimation bias of deep deterministic policy gradient via co-regularization Y Li, YH Wang, YZ Gan, XY Tan Pattern Recognition 131, 108872, 2022 | 7 | 2022 |
Greedy Multi-Step Off-Policy Reinforcement Learning Y Wang, X Tan Deep Reinforcement Learning Workshop, NeurIPS 2020, 2020 | 3 | 2020 |
Highway reinforcement learning Y Wang, M Strupl, F Faccio, Q Wu, H Liu, M Grudzień, X Tan, ... arXiv preprint arXiv:2405.18289, 2024 | 2 | 2024 |
Highway Value Iteration Networks Y Wang, W Li, F Faccio, Q Wu, J Schmidhuber arXiv preprint arXiv:2406.03485, 2024 | 1 | 2024 |
Greedy-Step Off-Policy Reinforcement Learning Y Wang, Q Wu, P He, X Tan arXiv preprint arXiv:2102.11717, 2021 | 1 | 2021 |
Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning Y Wang, Q Wu, W Li, DR Ashley, F Faccio, C Huang, J Schmidhuber arXiv preprint arXiv:2406.08404, 2024 | | 2024 |