Follow
Yuhui Wang
Title
Cited by
Cited by
Year
Truly proximal policy optimization
Y Wang, H He, C Wen, X Tan
Uncertainty Artificial Intelligence, 2019
1662019
Trust region-guided proximal policy optimization
Y Wang, H He, X Tan, Y Gan
Advances in Neural Information Processing Systems, 2019
672019
Mindstorms in Natural Language-Based Societies of Mind
M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ...
NeurIPSW (Best Paper Award), 2023
602023
SMIX (): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning
X Yao, C Wen, Y Wang, X Tan
AAAI Conference on Artificial Intelligence, 2019
512019
Pornographic image recognition via weighted multiple instance learning
X Jin, Y Wang, X Tan
IEEE transactions on cybernetics 49 (12), 4412-4420, 2018
452018
Pornographic image recognition by strongly-supervised deep multiple instance learning
Y Wang, X Jin, X Tan
2016 IEEE International Conference on Image Processing (ICIP), 4418-4422, 2016
322016
A cooperative-competitive multi-agent framework for auto-bidding in online advertising
C Wen, M Xu, Z Zhang, Z Zheng, Y Wang, X Liu, Y Rong, D Xie, X Tan, ...
Proceedings of the Fifteenth ACM International Conference on Web Search and …, 2022
132022
Guiding online reinforcement learning with action-free offline pretraining
D Zhu, Y Wang, J Schmidhuber, M Elhoseiny
arXiv preprint arXiv:2301.12876, 2023
92023
Learning to identify critical states for reinforcement learning from videos
H Liu, M Zhuge, B Li, Y Wang, F Faccio, B Ghanem, J Schmidhuber
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
92023
Deep Recurrent Belief Propagation Network for POMDPs
Y Wang, X Tan
Proceedings of the AAAI Conference on Artificial Intelligence 35 (11), 10236 …, 2021
92021
ACRM: Attention cascade R-CNN with Mix-NMS for metallic surface defect detection
J Fang, X Tan, Y Wang
2020 25th International Conference on Pattern Recognition (ICPR), 423-430, 2021
82021
Alleviating the estimation bias of deep deterministic policy gradient via co-regularization
Y Li, YH Wang, YZ Gan, XY Tan
Pattern Recognition 131, 108872, 2022
72022
Greedy Multi-Step Off-Policy Reinforcement Learning
Y Wang, X Tan
Deep Reinforcement Learning Workshop, NeurIPS 2020, 2020
32020
Highway reinforcement learning
Y Wang, M Strupl, F Faccio, Q Wu, H Liu, M Grudzień, X Tan, ...
arXiv preprint arXiv:2405.18289, 2024
22024
Highway Value Iteration Networks
Y Wang, W Li, F Faccio, Q Wu, J Schmidhuber
arXiv preprint arXiv:2406.03485, 2024
12024
Greedy-Step Off-Policy Reinforcement Learning
Y Wang, Q Wu, P He, X Tan
arXiv preprint arXiv:2102.11717, 2021
12021
Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning
Y Wang, Q Wu, W Li, DR Ashley, F Faccio, C Huang, J Schmidhuber
arXiv preprint arXiv:2406.08404, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–17