Folgen
Yinlam Chow
Titel
Zitiert von
Zitiert von
Jahr
A lyapunov-based approach to safe reinforcement learning
Y Chow, O Nachum, E Duenez-Guzman, M Ghavamzadeh
Advances in neural information processing systems 31, 2018
6192018
Risk-constrained reinforcement learning with percentile risk criteria
Y Chow, M Ghavamzadeh, L Janson, M Pavone
Journal of Machine Learning Research 18 (167), 1-51, 2018
6092018
Algorithms for CVaR optimization in MDPs
Y Chow, M Ghavamzadeh
Advances in neural information processing systems 27, 2014
3942014
Risk-sensitive and robust decision-making: a cvar optimization approach
Y Chow, A Tamar, S Mannor, M Pavone
Advances in neural information processing systems 28, 2015
3922015
Dualdice: Behavior-agnostic estimation of discounted stationary distribution corrections
O Nachum, Y Chow, B Dai, L Li
Advances in neural information processing systems 32, 2019
3702019
Lyapunov-based safe policy optimization for continuous control
Y Chow, O Nachum, A Faust, E Duenez-Guzman, M Ghavamzadeh
arXiv preprint arXiv:1901.10031, 2019
2882019
More robust doubly robust off-policy evaluation
M Farajtabar, Y Chow, M Ghavamzadeh
International Conference on Machine Learning, 1447-1456, 2018
2882018
Algaedice: Policy gradient from arbitrary experience
O Nachum, B Dai, I Kostrikov, Y Chow, L Li, D Schuurmans
arXiv preprint arXiv:1912.02074, 2019
2592019
Safe policy improvement by minimizing robust baseline regret
M Ghavamzadeh, M Petrik, Y Chow
Advances in Neural Information Processing Systems 29, 2016
1612016
Policy gradient for coherent risk measures
A Tamar, Y Chow, M Ghavamzadeh, S Mannor
Advances in neural information processing systems 28, 2015
1502015
Coindice: Off-policy confidence interval estimation
B Dai, O Nachum, Y Chow, L Li, C Szepesvári, D Schuurmans
Advances in neural information processing systems 33, 9398-9411, 2020
912020
Sequential decision making with coherent risk
A Tamar, Y Chow, M Ghavamzadeh, S Mannor
IEEE transactions on automatic control 62 (7), 3323-3338, 2016
862016
A framework for time-consistent, risk-sensitive model predictive control: Theory and algorithms
S Singh, Y Chow, A Majumdar, M Pavone
IEEE Transactions on Automatic Control 64 (7), 2905-2912, 2018
712018
Online modified greedy algorithm for storage control under uncertainty
J Qin, Y Chow, J Yang, R Rajagopal
IEEE Transactions on Power Systems 31 (3), 1729-1743, 2015
652015
CAQL: Continuous action Q-learning
M Ryu, Y Chow, R Anderson, C Tjandraatmadja, C Boutilier
arXiv preprint arXiv:1909.12397, 2019
582019
Latent bandits revisited
J Hong, B Kveton, M Zaheer, Y Chow, A Ahmed, C Boutilier
Advances in Neural Information Processing Systems 33, 13423-13433, 2020
562020
Weighted SGD for Regression with Randomized Preconditioning
J Yang, YL Chow, C Ré, MW Mahoney
Journal of Machine Learning Research 18 (211), 1-43, 2018
532018
Distributed online modified greedy algorithm for networked storage operation under uncertainty
J Qin, Y Chow, J Yang, R Rajagopal
IEEE Transactions on Smart Grid 7 (2), 1106-1118, 2015
462015
A framework for time-consistent, risk-averse model predictive control: Theory and algorithms
YL Chow, M Pavone
2014 American Control Conference, 4204-4211, 2014
452014
Efficient risk-averse reinforcement learning
I Greenberg, Y Chow, M Ghavamzadeh, S Mannor
Advances in Neural Information Processing Systems 35, 32639-32652, 2022
432022
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20