Follow
Asaf Cassel
Asaf Cassel
School of Computer Science, Tel Aviv University
Verified email at mail.tau.ac.il
Title
Cited by
Cited by
Year
A General Approach to Multi-Armed Bandits Under Risk Criteria
A Cassel, S Mannor, A Zeevi
Proceedings of the 31st Conference On Learning Theory 75, 1295--1306, 2018
942018
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
A Cassel, A Cohen, T Koren
Proceedings of the 37th International Conference on Machine Learning 119 …, 2020
762020
Bandit linear control
A Cassel, T Koren
Advances in Neural Information Processing Systems 33, 8872-8882, 2020
212020
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with $\sqrt $ T Regret
AB Cassel, T Koren
International Conference on Machine Learning, 1304-1313, 2021
142021
A general framework for bandit problems beyond cumulative objectives
A Cassel, S Mannor, A Zeevi
Mathematics of Operations Research 48 (4), 2196-2232, 2023
52023
Rate-optimal online convex optimization in adaptive linear control
AB Cassel, A Peled-Cohen, T Koren
Advances in Neural Information Processing Systems 35, 7410-7422, 2022
52022
Efficient online linear control with stochastic convex costs and unknown dynamics
AB Cassel, A Cohen, T Koren
Conference on Learning Theory, 3589-3604, 2022
52022
Multi-turn Reinforcement Learning from Preference Human Feedback
L Shani, A Rosenberg, A Cassel, O Lang, D Calandriello, A Zipori, ...
arXiv preprint arXiv:2405.14655, 2024
32024
Efficient rate optimal regret for adversarial contextual MDPs using online function approximation
O Levy, A Cohen, A Cassel, Y Mansour
International Conference on Machine Learning, 19287-19314, 2023
32023
Eluder-based Regret for Stochastic Contextual MDPs
O Levy, A Cassel, A Cohen, Y Mansour
arXiv preprint arXiv:2211.14932, 2022
32022
A General Framework for Bandit Problems Beyond Cumulative Objectives
A Cassel, S Mannor, A Zeevi
arXiv preprint arXiv:1806.01380, 2018
32018
Near-optimal regret in linear MDPs with aggregate bandit feedback
A Cassel, H Luo, A Rosenberg, D Sotnikov
arXiv preprint arXiv:2405.07637, 2024
22024
The Pendulum Arrangement: Maximizing the Escape Time of Heterogeneous Random Walks
A Cassel, S Mannor, G Tennenholtz
arXiv preprint arXiv:2007.13232, 2020
12020
Batch Ensemble for Variance Dependent Regret in Stochastic Bandits
A Cassel, O Levy, Y Mansour
arXiv preprint arXiv:2409.08570, 2024
2024
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
A Cassel, A Rosenberg
arXiv preprint arXiv:2407.03065, 2024
2024
Counterfactual Optimism: Rate Optimal Regret for Stochastic Contextual MDPs.
O Levy, AB Cassel, A Cohen, Y Mansour
CoRR, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–16