Follow
Asaf Cassel
Asaf Cassel
School of Computer Science, Tel Aviv University
Verified email at mail.tau.ac.il
Title
Cited by
Cited by
Year
A General Approach to Multi-Armed Bandits Under Risk Criteria
A Cassel, S Mannor, A Zeevi
Proceedings of the 31st Conference On Learning Theory 75, 1295--1306, 2018
992018
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
A Cassel, A Cohen, T Koren
Proceedings of the 37th International Conference on Machine Learning 119 …, 2020
802020
Bandit linear control
A Cassel, T Koren
Advances in Neural Information Processing Systems 33, 8872-8882, 2020
242020
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with $\sqrt $ T Regret
AB Cassel, T Koren
International Conference on Machine Learning, 1304-1313, 2021
172021
Multi-turn reinforcement learning with preference human feedback
L Shani, A Rosenberg, A Cassel, O Lang, D Calandriello, A Zipori, ...
Advances in Neural Information Processing Systems 37, 118953-118993, 2025
142025
A general framework for bandit problems beyond cumulative objectives
A Cassel, S Mannor, A Zeevi
Mathematics of Operations Research 48 (4), 2196-2232, 2023
62023
Rate-optimal online convex optimization in adaptive linear control
AB Cassel, A Peled-Cohen, T Koren
Advances in Neural Information Processing Systems 35, 7410-7422, 2022
62022
Efficient online linear control with stochastic convex costs and unknown dynamics
AB Cassel, A Cohen, T Koren
Conference on Learning Theory, 3589-3604, 2022
62022
Efficient rate optimal regret for adversarial contextual mdps using online function approximation
O Levy, A Cohen, A Cassel, Y Mansour
International Conference on Machine Learning, 19287-19314, 2023
52023
Near-optimal regret in linear mdps with aggregate bandit feedback
A Cassel, H Luo, A Rosenberg, D Sotnikov
arXiv preprint arXiv:2405.07637, 2024
42024
Eluder-based regret for stochastic contextual mdps
O Levy, A Cassel, A Cohen, Y Mansour
arXiv preprint arXiv:2211.14932, 2022
42022
A General Framework for Bandit Problems Beyond Cumulative Objectives
A Cassel, S Mannor, A Zeevi
arXiv preprint arXiv:1806.01380, 2018
32018
Batch Ensemble for Variance Dependent Regret in Stochastic Bandits
A Cassel, O Levy, Y Mansour
arXiv preprint arXiv:2409.08570, 2024
12024
The Pendulum Arrangement: Maximizing the Escape Time of Heterogeneous Random Walks
A Cassel, S Mannor, G Tennenholtz
arXiv preprint arXiv:2007.13232, 2020
12020
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
A Cassel, A Rosenberg
arXiv preprint arXiv:2407.03065, 2024
2024
Counterfactual Optimism: Rate Optimal Regret for Stochastic Contextual MDPs.
O Levy, AB Cassel, A Cohen, Y Mansour
CoRR, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–16