Follow
Samy Jelassi
Samy Jelassi
Verified email at fas.harvard.edu - Homepage
Title
Cited by
Cited by
Year
A momentumized, adaptive, dual averaged gradient method
A Defazio, S Jelassi
Journal of Machine Learning Research 23 (144), 1-34, 2022
75*2022
Global convergence of neuron birth-death dynamics
G Rotskoff, S Jelassi, J Bruna, E Vanden-Eijnden
International Conference on Machine Learning, 2019
63*2019
Vision transformers provably learn spatial structure
S Jelassi, M Sander, Y Li
Advances in Neural Information Processing Systems 35, 37822-37836, 2022
612022
A mean-field analysis of two-player zero-sum games
C Domingo-Enrich, S Jelassi, A Mensch, G Rotskoff, J Bruna
Advances in neural information processing systems 33, 20215-20226, 2020
572020
A permutation-equivariant neural network architecture for auction design
J Rahme, S Jelassi, J Bruna, SM Weinberg
Proceedings of the AAAI conference on artificial intelligence 35 (6), 5664-5672, 2021
482021
Auction learning as a two-player game
J Rahme, S Jelassi, SM Weinberg
arXiv preprint arXiv:2006.05684, 2020
472020
Towards understanding how momentum improves generalization in deep learning
S Jelassi, Y Li
International Conference on Machine Learning, 9965-10040, 2022
322022
Smoothed analysis of the low-rank approach for smooth semidefinite programs
T Pumir, S Jelassi, N Boumal
Advances in Neural Information Processing Systems 31, 2018
282018
Repeat after me: Transformers are better than state space models at copying
S Jelassi, D Brandfonbrener, SM Kakade, E Malach
arXiv preprint arXiv:2402.01032, 2024
232024
Length generalization in arithmetic transformers
S Jelassi, S d'Ascoli, C Domingo-Enrich, Y Wu, Y Li, F Charton
arXiv preprint arXiv:2306.15400, 2023
212023
Towards closing the gap between the theory and practice of SVRG
O Sebbouh, N Gazagnadou, S Jelassi, F Bach, R Gower
Advances in neural information processing systems 32, 2019
202019
Dissecting adaptive methods in GANs
S Jelassi, D Dobre, A Mensch, Y Li, G Gidel
arXiv preprint arXiv:2210.04319, 2022
18*2022
Depth separation beyond radial functions
L Venturi, S Jelassi, T Ozuch, J Bruna
Journal of machine learning research 23 (122), 1-56, 2022
172022
Extra-gradient with player sampling for faster convergence in n-player games
S Jelassi, C Domingo-Enrich, D Scieur, A Mensch, J Bruna
International Conference on Machine Learning, 4736-4745, 2020
13*2020
Depth Dependence of P Learning Rates in ReLU MLPs
S Jelassi, B Hanin, Z Ji, SJ Reddi, S Bhojanapalli, S Kumar
arXiv preprint arXiv:2305.07810, 2023
32023
Universal Length Generalization with Turing Programs
K Hou, D Brandfonbrener, S Kakade, S Jelassi, E Malach
arXiv preprint arXiv:2407.03310, 2024
2024
How Does Overparameterization Affect Features?
A Cagri Duzgun, S Jelassi, Y Li
arXiv e-prints, arXiv: 2407.00968, 2024
2024
How Does Overparameterization Affect Features?
AC Duzgun, S Jelassi, Y Li
arXiv preprint arXiv:2407.00968, 2024
2024
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
K Li, S Jelassi, H Zhang, S Kakade, M Wattenberg, D Brandfonbrener
arXiv preprint arXiv:2402.14688, 2024
2024
Algorithmic and Architectural Implicit Biases in Deep Learning
S Jelassi
Princeton University, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20