Levent Sagun
Levent Sagun
Facebook AI Research
Verified email at fb.com - Homepage
Title
Cited by
Cited by
Year
Entropy-sgd: Biasing gradient descent into wide valleys
P Chaudhari, A Choromanska, S Soatto, Y LeCun, C Baldassi, C Borgs, ...
arXiv preprint arXiv:1611.01838, 2016
3772016
Searchqa: A new q&a dataset augmented with context from a search engine
M Dunn, L Sagun, M Higgins, VU Guney, V Cirik, K Cho
arXiv preprint arXiv:1704.05179, 2017
1932017
Empirical analysis of the hessian of over-parametrized neural networks
L Sagun, U Evci, VU Guney, Y Dauphin, L Bottou
arXiv preprint arXiv:1706.04454, 2017
1572017
Eigenvalues of the hessian in deep learning: Singularity and beyond
L Sagun, L Bottou, Y LeCun
arXiv preprint arXiv:1611.07476, 2016
104*2016
A jamming transition from under-to over-parametrization affects generalization in deep learning
S Spigler, M Geiger, S d’Ascoli, L Sagun, G Biroli, M Wyart
Journal of Physics A: Mathematical and Theoretical 52 (47), 474001, 2019
76*2019
Scaling description of generalization with number of parameters in deep learning
M Geiger, A Jacot, S Spigler, F Gabriel, L Sagun, S d’Ascoli, G Biroli, ...
Journal of Statistical Mechanics: Theory and Experiment 2020 (2), 023401, 2020
752020
Energy landscapes for machine learning
AJ Ballard, R Das, S Martiniani, D Mehta, L Sagun, JD Stevenson, ...
Physical Chemistry Chemical Physics 19 (20), 12585-12603, 2017
742017
Jamming transition as a paradigm to understand the loss landscape of deep neural networks
M Geiger, S Spigler, S d'Ascoli, L Sagun, M Baity-Jesi, G Biroli, M Wyart
Physical Review E 100 (1), 012115, 2019
702019
A tail-index analysis of stochastic gradient noise in deep neural networks
U Simsekli, L Sagun, M Gurbuzbalaban
International Conference on Machine Learning, 5827-5837, 2019
652019
Comparing dynamics: Deep neural networks versus glassy systems
M Baity-Jesi, L Sagun, M Geiger, S Spigler, GB Arous, C Cammarota, ...
International Conference on Machine Learning, 314-323, 2018
602018
Explorations on high dimensional landscapes
L Sagun, VU Guney, GB Arous, Y LeCun
arXiv preprint arXiv:1412.6615, 2014
502014
Early Predictability of Asylum Court Decisions
M Dunn, H Sirin, L Sagun, D Chen
22*2017
Triple descent and the two kinds of overfitting: Where & why do they appear?
S d'Ascoli, L Sagun, G Biroli
arXiv preprint arXiv:2006.03509, 2020
102020
Universal halting times in optimization and machine learning
L Sagun, T Trogdon, Y LeCun
arXiv preprint arXiv:1511.06444, 2015
10*2015
Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias
S d'Ascoli, L Sagun, J Bruna, G Biroli
arXiv preprint arXiv:1906.06766, 2019
72019
Easing non-convex optimization with neural networks
D Lopez-Paz, L Sagun
32018
On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
U Şimşekli, M Gürbüzbalaban, TH Nguyen, G Richard, L Sagun
arXiv preprint arXiv:1912.00018, 2019
12019
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
S d'Ascoli, H Touvron, M Leavitt, A Morcos, G Biroli, L Sagun
arXiv preprint arXiv:2103.10697, 2021
2021
More data or more parameters? Investigating the effect of data structure on generalization
S d'Ascoli, M Gabrié, L Sagun, G Biroli
arXiv preprint arXiv:2103.05524, 2021
2021
Post-Workshop Report on, Science meets Engineering in Deep Learning, NeurIPS 2019, Vancouver
L Sagun, C Gulcehre, A Romero, N Rostemzadeh, SS Mannelli
arXiv preprint arXiv:2007.13483, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–20