Ego4d: Around the world in 3,000 hours of egocentric video K Grauman, A Westbury, E Byrne, Z Chavis, A Furnari, R Girdhar, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 973 | 2022 |
Voicebox: Text-guided multilingual universal speech generation at scale M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ... Advances in neural information processing systems 36, 2024 | 232 | 2024 |
A Multi-View Approach to Audio-Visual Speaker Verification L Sarı, K Singh, J Zhou, L Torresani, N Singhal, Y Saraf ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 52 | 2021 |
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR L Sarı, N Moritz, T Hori, J Le Roux ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 46 | 2020 |
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions C Liu, M Picheny, L Sarı, P Chitkara, A Xiao, X Zhang, M Chou, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 45 | 2022 |
Counterfactually fair automatic speech recognition L Sarı, M Hasegawa-Johnson, CD Yoo IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3515-3525, 2021 | 26 | 2021 |
Self-Supervised Representations for Singing Voice Conversion T Jayashankar, J Wu, L Sari, D Kant, V Manohar, Q He ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 19 | 2023 |
Fusion of LVCSR and posteriorgram based keyword search L Sarı, B Gündoğdu, M Saraçlar Sixteenth Annual Conference of the International Speech Communication …, 2015 | 19 | 2015 |
Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News L Sari, S Thomas, M Hasegawa-Johnson, M Picheny 2019 IEEE International Conference on Acoustics, Speech and Signal …, 2019 | 17 | 2019 |
Training Spoken Language Understanding Systems with Non-Parallel Speech and Text L Sarı, S Thomas, M Hasegawa-Johnson ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 16 | 2020 |
Template-based keyword search with pseudo posteriorgrams B Gündoğdu, L Sarı, G Çetinkaya, M Saraçlar Signal Processing and Communication Application Conference (SIU), 2016 24th …, 2016 | 14 | 2016 |
Auxiliary Networks for Joint Speaker Adaptation and Speaker Change Detection L Sari, M Hasegawa-Johnson, S Thomas IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 324-333, 2021 | 12 | 2021 |
Seamless equal accuracy ratio for inclusive CTC speech recognition H Gao, X Wang, S Kang, R Mina, D Issa, J Harvill, L Sari, ... Speech Communication 136, 76-83, 2022 | 11 | 2022 |
Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks L Sari, S Thomas, M Hasegawa-Johnson Interspeech, 769-773, 2019 | 9 | 2019 |
Elisa system description for lorehlt 2017 L Cheung, T Gowda, U Hermjakob, N Liu, J May, A Mayn, ... Proc. Low Resource Human Lang. Technol, 51-59, 2017 | 9 | 2017 |
Texture defect detection using independent vector analysis in wavelet domain L Sari, A Ertüzün 2014 22nd International Conference on Pattern Recognition, 1639-1644, 2014 | 9 | 2014 |
Worldly Wise (WoW)-Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering K Ramnath, L Sari, M Hasegawa-Johnson, C Yoo Proceedings of the 2021 Conference of the North American Chapter of the …, 2021 | 8 | 2021 |
Posteriorgram based approaches in keyword search L Sarı, B Gündoğdu, M Saraçlar 2015 23nd Signal Processing and Communications Applications Conference (SIU …, 2015 | 6 | 2015 |
Towards Selection of Text-to-speech Data to Augment ASR Training S Liu, L Sarı, C Wu, G Keren, Y Shangguan, J Mahadeokar, O Kalinli arXiv preprint arXiv:2306.00998, 2023 | 4 | 2023 |
Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition P Klumpp, P Chitkara, L Sarı, P Serai, J Wu, IE Veliche, R Huang, Q He arXiv preprint arXiv:2303.00802, 2023 | 4 | 2023 |