Speech emotion recognition with co-attention based multi-level acoustic information H Zou, Y Si, C Chen, D Rajan, ES Chng ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 86 | 2022 |
Interactive audio-text representation for automated audio captioning with contrastive learning C Chen, N Hou, Y Hu, H Zou, X Qi, ES Chng arXiv preprint arXiv:2203.15526, 2022 | 20 | 2022 |
Self-critical sequence training for automatic speech recognition C Chen, Y Hu, N Hou, X Qi, H Zou, ES Chng ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 17 | 2022 |
Leveraging modality-specific representations for audio-visual speech recognition via reinforcement learning C Chen, Y Hu, Q Zhang, H Zou, B Zhu, ES Chng Proceedings of the AAAI Conference on Artificial Intelligence 37 (11), 12607 …, 2023 | 16 | 2023 |
Unifying speech enhancement and separation with gradient modulation for end-to-end noise-robust speech separation Y Hu, C Chen, H Zou, X Zhong, ES Chng ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 11 | 2023 |
Edge-gan: Edge conditioned multi-view face image generation H Zou, KE Ak, AA Kassim 2020 IEEE International Conference on Image Processing (ICIP), 2401-2405, 2020 | 9 | 2020 |
Unsupervised noise adaptation using data simulation C Chen, Y Hu, H Zou, L Sun, ES Chng ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 8 | 2023 |
Mir-gan: Refining frame-level modality-invariant representations with adversarial network for audio-visual speech recognition Y Hu, C Chen, R Li, H Zou, ES Chng arXiv preprint arXiv:2306.10567, 2023 | 4 | 2023 |
UniS-MMC: Multimodal classification via unimodality-supervised multimodal contrastive learning H Zou, M Shen, C Chen, Y Hu, D Rajan, ES Chng arXiv preprint arXiv:2305.09299, 2023 | 4 | 2023 |
Cross-modal global interaction and local alignment for audio-visual speech recognition Y Hu, R Li, C Chen, H Zou, Q Zhu, ES Chng arXiv preprint arXiv:2305.09212, 2023 | 4 | 2023 |
Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection H Zou, M Shen, Y Hu, C Chen, ES Chng, D Rajan arXiv preprint arXiv:2401.05746, 2024 | | 2024 |
Towards Balanced Active Learning for Multimodal Classification M Shen, Y Huang, J Yin, H Zou, D Rajan, S See Proceedings of the 31st ACM International Conference on Multimedia, 3434-3445, 2023 | | 2023 |