Follow
Hang Chen
Title
Cited by
Cited by
Year
The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results
H Chen, H Zhou, J Du, CH Lee, J Chen, S Watanabe, SM Siniscalchi, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
422022
Audio-visual speech recognition in misp2021 challenge: Dataset release and deep analysis
H Chen, J Du, Y Dai, CH Lee, SM Siniscalchi, S Watanabe, ...
Proceedings of the Annual Conference of the International Speech …, 2022
242022
Deep neural network based regression approach for acoustic echo cancellation
Q Lei, H Chen, J Hou, L Chen, L Dai
Proceedings of the 2019 4th International Conference on Multimedia Systems …, 2019
212019
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement
H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee
Neural Networks 143, 171-182, 2021
182021
The multimodal information based speech processing (misp) 2022 challenge: Audio-visual diarization and recognition
Z Wang, S Wu, H Chen, MK He, J Du, CH Lee, J Chen, S Watanabe, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
142023
The ustc-nercslip systems for the chime-7 dasr challenge
R Wang, M He, J Du, H Zhou, S Niu, H Chen, Y Yue, G Yang, S Wu, L Sun, ...
arXiv preprint arXiv:2308.14638, 2023
62023
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries.
H Chen, J Du, Y Hu, LR Dai, BC Yin, CH Lee
Interspeech, 3001-3005, 2021
62021
Semi-supervised multi-channel speaker diarization with cross-channel attention
S Wu, J Du, MK He, S Niu, H Chen, H Tang, CH Lee
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
42023
Improving audio-visual speech recognition by lip-subword correlation based visual pre-training and cross-modal fusion encoder
Y Dai, H Chen, J Du, X Ding, N Ding, F Jiang, CH Lee
2023 IEEE International Conference on Multimedia and Expo (ICME), 2627-2632, 2023
42023
The multimodal information based speech processing (misp) 2023 challenge: Audio-visual target speaker extraction
S Wu, C Wang, H Chen, Y Dai, C Zhang, R Wang, H Lan, J Du, CH Lee, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
32024
Incorporating visual information reconstruction into progressive learning for optimizing audio-visual speech enhancement
CY Zhang, H Chen, J Du, BC Yin, J Pan, CH Lee
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.
H Zhou, J Du, H Chen, Z Jing, S Xiong, CH Lee
Interspeech, 341-345, 2021
32021
Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion
Y Jiang, H Chen, J Du, Q Wang, CH Lee
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
22023
Grammar-supervised end-to-end speech recognition with part-of-speech tagging and dependency parsing
G Wan, T Mao, J Zhang, H Chen, J Gao, Z Ye
Applied Sciences 13 (7), 4243, 2023
22023
Deep learning based audio-visual multi-speaker doa estimation using permutation-free loss function
Q Wang, H Chen, Y Jiang, Z Wang, Y Wang, J Du, CH Lee
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
22022
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023
H Wang, Y Xi, H Chen, J Du, Y Song, Q Wang, H Zhou, C Wang, J Ma, ...
Proceedings of the 31st ACM International Conference on Multimedia, 9531-9535, 2023
12023
Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge
H Chen, S Wu, Y Dai, Z Wang, J Du, CH Lee, J Chen, S Watanabe, ...
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
12023
Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement
C Wang, H Chen, J Du, B Yin, J Pan
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
12022
Collaborative Viseme Subword and End-to-end Modeling for Word-level Lip Reading
H Chen, Q Wang, J Du, GS Wan, SF Xiong, BC Yin, J Pan, CH Lee
IEEE Transactions on Multimedia, 2024
2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Y Dai, H Chen, J Du, R Wang, S Chen, J Ma, H Wang, CH Lee
arXiv preprint arXiv:2403.04245, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20