An Introduction to Balanced Multimodal Learning

Talk By Di HU

Nov 02, 2023 Thursday

Abstract:

“The whole is greater than the sum of its parts” is a fascinating phenomenon discovered by cognitive neuroscientists, in the cells of the superior colliculus of the brain. That is, the response to a combined visual, auditory, and somatosensory stimulus is greater than the response to these three stimuli presented alone. In multimodal machine learning, we often introduce additional modalities to improve the performance of existing tasks with uni-modality, such as RGB-D scene recognition, audiovisual speech recognition and RGB-optical flow action recognition. However, recent research has found that existing joint learning paradigm tend to ignore the heterogeneous characteristics of different modalities, resulting in a serious imbalance issue for the model’s utilization of each modality, where only specific modality is fully learned, hindering the potential of multimodal learning, thus it may even bring the disastrous consequences of “1+1<1”. In this talk, I will start with the differences of multimodal data in terms of model architecture, optimization, learning objective etc, introducing recent work on balanced multimodal learning, then further discuss how to improve the quality of multimodal cooperation from a theoretical viewpoint.

Time:

Nov 02, 2023 Thursday

16:30-17:20

Location:

RmE1-101, GZ Campus

Zoom:

628 334 1826 (PW: 234567)

Bilibili Live:

ID: 30748067

Speaker Bio:


Prof. Di Hu

Tenure-track Associate Professor, Gaoling School of Artificial Intelligence, Renmin University of China

Di Hu, a tenure-track Associate Professor at Gaoling School of Artificial Intelligence, Renmin University of China. His research interests include multimodal perception and learning. He has published more than 30 peer-reviewed top conference and journal papers, including TPAMI, NeurIPS, CVPR, ICCV, ECCV etc. He served as PC/Senior PC members of several top-tier conferences, and co-organized several tutorials on top-tier conferences. Di is the recipient of the Outstanding Doctoral Dissertation Award by the Chinese Association for Artificial Intelligence, also the recipient of ACM XI’AN Doctoral Dissertation Award. He is sponsored by the Young Elite Scientists Sponsorship Program by CAST and Awarded the 2022 WuWenJun AI Excellent Young Scientist.