Planting a Seed of Vision in Large Language Model

Talk By Yixiao GE
An Introduction to Balanced Multimodal Learning

Talk By Di HU
On Robustness of Vision Models in the 3D World

Talk By Yinpeng DONG
Learning Universal Representations Across Tasks and Domains

Talk by Wei-Hong LI
Towards Efficient Video Understanding

Talk by Xiaojun CHANG
The Rise of Vision-Language Foundation Models: Methods, Evaluation and Applications

Talk by Tiancheng ZHAO
RHOS: Robot, Human, Object, and Scene

Talk by Yong-lu LI
Recent Progress of Generative AI in HUAWEI Noah’s Ark Lab

Talk by Enze XIE
Developing an Internet and Blockchain Emulator for Reseach and Education

Talk by Wenliang DU
Cross-modal Pretraining for Open-set Detection and Segmentation

Talk by Xiaodan LIANG