Recent Progress of Generative AI in HUAWEI Noah’s Ark Lab

Talk by Enze XIE

Sep 22, 2023 Friday

Abstract:

Recently, research on Generative AI has gained significant momentum. Generative AI refers to the use of AI to create new content, such as text, images, music, audio, and videos. Current research in Generative AI primarily focuses on (1) Diffusion Models, mainly used for generating visual modalities, e.g., DALLE 2, Stable Diffusion, and (2) Large Language Models, primarily used for text generation, e.g., ChatGPT, LLaMA. This report aims to provide some insights into my research work in Generative AI conducted at Huawei Noah’s Ark Lab over the past year.

The following works will be discussed:

1. DDP: Modeling perception tasks using Diffusion Models.

2. DiffFit: Parameter-Efficient fine-tuning of large-scale pretrained diffusion models.

3. Make-A-Protagonist: A general video editing algorithm that ensemble multiple expert models.

4. DiT-3D: Transformer architecture design for 3D generation models.

5. PHP: Novel prompting method to enhance the mathematical reasoning ability of large language models.

Time:

September 22nd, 2023, Friday

11:00 – 11:50

Location:

Rm134, E1

Zoom:

628 334 1826 (PW: 234567)

Bilibili Live:

ID: 30748067

Speaker Bio:

Enze Xie

Researcher, AI Theory Lab, Huawei Noah’s Ark Lab

Dr. Enze Xie is currently a researcher at AI Theory Lab of Huawei Noah’s Ark Lab (Hong Kong). He obtained PhD from HKU MMLab in 2022, under the supervision of Prof. Ping Luo and Prof. Wenping Wang. His current research focuses on (1) AIGC, e.g.diffusion models for 2D/3D/video generation, and (2) large language models (LLM), e.g. LLM long-chain reasoning and AI4Math theorem proving. Dr. Xie has published over 20 top conference/journal papers, including TPAMI, CVPR, ICCV, ICML, ICLR, NeurIPS, ACL, with 8 first-author papers. His research papers have been cited over 8000 times on Google Scholar, with two papers individually cited over 1500 times. Four papers have been selected as the top 10 influential papers at CVPR 2020, ICCV 2021, NeurIPS 2021, and ECCV 2022. His representative work, SegFormer, was presented at the NVIDIA GTC conference and was praised by NVIDIA as visionary research that has been widely applied in major companies’ products worldwide, such as autonomous driving and medical AI. Dr. Xie serves as a reviewer for several top conferences/journals, including TPAMI, IJCV, CVPR, and NeurIPS. He received NVIDIA Fellowship Finalist Award 2022 (15 candidates worldwide) and Outstanding Paper Award at the World Artificial Intelligence Conference (WAIC) 2023 (10 papers worldwide).