Photo

Anyi Rao

Postdoctoral Scholar
Computer Science Department
Stanford University
Email: anyirao [at] stanford.edu

View Anyi Rao's profile on Google Scholar        View Anyi Rao's codes on Github        View Anyi Rao's profile on LinkedIn

Bio

Anyi Rao is a Postdoctoral Scholar at Stanford with Maneesh Agrawala. He studies reliable, steerable and explaniable human-centered AI for creativity, multimodality, and film, with focuses on intelligent media editing and creation, semantic and cinematic analysis, aiming to build connections between AI and humans for collaborative intelligence and unleash human creativity and productivity. His works include ControlNet, AnimateDiff, MovieNet, Virtual Studio, Shoot360, and CityNeRF, with a Marr Prize (ICCV best paper award). These works have been widely used in industry, including Amazon Prime Video, Netflix, Tencent, and more.
He received the Ph.D. at MMLab, Chinese University of Hong Kong in 2022, advised by Dahua Lin and Bolei Zhou. He has research experiences at Meta Reality Lab, Vector Institute, University of Toronto, Hong Kong University. If you also have some exciting ideas and insights on the aforementioned research, please fill in this form or drop me an email. Let's push it forward together.

News

  • 2024-02: Cinematic Behavior Transfer is accepted to CVPR 2024. Our efforts on Intelligent Cinematography 🎬 Virtual Film Studio include our SIGGRAPH Virtual Dynamic Storyboard, CVPR Cinematic Behavior Transfer and ECCV Multi-camera Editing.
  • 2024-02: Serve in ACM UIST 2024 organizing committee as a Registeration Co-Chair.
  • 2024-01: We are organizing the Fourth Workshop on AI for Creative Visual Content Generation Editing and Understanding at CVPR 2024. Researchers, artists, video directors, and entrepreneurs from academia (Stanford, Berkeley), industry (Adobe, Netflix, Meta) and more, are going to share and sparkle their ideas together in Seattle! Please follow our Twitter for more information!
  • 2023-11: AnimateDiff is online and gets an update on SparseCtrl ability.
  • 2023-10: 🧑‍🎨 ControlNet receives the 🏆 Best Paper Award (Marr Prize) at ICCV 2023. V1 V1.1 A1111 WebUI
  • 2023-08: Three papers are accepted to ICCV 2023 as Oral, UIST 2023, and ACM MM 2023.
  • 2023-06: We are organizing 🍿 AI ShortFest, inaugural AI film festival, jointly with ICCV 2023 in Paris, France.
  • 2022-07: Two papers on 👷 City-Super Research: CityNeRF and Shoot360 are accepted to ECCV 2022 and SIGGRAPH 2022.
  • 2021-05: Our CVPR 2020 work SceneSeg is set as the baseline for the ACM Multimedia 2021 Grand Challenge: Tencent Ads Algorithm Competition. Participate to win USD$100,000 for the first prize.
  • 2020-07: MovieNet is online with an easy-to-use toolkit as a part of OpenMMLab.
    More
  • 2023-06: We are organizing the Third Workshop on AI for Creative Video Editing and Understanding at ICCV 2023 in Paris
  • 2023-04: Two papers are accepted to AAAI 2023 as Oral and IJCAI 2023.
  • 2022-10: We are organizing the Second Workshop on AI for Creative Video Editing and Understanding at ECCV 2022.
  • 2022-03: Two papers are accepted to CVPR 2022 and IEEE Transactions on Multimedia.
  • 2021-09: We are organizing the First Workshop on AI for Creative Video Editing and Understanding during ICCV 2021.
  • 2021-07: Two papers are accepted to ICCV 2021 and IEEE Transactions on Multimedia.
  • 2020-07: Three papers are accepted to ECCV 2020.
  • 2020-02: One paper is accepted to CVPR 2020. Also appears at LUV 2020 (15-min talk) and Sight and Sound 2020 (5-min talk).
  • 2020-01: HotFlip is included in AllenNLP and TextAttack
  • Selected Publication [Full List]

    Cinematic Behavior Transfer via NeRF-based Differentiable Filming
    Anyi Rao*, Xuekun Jiang*, Jingbo Wang, Dahua Lin, Bo Dai
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Paper] [Webpage]

    AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
    Yuwei Guo, Ceyuan Yang, Anyi Rao, Zhengyang Liang, Yaohui Wang, Yu Qiao, Maneesh Agrawala, Dahua Lin, Bo Dai
    International Conference on Learning Representations (ICLR), 2024 (Spotlight)
    [Paper] [Webpage]

    ControlNet: Adding Conditional Control to Text-to-Image Diffusion Models
    Lvmin Zhang, Anyi Rao, Maneesh Agrawala
    IEEE/CVF International Conference on Computer Vision (ICCV), 2023 Best Paper Award (Marr Prize)
    [Paper] [Webpage] [Supplements] [V1] [V1.1] [A1111 WebUI]

    Automated Conversion of Music Videos into Lyric Videos
    Jiaju Ma, Anyi Rao, Li-Yi Wei, Rubaiat Habib Kazi, Hijung Valentina Shin, Maneesh Agrawala
    User Interface Software and Technology (UIST), 2023
    [Paper] [Webpage]

    Dynamic Storyboard Generation in an Engine-based Virtual Environments for Video Production
    Anyi Rao*, Xuekun Jiang*, Yuwei Guo, Linning Xu, Lei Yang, Libiao Jin, Dahua Lin, Bo Dai
    ACM Special Interest Group on Computer Graphics and Interactive Techniques Conference (SIGGRAPH) Poster, 2023
    [Paper] [Webpage]

    Shoot360: Normal View Video Creation from City Panorama Footage
    Anyi Rao, Linning Xu, Dahua Lin
    ACM Special Interest Group on Computer Graphics and Interactive Techniques Conference (SIGGRAPH), 2022
    [Paper] [Webpage]

    Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows
    Anyi Rao, Xuekun Jiang, Sichen Wang, Yuwei Guo, Zihao Liu, Bo Dai, Long Pang, Xiaoyu Wu, Dahua Lin, Libiao Jin
    European Conference on Computer Vision (ECCVW), 2022
    [Paper] [Webpage]

    A Coarse-to-Fine Framework for Automatic Video Unscreen
    Anyi Rao, Linning Xu, Zhizhong Li, Qingqiu Huang, Zhanghui Kuang, Wayne Zhang, Dahua Lin
    IEEE Transactions on Multimedia, (TMM), 2022
    [Paper] [Webpage]

    BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering
    Also known as CityNeRF: Building NeRF at City Scale
    Yuanbo Xiangli*, Linning Xu*, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, Dahua Lin
    European Conference on Computer Vision (ECCV), 2022
    [Paper] [Webpage]

    Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos
    Xuekun Jiang, Libiao Jin, Anyi Rao+(corresponding), Linning Xu, Dahua Lin
    IEEE Transactions on Multimedia, (TMM), 2021
    [Paper] [Webpage]

    A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
    Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    LUV 2020 and Sight and Sound 2020 workshops (Oral)
    [Paper] [Webpage]

    A Unified Framework for Shot Type Classification Based on Subject Centric Lens
    Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020
    Video Turing Test 2020 Workshop (Oral)
    [Paper] [Webpage]

    Online Multi-modal Person Search in Videos
    Jiayue Xia, Anyi Rao+(corresponding), Linning Xu, Qingqiu Huang, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020
    [Paper] [Webpage]

    MovieNet: A Holistic Dataset for Movie Understanding
    Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020 (Spotlight)
    [Paper] [Webpage]

    HotFlip: White-Box Adversarial Examples for Text Classification
    Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou
    Annual Meeting of the Association for Computational Linguistics (ACL), 2018
    It is included in several open-source NLP research libraries AllenNLP, TextAttack and OpenAttack
    [Paper] [Webpage] [AllenNLP] [OpenAttack]

    Awards and Grant

  • Best Paper Award (Marr Prize) by International Conference on Computer Vision (ICCV)
  • 2023
  • Magic Grant by The Brown Institute for Media Innovation
  • 2023
  • Amazon Prime Video Gift Funding
  • 2023
  • Pika and KAUST Grant for ECCV Workshop Organization
  • 2023
  • KAUST Grant for ECCV Workshop Organization
  • 2022
  • Adobe Grant for ICCV Workshop Organization
  • 2021
  • Hong Kong PhD Fellowship
  • 2021
  • Most Influential Paper by Paper Digest
  • 2021
  • National Scholarship awarded by the China Ministry of Education, the highest honor in China
  • 2015
  • SenseTime Scholarship, awarded to 30 students out of all AI major undergraduate students in China
  • 2017
  • Provincial Merit Student awarded by the Jiangsu Province, the highest honor in the province
  • 2017
  • Nanjing University Top-Grade Scholarship, the highest honor in the university
  • 2018
  • Gold Medal in Invitational National Mathematical Olympiad
  • 2013
    More
  • Nanjing University Outstanding Student Leader Award
  • 2015
  • Nanjing University Outstanding Student Award
  • 2016
  • Nanjing University Top Volunteer Excellence Award
  • 2015
  • Zhenggang Scholarship, top 40 students in Nanjing University
  • 2016
  • Zhenggang Jingying Scholarship
  • 2017
  • Nanjing University People Scholarship
  • 2016
  • Nanjing University People Scholarship
  • 2017
  • World ranking 32nd in 2016 Calculus World Cup
  • 2016
  • Meritorious winner prize in the 2016 National Mathematical Contest in Modeling
  • 2016
  • Best paper in the 2014 University Electronics Design Contest
  • 2014

    Talks

  • Symposium on AI Technologies and Their Implications: Intelligent Tools to Support Human Creativity in Video Production
  • 2024
  • Adobe: Intelligent Tools to Support Creative Video Production
  • 2024
  • CCF: Collaborative Intelligent Tools to Support Video Production
  • 2024
  • Art School of UTK: Creative Video Understanding, Editing and Generation
  • 2024
  • Netflix: Controllable Visual Content Generation to Unleash Creativity and Productivity
  • 2023
  • Bay Area Vision Day: Human-centred Intelligent Video Creation and Editing
  • 2023
  • ICCV: Creative Video Editing and Understanding
  • 2023
  • Meta: Multimodal Representation Learning
  • 2022
  • ECCV: Temporal and Contextual Transformer for Multi-Camera Editing
  • 2022
  • Film School of HKBU: Intelligent Video Analysis and Creation for Film
  • 2021
  • CVPR: A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
  • 2020
  • ECCV: A Unified Framework for Shot Type Classification Based on Subject Centric Lens
  • 2020

    Professional Activities

  • Key organizer: AI ShortFest film festival at Paris, France 2023.
  • Leading/key organizer: Workshop on AI for Creative Video Editing and Understanding at CVPR 2024, ICCV 2023, ECCV 2022, ICCV 2021
  • Program Committee Member: over 30 times on CVPR, ICCV, ECCV, ACCV, SIGGRAPH, SIGGRAPH Asia, CHI, UIST, MM, NeurIPS, ICML, ICLR, AAAI, IJCAI
  • Journal Reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Visualization and Computer Graphics (TVCG), IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), ACM Transactions on Graphics (TOG), Transactions on Machine Learning Research (TMLR), International Journal of Computer Vision (IJCV)
  • Judge: The 3rd International Artificial Intelligence Fair
  • Experiences

  • Research Intern at Meta Reality Lab
  • Research Intern at Shanghai Artificial Intelligence Laboratory
  • Research Intern at SenseTime Research
  • Visitor at the University of Toronto and Vector Institute
  • Research Assistant at the Advanced Integration and Mining Lab, Eugene, OR, USA
  • Research Intern at University of Hong Kong, Hong Kong S.A.R.
  • Patents

    A Video Generation Method, CN202210699177.X
    A Video Editing Method and Related Program Products, CN202210691662.2
    A Video Editing Method, CN202010694551.1
    A Video Classification Method, CN202010694811.1
    An Image Processing Method and Related Products, CN202010450801.3
    A Zero-shot Action Recognition Method, CN202110821209.4
    A Layout Generation Method, CN202111128

    Teaching Experience

    Course Instructor, CCF Advanced Disciplines Lectures, Spring 2024
    Head TA, IERG 4160, Image and Video Processing (graduate level), Fall 2019
    TA, IERG 3180, Microcontrollers and Embedded Systems Laboratory, Spring 2020

    Miscellaneous

    Undergrad

    Click to expand

    Undergrad Academic

    His GPA ranked No.1 in each semester during his undergraduate studies at Nanjing University with an overall GPA: 3.96/4.00 and Rank: 1/183. He finished major curricula in 2 years and learned a bunch of online courses. [Whole]

    Undergrad Research Beginning

    Robust Training with Word-level Adversity for NLP
    Sept. 2017 - April. 2018   Advanced Integration and Mining Lab (AIM), Eugene, OR, United States of America
    Advisor: Prof. Dejing Dou (Director, Head of Baidu Big Data Lab) and Prof. Daniel Lowd [The Register]

    Automatic Music Accompaniment Using Probabilistic Machine Learning
    Jul. 2017 - Aug. 2017   The University of Hong Kong, Hong Kong S.A.R.
    Advisor: Prof. Francis Lau (Associate Dean) [Arxiv]

    Real-time 3D Surface Reconstruction Using Lidar (Light Detection And Ranging)
    Aug. 2016 - Sept. 2017   Visual Sensing and Graphics Lab (VISG Lab), Nanjing University
    Supervisor: Prof. Sidan Du (Director) [Report] [Video]

    Undergrad Course Projects

    Computer Vision: 3D Human Poses Estimation from a Single Image [Presentation]
    Convex Optimization: Road Car Flow Prediction [Report]
    Probability and Stochastic Process: Monte Carlo for Multidimensional Integrals [Report]
    Microcomputers and Interface Techniques: x86 Assembly Language Programming [Report]
    Signal Processing: Single-Photon Detector Design [Report] [Presentation]

    Volunteer Experience

    Click to expand

    Co-Founder of a Children Care Volunteer Program     Sep. 2015 - Dec. 2015
    Co-founded a psychological consulting program to promote left-behind children's growth and education. Volunteered to teach left-behind children Math and English in a junior high school located in the remote, underdeveloped Xiushui county. Recognized as a key team leader in the successful Warm One Hundred Campaign, which raised money for left-behind children. Our group received an excellence award from the China Foundation for Poverty Alleviation.

    Vice President of a Young Volunteers Association at Nanjing University     Jun. 2015 - Jun. 2016
    Organized and participated in over 100 out-of-school and 20 in-school activities covering over 1000 volunteers. Our association received a volunteer association excellence award.

    Campus Ambassador of Huawei     Aug. 2017 - Dec. 2017

    Student Volunteer of International Conference on Computer Vision (ICCV)     Dec. 2019


    Hobbies
    Love: 🌊🥥🏖️✈️🎬
    Travel: 🇨🇳🇭🇰🇺🇸🇯🇵🇲🇴🇬🇧🇸🇬🇰🇷🇦🇪🇳🇱🇧🇪🇩🇪🇫🇷🇹🇭🇨🇦🇨🇭🇮🇹
    Undergrad: Department Young Volunteers Association Vice President, Bronze medal at University 55th Sports Competition, University Student Choir Member
    Language: A bit of Japanese and Cantonese, Native Mandarin, Full proficiency in English