Photo

Ray Anyi Rao

Ph.D. Candidate
Multimedia Laboratory
The Chinese University of Hong Kong
Hong Kong, China
Email: anyirao [at] link.cuhk.edu.hk

View Anyi (Ray) Rao's profile on Google Scholar        View Anyi (Ray) Rao's codes on Github        View Anyi (Ray) Rao's profile on LinkedIn

Short Bio

Ray Anyi Rao is a Ph.D. candidate at MMLab in the Chinese University of Hong Kong, advised by Dahua Lin and Bolei Zhou. He got the B.S. from EE Department, Nanjing University in 2018, ranking 1/183. He studies human-centered AI for multimodality and creativity, with focuses on intelligent video editing and creation, video semantic and cinematic analysis, aiming to build connections between AI and humans for collaborative intelligence.

News

  • 2022-07: One paper is accepted to ECCV 2022.
  • 2022-05: One paper is accepted to SIGGRAPH 2022.
  • 2022-04: We are organizing The Second Workshop on AI for Creative Video Editing and Understanding at ECCV 2022. Researchers, artists and entrepreneurs from academia (Stanford, Berkeley), industry (Adobe, Netflix, Meta) and more, are going to share and sparkle their ideas together! Please follow our Twitter for more information!
  • 2022-03: Two papers are accepted to CVPR 2022 and IEEE Transactions on Multimedia.
  • 2021-09: We are organizing The First Workshop on AI for Creative Video Editing and Understanding during ICCV 2021.
  • 2021-07: Two papers are accepted to ICCV 2021 and IEEE Transactions on Multimedia.
  • 2021-05: Our CVPR 2020 work SceneSeg is set as the baseline for the 2021 Tencent Advertising Algorithm Competition and ACM Multimedia 2021 Grand Challenge Track 1 Video Ads Content Structuring. Come to join and win USD$100,000 for the first prize.
  • 2020-07: MovieNet is online with an easy-to-use toolkit as a part of OpenMMLab.
  • 2020-07: Three papers are accepted to ECCV 2020.
  • 2020-02: One paper is accepted to CVPR 2020. Also appears at LUV 2020 (15-min talk) and Sight and Sound 2020 (5-min talk).
  • 2020-01: HotFlip is included in AllenNLP and TextAttack
  • Publication

    Shoot360: Normal View Video Creation from City Panorama Footage
    Anyi Rao, Linning Xu, Dahua Lin
    ACM Special Interest Group on Computer Graphics and Interactive Techniques Conference (SIGGRAPH), 2022
    [Paper] [Webpage]

    BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering
    Also known as CityNeRF: Building NeRF at City Scale
    Yuanbo Xiangli*, Linning Xu*, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, Dahua Lin
    European Conference on Computer Vision (ECCV), 2022
    [Paper] [Webpage]

    A Coarse-to-Fine Framework for Automatic Video Unscreen
    Anyi Rao, Linning Xu, Zhizhong Li, Qingqiu Huang, Zhanghui Kuang, Wayne Zhang, Dahua Lin
    IEEE Transactions on Multimedia, (TMM), 2022
    [Paper] [Webpage]

    BlockPlanner: City Block Generation with Vectorized Graph Representation
    Linning Xu*, Yuanbo Xiangli*, Anyi Rao, Nanxuan Zhao, Bo Dai, Ziwei Liu, Dahua Lin
    IEEE/CVF International Conference on Computer Vision (ICCV), 2021
    [Paper] [Webpage]

    Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos
    Xuekun Jiang, Libiao Jin, Anyi Rao+(corresponding), Linning Xu, Dahua Lin
    IEEE Transactions on Multimedia, (TMM), 2021
    [Paper] [Webpage]

    A Unified Framework for Shot Type Classification Based on Subject Centric Lens
    Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020
    Also appears at Video Turing Test 2020 (5-min talk)
    [Paper] [Webpage]

    MovieNet: A Holistic Dataset for Movie Understanding
    Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020 (Spotlight)
    [Paper] [Webpage]

    A Local-to-Global Approach to Multi-modal Movie Scene Segmentation
    Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
    Also appears at LUV 2020 (15-min talk) and Sight and Sound 2020 (5-min talk)
    [Paper] [Webpage]

    Online Multi-modal Person Search in Videos
    Jiayue Xia, Anyi Rao+(corresponding), Linning Xu, Qingqiu Huang, Dahua Lin
    European Conference on Computer Vision (ECCV), 2020
    [Paper] [Webpage]

    HotFlip: White-Box Adversarial Examples for Text Classification
    Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou
    Annual Meeting of the Association for Computational Linguistics (ACL), 2018
    It is included in several open-source NLP research libraries AllenNLP, TextAttack and OpenAttack
    [Paper] [Webpage] [AllenNLP] [OpenAttack]

    Automatic Music Accompaniment
    Anyi Rao, Francis Lau
    In arXiv, 2018
    [Paper] [Webpage]

    Experience

  • Research Intern at Shanghai Artificial Intelligence Laboratory
  • Research Intern at SenseTime Research
  • Visitor at the University of Toronto and Vector Institute
  • Research Assistant at the Advanced Integration and Mining Lab, Eugene, OR, USA
  • Research Intern at University of Hong Kong, Hong Kong S.A.R.
  • Awards

  • Hong Kong PhD Fellowship
  • 2018
  • Most Influential Paper by Paper Digest
  • 2021
  • National Scholarship awarded by the China Ministry of Education, the highest honor in China
  • 2015
  • Provincial Merit Student awarded by the Jiangsu Government, the highest honor in the province
  • 2017
  • Nanjing University Top-Grade Scholarship, the highest honor in the university
  • 2018
  • SenseTime Scholarship, awarded to 30 students out of all AI major undergraduate students in China
  • 2017
  • Gold Medal in Invitational National Mathematical Olympiad
  • 2013
    More
  • Nanjing University Outstanding Student Leader Award
  • 2015
  • Nanjing University Outstanding Student Award
  • 2016
  • Nanjing University Top Volunteer Excellence Award
  • 2015
  • Zhenggang Scholarship, top 40 students in Nanjing University
  • 2016
  • Zhenggang Jingying Scholarship
  • 2017
  • Nanjing University People Scholarship
  • 2016
  • Nanjing University People Scholarship
  • 2017
  • World ranking 32nd in 2016 Calculus World Cup
  • 2016
  • Meritorious winner prize in the 2016 National Mathematical Contest in Modeling
  • 2016
  • Best paper in the 2014 University Electronics Design Contest
  • 2014

    Early Stage Research and Academic

    Click to expand

    Early Stage Research

    Multi-modal Video Analysis and Understanding
    August. 2018 - August. 2020   MMLab, Hong Kong S.A.R.
    Advisor: Prof. Dahua Lin (Director) and Prof. Bolei Zhou (Innovators Under 35)

  • Cinematic style analysis in videos, published one ECCV20.
  • Multi-modal story and plot understanding in movies, published one ECCV20.
  • Scene understanding in movies, published one ECCV20 and one CVPR20.
  • Robust Training with Word-level Adversity for NLP
    Sept. 2017 - April. 2018   Advanced Integration and Mining Lab (AIM), Eugene, OR, United States of America
    Advisor: Prof. Dejing Dou (Director, Head of Baidu Big Data Lab) and Prof. Daniel Lowd (Director of Graduate Studies)

  • Proposed an efficient word-level gradient-based adversarial examples generation approach for training robust models.
  • Evaluated the method across a wide range of sentence-level classification tasks and the method using adversarial training achieved excellent performances on benchmarks.
  • This work on Adversarial examples for NLP was featured in an article in The Register.
  • Undergrad Academic

    His GPA ranked No.1 in each semester during his undergraduate studies at Nanjing University with an overall GPA: 3.96/4.00 and Rank: 1/183. He finished major curricula in 2 years and learned a bunch of online courses. [Whole]

    Undergrad Research Beginning

    Automatic Music Accompaniment Using Probabilistic Machine Learning
    Jul. 2017 - Aug. 2017   The University of Hong Kong, Hong Kong S.A.R.
    Advisor: Prof. Francis Lau (Associate Dean)

  • Proposed a fast decoding algorithm to deal with performance errors and reduce computational complexity from O(n2) to O(n). It is able to work in real-time with practical length scores.
  • Constructed a comprehensive system and developed a parallel Hidden Markov Model for score following.
  • Developed a new free open-source Windows-based automatic music follower and accompanist.
  • Real-time 3D Surface Reconstruction Using Lidar (Light Detection And Ranging)
    Aug. 2016 - Sept. 2017   Visual Sensing and Graphics Lab (VISG Lab), Nanjing University
    Supervisor: Prof. Sidan Du (Director)

  • Proposed a novel line of sight algorithm to real-time reconstruct surface and achieved state-of-the-art results.
  • Employed a new surface lattice data structure in implicit surface update for memory efficiency.
  • Presented a real-time 3D reconstruction pipeline for large-scale Lidar point clouds.
  • Implemented parallel computation to update the implicit surface faster and Motion Estimation and Mapping to register point clouds. [Report] [Video]
  • Undergrad Course Projects

  • Computer Vision: 3D Human Poses Estimation from a Single Image [Presentation] Reduced the ambiguities in the 3D pose estimation using sparse coding and applied human-portion constraints to acquire a minimization problem.
  • Convex Optimization: Road Car Flow Prediction [Report] Predicted car motion under road circumstances constraints using Markov Decision Process and OD estimated matrix, and improved the performance according to different traffic loads and vehicle types.
  • Probability and Stochastic Process: Monte Carlo for Multidimensional Integrals [Report]
  • Machine Learning: Spam Filtering System Construction
  • Microcomputers and Interface Techniques: x86 Assembly Language Programming [Report]
  • Signal Processing: Single-Photon Detector Design [Report] [Presentation] Designed a 64-channel low-noise pre-amplifier using a symmetric structure with 100 times less noise, and drew its 8-layers circuit board.

  • Volunteer and Leadership Experience

    Click to expand

    Co-Founder of a Children Care Volunteer Program     Sep. 2015 - Dec. 2015
    Co-founded a psychological consulting program to promote left-behind children's growth and education. Volunteered to teach left-behind children Math and English in a junior high school located in the remote, underdeveloped Xiushui county. Recognized as a key team leader in the successful Warm One Hundred Campaign, which raised money for left-behind children. Our group received an excellence award from the China Foundation for Poverty Alleviation.

    Vice President of a Young Volunteers Association at Nanjing University     Jun. 2015 - Jun. 2016
    Organized and participated in over 100 out-of-school and 20 in-school activities covering over 1000 volunteers. Our association received a volunteer association excellence award.

    Research Intern Group Leader of JCET [Media Report]     July. 2015 - Aug. 2015

    Campus Ambassador of Huawei     Aug. 2017 - Dec. 2017

    Student Volunteer of International Conference on Computer Vision (ICCV)     Dec. 2019


    Miscellaneous

    Professional Activities
    Co-organizer: Workshop on AI for Creative Video Editing and Understanding at ECCV 2022, ICCV 2021
    Program Committee Member: CVPR, ICCV, NeurIPS, ICML, ICLR, AAAI, ECCV
    Journal Reviewer: IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology
    Judge: The 3rd International Artificial Intelligence Fair

    Teaching Experience
    Head TA, IERG 4160, Image and Video Processing (graduate level), Fall 2019
    TA, IERG 3180, Microcontrollers and Embedded Systems Laboratory, Spring 2020

    Patents
    A Video Generation Method, CN202210699177.X
    A Video Editing Method and Related Program Products, CN202210691662.2
    A Video Editing Method, CN202010694551.1
    A Video Classification Method, CN202010694811.1
    An Image Processing Method and Related Products, CN202010450801.3
    A Zero-shot Action Recognition Method, CN202110821209.4
    A Layout Generation Method, CN202111128490.X

    Hobbies
    Love: 🌊🥥🏄‍♂️✈️🎬
    Travel: 🇨🇳🇭🇰🇺🇸🇯🇵🇲🇴🇬🇧🇸🇬🇰🇷🇦🇪🇳🇱🇧🇪🇩🇪🇫🇷
    Undergrad: Nanjing University Student Choir Member (Joyful Snowflakes written by Chih-mo Hsu), The NJU EE Department Young Volunteers Association Vice President, Bronze medal at Nanjing University 55th Sports Meeting
    Language: A bit of Japanese and Cantonese, Native Mandarin, Full proficiency in English