(Mark) Haoyue Dai    

I'm currently a junior student majoring in CS at IEEE Honor Class, Shanghai Jiao Tong University (SJTU), graduating in June 2021. I'm also a member of Zhiyuan Honor Program of Engineering. I was an exchange student at the University of Washington.

My research interests are centered around deep learning, computer vision, natural language processing and AI interpretability. Currently I'm in Prof. Pengtao Xie's group at University of California San Diego (UCSD), working on automatic paper illustration figures generator. Before that I've worked in a group for explainable AI under the supervision of Prof. Quanshi Zhang at John Hopcroft Center. In that group I actively explored new methods of interpretation analysis to transform the black-box networks to a semantically explainable and unsupervisedly hierarchical model. I also have broader interests in natural language processing, image processing, optimization and deep reinforcement learning.

I am looking for exciting research interns in the upcoming summer 2020, where I can apply my machine learning insights and passion for the greater good.

Email  /  CV  /  Github

profile photo
Selected Programs
What do CNN neurons learn: Visualization & Clustering
Haoyue Dai
supervised by Prof. John Hopcroft, Fall 2019
paper / slides / code

In this paper, we address the problem of interpreting a CNN from the aspects of the input image's focus and preference, and the neurons' domination, activation and contribution to a concrete final prediction. Specifically, we use two techniques - visualization and clustering to tackle the problems above. Visualization means the method of gradient descent on image pixel, and in clustering section two algorithms are proposed to cluster respectively over image categories and network neurons. Experiments and quantitative analyses have demonstrated the effectiveness of the two methods in explaining the question: what do neurons learn.

Interpretation of Speech Recognition ConvNets
Die Zhang, Haoyue Dai, Xinzhe Cao, Da Huo
supervised by Prof. Quanshi Zhang, Spring 2019
project page / code

By analyzing the mainstream speech recognition algorithms, we aim to fabricate a model to quantitatively characterize the "importance" of different parts of a voice. Further, based on the interpretable model, we would like to streamline the convolutional structure in speech recognition and adjust the intermediate layers, while maintain its accuracy. I proposed a novel method: to disperse voice spectrum along frequency domain, designed the back propagation of the dispersion and implemented the accelerated parallel algorithm, reconstructed the letter-based gated ConvNets wav2letter frame, and developed a series of well-packaged utilities like IFFT speech regeneration, noise coverage, mask separation, parameter visualization, etc.

Realtime Traffic Cone Detection
Haoyue Dai
applied on SJTU Racing Team's driverless car, Spring 2019
dataset / code

This project focuses on real time traffic cone detection and has been applied on SJTU Racing Team's driverless car, which performs well on Fomula Student China (FSC) 2019. I annotated a traffic cone dataset first, and fine-tuned a YOLOv3 network with mAP reaching 98%, recall 98%, precision 99%, and speed 60 fps on 1080Ti. Acceleration over constraints on TX2 and collaboration with SLAM has been designed for practical racing.

Poem Inspire: An Image-Poem Coupled Search & Generation Engine
Haoyue Dai, Zhongye Wang, Jingyu Li, Haoping Chen
jointly supervised by Prof. Ya Zhang and Prof. Dazhi He, Fall 2018
report / demo / code

This is an integrated poem engine with features of searching, exhibition, imagination, recommendation, and image-poem generation, etc. I provided lexical semantic prediction and expansion model to give synonym clustering between classical and modern Chinese, conducted a Recurrent Neural Network model independently, which can generate classical Chinese quatrain from images, with expansion from keywords, and fine-tuned a deep coupled visual-poetic embedding model by multi-adversarial training, which can generate Chinese modern poems from image. Images' features will be extracted from convolutional networks first, and be used as poetic clues to generate poems from two dis-criminative networks.


thanks jon!