Welcome

I am currently a Ph.D. student in Computer Engineering at the Information Sciences Institute (ISI) and the ECE department of the University of Southern California (USC), advised by Prof. Stephen Crago. My dissertation research focuses on Efficient and Trustworthy Distributed EdgeAI Systems in the Era of Large Language Models (LLM).

Before joining USC, I completed my Master’s degree in Electronic Engineering under Prof. Zhongfeng Wang (IEEE Fellow), and my Bachelor’s degree in Physics from Kuang Yaming Honors School at Nanjing University.

My research interests span Efficient Deep Learning Algorithms, Machine Learning Systems, and Distributed Edge Computing. I am passionate about creating practical and efficient deep learning models and systems suitable for real-world applications. I have published papers in top AI conferences, accumulating 93 citations (Google Scholar Profile).

🔥 News

2025.04: 🎉🎉 Our paper FedPaI has been accepted by ICIC 2025 (Oral paper).
2024.12: 🎉🎉 Present MoQ at NeurIPS 2024 Workshop MLNCP.
2024.05: 🎉🎉 Present EFFICIENT AND TRUSTWORTHY DISTRIBUTED EDGEAI SYSTEM at MLSys Young Professionals Symposium.
2023.06: 🎉🎉 Published papers QuantPipe and BEBERT at ICASSP 2023.

📝 Publications

ICIC 2025

FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization

Haonan Wang, Z Liu, K Hoshino, T Zhang, JP Walters, SP Crago

Introducing pruning at initialization for federated learning to significantly reduce computation and communication overhead.

NeurIPS 2024

MoQ: Mixture-of-format Activation Quantization for Communication-efficient AI Inference System

Haonan Wang, Z Liu, C Fang, JP Walters, SP Crago

Proposed a mixed-format quantization method for AI inference, enhancing communication efficiency for edge/cloud deployments.

ICASSP 2023

QuantPipe: Adaptive Post-Training Quantization for Distributed Transformer Pipelines in Dynamic Edge Environments

Haonan Wang, C Imes, S Kundu, PA Beerel, SP Crago, JP Walters

Developed adaptive post-training quantization for transformer models in dynamic distributed edge environments.

ICASSP 2023

BEBERT: Efficient and Robust Binary Ensemble BERT

J Tian, C Fang, Haonan Wang, Z Wang

Created an efficient and robust binary ensemble BERT, significantly reducing computational overhead.

Accelerating 3D CNN using 3D Fast Fourier Transform, C Fang, L He, Haonan Wang, J Wei, Z Wang, ISCAS 2021
Temporal Residual Feature Learning for 3D CNN Action Recognition, Haonan Wang, Y Mei, J Lin, Z Wang, SiPS 2020
Design Light-weight 3D CNN for Video Recognition, Haonan Wang, J Lin, Z Wang, arXiv 2019
A Low-latency Sparse-Winograd Accelerator for CNNs, Haonan Wang, W Liu, T Xu, J Lin, Z Wang, ICASSP 2019
Efficient Reconfigurable Hardware Core for CNNs, Haonan Wang, J Lin, Y Xie, B Yuan, Z Wang, Asilomar 2018

💻 Internships

2024.06 - 2024.09, Research Scientist Intern, Microsoft Azure Hardware System & Infrastructure, USA
- Explored scaling laws of quantized LLM
- Designed and trained BitNet models from 14M to 1B parameters
- Implemented quantized LLM using micro-scaling quantization
2019.01 - 2020.05, Machine Learning Engineer Intern, Windorise Tech. Co., China
- Developed FPGA-based sparse CNN accelerators
- Designed efficient 3D CNN algorithms for action recognition tasks

🎖 Honors and Awards

2024 MLSys YPS Poster Session Presenter
2024 KESTON and ISI Exploratory Research Grants ($100k)
2022 Research Festival of USC MHI ECE Best Poster Award
2020 USC Ph.D. Fellowship
2019 Best Poster Award & Travel Grant, Singapore AI Summer Workshop
2018 AI Scholarship, Nanjing University

📖 Educations

2020.09 - now, Ph.D. Computer Engineering, Ming Hsieh Department of Electrical and Computer Engineering, Los Angeles, USA.
2017.09 - 2020.06 M.Sc. Electronic Engineering, School of Electronic Science and Engineering, Nanjing University, China.
2013.09 - 2017.06 B.Sc. Physics, Kuang Yaming Honors School, Nanjing University, China.

💬 Invited Talks

2024.05 MLSys Young Professionals Symposium, EFFICIENT AND TRUSTWORTHY DISTRIBUTED EDGEAI SYSTEM.