Welcome
I am currently a Ph.D. student in Computer Engineering at the Information Sciences Institute (ISI) and the ECE department of the University of Southern California (USC), advised by Prof. Stephen Crago. My dissertation research focuses on Efficient and Trustworthy Distributed EdgeAI Systems in the Era of Large Language Models (LLM).
Before joining USC, I completed my Masterβs degree in Electronic Engineering under Prof. Zhongfeng Wang (IEEE Fellow), and my Bachelorβs degree in Physics from Kuang Yaming Honors School at Nanjing University.
My research interests span Efficient Deep Learning Algorithms, Machine Learning Systems, and Distributed Edge Computing. I am passionate about creating practical and efficient deep learning models and systems suitable for real-world applications. I have published papers in top AI conferences, accumulating 90 citations (Google Scholar Profile).
π₯ News
- 2025.04: Β ππ Our paper FedPaI has been accepted by ICIC 2025 (Oral paper).
- 2024.12: Β ππ Present MoQ at NeurIPS 2024 Workshop MLNCP.
- 2024.05: Β ππ Present EFFICIENT AND TRUSTWORTHY DISTRIBUTED EDGEAI SYSTEM at MLSys Young Professionals Symposium.
- 2023.06: Β ππ Published papers QuantPipe and BEBERT at ICASSP 2023.
π Publications

FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Haonan Wang, Z Liu, K Hoshino, T Zhang, JP Walters, SP Crago
Introducing pruning at initialization for federated learning to significantly reduce computation and communication overhead.

MoQ: Mixture-of-format Activation Quantization for Communication-efficient AI Inference System
Haonan Wang, Z Liu, C Fang, JP Walters, SP Crago
Proposed a mixed-format quantization method for AI inference, enhancing communication efficiency for edge/cloud deployments.

QuantPipe: Adaptive Post-Training Quantization for Distributed Transformer Pipelines in Dynamic Edge Environments
Haonan Wang, C Imes, S Kundu, PA Beerel, SP Crago, JP Walters
Developed adaptive post-training quantization for transformer models in dynamic distributed edge environments.

BEBERT: Efficient and Robust Binary Ensemble BERT
J Tian, C Fang, Haonan Wang, Z Wang
Created an efficient and robust binary ensemble BERT, significantly reducing computational overhead.
- Accelerating 3D CNN using 3D Fast Fourier Transform, C Fang, L He, Haonan Wang, J Wei, Z Wang, ISCAS 2021
- Temporal Residual Feature Learning for 3D CNN Action Recognition, Haonan Wang, Y Mei, J Lin, Z Wang, SiPS 2020
- Design Light-weight 3D CNN for Video Recognition, Haonan Wang, J Lin, Z Wang, arXiv 2019
- A Low-latency Sparse-Winograd Accelerator for CNNs, Haonan Wang, W Liu, T Xu, J Lin, Z Wang, ICASSP 2019
- Efficient Reconfigurable Hardware Core for CNNs, Haonan Wang, J Lin, Y Xie, B Yuan, Z Wang, Asilomar 2018
π» Internships
- 2024.06 - 2024.09, Research Scientist Intern, Microsoft Azure Hardware System & Infrastructure, USA
- Explored scaling laws of quantized LLM
- Designed and trained BitNet models from 14M to 1B parameters
- Implemented quantized LLM using micro-scaling quantization
- 2019.01 - 2020.05, Machine Learning Engineer Intern, Windorise Tech. Co., China
- Developed FPGA-based sparse CNN accelerators
- Designed efficient 3D CNN algorithms for action recognition tasks
π Honors and Awards
- 2024 MLSys YPS Poster Session Presenter
- 2024 KESTON and ISI Exploratory Research Grants ($100k)
- 2022 Research Festival of USC MHI ECE Best Poster Award
- 2020 USC Ph.D. Fellowship
- 2019 Best Poster Award & Travel Grant, Singapore AI Summer Workshop
- 2018 AI Scholarship, Nanjing University
π Educations
- 2020.09 - now, Ph.D. Computer Engineering, Ming Hsieh Department of Electrical and Computer Engineering, Los Angeles, USA.
- 2017.09 - 2020.06 M.Sc. Electronic Engineering, School of Electronic Science and Engineering, Nanjing University, China.
- 2013.09 - 2017.06 B.Sc. Physics, Kuang Yaming Honors School, Nanjing University, China.
π¬ Invited Talks
- 2024.05 MLSys Young Professionals Symposium, EFFICIENT AND TRUSTWORTHY DISTRIBUTED EDGEAI SYSTEM.