QSpec: Speculative decoding with complementary quantization schemes
Jan 1, 2025·
,·
0 min read
Juntao Zhao
Equal contribution
,Wenhao Lu
Equal contribution
Sheng Wang
Equal contribution
,Lingpeng Kong
Chuan Wu

Abstract
QSpec proposes speculative decoding with complementary quantization schemes to achieve lossless inference acceleration in low-resource scenarios.
Type
Publication
In The 2025 Conference on Empirical Methods in Natural Language Processing
Authors
Authors

Authors
Sheng Wang
(Forence)
PhD Graduate in Computer Science
Sheng Wang is a PhD graduate from The University of Hong Kong, supervised by Prof. Chuan Wu and Prof. Lingpeng Kong.
His research focuses on Agent, LLM Super-Alignment, and Data Synthesis. He has published 14+ papers in top-tier
conferences including NIPS2025 (Spotlight), ICLR2025, ACL2024/2025, EMNLP2025.
Authors
Authors