QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

Zhang, Yifan; Dong, Zhen; Yang, Huanrui; Lu, Ming; Tseng, Cheng-Ching; Du, Yuan; Keutzer, Kurt; Du, Li; Zhang, Shanghang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.10515 (cs)

[Submitted on 21 Aug 2023]

Title:QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

Authors:Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang

View PDF

Abstract:Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements. However, the huge memory consumption of state-of-the-art models makes it hard to deploy them on vehicles, and the non-trivial latency will affect the real-time perception of streaming applications. Despite the wide application of quantization to lighten models, we show in our paper that directly applying quantization in BEV tasks will 1) make the training unstable, and 2) lead to intolerable performance degradation. To solve these issues, our method QD-BEV enables a novel view-guided distillation (VGD) objective, which can stabilize the quantization-aware training (QAT) while enhancing the model performance by leveraging both image features and BEV features. Our experiments show that QD-BEV achieves similar or even better accuracy than previous methods with significant efficiency gains. On the nuScenes datasets, the 4-bit weight and 6-bit activation quantized QD-BEV-Tiny model achieves 37.2% NDS with only 15.8 MB model size, outperforming BevFormer-Tiny by 1.8% with an 8x model compression. On the Small and Base variants, QD-BEV models also perform superbly and achieve 47.9% NDS (28.2 MB) and 50.9% NDS (32.9 MB), respectively.

Comments:	ICCV 2023 Accept
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2308.10515 [cs.CV]
	(or arXiv:2308.10515v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.10515

Submission history

From: Yifan Zhang [view email]
[v1] Mon, 21 Aug 2023 07:06:49 UTC (23,348 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators