What is Holding Back Convnets for Detection?

Pepik, Bojan; Benenson, Rodrigo; Ritschel, Tobias; Schiele, Bernt

Computer Science > Computer Vision and Pattern Recognition

arXiv:1508.02844 (cs)

[Submitted on 12 Aug 2015 (v1), last revised 18 Aug 2015 (this version, v2)]

Title:What is Holding Back Convnets for Detection?

Authors:Bojan Pepik, Rodrigo Benenson, Tobias Ritschel, Bernt Schiele

View PDF

Abstract:Convolutional neural networks have recently shown excellent results in general object detection and many other tasks. Albeit very effective, they involve many user-defined design choices. In this paper we want to better understand these choices by inspecting two key aspects "what did the network learn?", and "what can the network learn?". We exploit new annotations (Pascal3D+), to enable a new empirical analysis of the R-CNN detector. Despite common belief, our results indicate that existing state-of-the-art convnet architectures are not invariant to various appearance factors. In fact, all considered networks have similar weak points which cannot be mitigated by simply increasing the training data (architectural changes are needed). We show that overall performance can improve when using image renderings for data augmentation. We report the best known results on the Pascal3D+ detection and view-point estimation tasks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1508.02844 [cs.CV]
	(or arXiv:1508.02844v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1508.02844

Submission history

From: Bojan Pepik [view email]
[v1] Wed, 12 Aug 2015 08:22:04 UTC (3,552 KB)
[v2] Tue, 18 Aug 2015 13:54:22 UTC (3,536 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2015-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Bojan Pepik
Rodrigo Benenson
Tobias Ritschel
Bernt Schiele

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:What is Holding Back Convnets for Detection?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:What is Holding Back Convnets for Detection?

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators