Abstract
Compared with RGB salient object detection (SOD) methods, RGB-D SOD models show better performance in many challenging scenarios by leveraging spatial information embedded in depth maps. However, existing RGB-D SOD models prone to ignore the modality-specific characteristics and fuse multi-modality features by simple element-wise addition or multiplication. Thus, they may induce noise-degraded saliency maps when encountering inaccurate or blurred depth images. Besides, many models adopt the U-shape architecture to integrate multi-level features layer-by-layer. Despite the fact that low-level features can be gradually polished, little attention has been paid to enhance high-level features, which may lead to suboptimal results. In this paper, we propose a novel network named CFIDNet to tackle the above problems. Specifically, we design the feature-enhanced module to excavate informative depth cues from depth images and enhance the RGB features by employing complementary information between RGB and depth modalities. Besides, we propose the feature refinement module to exploit multi-scale complementary information between multi-level features and polish these features by applying residual connections. The cascaded feature interaction decoder (CFID) is then proposed to refine multi-level features iteratively. Equipped with these proposed modules, our CFIDNet is capable of segmenting salient objects accurately. Experimental results on 7 widely used benchmark datasets validate that our CFIDNet achieves highly competitive performance over 15 state-of-the-art models in terms of 8 evaluation metrics. Our source code will be publicly available at https://github.com/clelouch/CFIDNet.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Borji A, Cheng MM, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2015.2487833
Wang W, Lai Q, Fu H, Shen J, Ling H, Yang R (2021) Salient object detection in the deep learning era: an in-depth survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3051099
Cheng MM, Liu Y, Lin WY, Zhang Z, Rosin PL, Torr PHS (2019) BING: Binarized normed gradients for objectness estimation at 300fps. Comput Vis Media. https://doi.org/10.1007/s41095-018-0120-1
Cheng MM, Zhang FL, Mitra NJ, Huang X, Hu SM (2010) RepFinder: Finding approximately repeated scene elements for image editing. ACM Trans Graph. https://doi.org/10.1145/1778765.1778820
Liu C et al (2020) Aggregation signature for small object tracking. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2940477
Borji A, Frintrop S, Sihite DN, Itti L (2012) Adaptive object tracking by learning background context. In: 2012 IEEE computer society conference on computer vision and pattern recognition workshops. pp 23–30. IEEE, https://doi.org/10.1109/CVPRW.2012.6239191
Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. In: 32nd international conference on machine learning, ICML 2015, vol 1
Zhao R, Ouyang W, Wang X (2013) Unsupervised salience learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition https://doi.org/10.1109/CVPR.2013.460
Fan DP, Wang W, Cheng MM, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00875
Yan P et al (2019) Semi-supervised video salient object detection using pseudo-labels. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00738
Wang W, Shen J, Yu Y, Ma KL (2017) Stereoscopic thumbnail creation via efficient stereo saliency detection. IEEE Trans Vis Comput Graph. https://doi.org/10.1109/TVCG.2016.2600594
Cheng MM, Mitra NJ, Huang X, Torr PHS, Hu SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2014.2345401
Xiao F, Peng L, Fu L, Gao X (2018) Salient object detection based on eye tracking data. Signal Process. https://doi.org/10.1016/j.sigpro.2017.10.019
Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: A discriminative regional feature integration approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2083–2090. https://doi.org/10.1109/CVPR.2013.271
Zhang J, Ehinger KA, Wei H, Zhang K, Yang J (2017) A novel graph-based optimization framework for salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2016.10.025
Lu S, Lim JH (2012) Saliency modeling from image histograms. In: European Conference on Computer Vision, pp 312–332. Springer, Berlin 2012.
Klein DA, Frintrop S (2011) Center-surround divergence of feature statistics for salient object detection. In: 2011 international conference on computer vision. IEEE, https://doi.org/10.1109/ICCV.2011.6126499
Chen T, Hu X, Xiao J, Zhang G (2021) BPFINet: boundary-aware progressive feature integration network for salient object detection. Neurocomputing. https://doi.org/10.1016/j.neucom.2021.04.078
Tu Z, Ma Y, Li C, Li C, Tang J, Luo B (2020) Edge-guided non-local fully convolutional network for salient object detection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2020.2980853
Hou Q, Cheng MM, Hu X, Borji A, Tu Z, Torr PHS (2019) Deeply supervised salient object detection with short connections. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2815688
Luo Z, Mishra A, Achkar A, Eichel J, Li S, Jodoin PM (2017) Non-local deep features for salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6609–6617. https://doi.org/10.1109/CVPR.2017.698
Zhang P, Wang D, Lu H, Wang H, Yin B (2017) Learning uncertain convolutional features for accurate saliency detection. In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.32
Zhao J, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) EGNet: Edge guidance network for salient object detection. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00887
Liu JJ, Hou Q, Cheng MM, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection, https://doi.org/10.1109/CVPR.2019.00404
Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection, https://doi.org/10.1109/CVPR.2018.00187
Zhu L et al (2020) Aggregating attentional dilated features for salient object detection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2019.2941017
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00766
Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00403
Wei J, Wang S, Huang Q (2020) F3Net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, no 07 SE-AAAI Technical Track: Vision, pp 12321–12328, https://doi.org/10.1609/aaai.v34i07.6916
Gao S-H, Tan Y-Q, Cheng M-M, Lu C, Chen Y, Yan S (2020) Highly efficient salient object detection with 100K parameters. In: Computer Vision—ECCV 2020, pp 702–721
Zhou H, Xie X, Lai J-H, Chen Z, Yang L (2020) Interactive two-stream decoder for accurate and fast saliency detection, doi: https://doi.org/10.1109/cvpr42600.2020.00916
Pang Y, Zhao X, Zhang L, Lu H (2020) Multi-scale interactive network for salient object detection. CVPR. https://doi.org/10.1109/cvpr42600.2020.00943
Deng Z et al (2018) R3Net: recurrent residual refinement network for saliency detection. In: IJCAI international joint conference on artificial intelligence, vol 2018-July, https://doi.org/10.24963/ijcai.2018/95
Qin X, Zhang Z, Huang C, Dehghan M, Zaiane OR, Jagersand M (2020) U2-Net: Going deeper with nested U-structure for salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2020.107404
Zhai Y et al (2020) Bifurcated backbone strategy for RGB-D salient object detection, arXiv. 2020
Chen Z, Cong R, Xu Q, Huang Q (2020) DPANet: depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/tip.2020.3028289
Da Jin W, Xu J, Han Q, Zhang Y, Cheng MM (2021) CDNet: complementary depth network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2021.3060167
Piao Y, Rong Z, Zhang M, Ren W, Lu H (2020) A2dele: adaptive and attentive depth distiller for efficient RGB-D salient object detection, https://doi.org/10.1109/CVPR42600.2020.00908
Wang N, Gong X (2019) Adaptive fusion for RGB-D salient object detection. IEEE Access 7:55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107
Zhang M, Fei SX, Liu J, Xu S, Piao Y, Lu H (2020) Asymmetric two-stream architecture for accurate RGB-D saliency detection. ECCV. https://doi.org/10.1007/978-3-030-58604-1_23
Zhao JX, Cao Y, Fan DP, Cheng MM, Li XY, Zhang L (2019) Contrast prior and fluid pyramid integration for rgbd salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00405
Li G, Liu Z, Ye L, Wang Y, Ling H (2020) Cross-modal weighting network for RGB-D salient object detection. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 2020, vol 12362 LNCS, https://doi.org/10.1007/978-3-030-58520-4_39
Piao Y, Ji W, Li J, Zhang M, Lu H (2019) Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE international conference on computer vision, vol 2019-October, https://doi.org/10.1109/ICCV.2019.00735
Zhang M, Zhang Y, Piao Y, Hu B, Lu H (2020) Feature reintegration over differential treatment: a top-down and adaptive fusion network for RGB-D salient object detection, https://doi.org/10.1145/3394171.3413969
Pang Y, Zhang L, Zhao X, Lu H (2020) Hierarchical dynamic filtering network for rgb-d salient object detection. ECCV. https://doi.org/10.1007/978-3-030-58595-2_15
Li G, Liu Z, Ling H (2020) ICNet: information conversion network for RGB-D based salient object detection”. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2020.2976689
Wu J, Zhou W, Luo T, Yu L, Lei J (2021) Multiscale multilevel context and multimodal fusion for RGB-D salient object detection. Signal Process. https://doi.org/10.1016/j.sigpro.2020.107766
Fu K, Fan DP, Ji GP, Zhao Q (2020) JL-DCF: joint learning and densely-cooperative fusion framework for RGB-D salient object detection. https://doi.org/10.1109/CVPR42600.2020.00312
Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. Pattern Recognit. https://doi.org/10.1016/j.patcog.2018.08.007
Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time RGB-D salient object detection. ECCV. https://doi.org/10.1007/978-3-030-58542-6_39
Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2021.3062689
Wang X, Girshick R, Gupta A, He K (2018) Non-local Neural Networks. https://doi.org/10.1109/CVPR.2018.00813
Lu S, Tan C, Lim JH (2014) Robust and efficient saliency modeling from image co-occurrence histograms. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2013.158
Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE international conference on computer vision, vol 2017-Octob, https://doi.org/10.1109/ICCV.2017.31
Wang T, Borji A, Zhang L, Zhang P, Lu H (2017) A stagewise refinement model for detecting salient objects in images. In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.433
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, vol 2017-January, https://doi.org/10.1109/CVPR.2017.660
Liu N, Han J, Yang MH (2018) PiCANet: learning pixel-wise contextual attention for saliency detection. https://doi.org/10.1109/CVPR.2018.00326
Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00172
Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, https://doi.org/10.1109/CVPR.2019.00320
Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. https://doi.org/10.1109/ICCV.2019.00736
Liu JJ, Hou Q, Cheng MM (2020) Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2020.3017352
Cheng Y, Fu H, Wei X, Xiao J, Cao X (2014) Depth enhanced saliency detection method, https://doi.org/10.1145/2632856.2632866
Zhu C, Li G, Wang W, Wang R (2017) An innovative salient object detection using center-dark channel prior. In: Proceedings - 2017 IEEE international conference on computer vision workshops, ICCVW 2017, vol 2018-January, https://doi.org/10.1109/ICCVW.2017.178
Peng H, Li B, Xiong W, Hu W, Ji R (2014) RGBD salient object detection: A benchmark and algorithms. ECCV. https://doi.org/10.1007/978-3-319-10578-9_7
Song H, Liu Z, Du H, Sun G, Le Meur O, Ren T (2017) Depth-aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2017.2711277
Feng D, Barnes N, You S, McCarthy C (2016) Local background enclosure for RGB-D salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2016-December, https://doi.org/10.1109/CVPR.2016.257
Ju R, Ge L, Geng W, Ren T, Wu G (2014) Depth saliency based on anisotropic center-surround difference, https://doi.org/10.1109/ICIP.2014.7025222
Zhu C, Cai X, Huang K, Li TH, Li G (2019) PDNet: prior-model guided depth-enhanced network for salient object detection. In: Proceedings - IEEE international conference on multimedia and expo, vol 2019-July, https://doi.org/10.1109/ICME.2019.00042
Ji W, Li J, Zhang M, Piao Y, Lu H (2020) Accurate RGB-D salient object detection via collaborative learning. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 12363 LNCS, https://doi.org/10.1007/978-3-030-58523-5_4
Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking RGB-D salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.2996406
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2016-December, https://doi.org/10.1109/CVPR.2016.90
Gupta S, Girshick R, Arbeláez P, Malik J (2014) Learning rich features from RGB-D images for object detection and segmentation. ECCV. https://doi.org/10.1007/978-3-319-10584-0_23
Krähenbühl P, Koltun V (2012) Efficient inference in fully connected CRFs with gaussian edge potentials. Adv Neural Inf Process Syst 24:109–117
Liu N, Zhang N, Han J (2020) Learning selective self-mutual attention for RGB-D saliency detection. https://doi.org/10.1109/CVPR42600.2020.01377
Li N, Ye J, Ji Y, Ling H, Yu J (2017) Saliency detection on light field. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2016.2610425
Li G, Zhu C (2017) A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: Proceedings - 2017 IEEE international conference on computer vision workshops, ICCVW 2017, vol 2018-January, https://doi.org/10.1109/ICCVW.2017.355
Niu Y, Geng Y, Li X, Liu F (2012) Leveraging stereopsis for saliency analysis, https://doi.org/10.1109/CVPR.2012.6247708
Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization
Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps, https://doi.org/10.1109/CVPR.2014.39
Fan DP, Cheng MM, Liu Y, Li T, Borji A (2017) Structure-measure: a new way to evaluate foreground maps, In: Proceedings of the IEEE international conference on computer vision, vol 2017-October, https://doi.org/10.1109/ICCV.2017.487
Fan DP, Gong C, Cao Y, Ren B, Cheng MM, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI International joint conference on artificial intelligence, vol 2018-July, https://doi.org/10.24963/ijcai.2018/97
Han J, Chen H, Liu N, Yan C (2018) Li X “CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion.” IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2017.2761775
Chen H, Li Y (2018) Progressively complementarity-aware fusion network for RGB-D salient object detection, https://doi.org/10.1109/CVPR.2018.00322
Chen H, Li Y (2019) Three-stream attention-aware network for rgb-d salient object detection. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2891104
Ji W et al (2021) Calibrated RGB-D salient object detection. In: CVPR, pp 9471–9481
Hussain T, Anwar S, Ullah A, Muhammad K, Baik SW (2021) Densely deformable efficient salient object detection network, In: CoRR, vol abs/2102.06407, [Online]. Available: https://arxiv.org/abs/2102.06407
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra R-CNN: towards balanced learning for object detection. CVPR. https://doi.org/10.1109/CVPR.2019.00091
Acknowledgements
This work was supported by the National Natural Science Foundation of China (under Grant 51807003).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work; there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, T., Hu, X., Xiao, J. et al. CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection. Neural Comput & Applic 34, 7547–7563 (2022). https://doi.org/10.1007/s00521-021-06845-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06845-3