Skip to main content
Log in

Semantic context based coincidental correct test cases detection for fault localization

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Fault localization is a process that aims to identify the potentially faulty statements responsible for program failures by analyzing runtime information. Therefore, the input code coverage matrix plays a crucial role in FL. However, the effectiveness of fault localization is compromised by the presence of coincidental correct test cases (CCTC) in the coverage matrix. These CCTC execute faulty code but do not result in program failures. To address this issue, many existing methods focus on identifying CCTC through cluster analysis. However, these methods have three problems. Firstly, identifying the optimal cluster count poses a considerable challenge in CCTC detection. Secondly, the effectiveness of CCTC detection is heavily influenced by the initial centroid selection. Thirdly, the presence of abundant fault-irrelevant statements within the raw coverage matrix introduces substantial noise for CCTC detection. To overcome these challenges, we propose SCD4FL: a semantic context-based CCTC detection method to enhance the coverage matrix for fault localization. SCD4FL incorporates and implements two key ideas: (1) SCD4FL uses the intersection of execution slices to construct a semantic context from the raw coverage matrix, effectively reducing noise during CCTC detection. (2) SCD4FL employs an expert-knowledge-based K-nearest neighbors (KNN) algorithm to detect the CCTC, effectively eliminating the requirement of determining the cluster number and initial centroid. To evaluate the effectiveness of SCD4FL, we conducted extensive experiments on 420 faulty versions of nine benchmarks using six state-of-the-art fault localization methods and two representative CCTC detection methods. The experimental results validate the effectiveness of our method in enhancing the performance of the six fault localization methods and two CCTC detection methods, e.g., the RNN method can be improved by 53.09% under the MFR metric.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://github.com/HuJGithub/SCD4FL.

  2. https://github.com/rjust/defects4j.

  3. https://repairbenchmarks.cs.umass.edu/

  4. https://sir.csc.ncsu.edu/portal/index.php

  5. https://gzoltar.com/

References

  • Abou Assi, R., Masri, W., Trad, C.: How detrimental is coincidental correctness to coverage-based fault detection and localization? an empirical study. Softw. Test. Verif. Reliab. 31(5), 1762 (2021)

    Article  Google Scholar 

  • Abou Assi, R., Trad, C., Maalouf, M., Masri, W.: Coincidental correctness in the defects4j benchmark. Softw. Test. Verif. Reliab. 29(3), e1696 (2019)

    Article  Google Scholar 

  • Abreu, R., Zoeteweij, P., Gemund, A.J.C.V.: An evaluation of similarity coefficients for software fault localization, PRDC. Washington, DC, USA: IEEE Computer Society, pp. 39–46 (2006)

  • Abreu, R., Zoeteweij, P., Van Gemund, A. J. C.: Spectrum-based multiple fault localization, in Proc. 24th IEEE/ACM Int. Conf. Automated Softw. Eng., pp. 88–99 (2009)

  • Abreu, R., Zoeteweij, P., Van Gemund, A.: On the accuracy of spectrum-based fault localization, in Proc. IEEE Testing: Academic Ind. Conf. Pract. Res. Techn. Mutation., pp. 89–98 (2007)

  • Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering, in Proc. 33rd Int. Conf. Softw. Eng., pp. 1–10 (2011)

  • Cao, P., Dong, Z., Liu, K., Cai, K.Y.: Quantitative effects of software testing on reliability improvement in the presence of imperfect debugging. Inform SCD4FLs 218, 119–132 (2013)

    MathSciNet  Google Scholar 

  • Corder, G.W., Foreman, D.I.: Nonparametric statistics for non-statisticians: a step-by-step approach. Int. Stat. Rev. 78, 451–452 (2010)

    Article  Google Scholar 

  • Debroy, V., Wong, W.E., Xu, X., Choi, B.: A grouping-based strategy to improve the effectiveness of fault localization techniques, in Proceedings of International Conference on Quality Software, pp.13–22 (2010)

  • Dudaand, P., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken, NJ, USA (2001)

    Google Scholar 

  • Eric Wong, W., Gao, R., Li, Y., Abreu, R., Wotawa, F.: A survey on software fault localization. IEEE Trans. Softw. Eng. 42(8), 707–740 (2016)

    Article  Google Scholar 

  • Eric Wong, W., Qi, Yu.: Effective program debugging based on execution slices and inter-block data dependency. J. Syst. Softw. 79(7), 891–903 (2006)

    Article  Google Scholar 

  • Hervé Abdi. The Bonferonni and Šidák Corrections for Multiple Comparisons, Encyclopedia of measurement and statistics, 2007, pp. 103-107

  • Jones, J.A.: Fault localization using visualization of test information, in Proceedings. 26th International Conference on Software Engineering., pp. 54–56 (2004)

  • Jones, J.A., Harrold, M.J.: Empirical evaluation of the tarantula automatic fault-localization technique, in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering., pp. 273–282 (2005)

  • Jones, James A., Bowring, J.F., Harrold, M.J.: Debugging in parallel. In Proceedings of the 2007 International Symposium on Software Testing and Analysis (ISSTA). ACM, 16-26 (2007)

  • Kingma, Diederik P., Ba, Jimmy: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  • Kochhar, P.S., Xia, X., Lo, D., Li, S.: Practitioners’ expectations on automated fault localization, in Proceedings of International Symposium on Software Testing and Analysis, pp. 165–176 (2016)

  • Lei, Y., Liu, C., Xie, H., Huang, S., Yan, M., Xu, Z.: BCL-FL: A Data Augmentation Approach with Between-Class Learning for Fault Localization, 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Honolulu, HI, USA, pp. 289-300 (2022)

  • Li, X., Li, W., Zhang, Y., Zhang, L.: Deepfl :Integrating multiple fault diagnosis dimensions for deep fault localization, in Proceedings of the 28th ACM SIGSOFT international symposium on software testing and analysis., pp. 169–180 (2019)

  • Li, Y., Wang, S., Nguyen, T.: Fault localization with code coverage representation learning, in IEEE/ACM 43rd International Conference on Software Engineering., pp. 661–673 (2021)

  • Li, X., Zhang, L.: Transforming programs and tests in tandem for fault localization, Proceedings of the ACM on Programming Languages, pp.1–30 (2017)

  • Li, X., Orso, A.: More accurate dynamic slicing for better supporting software debugging, in International Conference on Software Testing, Validation and Verification, 2020, pp. 28–38

  • Liu, M.H.: Feature Selection for Knowledge Discovery and Data Mining. Springer, Berlin, Germany (2012)

    Google Scholar 

  • Liu, Yong, Li, Meiying, Yonghao, Wu., Li, Zheng: A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization. J. Syst. Softw. 151, 20–37 (2019)

    Article  Google Scholar 

  • Lucia, L., Lo, D., Jiang, L., Thung, F., Budi, A.: Extended comprehensive study of association measures for fault localization. J. Softw. Evol. Process 26(2014), 172–219 (2014)

    Article  Google Scholar 

  • Masri, W., Assi, R.A.: Prevalence of coincidental correctness and mitigation of its impact on fault localization. ACM Trans. Softw. Eng. Methodol. (TOSEM) 23(1), 1–28 (2014)

    Article  Google Scholar 

  • Masri, W. , Assi, R.A.: Cleansing test suites from coincidental correctness to enhance fault-localization. In: Software Testing, Verification and Validation (ICST), 2010 Third International Conference on. IEEE, pp. 165–174 (2010)

  • Miao, Y., Chen, Z., Li, S., Zhao, Z., Zhou, Y.: Identifying coincidental correctness for fault localization by clustering test cases. In: SEKE, pp. 267–272 (2012)

  • Naish, L., Lee, H.J., Ramamohanarao, K.: A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. (TOSEM) 20(3), 11 (2011)

    Article  Google Scholar 

  • Pan, Y., Xiao, X., Hu, G., Zhang, B., Li, Q., Zheng, H.: ALBFL: A Novel Neural Ranking Model for Software Fault Localization via Combining Static and Dynamic Features, 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, pp. 785-792 (2020)

  • Parnin, C., Orso, A.: Are automated debugging techniques actually helping programmers? In Proceedings of the International Symposium on Software Testing and Analysis, pp.199-209 (2011)

  • Pearson, S., et al.: Evaluating and Improving Fault Localization, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), Buenos Aires, Argentina, pp. 609-620 (2017)

  • Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)

  • Santelices, R., Jones, J.A., Yu, Yanbing, Harrold, M.J.: Lightweight fault-localization using multiple coverage types, 2009 IEEE 31st International Conference on Software Engineering, Vancouver, BC, Canada, pp. 56-66 (2009)

  • Sohn, J., Yoo, S.: Fluccs: Using code and change metrics to improve fault localization, in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis., pp. 273–283 (2017)

  • Vargha, A., Delaney, H.D.: A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J. Educ. Behav. Statist. 25(2), 101–132 (2000)

    Google Scholar 

  • Voas, J.M.: PIE: a dynamic failure-based technique. IEEE Trans. Software Eng. 18(8), 717–727 (1992)

    Article  Google Scholar 

  • Wong, W.E., Debroy, V., Gao, R., Li, Y.: The DStar method for effective software fault localization. IEEE Trans. Rel. 63(1), 290–308 (2014)

    Article  Google Scholar 

  • Wong, W.E., Sugeta, T., Qi, Yu, Maldonado, J.C.: Smart debugging software architectural design in SDL, Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003, Dallas, TX, USA, pp. 41-47 (2003)

  • Wong, W. E., Zhao, L., Qi, Y., Cai, K.-Y., Dong, J.: Effective fault localization using bp neural networks. in SEKE. Citeseer, pp. 374–379 (2007)

  • Wu, Y., Tian, S., Yang, Z., Li, Z., Liu, Y., Chen, X.: Identifying Coincidental Correct Test Cases with Multiple Features Extraction for Fault Localization, 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 800–809. Torino, Italy (2023)

  • Xia, X., Bao, L., Lo, D., Li, S.: Automated Debugging Considered Harmful Considered Harmful: A User Study Revisiting the Usefulness of Spectra-Based Fault Localization Techniques with Professionals Using Real Bugs from Large Systems, In Proceedings of the IEEE International Conference on Software Maintenance and Evolution. pp. 267-278 (2016)

  • Xiaofeng, X., Debroy, V.W., Wong, E., Donghui, G.: Ties within fault localization rankings: exposing and addressing the problem. Int. J. Softw. Eng. Knowl. Eng. 21(06), 803–827 (2011)

    Article  Google Scholar 

  • Xiaoyuan, X., Yueh Chen, T., Ching Kuo, F., Xu, B.: A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. 31(22), 1–40 (2013)

    Google Scholar 

  • Xie, X., Kuo, F. C., Chen, T. Y., Yoo, S., and Harman, M.: Provably Optimal and Human-Competitive Results in SBSE for Spectrum Based Fault Localisation, In Proceedings of the 5th International Symposium on Search Based Software Engineering, pp. 224-238 (2013)

  • Xie, H., Lei, Y., Yan, M., Yu, Y., Xia, X., Mao, X.: A Universal Data Augmentation Approach for Fault Localization, IEEE/ACM 44th International Conference on Software Engineering (ICSE), Pittsburgh, PA, USA, pp. 48-60 (2022)

  • Yang, Aidan Z.H., Goues, C.L., Martins, R., Hellendoorn, V.: Large language models for test-free fault localization. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 17, 1–12 (2024)

  • Yoo, S., Xie, X., Kuo, F., Chen, T.Y., Harman, M.: No pot of gold at the end of program spectrum rainbow: Greatest risk evaluation formula does not exist, Res. Note RN/14/14, University College London, London, U.K. (2014)

  • Yu, J., Lei, Y., Xie, H., Fu, L., Liu, C.: Context-based Cluster Fault Localization, in Proc. IEEE/ACM 30th Int. Conf. Program Comprehension (ICPC), Pittsburgh, PA, USA, pp. 482-493 (2022)

  • Zhang, Z., Lei, Y., Mao, X., Yan, M., Ling, X., Zhang, X.: A study of effectiveness of deep learning in locating real faults. Inform. Softw. Technol. 131, 106486 (2021)

    Article  Google Scholar 

  • Zhang, Z., Yan L., Xiaoguang M., and Panpan, L.: CNN-FL: An Effective Approach for Localizing Faults using Convolutional Neural Networks, International Conference on Software Analysis, Evolution and Reengineering, pp.445-455 (2019)

  • Zhang, X., He, H., N.G., Gupta, R.: Experimental evaluation of using dynamic slices for fault location, in Proceedings of the sixth international symposium on Automated analysis-driven debugging, pp. 33–42 (2005)

  • Zheng, W., Hu, D., Wang, J.: Fault localization analysis based on deep neural network. Math. Problems Eng. 2016, 1–11 (2016)

    Google Scholar 

  • Zou, D., Liang, J., Xiong, Y., Ernst, M.D., Zhang, L.: An Empirical Study of Fault Localization Families and Their Combinations, in IEEE Transactions on Software Engineering, vol. 47, no. 2, pp. 332-347

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Hu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is partially supported by the National Natural Science Foundation of China (No. 61902421).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J. Semantic context based coincidental correct test cases detection for fault localization. Autom Softw Eng 31, 68 (2024). https://doi.org/10.1007/s10515-024-00466-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-024-00466-5

Keywords

Navigation