The Effect of Dual Hyperparameter Optimization on Software Vulnerability Prediction Models

Deepali Bassi; Hardeep Singh

doi:10.37190/e-Inf230102

EISEJ

ISSN (electronic): 2084-4840
ISSN (print): 1897-7979
ISBN: 978-83-7493-305-6
DOI: 10.37190/e-inf
DOI (before 2020): 10.5277/e-informatica

Our journal works under a Creative Commons Attribution 4.0 International License.

Impact Factor (2024) = 1.2 New!
5-Year Impact Factor (2024) = 1.2 New!

Scopus CiteScore (2024) = 3.5
Scopus SNIP (2024) = 0.734

The journal is published under the auspices of the Software Engineering Section of the Committee on Informatics of the Polish Academy of Sciences and Wrocław University of Science and Technology.

The Journal was co-financed under the program “Development of scientific journals” from the funds of the Minister of Education and Science, contract no RCN/SP/0219/2021/1. Now, the Journal is financially supported by the Department of Artificial Intelligence, Wroclaw University of Science and Technology.

Indexed by:

About

The Effect of Dual Hyperparameter Optimization on Software Vulnerability Prediction Models

2023
[1]	Deepali Bassi and Hardeep Singh, "The Effect of Dual Hyperparameter Optimization on Software Vulnerability Prediction Models", In e-Informatica Software Engineering Journal, vol. 17, no. 1, pp. 230102, 2023. DOI: 10.37190/e-Inf230102. Download article (PDF)Get article BibTeX file

Authors

Deepali Bassi, Hardeep Singh

Abstract

Background: Prediction of software vulnerabilities is a major concern in the field of software security. Many researchers have worked to construct various software vulnerability prediction (SVP) models. The emerging machine learning domain aids in building effective SVP models. The employment of data balancing/resampling techniques and optimal hyperparameters can upgrade their performance. Previous research studies have shown the impact of hyperparameter optimization (HPO) on machine learning algorithms and data balancing techniques.

Aim: The current study aims to analyze the impact of dual hyperparameter optimization on metrics-based SVP models.

Method: This paper has proposed the methodology using the python framework Optuna that optimizes the hyperparameters for both machine learners and data balancing techniques. For the experimentation purpose, we have compared six combinations of five machine learners and five resampling techniques considering default parameters and optimized hyperparameters.

Results: Additionally, the Wilcoxon signed-rank test with the Bonferroni correction method was implied, and observed that dual HPO performs better than HPO on learners and HPO on data balancers. Furthermore, the paper has assessed the impact of data complexity measures and concludes that HPO does not improve the performance of those datasets that exhibit high overlap.

Conclusion: The experimental analysis unveils that dual HPO is 64% effective in enhancing the productivity of SVP models.

Keywords

software vulnerability, hyperparameter optimization, machine learning algorithm, data balancing techniques, data complexity measures

References

1. S.M. Ghaffarian and H.R. Shahriari, “Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey,” ACM Computing Surveys (CSUR) , Vol. 50, No. 4, 2017, pp. 1–36.

2. W.R.J. Freitez, A. Mammar, and A.R. Cavalli, “Software vulnerabilities, prevention and detection methods: a review,” SEC-MDA 2009: Security in Model Driven Architecture , 2009, pp. 1–11.

3. A. Kaya, A.S. Keceli, C. Catal, and B. Tekinerdogan, “The impact of feature types, classifiers, and data balancing techniques on software vulnerability prediction models,” Journal of Software: Evolution and Process , Vol. 31, No. 9, 2019, p. e2164.

4. J. Morgenthaler and J. Penix, “software development tools using static analysis to find bugs,” Development , 2008.

5. B. Arkin, S. Stender, and G. McGraw, “Software penetration testing,” IEEE Security & Privacy , Vol. 3, No. 1, 2005, pp. 84–87.

6. P. Godefroid, “Random testing for security: blackbox vs. whitebox fuzzing,” in Proceedings of the 2nd international workshop on Random testing: co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2007) , 2007, pp. 1–1.

7. D. Evans and D. Larochelle, “Improving security using extensible lightweight static analysis,” IEEE software , Vol. 19, No. 1, 2002, pp. 42–51.

8. M. Fagan, “Design and code inspections to reduce errors in program development,” in Software pioneers . Springer, 2002, pp. 575–607.

9. H. Shahriar and M. Zulkernine, “Mitigating program security vulnerabilities: Approaches and challenges,” ACM Computing Surveys (CSUR) , Vol. 44, No. 3, 2012, pp. 1–46.

10. M. Jimenez, R. Rwemalika, M. Papadakis, F. Sarro, Y. Le Traon et al., “The importance of accounting for real-world labelling when predicting software vulnerabilities,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , 2019, pp. 695–705.

11. Y. Shin and L. Williams, “Can traditional fault prediction models be used for vulnerability prediction?” Empirical Software Engineering , Vol. 18, No. 1, 2013, pp. 25–59.

12. T. Zimmermann, N. Nagappan, and L. Williams, “Searching for a needle in a haystack: Predicting security vulnerabilities for windows vista,” in 2010 Third international conference on software testing, verification and validation . IEEE, 2010, pp. 421–428.

13. H. Alves, B. Fonseca, and N. Antunes, “Experimenting machine learning techniques to predict vulnerabilities,” in 2016 Seventh Latin-American Symposium on Dependable Computing (LADC) . IEEE, 2016, pp. 151–156.

14. W. Almutairi and R. Janicki, “On relationships between imbalance and overlapping of datasets.” in CATA , 2020, pp. 141–150.

15. S. Wang and X. Yao, “Using class imbalance learning for software defect prediction,” IEEE Transactions on Reliability , Vol. 62, No. 2, 2013, pp. 434–443.

16. T. Sasada, Z. Liu, T. Baba, K. Hatano, and Y. Kimura, “A resampling method for imbalanced datasets considering noise and overlap,” Procedia Computer Science , Vol. 176, 2020, pp. 420–429.

17. K. Borowska and J. Stepaniuk, “Imbalanced data classification: A novel re-sampling approach combining versatile improved smote and rough sets,” in IFIP International Conference on Computer Information Systems and Industrial Management . Springer, 2016, pp. 31–42.

18. G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue et al., “Learning from class-imbalanced data: Review of methods and applications,” Expert systems with applications , Vol. 73, 2017, pp. 220–239.

19. G. Douzas, F. Bacao, and F. Last, “Improving imbalanced learning through a heuristic oversampling method based on k-means and smote,” Information Sciences , Vol. 465, 2018, pp. 1–20.

20. C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, “Rusboost: A hybrid approach to alleviating class imbalance,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans , Vol. 40, No. 1, 2009, pp. 185–197.

21. C. Tantithamthavorn, S. McIntosh, A.E. Hassan, and K. Matsumoto, “Automated parameter optimization of classification techniques for defect prediction models,” in Proceedings of the 38th international conference on software engineering , 2016, pp. 321–332.

22. J.N. Van Rijn and F. Hutter, “Hyperparameter importance across datasets,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , 2018, pp. 2367–2376.

23. H.J. Weerts, A.C. Mueller, and J. Vanschoren, “Importance of tuning hyperparameters of machine learning algorithms,” arXiv preprint arXiv:2007.07588 , 2020.

24. L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing , Vol. 415, 2020, pp. 295–316. [Online]. https://www.sciencedirect.com/science/article/pii/S0925231220311693

25. R. Shu, T. Xia, L. Williams, and T. Menzies, “Better security bug report classification via hyperparameter optimization,” arXiv preprint arXiv:1905.06872 , 2019.

26. J. Kong, W. Kowalczyk, D.A. Nguyen, T. Bäck, and S. Menzel, “Hyperparameter optimisation for improving classification under class imbalance,” in 2019 IEEE symposium series on computational intelligence (SSCI) . IEEE, 2019, pp. 3072–3078.

27. R. Shu, T. Xia, J. Chen, L. Williams, and T. Menzies, “How to better distinguish security bug reports (using dual hyperparameter optimization),” Empirical Software Engineering , Vol. 26, No. 3, 2021, pp. 1–37.

28. A. Agrawal, W. Fu, D. Chen, X. Shen, and T. Menzies, “How to “dodge” complex software analytics,” IEEE Transactions on Software Engineering , Vol. 47, No. 10, 2019, pp. 2182–2194.

29. A. Agrawal, X. Yang, R. Agrawal, R. Yedida, X. Shen et al., “Simpler hyperparameter optimization for software analytics: why, how, when,” IEEE Transactions on Software Engineering , 2021.

30. J. Walden, J. Stuckman, and R. Scandariato, “Predicting vulnerable components: Software metrics vs text mining,” in 2014 IEEE 25th international symposium on software reliability engineering . IEEE, 2014, pp. 23–33.

31. J. Stuckman, J. Walden, and R. Scandariato, “The effect of dimensionality reduction on software vulnerability prediction models,” IEEE Transactions on Reliability , Vol. 66, No. 1, 2016, pp. 17–37.

32. M. Claesen and B. De Moor, “Hyperparameter search in machine learning,” arXiv preprint arXiv:1502.02127 , 2015.

33. P.K. Kudjo, S.B. Aformaley, S. Mensah, and J. Chen, “The significant effect of parameter tuning on software vulnerability prediction models,” in 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C) . IEEE, 2019, pp. 526–527.

34. E. Sara, C. Laila, and I. Ali, “The impact of smote and grid search on maintainability prediction models,” in 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA) . IEEE, 2019, pp. 1–8.

35. H. Osman, M. Ghafari, and O. Nierstrasz, “Hyperparameter optimization to improve bug prediction accuracy,” in 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) . IEEE, 2017, pp. 33–38.

36. V.H. Barella, L.P. Garcia, M.P. de Souto, A.C. Lorena, and A. de Carvalho, “Data complexity measures for imbalanced classification tasks,” in 2018 International Joint Conference on Neural Networks (IJCNN) . IEEE, 2018, pp. 1–8.

37. T.K. Ho and M. Basu, “Complexity measures of supervised classification problems,” IEEE transactions on pattern analysis and machine intelligence , Vol. 24, No. 3, 2002, pp. 289–300.

38. J.M. Sotoca, J. Sánchez, and R.A. Mollineda, “A review of data complexity measures and their applicability to pattern classification problems,” Actas del III Taller Nacional de Mineria de Datos y Aprendizaje. TAMIDA , 2005, pp. 77–83.

39. A.C. Lorena, L.P. Garcia, J. Lehmann, M.C. Souto, and T.K. Ho, “How complex is your classification problem? a survey on measuring classification complexity,” ACM Computing Surveys (CSUR) , Vol. 52, No. 5, 2019, pp. 1–34.

40. Y. Zhang, D. Lo, X. Xia, B. Xu, J. Sun et al., “Combining software metrics and text features for vulnerable file prediction,” in 2015 20th International Conference on Engineering of Complex Computer Systems (ICECCS) , 2015, pp. 40–49.

41. I. Abunadi and M. Alenezi, “An empirical investigation of security vulnerabilities within web applications,” JOURNAL OF UNIVERSAL COMPUTER SCIENCE , Vol. 22, 07 2016, pp. 537–551.

42. M.N. Khalid, H. Farooq, M. Iqbal, M.T. Alam, and K. Rasheed, “Predicting web vulnerabilities in web applications based on machine learning,” in Intelligent Technologies and Applications . Singapore: Springer Singapore, 2019, pp. 473–484.

43. C. Catal, A. Akbulut, E. Ekenoglu, and M. Alemdaroglu, “Development of a software vulnerability prediction web service based on artificial neural networks,” in Trends and Applications in Knowledge Discovery and Data Mining . Cham: Springer International Publishing, 2017, pp. 59–67.

44. D. Bassi and H. Singh, “Optimizing hyperparameters for improvement in software vulnerability prediction models,” in Advances in Distributed Computing and Machine Learning , R.R. Rout, S.K. Ghosh, P.K. Jana, A.K. Tripathy, J.P. Sahoo et al., Eds. Singapore: Springer Nature Singapore, 2022, pp. 533–544.

45. Z. Jin, J. Shang, Q. Zhu, C. Ling, W. Xie et al., “Rfrsf: Employee turnover prediction based on random forests and survival analysis,” in Web Information Systems Engineering – WISE 2020 , Z. Huang, W. Beek, H. Wang, R. Zhou, and Y. Zhang, Eds. Cham: Springer International Publishing, 2020, pp. 503–515.

46. R.E. Schapire, The Boosting Approach to Machine Learning: An Overview . New York, NY: Springer New York, 2003, pp. 149–171.

47. R. Meir and G. Rätsch, An Introduction to Boosting and Leveraging . Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 118–183.

48. R. Schapire and Y. Singer, “Improved boosting algorithms using confidence-rated predictions,” Machine Learning , Vol. 37, Dec. 1999, pp. 297–336.

49. H. Zhang, “The optimality of naive bayes,” in Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2004) , V. Barr and Z. Markov, Eds. AAAI Press, 2004.

50. M. Martinez-Arroyo and L.E. Sucar, “Learning an optimal naive bayes classifier,” in 18th international conference on pattern recognition (ICPR’06) , Vol. 3. IEEE, 2006, pp. 1236–1239.

51. J.D.M. Rennie, L. Shih, J. Teevan, and D.R. Karger, “Tackling the poor assumptions of naive bayes text classifiers,” in Proceedings of the Twentieth International Conference on International Conference on Machine Learning , 2003, p. 616–623.

52. C.C. Chang and C.J. Lin, “Libsvm: a library for support vector machines,” ACM transactions on intelligent systems and technology (TIST) , Vol. 2, No. 3, 2011, pp. 1–27.

53. H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, and V.N. Vapnik, “Support vector regression machines,” in NIPS , 1996.

54. X. Wang, J. Yang, X. Teng, and N. Peng, “Fuzzy-rough set based nearest neighbor clustering classification algorithm,” in Fuzzy Systems and Knowledge Discovery , L. Wang and Y. Jin, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 370–373.

55. W. Zuo, D. Zhang, and K. Wang, “On kernel difference-weighted k-nearest neighbor classification,” Pattern Analysis and Applications , Vol. 11, No. 3, 2008, pp. 247–257.

56. M. Santos, J. Soares, P. Henriques Abreu, H. Araujo, and J. Santos, “Cross-validation for imbalanced datasets: Avoiding overoptimistic and overfitting approaches,” IEEE Computational Intelligence Magazine , Vol. 13, 10 2018, pp. 59–76.

57. N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artificial intelligence research , Vol. 16, 2002, pp. 321–357.

58. H. He, Y. Bai, E.A. Garcia, and S. Li, “Adasyn: Adaptive synthetic sampling approach for imbalanced learning,” 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence) , 2008, pp. 1322–1328.

59. H. Han, W.Y. Wang, and B.H. Mao, “Borderline-smote: A new over-sampling method in imbalanced data sets learning,” in Advances in Intelligent Computing , D.S. Huang, X.P. Zhang, and G.B. Huang, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 878–887.

60. T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min , Vol. abs/1907.10902, 2019. [Online]. http://arxiv.org/abs/1907.10902

61. V. López, A. Fernández, J.G. Moreno-Torres, and F. Herrera, “Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. open problems on intrinsic data characteristics,” Expert Systems with Applications , Vol. 39, No. 7, 2012, pp. 6585–6608.

62. J. Huang and C.X. Ling, “Using auc and accuracy in evaluating learning algorithms,” IEEE Transactions on knowledge and Data Engineering , Vol. 17, No. 3, 2005, pp. 299–310.

63. K. Sultana, B. Williams, and A. Bosu, “A comparison of nano-patterns vs. software metrics in vulnerability prediction,” in Proceedings – 25th Asia-Pacific Software Engineering Conference, APSEC 2018 , Proceedings – Asia-Pacific Software Engineering Conference, APSEC. IEEE Computer Society, Jul. 2018, pp. 355–364.

64. A.K. Tanwani and M. Farooq, “Classification potential vs. classification accuracy: A comprehensive study of evolutionary algorithms with biomedical datasets,” in Learning Classifier Systems , J. Bacardit, W. Browne, J. Drugowitsch, E. Bernadó-Mansilla, and M.V. Butz, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 127–144.

65. S.S. Rathore and S. Kumar, “An empirical study of ensemble techniques for software fault prediction,” Applied Intelligence , Vol. 51, No. 6, 2021, pp. 3615–3644.

66. D. Tomar and S. Agarwal, “Prediction of defective software modules using class imbalance learning,” Applied Computational Intelligence and Soft Computing , Vol. 2016, 01 2016, pp. 1–12.

67. A. Kaur and K. Kaur, “Statistical comparison of modelling methods for software maintainability prediction,” International Journal of Software Engineering and Knowledge Engineering , Vol. 23, 2013.

68. E.W. Weisstein, Bonferroni Correction , 2004. [Online]. https://mathworld.wolfram.com/

69. P. Sedgwick, “Multiple significance tests: the bonferroni correction,” BMJ (online) , Vol. 344, 01 2012, pp. e509–e509.

EISEJ

e-Informatica Software Engineering Journal

Indexed by:

About

The Effect of Dual Hyperparameter Optimization on Software Vulnerability Prediction Models

Authors

Abstract

Keywords

References