e-Informatica Software Engineering Journal Software Change Prediction: A Systematic Review and Future Guidelines

Software Change Prediction: A Systematic Review and Future Guidelines

[1]Ruchika Malhotra and Megha Khanna, "Software Change Prediction: A Systematic Review and Future Guidelines", In e-Informatica Software Engineering Journal, vol. 13, no. 1, pp. 227–259, 2019. DOI: 10.5277/e-Inf190107.

Get article (PDF)View article entry (BibTeX)


Ruchika Malhotra, Megha Khanna


Background: The importance of Software Change Prediction (SCP) has been emphasized by several studies. Numerous prediction models in literature claim to effectively predict change-prone classes in software products. These models help software managers in optimizing resource usage and in developing good quality, easily maintainable products.

Aim: There is an urgent need to compare and assess these numerous SCP models in order to evaluate their effectiveness. Moreover, one also needs to assess the advancements and pitfalls in the domain of SCP to guide researchers and practitioners.

Method: In order to fulfill the above stated aims, we conduct an extensive literature review of 38 primary SCP studies from January 2000 to June 2019.

Results: The review analyzes the different set of predictors, experimental settings, data analysis techniques, statistical tests and the threats involved in the studies, which develop SCP models.

Conclusion: Besides, the review also provides future guidelines to researchers in the SCP domain, some of which include exploring methods for dealing with imbalanced training data, evaluation of search-based algorithms and ensemble of algorithms for SCP amongst others.


change-proneness, machine learning, software quality, systematic review


[1]    A.G. Koru and H. Liu, “Identifying and characterizing change-prone classes in two large-scale open-source products,” Journal of Systems and Software, Vol. 80, No. 1, 2007, pp. 63–73.

[2]    Y. Zhou, H. Leung, and B. Xu, “Examining the potentially confounding effect of class size on the associations between object-oriented metrics and change-proneness,” IEEE Transactions on Software Engineering, Vol. 35, No. 5, 2009, pp. 607–623.

[3]    A.G. Koru and J. Tian, “Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products,” IEEE Transactions on Software Engineering, Vol. 31, No. 8, 2005, pp. 625–642.

[4]    E. Arisholm, L.C. Briand, and A. Foyen, “Dynamic coupling measurement for object-oriented software,” IEEE Transactions on software engineering, Vol. 30, No. 8, 2004, pp. 491–506.

[5]    B.A. Kitchenham, D. Budgen, and P. Brereton, Evidence-based software engineering and systematic reviews. CRC Press, 2015, Vol. 4.

[6]    R. Malhotra, M. Khanna, and R.R. Raje, “On the application of search-based techniques for software engineering predictive modeling: A systematic review and future directions,” Swarm and Evolutionary Computation, Vol. 32, 2017, pp. 85–109.

[7]    R. Malhotra and M. Khanna, “Threats to validity in search-based predictive modelling for software engineering,” IET Software, Vol. 12, No. 4, 2018, pp. 293–305.

[8]    D. Godara and R. Singh, “A review of studies on change proneness prediction in object oriented software,” International Journal of Computer Applications, Vol. 105, No. 3, 2014, pp. 35–41.

[9]    R. Malhotra and A.J. Bansal, “Software change prediction: A literature review,” International Journal of Computer Applications in Technology, Vol. 54, No. 4, 2016, pp. 240–256.

[10]    C. Catal and B. Diri, “A systematic review of software fault prediction studies,” Expert systems with applications, Vol. 36, No. 4, 2009, pp. 7346–7354.

[11]    T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell, “A systematic literature review on fault prediction performance in software engineering,” IEEE Transactions on Software Engineering, Vol. 38, No. 6, 2011, pp. 1276–1304.

[12]    D. Radjenović, M. Heričko, R. Torkar, and A. Živkovič, “Software fault prediction metrics: A systematic literature review,” Information and Software Technology, Vol. 55, No. 8, 2013, pp. 1397–1418.

[13]    R.S. Wahono, “A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks,” Journal of Software Engineering, Vol. 1, No. 1, 2015, pp. 1–16.

[14]    S. Hosseini, B. Turhan, and D. Gunarathna, “A systematic literature review and meta-analysis on cross project defect prediction,” IEEE Transactions on Software Engineering, Vol. 45, No. 2, 2017, pp. 111–147.

[15]    R. Malhotra, “A systematic review of machine learning techniques for software fault prediction,” Applied Soft Computing, Vol. 27, 2015, pp. 504–518.

[16]    P.K. Singh, D. Agarwal, and A. Gupta, “A systematic review on software defect prediction,” in 2nd International Conference on Computing for Sustainable Global Development (INDIACom). IEEE, 2015, pp. 1793–1797.

[17]    C. Catal, “Software fault prediction: A literature review and current trends,” Expert systems with applications, Vol. 38, No. 4, 2011, pp. 4626–4636.

[18]    X. Zhu, Y. He, L. Cheng, X. Jia, and L. Zhu, “Software change-proneness prediction through combination of bagging and resampling methods,” Journal of Software: Evolution and Process, Vol. 30, No. 12, 2018, p. e2111.

[19]    G. Catolino and F. Ferrucci, “An extensive evaluation of ensemble techniques for software change prediction,” Journal of Software: Evolution and Process, 2019, p. e2156.

[20]    G. Catolino, F. Palomba, A. De Lucia, F. Ferrucci, and A. Zaidman, “Enhancing change prediction models using developer-related factors,” Journal of Systems and Software, Vol. 143, 2018, pp. 14–28.

[21]    R. Malhotra and M. Khanna, “Dynamic selection of fitness function for software change prediction using particle swarm optimization,” Information and Software Technology, Vol. 112, 2019, pp. 51–67.

[22]    L. Kumar, S. Lal, A. Goyal, and N. Murthy, “Change-proneness of object-oriented software using combination of feature selection techniques and ensemble learning techniques,” in Proceedings of the 12th Innovations on Software Engineering Conference. ACM, 2019, p. 8.

[23]    Y. Ge, M. Chen, C. Liu, F. Chen, S. Huang, and H. Wang, “Deep metric learning for software change-proneness prediction,” in International Conference on Intelligent Science and Big Data Engineering. Springer, 2018, pp. 287–300.

[24]    H. Lu, Y. Zhou, B. Xu, H. Leung, and L. Chen, “The ability of object-oriented metrics to predict change-proneness: a meta-analysis,” Empirical software engineering, Vol. 17, No. 3, 2012, pp. 200–242.

[25]    M.O. Elish and M. Al-Rahman Al-Khiaty, “A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software,” Journal of Software: Evolution and Process, Vol. 25, No. 5, 2013, pp. 407–437.

[26]    R. Malhotra and M. Khanna, “An exploratory study for software change prediction in object-oriented systems using hybridized techniques,” Automated Software Engineering, Vol. 24, No. 3, 2017, pp. 673–717.

[27]    D. Romano and M. Pinzger, “Using source code metrics to predict change-prone java interfaces,” in 27th International Conference on Software Maintenance (ICSM). IEEE, 2011, pp. 303–312.

[28]    E. Giger, M. Pinzger, and H.C. Gall, “Can we predict types of code changes? An empirical analysis,” in 9th Working Conference on Mining Software Repositories (MSR). IEEE, 2012, pp. 217–226.

[29]    D. Azar and J. Vybihal, “An ant colony optimization algorithm to improve software quality prediction models: Case of class stability,” Information and Software Technology, Vol. 53, No. 4, 2011, pp. 388–393.

[30]    S. Karus and M. Dumas, “Code churn estimation using organisational and code metrics: An experimental comparison,” Information and Software Technology, Vol. 54, No. 2, 2012, pp. 203–211.

[31]    J.M. Bieman, G. Straw, H. Wang, P.W. Munger, and R.T. Alexander, “Design patterns and change proneness: An examination of five evolving systems,” in Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717). IEEE, 2004, pp. 40–49.

[32]    N. Zazworka, C. Izurieta, S. Wong, Y. Cai, C. Seaman, F. Shull et al., “Comparing four approaches for technical debt identification,” Software Quality Journal, Vol. 22, No. 3, 2014, pp. 403–426.

[33]    X. Zhu, Q. Song, and Z. Sun, “Automated identification of change-prone classes in open source software projects.” Journal of Software, Vol. 8, No. 2, 2013, pp. 361–366.

[34]    M. Lindvall, “Are large C++ classes change-prone? An empirical investigation,” Software: Practice and Experience, Vol. 28, No. 15, 1998, pp. 1551–1558.

[35]    M. Lindvall, “Measurement of change: stable and change-prone constructs in a commercial C++ system,” in Proceedings Sixth International Software Metrics Symposium. IEEE, 1999, pp. 40–49.

[36]    Y. Liu and T.M. Khoshgoftaar, “Genetic programming model for software quality classification,” in Proceedings Sixth International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking. IEEE, 2001, pp. 127–136.

[37]    M. Al-Khiaty, R. Abdel-Aal, and M.O. Elish, “Abductive network ensembles for improved prediction of future change-prone classes in object-oriented software.” International Arab Journal of Information Technology, Vol. 14, No. 6, 2017, pp. 803–811.

[38]    T.M. Khoshgoftaar, N. Seliya, and Y. Liu, “Genetic programming-based decision trees for software quality classification,” in 15th International Conference on Tools with Artificial Intelligence. IEEE, 2003, pp. 374–383.

[39]    L. Kumar, S.K. Rath, and A. Sureka, “Empirical analysis on effectiveness of source code metrics for predicting change-proneness,” in 10th Innovations in Software Engineering Conference. ACM, 2017, pp. 4–14.

[40]    N. Tsantalis, A. Chatzigeorgiou, and G. Stephanides, “Predicting the probability of change in object-oriented systems,” IEEE Transactions on Software Engineering, Vol. 31, No. 7, 2005, pp. 601–614.

[41]    L. Kumar, S.K. Rath, and A. Sureka, “Using source code metrics to predict change-prone web services: A case-study on ebay services,” in Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE). IEEE, 2017, pp. 1–7.

[42]    A.R. Sharafat and L. Tahvildari, “Change prediction in object-oriented software systems: A probabilistic approach,” Journal of Software, Vol. 3, No. 5, 2008, pp. 26–39.

[43]    L. Kumar, R.K. Behera, S. Rath, and A. Sureka, “Transfer learning for cross-project change-proneness prediction in object-oriented software systems: A feasibility analysis,” ACM SIGSOFT Software Engineering Notes, Vol. 42, No. 3, 2017, pp. 1–11.

[44]    D. Azar, “A genetic algorithm for improving accuracy of software quality predictive models: a search-based software engineering approach,” International Journal of Computational Intelligence and Applications, Vol. 9, No. 02, 2010, pp. 125–136.

[45]    R. Malhotra and R. Jangra, “Prediction and assessment of change prone classes using statistical and machine learning techniques,” Journal of Information Processing Systems, Vol. 13, No. 4, 2017, pp. 778–804.

[46]    A.R. Han, S.U. Jeon, D.H. Bae, and J.E. Hong, “Measuring behavioral dependency for improving change-proneness prediction in uml-based design models,” Journal of Systems and Software, Vol. 83, No. 2, 2010, pp. 222–234.

[47]    R. Malhotra and M. Khanna, “An empirical study for software change prediction using imbalanced data,” Empirical Software Engineering, Vol. 22, No. 6, 2017, pp. 2806–2851.

[48]    S. Eski and F. Buzluca, “An empirical study on object-oriented metrics and software evolution in order to reduce testing costs by predicting change-prone classes,” in Fourth International Conference on Software Testing, Verification and Validation Workshops. IEEE, 2011, pp. 566–571.

[49]    M. Yan, X. Zhang, C. Liu, L. Xu, M. Yang, and D. Yang, “Automated change-prone class prediction on unlabeled dataset using unsupervised method,” Information and Software Technology, Vol. 92, 2017, pp. 1–16.

[50]    A. Agrawal and R.K. Singh, “Empirical validation of OO metrics and machine learning algorithms for software change proneness prediction,” in Towards Extensible and Adaptable Methods in Computing. Springer, 2018, pp. 69–84.

[51]    C. Liu, Y. Dan, X. Xin, Y. Meng, and Z. Xiaohong, “Cross-project change-proneness prediction,” in 42nd Annual Computer Software and Applications Conference (COMPSAC). IEEE, 2018, pp. 64–73.

[52]    R. Malhotra and M. Khanna, “Investigation of relationship between object-oriented metrics and change proneness,” International Journal of Machine Learning and Cybernetics, Vol. 4, No. 4, 2013, pp. 273–286.

[53]    L. Kaur and M. Ashutosh, “A comparative analysis of evolutionary algorithms for the prediction of software change,” in International Conference on Innovations in Information Technology (IIT). IEEE, 2018, pp. 188–192.

[54]    R. Malhotra and A.J. Bansal, “Cross project change prediction using open source projects,” in International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2014, pp. 201–207.

[55]    R. Malhotra and M. Khanna, “Prediction of change prone classes using evolution-based and object-oriented metrics,” Journal of Intelligent and Robotic Systems Fuzzy Systems, Vol. 34, No. 3, 2018, pp. 1755–1766.

[56]    R. Malhotra and M. Khanna, “A new metric for predicting software change using gene expression programming,” in 5th International Workshop on Emerging Trends in Software Metrics. ACM, 2014, pp. 8–14.

[57]    R. Malhotra and M. Khanna, “Particle swarm optimization-based ensemble learning for software change prediction,” Information and Software Technology, Vol. 102, 2018, pp. 65–84.

[58]    C. Marinescu, “How good is genetic programming at predicting changes and defects?” in 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). IEEE, 2014, pp. 544–548.

[59]    M.O. Elish, H. Aljamaan, and I. Ahmad, “Three empirical studies on predicting software maintainability using ensemble methods,” Soft Computing, Vol. 19, No. 9, 2015, pp. 2511–2524.

[60]    R. Malhotra and M. Khanna, “Mining the impact of object oriented metrics for change prediction using machine learning and search-based techniques,” in International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2015, pp. 228–234.

[61]    A. Bansal, “Empirical analysis of search based algorithms to identify change prone classes of open source software,” Computer Languages, Systems and Structures, Vol. 47, 2017, pp. 211–231.

[62]    S.R. Chidamber and C.F. Kemerer, “A metrics suite for object oriented design,” IEEE Transactions on Software Engineering, Vol. 20, No. 6, 1994, pp. 476–493.

[63]    J. Bansiya and C.G. Davis, “A hierarchical model for object-oriented design quality assessment,” IEEE Transactions on Software Engineering, Vol. 28, No. 1, 2002, pp. 4–17.

[64]    M. Lorenz and J. Kidd, Object-oriented software metrics: A practical guide. Prentice-Hall, Inc., 1994.

[65]    W. Li and S. Henry, “Object-oriented metrics that predict maintainability,” Journal of Systems and Software, Vol. 23, No. 2, 1993, pp. 111–122.

[66]    K. Gao, T.M. Khoshgoftaar, and A. Napolitano, “Combining feature subset selection and data sampling for coping with highly imbalanced software data,” in Software Engineering Knowledge Engineering Conference, 2015, pp. 439–444.

[67]    H. He and E.A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, Vol. 21, No. 9, 2009, pp. 1263–1284.

[68]    C.G. Weng and J. Poon, “A new evaluation measure for imbalanced datasets,” in 7th Australian Data Mining Conference. Australian Computer Society, Inc., 2008, pp. 27–32.

[69]    M.A. De Almeida and S. Matwin, “Machine learning method for software quality model building,” in International symposium on methodologies for intelligent systems. Springer, 1999, pp. 565–573.

[70]    R. Malhotra, Empirical research in software engineering: Concepts, analysis and applications. CRC Press, 2016.

[71]    W. Afzal and R. Torkar, “On the application of genetic programming for software engineering predictive modeling: A systematic review,” Expert Systems with Applications, Vol. 38, No. 9, 2011, pp. 11984–11997.

[72]    J. Wen, S. Li, Z. Lin, Y. Hu, and C. Huang, “Systematic literature review of machine learning based software development effort estimation models,” Information and Software Technology, Vol. 54, No. 1, 2012, pp. 41–59.

©2015 e-Informatyka.pl, All rights reserved.

Built on WordPress Theme: Mediaphase Lite by ThemeFurnace.