e-Informatica Software Engineering Journal ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers

ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers


Sangeeta Lal and Neetu Sardana and Ashish Sureka


Background: Software developers insert log statements in the source code to record program execution information. However, optimizing the number of log statements in the source code is challenging. Machine learning based within-project logging prediction tools, proposed in previous studies, may not be suitable for new or small software projects. For such software projects, we can use cross-project logging prediction. Aim: The aim of the study presented here is to investigate cross-project logging prediction methods and techniques. Method: The proposed method is ECLogger, which is a novel, ensemble-based, cross-project, catch-block logging prediction model. In the research We use 9 base classifiers were used and combined using ensemble techniques. The performance of ECLogger was evaluated on on three open-source Java projects: Tomcat, CloudStack and Hadoop. Results: ECLogger Bagging, ECLogger AverageVote, and ECLogger MajorityVote show a considerable improvement in the average Logged F-measure (LF) on 3, 5, and 4 source -> target project pairs, respectively, compared to the baseline classifiers. ECLogger AverageVote performs best and shows improvements of 3.12%  (average LF) and 6.08% (average ACC – Accuracy). Conclusion: The classifier based on ensemble techniques, such as bagging, average vote, and majority vote outperforms the baseline classifier. Overall, the ECLogger AverageVote model performs best. The results show that the CloudStack project is more generalizable than the other projects.


  1. D. Yuan, S. Park, and Y. Zhou, “Characterizing logging practices in open-source software,” in Proceedings of the 34th International Conference on Software Engineering, 2012, pp. 102–112.
  2. B. Sharma, V. Chudnovsky, J.L. Hellerstein, R. Rifaat, and C.R. Das, “Modeling and synthesizing task placement constraints in google compute clusters,” in Proceedings of the 2Nd ACM Symposium on Cloud Computing. New York, NY, USA: ACM, 2011, pp. 3:1–3:14. [Online]. http://doi.acm.org/10.1145/2038916.2038919
  3. K. Nagaraj, C. Killian, and J. Neville, “Structured comparative analysis of systems logs to diagnose performance problems,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, 2012, pp. 26–26.
  4. Q. Fu, J.G. Lou, Y. Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in Proceedings of the 2009 Ninth IEEE International Conference on Data Mining. Washington, DC, USA: IEEE Computer Society, 2009, pp. 149–158. [Online]. http://dx.doi.org/10.1109/ICDM.2009.60
  5. Z.M. Jiang, A.E. Hassan, G. Hamann, and P. Flora, “Automatic identification of load testing problems,” in IEEE International Conference on Software Maintenance, 2008, pp. 307–316.
  6. Z.M. Jiang, A.E. Hassan, G. Hamann, and P. Flora, “Automated performance analysis of load tests,” in IEEE International Conference on Software Maintenance, 2009, pp. 125–134.
  7. Blackberry enterprise server logs submission, [Online; accessed 4-June-2016]. [Online]. BlackBerryEnterpriseServerLogsSubmission
  8. Q. Fu, J. Zhu, W. Hu, J.G. Lou, R. Ding, Q. Lin, D. Zhang, and T. Xie, “Where do developers log? An empirical study on logging practices in industry,” in Companion Proceedings of the 36th International Conference on Software Engineering, 2014, pp. 24–33.
  9. S. Lal, N. Sardana, and A. Sureka, “LogOptPlus: Learning to optimize logging in catch and if programming constructs,” in 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1, Jun. 2016, pp. 215–220.
  10. S. Lal and A. Sureka, “LogOpt: Static feature extraction from source code for automated catch block logging prediction,” in 9th India Software Engineering Conference (ISEC), 2016, pp. 151–155.
  11. J. Zhu, P. He, Q. Fu, H. Zhang, M. Lyu, and D. Zhang, “Learning to log: Helping developers make informed logging decisions,” in Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on, Vol. 1, May 2015, pp. 415–425.
  12. Top tomcat performance problems part 2: Bad coding, inefficient logging and exceptions, [Online; accessed 31-May-2015]. [Online]. http://apmblog.dynatrace.com/2016/03/08/ top-tomcat-performance-problems-part-2-badcoding- inefficient-logging-exceptions/
  13. W. Shang, M. Nagappan, and A.E. Hassan, “Studying the relationship between logging characteristics and the code quality of platform software,” Empirical Software Engineering, Vol. 20, No. 1, 2015, pp. 1–27. [Online]. http://dx.doi.org/10.1007/s10664-013-9274-8
  14. J. Nam, S.J. Pan, and S. Kim, “Transfer defect learning,” in 2013 35th International Conference on Software Engineering (ICSE), May 2013, pp. 382–391.
  15. M. Ayse Tosun, B. Ayse Basar, and T. Burak, “An industrial case study of classifier ensembles for locating software defects,” Software Quality Journal, Vol. 19, No. 3, 2011, pp. 515–536. [Online]. http://dx.doi.org/10.1007/s11219-010-9128-1
  16. L. Mariani and F. Pastore, “Automated identification of failure causes in system logs,” in Software Reliability Engineering, 2008. ISSRE 2008. 19th International Symposium on, Nov. 2008, pp. 117–126.
  17. D. Yuan, H. Mai, W. Xiong, L. Tan, Y. Zhou, and S. Pasupathy, “SherLog: error diagnosis by connecting clues from run-time logs,” in Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems. New York, NY, USA: ACM, 2010, pp. 143–154.
  18. W. Xu, L. Huang, A. Fox, D. Patterson, and M.I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, 2009, pp. 117–132.
  19. M. Montanari, J.H. Huh, D. Dagit, R. Bobba, and R.H. Campbell, “Evidence of log integrity in policy-based security monitoring,” in DSN Workshops. IEEE, 2012, pp. 1–6.
  20. G. Lee, J. Lin, C. Liu, A. Lorek, and D. Ryaboy, “The unified logging infrastructure for data analytics at twitter,” Proc. VLDB Endow., Vol. 5, No. 12, Aug. 2012, pp. 1771–1780. [Online]. http://dx.doi.org/10.14778/2367502.2367516
  21. Logstash, Logstash homepage, [Online; accessed 27-July-2016]. [Online]. https://www.elastic.co/ products/logstash/
  22. Splunk, Splunk homepage, [Online; accessed 27-July-2016]. [Online]. http://www.splunk. com/
  23. S. Kabinna, C.P. Bezemer, W. Shang, and A.E. Hassan, “Examining the stability of logging statements,” in The 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2016.
  24. D. Yuan, J. Zheng, S. Park, Y. Zhou, and S. Savage, “Improving software diagnosability via log enhancement,” in Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems. New York, NY, USA: ACM, 2011, pp. 3–14. [Online]. http://doi.acm.org/10.1145/1950365.1950369
  25. D. Yuan, S. Park, P. Huang, Y. Liu, M.M. Lee, X. Tang, Y. Zhou, and S. Savage, “Be conservative: Enhancing failure diagnosis with proactive logging,” in Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, 2012, pp. 293–306. [Online]. http: //dl.acm.org/citation.cfm?id=2387880.2387909
  26. 10 tips for proper application logging, [Online; accessed 19-Oct-2015]. [Online]. http://www.javacodegeeks.com/2011/01/ 10-tips-proper-application-logging.html
  27. Why does the TRACE level exists, and when should I use it rather than DEBUG?, [Online; accessed 22-Oct-2015]. [Online]. http: //programmers.stackexchange.com/questions/ 279690/why-does-the-trace-level-exists-andwhen- should-i-use-it-rather-than-debug
  28. T. Menzies, J. Greenwald, and A. Frank, “Data mining static code attributes to learn defect predictors,” Software Engineering, IEEE Transactions on, Vol. 33, No. 1, Jan. 2007, pp. 2–13.
  29. S. Kim, E.J.W. Jr., and Y. Zhang, “Classifying software changes: Clean or buggy?” IEEE Transactions on Software Engineering, Vol. 34, No. 2, Mar. 2008, pp. 181–196.
  30. Y. Zhang, D. Lo, X. Xia, and J. Sun, “An empirical study of classifier combination for cross-project defect prediction,” in Computer Software and Applications Conference (COMPSAC), 2015 IEEE 39th Annual, Vol. 2, Jul. 2015, pp. 264–269.
  31. Y. Hu, X. Zhang, E. Ngai, R. Cai, and M. Liu, “Software project risk analysis using Bayesian networks with causality constraints,” Decision Support Systems, Vol. 56, 2013, pp. 439–449.
  32. X. Xia, D. Lo, X. Wang, X. Yang, S. Li, and J. Sun, “A comparative study of supervised learning algorithms for re-opened bug prediction,” in 17th European Conference on Software Maintenance and Reengineering (CSMR). IEEE, 2013, pp. 331–334.
  33. T.G. Dietterich, “Ensemble learning,” in The handbook of brain theory and neural networks, 2nd ed., M.A. Arbib, Ed. MIT Press: Cambridge, MA, 2002, pp. 405–408.
  34. Z.H. Zhou, “Ensemble learning,” Encyclopedia of Biometrics, 2015, pp. 411–416.
  35. L. Breiman, “Bagging predictors,” Machine Learning, Vol. 24, No. 2, 1996, pp. 123–140. [Online]. http://dx.doi.org/10.1023/A: 1018054314350
  36. L. Breiman, “Random forests,” Mach. Learn., Vol. 45, No. 1, Oct. 2001, pp. 5–32. [Online]. http://dx.doi.org/10.1023/A:1010933404324
  37. Y. Freund and R.E. Schapire, “A decision- theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and System Sciences, Vol. 55, No. 1, Aug. 1997, pp. 119–139. [Online]. http://dx.doi.org/10.1006/jcss.1997.1504
  38. J.R. Quinlan, “Bagging, boosting, and C4.S,” in Proceedings of the Thirteenth National Conference on Artificial Intelligence – Volume 1. AAAI Press, 1996, pp. 725–730. [Online]. http: //dl.acm.org/citation.cfm?id=1892875.1892983
  39. J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2011.
  40. D.H. Wolpert, “Stacked generalization,” Neural networks, Vol. 5, No. 2, 1992, pp. 241–259.
  41. A. Panichella, R. Oliveto, and A.D. Lucia, “Cross-project defect prediction models: L’union fait la force,” in IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week, Feb. 2014, pp. 164–173.
  42. X. Xia, D. Lo, E. Shihab, X. Wang, and X. Yang, “ELBlocker: Predicting blocking bugs with ensemble imbalance learning,” Information and Software Technology, Vol. 61, 2015, pp. 93–106. [Online]. http://www.sciencedirect.com/ science/article/pii/S0950584914002602
  43. W. Dai, Q. Yang, G.R. Xue, and Y. Yu, “Boosting for transfer learning,” in Proceedings of the 24th International Conference on Machine Learning. New York, NY, USA: ACM, 2007, pp. 193–200. [Online]. http://doi.acm.org/10.1145/1273496.1273521
  44. S.J. Pan, I.W. Tsang, J.T. Kwok, and Q. Yang, “Domain adaptation via transfer component analysis,” IEEE Transactions on Neural Networks, Vol. 22, No. 2, Feb. 2011, pp. 199–210.
  45. X. Xia, D. Lo, S. McIntosh, E. Shihab, and A.E. Hassan, “Cross-project build co-change prediction,” in 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), March 2015, pp. 311–320.
  46. S.J. Pan, X. Ni, J.T. Sun, Q. Yang, and Z. Chen, “Cross-domain sentiment classification via spectral feature alignment,” in Proceedings of the 19th international conference on World wide web. ACM, 2010, pp. 751–760.
  47. Y. Freund and R.E. Schapire, “Experiments with a new boosting algorithm,” 1996. [Online]. http://www.public.asu.edu/~jye02/CLASSES/ Fall-2005/PAPERS/boosting-icml.pdf
  48. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I.H. Witten, “The WEKA data mining software: An update,” SIGKDD Explor. Newsl., Vol. 11, No. 1, Nov. 2009, pp. 10–18. [Online]. http://doi.acm.org/10.1145/1656274.1656278
  49. M. Sewell, “Ensemble learning,” RN, Vol. 11, No. 02, 2008.
  50. Y. Freund and L. Mason, “The alternating decision tree learning algorithm,” in icml, Vol. 99, 1999, pp. 124–133.
  51. K. Murphy, A brief introduction to graphical models and bayesian networks, [Online; accessed 20-March-2016]. [Online]. http://www.cs.ubc. ca/~murphyk/Bayes/bnintro.html
  52. T.D. Nielsen and F.V. Jensen, Bayesian networks and decision graphs. Springer Science & Business Media, 2009.
  53. R. Kohavi, “The power of decision tables,” in Machine Learning: ECML-95. Springer, 1995, pp. 174–189.
  54. G.H. John, R. Kohavi, K. Pfleger et al., “Irrelevant features and the subset selection problem,” in Machine learning: proceedings of the eleventh international conference, 1994, pp. 121–129.
  55. A. Padhye, Classification methods, [Online; accessed 20-March-2016]. [Online]. http: //www.d.umn.edu/~padhy005/Chapter5.html
  56. D.W. Hosmer and S. Lemeshow, “Introduction to the logistic regression model,” Applied Logistic Regression, Second Edition, 2000, pp. 1–30.
  57. D.D. Lewis, “Naive (Bayes) at forty: The independence assumption in information retrieval,” in Proceedings of the 10th European Conference on Machine Learning. London, UK, UK: Springer-Verlag, 1998, pp. 4–15. [Online]. http: //dl.acm.org/citation.cfm?id=645326.649711
  58. S. Shivaji, E.J. Whitehead, R. Akella, and S. Kim, “Reducing features to improve code change-based bug prediction,” IEEE Transactions on Software Engineering, Vol. 39, No. 4, 2013, pp. 552–569.
  59. M.D. Buhmann and M.D. Buhmann, Radial Basis Functions. New York, NY, USA: Cambridge University Press, 2003.
  60. Python NLTK library, [Online; accessed 19-March-2016]. [Online]. http://www.nltk.org/
  61. Java regains spot as most popular language in developer index, [Online; accessed 19-March-2016]. [Online]. http://www.infoworld.com/article/ 2909894/application-development/java-backat- 1-in-language-popularity-assessment.html
  62. Apache, Apache project homepage, [Online; accessed 18-March-2016]. [Online]. https://commons.apache.org/proper/ commons-logging/
  63. Cloudstack, Cloudstack project homepage, [Online; accessed 18-March-2016]. [Online]. https://cloudstack.apache.org/
  64. Hadoop, Hadoopt project homepage, [Online; accessed 18-March-2016]. [Online]. http://hadoop.apache.org/
  65. B. Chen and Z.M. (Jack) Jiang, “Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation,” Empirical Software Engineering, 2016, pp. 1–45. [Online]. http://dx.doi.org/10.1007/s10664-016-9429-5
  66. D. Correa and A. Sureka, “Chaff from the wheat: Characterization and modeling of deleted questions on stack overflow,” in Proceedings of the 23rd International Conference on World Wide Web. New York, NY, USA: ACM, 2014, pp. 631–642. [Online]. http://doi.acm.org/10.1145/2566486.2568036
  67. S. Shivaji, E.J.W. Jr., R. Akella, and S. Kim, “Reducing features to improve bug prediction,” in Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. Washington, DC, USA: IEEE Computer Society, 2009, pp. 600–604. [Online]. http://dx.doi.org/10.1109/ASE.2009.76
  68. C.D. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. New York, NY, USA: Cambridge University Press, 2008.
  69. Y. Tian, J. Lawall, and D. Lo, “Identifying Linux bug fixing patches,” in Proceedings of the 34th International Conference on Software Engineering. Piscataway, NJ, USA: IEEE Press, 2012, pp. 386–396. [Online]. http: //dl.acm.org/citation.cfm?id=2337223.2337269
  70. H. Valdivia Garcia and E. Shihab, “Characterizing and predicting blocking bugs in open source projects,” in Proceedings of the 11th Working Conference on Mining Software Repositories. New York, NY, USA: ACM, 2014, pp. 72–81. [Online]. http://doi.acm.org/10.1145/2597073.2597099
  71. F. Zhang, Q. Zheng, Y. Zou, and A.E. Hassan, “Cross-project defect prediction using a connectivity-based unsupervised classifier,” in Proceedings of the 38th International Conference on Software Engineering. New York, NY, USA: ACM, 2016, pp. 309–320. [Online]. http://doi.acm.org/10.1145/2884781.2884839
  72. G. Zhou, D. Shen, J. Zhang, J. Su, and S. Tan, “Recognition of protein/gene names from text using an ensemble of classifiers,” BMC bioinformatics, Vol. 6, No. 1, 2005, p. 1.
  73. R.F. Satin, I.S. Wiese, and R. Ré, “An exploratory study about the cross-project defect prediction: Impact of using different classification algorithms and a measure of performance in building predictive models,” in Computing Conference (CLEI), 2015 Latin American. IEEE, 2015, pp. 1–12.
  74. A. Jordan, “On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes,” Advances in neural information processing systems, Vol. 14, 2002, p. 841.
  75. StatSoft, Neural networks, [Online; accessed 30-July-2016]. [Online]. http://www.fmi.uni-sofia.bg/fmi/statist/ education/textbook/eng/stneunet.html#radial
  76. T.G. Dietterich, “Ensemble methods in machine learning,” in Proceedings of the First International Workshop on Multiple Classifier Systems. London, UK, UK: Springer-Verlag, 2000, pp. 1–15. [Online]. http: //dl.acm.org/citation.cfm?id=648054.743935
  77. S.B. Kotsiantis, “Supervised machine learning: A review of classification techniques,” in Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies. Amsterdam, The Netherlands, The Netherlands: IOS Press, 2007, pp. 3–24. [Online]. http: //dl.acm.org/citation.cfm?id=1566770.1566773
  78. Z.H. Zhou, Ensemble methods: foundations and algorithms. CRC press, 2012.
  79. R.T. Guy, P. Santago, and C.D. Langefeld, “Bootstrap aggregating of alternating decision trees to detect sets of SNPs that associate with disease,” Genetic epidemiology, Vol. 36, No. 2, 2012, pp. 99–106.
  80. E. Bauer and R. Kohavi, “An empirical comparison of voting classification algorithms: Bagging, boosting, and variants,” Machine learning, Vol. 36, No. 1-2, 1999, pp. 105–139.
  81. G. Brown and L.I. Kuncheva, “Good and bad diversity in majority vote ensembles,” in International Workshop on Multiple Classifier Systems. Springer, 2010, pp. 124–133.
  82. P.R. Campos, V.M. de Oliveira, and F.B. Moreira, “Small-world effects in the majority-vote model,” Physical Review E, Vol. 67, No. 2, 2003, p. 026104.
  83. L.I. Kuncheva, C.J. Whitaker, C.A. Shipp, and R.P. Duin, “Limits on the majority vote accuracy in classifier fusion,” Pattern Analysis & Applications, Vol. 6, No. 1, 2003, pp. 22–31.
  84. Sheng, Cloudstack and hadoop: A match made in the cloud, [Online; accessed 27-July-2016]. [Online]. http://nosql.mypopescu.com/post/ 20461845393/cloudstack-and-hadoop-a-matchmade- in-the-cloud#fn:2-fn-Sheng/
  85. CloudStack, Additional installation options, [Online; accessed 27-July-2016]. [Online]. http://docs.cloudstack.apache.org/projects/ cloudstack-installation/en/4.6/optional_ installation.html/
  86. M. Mitchell, An introduction to genetic algorithms. MIT press, 1998.
  87. C. Zhai and S. Massung, Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining. Association for Computing Machinery and Morgan & Claypool Publishers, 2016. [Online]. https://books. google.co.in/books?id=0zq0DAAAQBAJ
  88. L. Jonsson, M. Borg, D. Broman, K. Sandahl, S. Eldh, and P. Runeson, “Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts,” Empirical Software Engineering, 2015, pp. 1–46.
  89. M. Borg, “TuneR: a framework for tuning software engineering tools with hands-on instructions in R,” Journal of Software: Evolution and Process, Vol. 28, No. 6, 2016, pp. 427–459.
[1]Sangeeta Lal, Neetu Sardana, Ashish Sureka, "ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers", In e-Informatica Software Engineering Journal, vol. 11, iss. 1, pp. 9-40, 2017. [bibtex] [pdf] [doi]

©2015 e-Informatyka.pl, All rights reserved.

Built on WordPress Theme: Mediaphase Lite by ThemeFurnace.