Publikace UTB
Repozitář publikační činnosti UTB

Evaluating subset selection methods for use case points estimation

Repozitář DSpace/Manakin

Zobrazit minimální záznam


dc.title Evaluating subset selection methods for use case points estimation en
dc.contributor.author Šilhavý, Radek
dc.contributor.author Šilhavý, Petr
dc.contributor.author Prokopová, Zdenka
dc.relation.ispartof Information and Software Technology
dc.identifier.issn 0950-5849 Scopus Sources, Sherpa/RoMEO, JCR
dc.date.issued 2018
utb.relation.volume 97
dc.citation.spage 1
dc.citation.epage 9
dc.type article
dc.language.iso en
dc.publisher Elsevier Science BV
dc.identifier.doi 10.1016/j.infsof.2017.12.009
dc.relation.uri https://www.sciencedirect.com/science/article/pii/S0950584917305153
dc.subject Software Development Effort Estimation en
dc.subject Software size estimation en
dc.subject Clustering techniques en
dc.subject Spectral Clustering en
dc.subject K-means en
dc.subject Moving Window en
dc.subject Use Case Points en
dc.description.abstract When the Use Case Points method is used for software effort estimation, users are faced with low model accuracy which impacts on its practical application. This study investigates the significance of using subset selection methods for the prediction accuracy of Multiple Linear Regression models, obtained by the stepwise approach. K-means, Spectral Clustering, the Gaussian Mixture Model and Moving Window are evaluated as appropriate subset selection techniques. The methods were evaluated according to several evaluation criteria and then statistically tested. Evaluation was performing on two independent datasets-which differ in project types and size. Both were cut by the hold-out method. If clustering were used, the training sets were clustered into 3 classes; and, for each of class, an independent regression model was created. These were later used for the prediction of testing sets. If Moving Window was used, then window of sizes 5, 10 and 15 were tested. The results show that clustering techniques decrease prediction errors significantly when compared to Use Case Points or moving windows methods. Spectral Clustering was selected as the best-performing solution, because it achieves a Sum of Squared Errors reduction of 32% for the first dataset, and 98% for the second dataset. The Mean Absolute Percentage Error is less than 1% for the second dataset for Spectral Clustering; 9% for moving window; and 27% for Use Case Points. When the first dataset is used, then prediction errors are significantly higher -53% for Spectral Clustering, but Use Case Points produces a 165% result. It can be concluded that this study proves subset selection techniques as a significant method for improving the prediction ability of linear regression models - which are used for software development effort prediction. It can also be concluded that the clustering method performs better than the moving window method. en
utb.faculty Faculty of Applied Informatics
dc.identifier.uri http://hdl.handle.net/10563/1007858
utb.identifier.obdid 43878637
utb.identifier.scopus 2-s2.0-85039969351
utb.identifier.wok 000428008600001
utb.source j-wok
dc.date.accessioned 2018-04-23T15:01:48Z
dc.date.available 2018-04-23T15:01:48Z
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.access openAccess
utb.contributor.internalauthor Šilhavý, Radek
utb.contributor.internalauthor Šilhavý, Petr
utb.contributor.internalauthor Prokopová, Zdenka
utb.fulltext.affiliation Radek Silhavy *, Petr Silhavy, Zdenka Prokopova Tomas Bata University in Zlin, Faculty of Applied Infomatics, Nad Stranemi 4511, Zlin 76001, Czech Republic * Corresponding author. E-mail addresses: radek@silhavy.cz, rsilhavy@fai.utb.cz (R. Silhavy).
utb.fulltext.dates Received 30 June 2017 Received in revised form 18 December 2017 Accepted 21 December 2017 Available online 28 December 2017
utb.fulltext.references [1] G. Karner, Metrics For Objectory', Diploma, University of Linkoping, Sweden, December 1993, p. 21 No. LiTH-IDA-Ex-9344. [2] R Silhavy, P Silhavy, Z Prokopova, Algorithmic optimisation method for improving use case points estimation, PloS ONE 10 (11) (2015) e0141887. [3] M. Ochodek, B. Alchimowicz, J. Jurkiewicz, J. Nawrocki, Improving the reliability of transaction identification in use cases, Inf. Softw. Technol. 53 (8) (2011) 885–897, http://dx.doi.org/10.1016/J.Infsof.2011.02.004 PubMed PMID: WOS:000292176300007. [4] A.B. Nassif, D Ho, L.F. Capretz, Towards an early software estimation using log-linear regression and a multilayer perceptron model, J. Syst. Softw. 86 (1) (2013) 144–160. [5] V. Anandhi, R.M. Chezian, Regression techniques in software effort estimation using cocomo dataset, International Conference on Intelligent Computing Applications (Icica 2014), 2014, pp. 353–357, , http://dx.doi.org/10.1109/Icica.2014.79 PubMed PMID: WOS:000358253500072. [6] M. Ochodek, J. Nawrocki, K. Kwarciak, Simplifying effort estimation based on use case points, Inf. Softw. Technol. 53 (3) (2011) 200–213, http://dx.doi.org/10.1016/j.infsof.2010.10.005. [7] M. Jorgensen, Regression models of software development effort estimation accuracy and bias, Empir. Softw. Eng. 9 (4) (2004) 297–314, http://dx.doi.org/10.1023/B:EMSE.0000039881.57613 cb. PubMed PMID: WOS:000224569200003. [8] T. Urbanek, Z. Prokopova, R. Silhavy, V. Vesela, Prediction accuracy measurements as a fitness function for software effort estimation, SpringerPlus 4 (2015) 17, http://dx.doi.org/10.1186/s40064-015-1555-9 PubMed PMID: WOS:000368718000002. [9] M. Jorgensen, M. Shepperd, A systematic review of software development cost estimation studies, IEEE T Softw. Eng. 33 (1) (2007) 33–53, http://dx.doi.org/10.1109/Tse.2007.256943 PubMed PMID: WOS:000242312200003. [10] J.F. Wen, S.X. Li, Z.Y. Lin, Y. Hu, C.Q. Huang, Systematic literature review of machine learning based software development effort estimation models, Inf. Softw. Technol. 54 (1) (2012) 41–59, http://dx.doi.org/10.1016/j.infsof.2011.09.002 PubMed PMID: WOS:000297871500003. [11] R. Silhavy, P. Silhavy, Z. Prokopova, Analysis and selection of a regression model for the use case points method using a stepwise approach, J. Syst. Softw. 125 (2017) 1–14 http://dx.doi.org/10.1016/j.jss.2016.11.029. [12] A. Idri, F.A. Amazal, A. Abran, Analogy-based software development effort estimation: a systematic mapping and review, Inf. Softw. Technol. 58 (2015) 206–230, http://dx.doi.org/10.1016/j.infsof.2014.07.013 PubMed PMID: WOS:000347022800012. [13] A. Nassif, M. Azzeh, L. Capretz, D. Ho, Neural network models for software development effort estimation: a comparative study, Neural Comput. Appl. (2015) 1–13, http://dx.doi.org/10.1007/s00521-015-2127-1. [14] M. Azzeh, A.B. Nassif, A hybrid model for estimating software project effort from use case points, Appl. Soft Comput. (2016). [15] V.K. Bardsiri, D.N.A. Jawawi, S.Z.M. Hashim, E. Khatibi, Increasing the accuracy of software development effort estimation using projects clustering, Iet Softw. 6 (6) (2012) 461–473, http://dx.doi.org/10.1049/iet-sen.2011.0210 PubMed PMID: WOS:000310517200001. [16] Z. Prokopova, R. Silhavy, P. Silhavy, The effects of clustering to software size estimation for the use case points methods, Adv. Intell. Syst. Comput. (2017) 479–490. [17] V.K. Bardsiri, D.N.A. Jawawi, S.Z.M. Hashim, E. Khatibi, A flexible method to estimate the software development effort based on the classification of projects and localization of comparisons, Empir. Softw. Eng. 19 (4) (2014) 857–884, http://dx.doi.org/10.1007/s10664-013-9241-4 PubMed PMID: WOS:000336388500003. [18] J. Kennedy, R. Eberhart, Particle swarm optimization, 1995 IEEE International Conference on Neural Networks Proceedings, 1–6 1995, pp. 1942–1948, , http://dx.doi.org/10.1109/Icnn.1995.488968 PubMed PMID: WOS:A1995BF46H00374. [19] J. Hihn, L. Juster, J. Johnson, T. Menzies, G. Michael, Improving and expanding NASA software cost estimation methods, Aerospace Conference, 2016 IEEE, IEEE, 2016. [20] C. Lokan, E. Mendes, Applying moving windows to software effort estimation, Int. Symp. Emp. Softw. (2009) 111–122 PubMed PMID: WOS:000274866100011. [21] S. Amasaki, C. Lokan, The effects of moving windows to software estimation: comparative study on linear regression and estimation by analogy, Proceedings of the 2012 Joint Conference of the 22nd International Workshop on Software Measurement and the 2012 Seventh International Conference on Software Process and Product Measurement (Iwsm-Mensura 2012), 2012, pp. 23–32, , http://dx.doi.org/10.1109/Iwsm-Mensura.2012.13 PubMed PMID: WOS:000317102600006. [22] A. Saxena, M. Prasad, A. Gupta, N. Bharill, O.P. Patel, A. Tiwari, et al., A review of clustering techniques and developments, Neurocomputing 267 (Supplement C) (2017) 664–681 https://doi.org/10.1016/j.neucom.2017.06.053. [23] C. Lokan, E. Mendes, Investigating the use of moving windows to improve software effort prediction: a replicated study, Empir. Softw. Eng. 22 (2) (2017) 716–767, http://dx.doi.org/10.1007/s10664-016-9446-4 PubMed PMID: WOS:000399891400004. [24] S. Amasaki, C. Lokan, Evaluation of moving window policies with CART, Int. Worksh. Empir. Eng. (2016) 24–29, http://dx.doi.org/10.1109/Iwesep.2016.10 PubMed PMID: WOS:000381744800005. [25] S. Amasaki, C. Lokan, On applicability of fixed-size moving windows for ANN-based effort estimation, Proceedings of 2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (Iwsm-Mensura), 2016, pp. 213–218, , http://dx.doi.org/10.1109/IWSM-Mensura.2016.31 PubMed PMID: WOS:000399139200029. [26] M. Azzeh, A. Nassif, S. Banitaan, F. Almasalha, Pareto efficient multi-objective optimization for local tuning of analogy-based estimation, Neural Comput. Appl. (2015) 1–25, http://dx.doi.org/10.1007/s00521-015-2004-y. [27] R. Silhavy, P. Silhavy, Z. Prokopova, Applied least square regression in use case estimation precision tuning, Software Engineering in Intelligent Systems, Springer International Publishing, 2015, pp. 11–17. [28] A. de Myttenaere, B. Golden, B. Le Grand, F. Rossi, Mean absolute percentage error for regression models, Neurocomputing 192 (Supplement C) (2016) 38–48 https://doi.org/10.1016/j.neucom.2015.12.114. [29] B.A. Kitchenham, L.M. Pickard, S.G. MacDonell, M.J. Shepperd, What accuracy statistics really measure [software estimation], Softw. IEE Proc. 148 (3) (2001) 81–85, http://dx.doi.org/10.1049/ip-sen:20010506. [30] M. Shepperd, M. Cartwright, G. Kadoda, On building prediction systems for software engineers, Empir. Softw. Eng. 5 (3) (2000) 175–182, http://dx.doi.org/10.1023/a:1026582314146. [31] G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, Springer, 2013. [32] D. Reynolds, Gaussian Mixture Models, in: SZ Li, A Jain (Eds.), Encyclopedia of Biometrics. Boston, MA: Springer US, 2009, pp. 659–663. [33] U. von Luxburg, A tutorial on spectral clustering, Stat. Comput. 17 (4) (2007) 395–416, http://dx.doi.org/10.1007/s11222-007-9033-z. [34] M. Soltanolkotabi, E. Elhamifar, E.J. Candes, Robust subspace clustering, Ann. Stat. 42 (2) (2014) 669–699, http://dx.doi.org/10.1214/13-Aos1199 PubMed PMID: WOS:000336888400014. [35] T. Li, S. Zhu, M. Ogihara, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl. Inf. Syst. 10 (4) (2006) 453–472, http://dx.doi.org/10.1007/s10115-006-0013-y. [36] B. Kitchenham, S.L. Pfleeger, B. McColl, S. Eagan, An empirical study of maintenance and development estimation accuracy (vol 64, pg 57, 2002), J. Syst. Softw. 74 (2) (2005) 227, http://dx.doi.org/10.1016/j.jss.2004.07.010 PubMed PMID: WOS:000224874700010. [37] A Subriadi, P Ningrum, Critical review of the effort rate value in use case point method for estimating software development effort, J. Theoretical Appl. Inf. Technol. 59 (3) (2014) 735–744.
utb.fulltext.sponsorship -
utb.wos.affiliation [Silhavy, Radek; Silhavy, Petr; Prokopova, Zdenka] Tomas Bata Univ Zlin, Fac Appl Infomat, Nad Stranemi 4511, Zlin 76001, Czech Republic
utb.scopus.affiliation Tomas Bata University in Zlin, Faculty of Applied Infomatics, Nad Stranemi 4511, Zlin, Czech Republic
utb.fulltext.projects -
Find Full text

Soubory tohoto záznamu

Zobrazit minimální záznam

Attribution-NonCommercial-NoDerivatives 4.0 International Kromě případů, kde je uvedeno jinak, licence tohoto záznamu je Attribution-NonCommercial-NoDerivatives 4.0 International