Mining Top-K high utility itemset using bio-inspired algorithms

Pham, Ngoc Nam; Komínková Oplatková, Zuzana; Huynh, Minh Huy; Vo, Bay

dc.title	Mining Top-K high utility itemset using bio-inspired algorithms	en
dc.contributor.author	Pham, Ngoc Nam
dc.contributor.author	Komínková Oplatková, Zuzana
dc.contributor.author	Huynh, Minh Huy
dc.contributor.author	Vo, Bay
dc.relation.ispartof	2022 IEEE Workshop on Complexity in Engineering, COMPENG 2022
dc.identifier.issn	2688-2582 Scopus Sources, Sherpa/RoMEO, JCR
dc.identifier.issn	2688-2566 Scopus Sources, Sherpa/RoMEO, JCR
dc.identifier.isbn	978-1-7281-7124-1
dc.date.issued	2022
dc.event.title	2022 IEEE Workshop on Complexity in Engineering, COMPENG 2022
dc.event.location	Florence
utb.event.state-en	Italy
utb.event.state-cs	Itálie
dc.event.sdate	2022-07-18
dc.event.edate	2022-07-20
dc.type	conferenceObject
dc.language.iso	en
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.identifier.doi	10.1109/COMPENG50184.2022.9905433
dc.relation.uri	https://ieeexplore.ieee.org/document/9905433
dc.relation.uri	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9905433
dc.subject	bio-inspired algorithm	en
dc.subject	top-k high utility itemset mining	en
dc.subject	binary particle swarm optimization	en
dc.description.abstract	High utility itemset (HUI) mining is a necessary research problem in the field of knowledge discovery and data mining. Many algorithms for Top-K HUI mining have been proposed. However, the principal issue with these algorithms is that they need to store potential top-k patterns in the memory anytime, and they request the minimum utility threshold to automatically rise when finding HUIs. Consequently, the performance of existing exact algorithms for Top-K HUIs mining tends to decrease when the database size and the number of distinct items in the databases rise. To address this issue, we suggest a Binary Particle Swarm Optimization (BPSO) based algorithm for mining Top-K HUIs effectively, namely TKO-BPSO (Top-K high utility itemset mining in One phase based on Binary Particle Swarm Optimization). The main idea of TKO-BPSO is not only to use a one-phase model and strategy Raising the threshold by the Utility of Candidates (RUC) to effectively increase the border thresholds for pruning the search space but also to adopt the sigmoid function in the updating process of the particles. This might significantly reduce the combinational problem in traditional HUIM when the database size and the number of distinct items in the databases rise. Consequently, its performance outperforms existing exact algorithms for mining Top-K HUIs because it efficiently overcomes the problem of the vast amount candidates. Substantial experiments conducted on publicly available several real and synthetic datasets show that the proposed algorithm has better results than existing state-of-the-art algorithms in terms of runtime, which can significantly reduce the combinational problem and memory usage.	en
utb.faculty	Faculty of Applied Informatics
dc.identifier.uri	http://hdl.handle.net/10563/1011256
utb.identifier.obdid	43884093
utb.identifier.scopus	2-s2.0-85141086200
utb.identifier.wok	001427034900006
utb.source	d-scopus
dc.date.accessioned	2023-01-06T08:03:59Z
dc.date.available	2023-01-06T08:03:59Z
dc.description.sponsorship	IGA/CebiaTech/022/001; Technology Agency of the Czech Republic, TACR: FW01010381
utb.contributor.internalauthor	Pham, Ngoc Nam
utb.contributor.internalauthor	Komínková Oplatková, Zuzana
utb.contributor.internalauthor	Huynh, Minh Huy
utb.fulltext.affiliation	Nam Ngoc Pham Faculty of Applied Informatics Tomas Bata University Zlín, Czech Republic npham@utb.cz Huy Minh Huynh Faculty of Applied Informatics Tomas Bata University Zlín, Czech Republic. huynh@utb.cz Zuzana Komínková Oplatková Faculty of Applied Informatics Tomas Bata University Zlín, Czech Republic oplatkova@utb.cz Bay Vo* HUTECH University Ho Chi Minh City, Vietnam vd.bay@hutech.edu.vn *Corresponding author
utb.fulltext.dates	-
utb.fulltext.references	[1] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in Proceedings of the 20th ACM International Conference on Very Large Data Bases, vol. 1215. Citeseer, 1994, pp. 487–499. [2] W. Gan et al., “A survey of utility-oriented pattern mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, pp. 1306–1327, 2021. [3] M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64, 2012. [4] H. Ryang, U. Yun, and K. Ryu, “Discovering high utility itemsets with multiple minimum supports,” Intell. Data Anal., vol. 18, no. 6, pp. 1027–1047, 2014. [5] M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in Proc. ACM Int. Conf. Inf. Knowl. Manag., 2012, pp. 55–64. [6] Y. Liu, W. Liao, and A. N. Choudhary. 2005. A two-phase algorithm for fast discovery of high utility itemsets. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 689–695. [7] S. Krishnamoorthy, “Pruning strategies for mining high utility itemsets,” Expert Systems with Applications, vol. 42, no. 5, pp. 2371–2381, 2015. [8] V. S. Tseng et al. “An efficient algorithm for high utility itemset mining” In Proceedings of the International Conference on Knowledge Discovery and Data Mining. 253–262. [9] P. Fournier-Viger, C. Wu, S. Zida, and V. S. Tseng. 2014. FHM: Faster high utility itemset mining using estimated utility co-occurrence pruning. In Proceedings of the International Symposium on Foundations of Intelligent Systems. 83–92. [10] S. Zida et al., “A highly efficient algorithm for high utility itemset mining.” In Proceedings of the Mexican International Conference on Artificial Intelligence. 530–546, 2015. [11] S. Ventura and J. M. Luna. 2016. Pattern Mining with Evolutionary Algorithms. Springer. [12] Kannimuthu, S., Premalatha, K., 2014. Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl. Artif. Intell. 28 (4), 337–359. [13] Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948. [14] Kennedy, J., Eberhart, R., 1997. A discrete binary version of particle swarm algorithm. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, pp. 4104–4108. [15] J. C.-W. Lin et al., ‘‘Mining high-utility itemsets based on particle swarm optimization,’’ Eng. Appl. Artif. Intell., vol. 55, pp. 320–330, Oct. 2016. [16] V. S. Tseng et al., ‘‘Efficient algorithms for mining high utility itemsets from transactional databases,’’ IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1772–1786, Aug. 2013. [17] W. Song and C. Huang, "Mining High Utility Itemsets Using Bio-Inspired Algorithms: A Diverse Optimal Value Framework," IEEE Access, vol. 6, pp. 19568-19582, 2018. [18] N. N. Pham ‘‘Mining high average utility pattern using bio-inspired algorithm: student research abstract,’’ in Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, April 2022, pp 445–449. [19] Wei Song, Chaomin Huang. Mining High Average-Utility Itemsets Based on Particle Swarm Optimization, UK, 2018. [20] P. Fournier-Viger et al., “The SPMF open-source data mining library version 2.” in Proceedings of the 19th European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 36-40, 2016. [21] V. S. Tseng et al., "Efficient Algorithms for Mining TopK High Utility Itemsets," in IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 1, pp. 54-67, 1 Jan. 2016. [22] Huynh, H.M et al, “Sequential Pattern Mining Using IDLists. In proceeding of the 12th International Conference, ICCCI 2020, Da Nang, Vietnam, Nov 30 – Dec3, 2020. [23] Huynh, H.M., Nguyen, L.T.T., Pham, N.N. et al, “An efficient method for mining sequential patterns with indices,” Knowledge-Based Systems 239:107946, December 2021. [24] Q. H. Duong, B. Liao, P. Fournier Viger, and T. L. Dam, “An efficient algorithm for mining the top-k high utility itemsets using novel threshold raising and pruning strategies,” Knowledge-Based Systems, vol. 104, pp. 106–122, 2016.
utb.fulltext.sponsorship	This work was supported by the Technology Agency of the Czech Republic, under the project no. FW01010381, by internal Grant Agency of Tomas Bata University under the project no. IGA/CebiaTech/022/001, and further by the resources of A.I.Lab at the Faculty of Applied Informatics, Tomas Bata University in Zlin.
utb.wos.affiliation	[Nam Ngoc Pham; Oplatkova, Zuzana Kominkova; Huy Minh Huynh] Tomas Bata Univ, Fac Appl Informat, Zlin, Czech Republic; [Bay Vo] HUTECH Univ, Ho Chi Minh City, Vietnam
utb.scopus.affiliation	Pham N.N., Tomas Bata University, Faculty of Applied Informatics, Zlín, Czech Republic; Kominkova Oplatkova Z., Tomas Bata University, Faculty of Applied Informatics, Zlín, Czech Republic; Huynh H.M., Tomas Bata University, Faculty of Applied Informatics, Zlín, Czech Republic; Vo B., Hutech University, Ho Chi Minh City, Viet Nam
utb.fulltext.projects	TAČR FW01010381
utb.fulltext.projects	IGA/CebiaTech/022/001
utb.fulltext.faculty	Faculty of Applied Informatics
utb.fulltext.faculty	Faculty of Applied Informatics
utb.fulltext.faculty	Faculty of Applied Informatics
utb.fulltext.ou	-
utb.fulltext.ou	-
utb.fulltext.ou	-