Publikace UTB
Repozitář publikační činnosti UTB

Text-based feature selection using binary particle swarm optimization for sentiment analysis

Repozitář DSpace/Manakin

Zobrazit minimální záznam


dc.title Text-based feature selection using binary particle swarm optimization for sentiment analysis en
dc.contributor.author Botchway, Raphael Kwaku
dc.contributor.author Yadav, Vinod
dc.contributor.author Komínková Oplatková, Zuzana
dc.contributor.author Oplatková, Zuzana
dc.contributor.author Šenkeřík, Roman
dc.relation.ispartof International Conference on Electrical, Computer, and Energy Technologies, ICECET 2022
dc.identifier.isbn 978-1-6654-7087-2
dc.date.issued 2022
dc.event.title 2022 IEEE International Conference on Electrical, Computer, and Energy Technologies, ICECET 2022
dc.event.location Praha
utb.event.state-en Czech Republic
utb.event.state-cs Česká republika
dc.event.sdate 2022-07-20
dc.event.edate 2022-07-22
dc.type conferenceObject
dc.language.iso en
dc.publisher Institute of Electrical and Electronics Engineers Inc.
dc.identifier.doi 10.1109/ICECET55527.2022.9872823
dc.relation.uri https://ieeexplore.ieee.org/document/9872823
dc.relation.uri https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9872823
dc.subject BPSO en
dc.subject classification en
dc.subject feature selection en
dc.subject optimization en
dc.subject sentiment analysis en
dc.description.abstract The upsurge in social media data due to the proliferation of Web 2.0 applications has escalated scholarly studies within the sentiment analysis domain in recent times. Sentiment Analysis usually considered a text classification task in Natural Language Processing (NLP) classifies the views, attitudes, and feelings expressed by people concerning a particular organization or entity. This unstructured textual data can be pre-processed and represented as feature vectors which then serve as input to a machine learning algorithm for sentiment classification. In this process, feature selection which is a binary problem becomes an essential component of the SA exercise. We present a metaheuristic-based approach for optimal selection of features subset via the binary particle swarm optimization (BPSO) metaheuristic algorithm with the view to improve sentiment classification accuracy on the sentiment labelled sentences benchmark dataset. K-Nearest Neighbours, Naïve Bayes, and Support Vector Machine classifiers were employed as baseline classifiers to train the features. Before the sentiment classification process, the BPSO is utilized for selecting the optimal text features subset from the data. We train our sentiment labelled sentences benchmark dataset with SVM, NB, and k-NN using the selected optimal feature subset for sentiment classification. The results of the experiments conducted show impressive performance using our proposed approach for optimal text feature selection and sentiment classification compared to the baseline classifiers. © 2022 IEEE. en
utb.faculty Faculty of Applied Informatics
dc.identifier.uri http://hdl.handle.net/10563/1011149
utb.identifier.obdid 43883933
utb.identifier.scopus 2-s2.0-85138940468
utb.source d-scopus
dc.date.accessioned 2022-10-18T12:15:14Z
dc.date.available 2022-10-18T12:15:14Z
dc.description.sponsorship Tomas Bata University in Zlin, TBU: IGA/CebiaTech/2021/001
utb.contributor.internalauthor Botchway, Raphael Kwaku
utb.contributor.internalauthor Yadav, Vinod
utb.contributor.internalauthor Komínková Oplatková, Zuzana
utb.contributor.internalauthor Oplatková, Zuzana
utb.contributor.internalauthor Šenkeřík, Roman
utb.fulltext.affiliation Raphael Kwaku Botchway Faculty of Applied Informatics Tomas Bata University in Zlin Nam T.G. Masaryka 5555, 760 01 Zlin, Czech Republic botchway@utb.cz ralph.botchway@gmail.com Vinod Yadav Faculty of Applied Informatics Tomas Bata University in Zlin Nam T.G. Masaryka 5555, 760 01 Zlin, Czech Republic vyadav@utb.cz Zuzana Oplatková Komínková Faculty of Applied Informatics Tomas Bata University in Zlin Nam T.G. Masaryka 5555, 760 01 Zlin Czech Republic oplatkova@utb.cz Roman Senkerik Faculty of Applied Informatics Tomas Bata University in Zlin Nam T.G. Masaryka 5555, 760 01 Zlin, Czech Republic senkerik@utb.cz
utb.fulltext.dates Date Added to IEEE Xplore: 09 September 2022
utb.fulltext.references [1] DiMaggio, P., Hargittai, E., Neuman, W. R., & Robinson, J. P. (2001). Social implications of the Internet. Annual review of sociology, 27(1), 307-336. [2] Wang, C., & Zhang, P. (2012). The evolution of social commerce: The people, management, technology, and information dimensions. Communications of the association for information systems 31(5), 5. [3] Kumar, A., & Jaiswal, A. (2020). Systematic literature review of sentiment analysis on Twitter using soft computing techniques. Concurrency and Computation: Practice and Experience, 32(1), e5107. [4] Liu, B. (2011). Opinion mining and sentiment analysis. In Web Data Mining (pp. 459-526). Springer, Berlin, Heidelberg . [5] Liu, H., & Motoda, H. (2012). Feature selection for knowledge discovery and data mining (Vol. 454). Springer Science & Business Media. [6] Kohavi, R., & John, G. H. (1997).Wrappers for feature subset selection. Artificial intelligence, 97(1-2), 273-324. [7] Feldman, R., & Sanger, J. (2007). The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press. [8] Math, N. S. J., & Ivanovi, M. (2008). Text mining: bag-of-words document representation machine learning with textual data. October, 38(3), 227-234. [9] Botchway, R. K., Jibril, A. B., Oplatková, Z. K., & Chovancová, M. (2020). Deductions from a Sub-Saharan African Bank’s Tweets: A sentiment analysis approach. Cogent Economics & Finance, 8(1), 1776006. [10] Burnap, P., & Williams, M. L. (2015). Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & internet, 7(2), 223-242. [11] Saravanan, T. M., & Tamilarasi, A. (2016). Effective sentiment analysis for opinion mining using artificial bee colony optimization. Research Journal of Applied Sciences, Engineering and Technology, 12(8), 828-840. [12] Pandey, A. C., Kulhari, A., & Shukla, D. S. (2022). Enhancing sentiment analysis using Roulette wheel selection based cuckoo search clustering method. Journal of Ambient Intelligence and Humanized Computing, 13(1), 1-29.. [13] Zhang, L., Shan, L., & Wang, J. (2017). Optimal feature selection using distance-based discrete firefly algorithm with mutual information criterion. Neural Computing and Applications, 28(9), 2795-2808. [14] Kennedy, J., & Eberhart, R. (1995, November). Particle swarm optimization. In Proceedings of ICNN'95-international conference on neural networks (Vol. 4, pp. 1942-1948). IEEE. [15] Kennedy, J., & Eberhart, R. C. (1997, October). A discrete binary version of the particle swarm algorithm. In 1997 IEEE International conference on systems, man, and cybernetics. Computational cybernetics and simulation (Vol. 5, pp. 4104-4108). IEEE. [16] Liu, H., & Zhao, Z. (2012). Manipulating data and dimension reduction methods: Feature selection. In Computational Complexity:theory, techniques, and applications (pp. 1790-1800). Springer New York [17] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830. [18] Miranda, L. J. (2018). PySwarms: a research toolkit for Particle Swarm Optimization in Python. Journal of Open Source Software, 3(21), 433 . [19] Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information retrieval, 1(1), 69-90. [20] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. [21] Kotzias, D., Denil, M., De Freitas, N., & Smyth, P. (2015, August). From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 597-606) . [22] Vieira, S. M., Mendonça, L. F., Farinha, G. J., & Sousa, J. M. (2013). Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Applied Soft Computing, 13(8), 3494-3504 . [23] Botchway, R. K., Jibril, A. B., Kwarteng, M. A., Chovancova, M., & Oplatková, Z. K. (2019, September). A review of social media posts from UniCredit bank in Europe: a sentiment analysis approach. In Proceedings of the 3rd international conference on business and information Management (pp. 74-79).
utb.fulltext.sponsorship Funding Agency: IGA/CebiaTech/2021/001 Funding for this research was sourced from the Internal Grant Agency of Tomas Bata University in Zlin under project no. IGA/CebiaTech/2021/001. The work was further supported by the resources of the A.I lab at the Faculty of Applied Informatics, Tomas Bata University in Zlin (ailab.fai.utb.cz).
utb.scopus.affiliation Faculty of Applied Informatics Tomas Bata, University in Zlin, Nam T.G. Masaryka 5555, Zlin, 76001, Czech Republic
utb.fulltext.projects IGA/CebiaTech/2021/001
utb.fulltext.faculty Faculty of Applied Informatics
utb.fulltext.faculty Faculty of Applied Informatics
utb.fulltext.faculty Faculty of Applied Informatics
utb.fulltext.faculty Faculty of Applied Informatics
utb.fulltext.ou -
utb.fulltext.ou -
utb.fulltext.ou -
utb.fulltext.ou -
Find Full text

Soubory tohoto záznamu

Zobrazit minimální záznam