Social Media Mining with Fuzzy Text Matching: A Knowledge Extraction on Tourism After COVID-19 Pandemic

Ida Bagus Putra Manuaba, I Wayan Budi Sentana, I Nyoman Gede Arya Astawa, I Wayan Suasnawa, I Putu Bagus Arya Pradnyana

Abstract


Social media mining is an emerging technique for analyzing data to extract valuable knowledge related to various domains. However, traditional text matching techniques, such as exact matching, are not always suitable for social media data, which can contain spelling mistakes, abbreviations, and variations in the use of words. Fuzzy matching is a text matching technique that can handle such variations and identify similarities between two texts, even if there are differences in spelling or phrasing. The gap in existing research is the limited use of fuzzy matching in social media mining for tourism recovery analysis. By applying fuzzy matching to social media data related to COVID-19 and tourism recovery, this research seeks to bridge this gap and extract valuable insights related to the impact of the pandemic on tourism recovery. We manually retrieved 19,462 Twitter records and differentiated the data sources using four diver parameters to indicate data related to the impact of COVID-19 on the tourism industry, such as the economy, restrictions, government policies, and vaccination. We conducted text mining analysis on the collected 7,352 words and identified 25 highly recommended words that indicated COVID-19 recovery from a tourism perspective. We separated the four words representing the tourism perspective to perform fuzzy matching as a dataset. We then used the inbound dataset on the fuzzy matching process, with the 7,352-word data collected from the text mining process. The matching process resulted in 18 words representing COVID-19 recovery from a tourism perspective.

Full Text:

PDF

References


M. Nicola et al., “The socio-economic implications of the coronavirus pandemic (COVID-19): A review,” Int. J. Surg., vol. 78, pp. 185–193, Jun. 2020.

M. Sigala, “Tourism and COVID-19: Impacts and implications for advancing and resetting industry and research,” J. Bus. Res., vol. 117, pp. 312–321, Sep. 2020.

UNWTO, “2020: A year in review,” World Tourism Organization, 2020. (Access on 29 October 2022)

J. X. Koh and T. M. Liew, “How loneliness is talked about in social media during COVID-19 pandemic: Text mining of 4,492 Twitter feeds,” J. Psychiatr. Res., vol. 145, pp. 317–324, Jan. 2022.

A. Karami, B. Bookstaver, M. Nolan, and P. Bozorgi, “Investigating diseases and chemicals in COVID-19 literature with text mining,” Int. J. Inf. Manag. Data Insights, vol. 1, no. 2, p. 100016, Nov. 2021.

P. Carracedo, R. Puertas, and L. Marti, “Research lines on the impact of the COVID-19 pandemic on business. A text mining analysis,” J. Bus. Res., vol. 132, pp. 586–593, Aug. 2021.

K. Hou, T. Hou, and L. Cai, “Public attention about COVID-19 on social media: An investigation based on data mining and text analysis,” Pers. Individ. Dif., vol. 175, p. 110701, Jun. 2021.

J. Y. Park, E. Mistur, D. Kim, Y. Mo, and R. Hoefer, “Toward human-centric urban infrastructure: Text mining for social media data to identify the public perception of COVID-19 policy in transportation hubs,” Sustain. Cities Soc., vol. 76, p. 103524, Jan. 2022.

A. Kang et al., “Environmental management strategy in response to COVID-19 in China: Based on text mining of government open information,” Sci. Total Environ., vol. 769, p. 145158, May 2021.

S. Luo and S. Y. He, “Understanding gender difference in perceptions toward transit services across space and time: A social media mining approach,” Transp. Policy, vol. 111, pp. 63–73, Sep. 2021.

N. Nasser, L. Karim, A. El Ouadrhiri, A. Ali, and N. Khan, “n-Gram based language processing using Twitter dataset to identify COVID-19 patients,” Sustain. Cities Soc., vol. 72, p. 103048, Sep. 2021.

C. Fernandez-Basso, K. Gutiérrez-Batista, R. Morcillo-Jiménez, M.-A. Vila, and M. J. Martin-Bautista, “A fuzzy-based medical system for pattern mining in a distributed environment: Application to diagnostic and co-morbidity,” Appl. Soft Comput., vol. 122, p. 108870, Jun. 2022.

D. Rohidin, N. A. Samsudin, and M. M. Deris, “Association rules of fuzzy soft set based classification for text classification problem,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 3, pp. 801–812, Mar. 2022.

S. Rameem Zahra, M. Ahsan Chishti, A. Iqbal Baba, and F. Wu, “Detecting Covid-19 chaos driven phishing/malicious URL attacks by a fuzzy logic and data mining based intelligence system,” Egypt. Informatics J., vol. 23, no. 2, pp. 197–214, Jul. 2022.

C. Peng, P. Goswami, and G. Bai, “Fuzzy Matching of OpenAPI Described REST Services,” Procedia Comput. Sci., vol. 126, pp. 1313–1322, 2018.

Ida Bagus Putra Manuaba, Komang Ayu Triana Indah, Muhammad Fahmi, and Irma Nuraeni Salsabila, “An Improvement Object Detection Method Findcontour with Fuzzy Logic for Detect Balinese Script Object,” Aptisi Trans. Technopreneursh., vol. 4, no. 3, pp. 257–262, Oct. 2022.

M. Singh, M. Kumar, and J. Malhotra, “Energy efficient cognitive body area network (CBAN) using lookup table and energy harvesting,” J. Intell. Fuzzy Syst., vol. 35, no. 2, pp. 1253–1265, Aug. 2018.

L. Guan-Feng and M. Zong-Min, “An efficient matching algorithm for fuzzy RDF graph,” J. Inf. Sci. Eng., vol. 34, no. 2, pp. 519–534, 2018.

M. Pikies and J. Ali, “Analysis and safety engineering of fuzzy string matching algorithms,” ISA Trans., vol. 113, pp. 1–8, Jul. 2021.

H. R. Bosker, “Using fuzzy string matching for automated assessment of listener transcripts in speech intelligibility studies,” Behav. Res. Methods, vol. 53, no. 5, pp. 1945–1953, Oct. 2021.




DOI: http://dx.doi.org/10.17977/um018v5i22022p143-149

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Knowledge Engineering and Data Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Flag Counter

Creative Commons License


This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

View My Stats