INTEGRATING DATA SCIENCE AND PREDICTIVE MODELING FOR DETECTING INCONSISTENT HOTEL REVIEWS

TITLE
INTEGRATING DATA SCIENCE AND PREDICTIVE MODELING FOR DETECTING INCONSISTENT HOTEL REVIEWS

AUTHOR(S)
Milena Nikolić1,2*, Miloš Stojanović2, Marina Marjanović3

ABSTRACT
With the rising dependence on online reviews in the hotel industry, it is essential to identify and remove inconsistent or misleading feedback to ensure the credibility of review platforms. This paper presents a predictive modeling approach aimed at detecting inconsistent hotel reviews through a combination of sentiment analysis, correlation assessment, and advanced feature engineering techniques. Our methodology involves extracting sentiment scores from review texts, titles, and tags using the VADER sentiment analysis tool, which is particularly suited for evaluating informal, user-generated content. By analyzing the correlations between sentiment scores and the numerical ratings provided by reviewers, we identify potential mismatches that indicate inconsistencies. To enhance detection accuracy, we implement sophisticated criteria based on sentiment mismatches and correlation thresholds. For the classification of reviews, we employ the XGBoost algorithm, known for strong performances in handling structured data. Using RandomizedSearchCV, we fine-tune the model to achieve higher levels of precision. This technique successfully filters out inconsistent reviews and provides insights for enhancing the reliability of online feedback systems. Our results emphasize the value of data science and predictive modeling in ensuring the integrity of review data, ultimately enabling consumers to make more well-informed decisions.

DOI
http://www.doi.org/10.70456/DHXA1258

DOWNLOAD
https://unitechsp.tugab.bg/images/2024/4-CST/s4_p104_v1.pdf

How to cite this article:
Milena Nikolić, Miloš Stojanović, Marina Marjanović, INTEGRATING DATA SCIENCE AND PREDICTIVE MODELING FOR DETECTING INCONSISTENT HOTEL REVIEWS, UNITECH – SELECTED PAPERS - 2024