The increasingly developed online platform generates a large amount of online reviews every moment, e.g., Yelp and Amazon. Consumers gradually develop the habit of reading previous reviews before making a decision of buying or choosing various products. Online reviews play an vital part in determining consumers’ purchase choices in e-commerce, yet many online reviews are intentionally created to confuse or mislead potential consumers. Moreover, driven by product reputations and merchants’ profits, more and more spam reviews were inserted into online platform. This kind of reviews can be positive, negative or neutral, but they had common features: misleading consumers or damaging reputations. In the past decade, many people conducted research on detecting spam reviews using statistical or deep learning method with various datasets. In view of that, this article first introduces the task of spam online reviews detection and makes a common definition of spam reviews. Then, we comprehensively conclude the existing method and available datasets. Third, we summarize the existing network-based approaches in dealing with this task and propose some direction for future research.
Keywords: Spam review detection, Machine learning, Graph convolution network, Deep learning
日益发展的网络平台时刻都会产生大量的网评，例如 Yelp 和亚马逊。消费者在决定购买或挑选各种产品之前，会逐渐养成查阅之前评论的习惯。网评在决定消费者在电子商务中的购买选择方面发挥着至关重要的作用，然而很多网评是故意捏造的，其目的是混淆或误导潜在消费者。此外，在产品声誉和商家利润的推动下，越来越多的垃圾评论被植入到网络平台。这类评论可以是正面、负面或中性的；但是，它们具有误导消费者或损害产品声誉的共同特征。在过去十年内，许多研究者使用统计或深度学习方法对各种数据集进行垃圾评论检测方面的研究。鉴于此，本文首先介绍了垃圾评论检测的任务，并对垃圾评论做出通用定义。接着，全面概括了现有方法和可用数据集。最后，总结了解决该项任务现有的基于网络的方法，并为未来的研究提出了一些方向。