Tripadvisor Restaurant 2020 (TripR-2020) is a dataset of reviews from Tripadvisor about restaurants located in the city of London.
Sentiment Analysis at aspect level is known as aspect-based sentiment analysis (ABSA), and it calculates the opinion meaning of every entity and aspects of those entities explicitly or implicitly mentioned in the review. We show an example review that expresses a positive opinion about the ambiance and the food of a restaurant, but it expresses a negative opinion about the price of the restaurant:
TripR-2020 dataset provides the reviews of 1,428 experts evaluating 78 restaurants. Not all experts evaluate all the restaurants. The dataset collects 8,306 reviews. In particular, each review is given by attributes:
- Concerning the restaurant: name, identifier code, location name, and location identifier code.
- Concerning the expert: name, identifier code, and location name.
- Concerning the review: title, body, date, general rating, food rating, service rating, and value rating.
A reduction of the TripR-2020 dataset has been analyzed for the evaluation of the Sentiment Analysis based Multi-person Multi-criteria Decision Making (SA-MpMcDM) methodology. The SA-MpMcDM methodology, which is under review, incorporates sentiment analysis to allow decision making models to consider expert evaluations in natural language. The aim of the SA-MpMcDM methodology is to overcome the limitation of traditional decision making models, since they are constrained by taking the expert evaluations with pre-defined numerical or linguistic terms.
The reduction of the TripR-2020 dataset is composed of 4 restaurants, which are evaluated by 6 experts. The 6 experts evaluate the 4 restaurants providing a total of 24 reviews, avoiding any the loss of information. The dataset is annotated following the official SemEval-2016 annotation guidelines. We show the main features of the annotation:
|Num. Pos. Opinions||149|
|Num. Neg. Opinions||26|
|Aspect categories||Restaurant, Food, Service, Drinks, Ambience and Location|
|Polarities value||positive, negative and neutral|
The 24 reviews from the dataset are split into 168 sentences. We annotated 185 opinions of which 149 are positive, 26 are negative and the rest are neutral opinions. Each sentence can have at least an aspect category annotated (Restaurant, Food, Service, Drinks, Ambience and Location).
We show an example of annotation of a review from the TripR-2020 dataset:
|I have been coming for years.||implicit||Restaurant||positive|
|Always good atmosphere and fun people watching.||atmosphere||Ambience||positive|
|The food is always good and quick.||food||Food||positive|
The review is split into sentences. First column presents the text of the sentence. Second column is referred to the aspect terms discussed in the sentence. The aspect is implicit when the aspect term is not explicitly mentioned in the sentence. Third column is the aspect category to which the aspect term is categorized. Fourth column is the sentiment polarity about each aspect category.
The TripR-2020 dataset can be used for research in:
- Extraction of implicit and explicit aspects.
- Classification of aspects categories.
- Opinion classification at the aspect level.
Furthermore, due to its orientation to restaurant evaluation, it can be used for research related to Decision Making (DM).
TripR-2020 has been used for the evaluation of the SA-MpMcDM methodology, in which opinion analysis techniques and deep learning are combined to obtain the opinion of the evaluators of each criterion (aspect category) with a model of DM. The work is under review.
The article in which TripR-2020 is presented is under review.
Available version of TripR-2020 are downloadable at the following repository: