TY - JOUR
T1 - Python data odyssey
T2 - Mining user feedback from google play store
AU - Yasin, Affan
AU - Fatima, Rubia
AU - Ghazi, Ahmad Nauman
AU - Wei, Ziqi
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/6
Y1 - 2024/6
N2 - Context: The Google Play Store is widely recognized as one of the largest platforms for downloading applications, both free and paid1. On a daily basis, millions of users avail themselves of this marketplace, sharing their thoughts through various means such as star ratings, user comments, suggestions, and feedback. These insights, in the form of comments and feedback, constitute a valuable resource for organizations, competitors, and emerging companies seeking to expand their market presence. These comments provide insights into app deficiencies, suggestions for new features, identified issues, and potential enhancements. Unlocking the potential of this repository of suggestions holds significant value. Objective: This study sought to gather and analyze user reviews from the Google Play store for leading game apps. The primary aim was to construct a dataset for subsequent analysis utilizing requirements engineering, machine learning, and competitive assessment. Methodology: The authors employed a Python-based web scraping method to extract a comprehensive set of over 429,000+ reviews from the Google Play pages of selected apps. The scraped data encompassed reviewer names (removed due to privacy), ratings, and the textual content of the reviews. Results: The outcome was a dataset comprising the extracted user reviews, ratings, and associated metadata. A total of 429,000+ reviews were acquired through the scraping process for popular apps like Subway Surfers, Candy Crush Saga, PUBG Mobile, among others. This dataset not only serves as a valuable educational resource for instructors, aiding in the training of students in data analysis, but also offers practitioners the opportunity for in-depth examination and insights (in the past data of top apps).
AB - Context: The Google Play Store is widely recognized as one of the largest platforms for downloading applications, both free and paid1. On a daily basis, millions of users avail themselves of this marketplace, sharing their thoughts through various means such as star ratings, user comments, suggestions, and feedback. These insights, in the form of comments and feedback, constitute a valuable resource for organizations, competitors, and emerging companies seeking to expand their market presence. These comments provide insights into app deficiencies, suggestions for new features, identified issues, and potential enhancements. Unlocking the potential of this repository of suggestions holds significant value. Objective: This study sought to gather and analyze user reviews from the Google Play store for leading game apps. The primary aim was to construct a dataset for subsequent analysis utilizing requirements engineering, machine learning, and competitive assessment. Methodology: The authors employed a Python-based web scraping method to extract a comprehensive set of over 429,000+ reviews from the Google Play pages of selected apps. The scraped data encompassed reviewer names (removed due to privacy), ratings, and the textual content of the reviews. Results: The outcome was a dataset comprising the extracted user reviews, ratings, and associated metadata. A total of 429,000+ reviews were acquired through the scraping process for popular apps like Subway Surfers, Candy Crush Saga, PUBG Mobile, among others. This dataset not only serves as a valuable educational resource for instructors, aiding in the training of students in data analysis, but also offers practitioners the opportunity for in-depth examination and insights (in the past data of top apps).
KW - App reviews
KW - Crowd-source data
KW - Data mining
KW - NLP
KW - User reviews
UR - http://www.scopus.com/inward/record.url?scp=85192669911&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2024.110499
DO - 10.1016/j.dib.2024.110499
M3 - Article
AN - SCOPUS:85192669911
SN - 2352-3409
VL - 54
JO - Data in Brief
JF - Data in Brief
M1 - 110499
ER -