Abstract
The China Internet Network Information Centre (CNNIC) published that internet users around the world mostly spent 10-16 hours per week online. For effective advertising and social information publishing on the internet, how to dig out the commercial value from users' online behaviour becomes a new challenge compared with the traditional recommendation system. In this paper, we propose a novel system named 'online commercial intention (OCI) detection system' using users' global web browsing history to predict potential purchasing products on an online shopping platform. A 'commercial keyword dictionary (KD)' that reveals the relationship between user queries and product categories is firstly set up by analysing the click distribution of billion queries on the shopping platform. Footprints of millions of internet users are gathered and the raw page contents are crawled. Keywords in these pages are extracted using N-gram algorithm and commercial probabilities are estimated with query frequency (QF), inverse category frequency (ICF), etc. The page OCI is estimated by merging the KD matrices of its commercial keywords. In order to increase categories' coherence and accuracy, we provide a category similarity model to observe the distance between top N categories. The experiment results show that category prediction accuracy reaches 86% with manual evaluation.
Original language | English |
---|---|
Pages (from-to) | 176-185 |
Number of pages | 10 |
Journal | International Journal of Computational Science and Engineering |
Volume | 12 |
Issue number | 2-3 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Keywords
- Category similarity model
- Commercial keyword dictionary
- Commercial probabilities
- Large-scale data
- OCI
- Online commercial intention
- Product categories
- User profile
- User-online behaviour