Abstract
An ontology is defined as a structured, hierarchical way for describing domain knowledge. Research work regarding ontological engineering has yielded fruitful results, but these methods share a common drawback: they require significant manual work to generate an ontology, which limits the usefulness of these approaches in practice. In this paper, we propose a computational model that combines data mining, Natural Language Processing (NLP), WordNet and a novel class-based n-gram model for automatic ontology discovery and recognition from existing patent documents. A pre-built ontology library was constructed by gathering knowledge from engineering textbooks and dictionaries. Then a data set of engineering patent claims was split into training (80%) and validation (20%) subsets. The pre-built library and WordNet were used to generate class labels for constructing classbased n-gram models in a training process. The holdout validation showed that the average accuracy was 87.26% for all validation samples.
| Original language | English |
|---|---|
| Pages (from-to) | 142-172 |
| Number of pages | 31 |
| Journal | International Journal of Product Development |
| Volume | 20 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 2015 |
Keywords
- N-gram language model
- Natural language processing
- Ontological engineering
Fingerprint
Dive into the research topics of 'Automatic ontology generation from patents using a pre-built library, WordNet and a class-based n-gram model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver