Understanding Negative Sampling in Knowledge Graph Embedding

Jing Qian; Gangmin Li; Katie Atkinson; Yong Yue

Understanding Negative Sampling in Knowledge Graph Embedding

Jing Qian^*, Gangmin Li, Katie Atkinson, Yong Yue

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

Original language	English
Pages (from-to)	71
Number of pages	11
Journal	International Journal of Artificial Intelligence and Applications
Volume	12
Issue number	1
Publication status	Published - Jan 2021

Keywords

Negative Sampling
Knowledge Graph Embedding
Generative Adversarial Network

Access to Document

https://aircconline.com/ijaia/V12N1/12121ijaia05.pdf

Cite this

@article{2f148e17e7864842bed083c2e7c462fe,

title = "Understanding Negative Sampling in Knowledge Graph Embedding",

abstract = "Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.",

keywords = "Negative Sampling, Knowledge Graph Embedding, Generative Adversarial Network",

author = "Jing Qian and Gangmin Li and Katie Atkinson and Yong Yue",

year = "2021",

month = jan,

language = "English",

volume = "12",

pages = "71",

journal = "International Journal of Artificial Intelligence and Applications",

issn = "0976-2191",

number = "1",

}

TY - JOUR

T1 - Understanding Negative Sampling in Knowledge Graph Embedding

AU - Qian, Jing

AU - Li, Gangmin

AU - Atkinson, Katie

AU - Yue, Yong

PY - 2021/1

Y1 - 2021/1

N2 - Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

AB - Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

KW - Negative Sampling

KW - Knowledge Graph Embedding

KW - Generative Adversarial Network

M3 - Article

SN - 0976-2191

VL - 12

SP - 71

JO - International Journal of Artificial Intelligence and Applications

JF - International Journal of Artificial Intelligence and Applications

IS - 1

ER -

Understanding Negative Sampling in Knowledge Graph Embedding

Abstract

Keywords

Access to Document

Fingerprint

Cite this