Kernel triplet loss for image-text retrieval

Zhengxin Pan; Fangyu Wu; Bailing Zhang

doi:10.1002/cav.2093

Kernel triplet loss for image-text retrieval

Zhengxin Pan, Fangyu Wu^*, Bailing Zhang

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Triplet loss is widely used as the objective function in image-text retrieval tasks. However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.

Original language	English
Article number	e2093
Journal	Computer Animation and Virtual Worlds
Volume	33
Issue number	3-4
DOIs	https://doi.org/10.1002/cav.2093
Publication status	Published - 13 Jun 2022
Externally published	Yes

Keywords

deep metric learning
image-text retrieval
kernel triplet loss
weighting scheme

Access to Document

10.1002/cav.2093

Cite this

@article{6c111f10caae47778ea564aa0d0df442,

title = "Kernel triplet loss for image-text retrieval",

abstract = "Triplet loss is widely used as the objective function in image-text retrieval tasks. However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.",

keywords = "deep metric learning, image-text retrieval, kernel triplet loss, weighting scheme",

author = "Zhengxin Pan and Fangyu Wu and Bailing Zhang",

note = "Funding Information: The article is supported by Ningbo 2025 Key Scientific Research Programs, Grant/Award Number: 2019B10128; Zhejiang Provincial Philosophy and Social Sciences Planning Project, Grant/Award Number: 22JCXK08Z. Funding Information: Ningbo 2025 Key Scientific Research Programs, Grant/Award Number: 2019B10128; Zhejiang Provincial Philosophy and Social Sciences Planning Project, Grant/Award Number: 22JCXK08Z Funding information Publisher Copyright: {\textcopyright} 2022 John Wiley & Sons, Ltd.",

year = "2022",

month = jun,

day = "13",

doi = "10.1002/cav.2093",

language = "English",

volume = "33",

journal = "Computer Animation and Virtual Worlds",

issn = "1546-4261",

number = "3-4",

}

TY - JOUR

T1 - Kernel triplet loss for image-text retrieval

AU - Pan, Zhengxin

AU - Wu, Fangyu

AU - Zhang, Bailing

N1 - Funding Information: The article is supported by Ningbo 2025 Key Scientific Research Programs, Grant/Award Number: 2019B10128; Zhejiang Provincial Philosophy and Social Sciences Planning Project, Grant/Award Number: 22JCXK08Z. Funding Information: Ningbo 2025 Key Scientific Research Programs, Grant/Award Number: 2019B10128; Zhejiang Provincial Philosophy and Social Sciences Planning Project, Grant/Award Number: 22JCXK08Z Funding information Publisher Copyright: © 2022 John Wiley & Sons, Ltd.

PY - 2022/6/13

Y1 - 2022/6/13

N2 - Triplet loss is widely used as the objective function in image-text retrieval tasks. However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.

AB - Triplet loss is widely used as the objective function in image-text retrieval tasks. However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.

KW - deep metric learning

KW - image-text retrieval

KW - kernel triplet loss

KW - weighting scheme

UR - http://www.scopus.com/inward/record.url?scp=85131719083&partnerID=8YFLogxK

U2 - 10.1002/cav.2093

DO - 10.1002/cav.2093

M3 - Article

AN - SCOPUS:85131719083

SN - 1546-4261

VL - 33

JO - Computer Animation and Virtual Worlds

JF - Computer Animation and Virtual Worlds

IS - 3-4

M1 - e2093

ER -

Kernel triplet loss for image-text retrieval

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this