Abstract
Triplet loss is widely used as the objective function in image-text retrieval tasks. However, as all the triplets are treated equally, triplet loss has a bottleneck problem of slow convergence and other unsatisfactory performances. In this article, we propose solutions by appropriately weighting triplets according to the relative similarities among the training samples. Specifically, we present three weighting functions to assign an appropriate weight for the selected informative triplets to accelerate the convergence. We evaluate our approach on two widely used benchmark datasets: Flickr30k and MSCOCO, with results outperforming the previous methods, which demonstrates its superiority.
Original language | English |
---|---|
Article number | e2093 |
Journal | Computer Animation and Virtual Worlds |
Volume | 33 |
Issue number | 3-4 |
DOIs | |
Publication status | Published - 13 Jun 2022 |
Externally published | Yes |
Keywords
- deep metric learning
- image-text retrieval
- kernel triplet loss
- weighting scheme