A Training-free Synthetic Data Selection Method for Semantic Segmentation

Hao Tang, Siyue Yu, Jian Pang, Bingfeng Zhang*

*Corresponding author for this work

    Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

    Abstract

    Training semantic segmenter with synthetic data has been attracting great attention due to its easy accessibility and huge quantities. Most previous methods focused on producing large-scale synthetic image-annotation samples and then training the segmenter with all of them. However, such a solution remains a main challenge in that the poor-quality samples are unavoidable, and using them to train the model will damage the training process. In this paper, we propose a training-free Synthetic Data Selection (SDS) strategy with CLIP to select high-quality samples for building a reliable synthetic dataset. Specifically, given massive synthetic image-annotation pairs, we first design a Perturbation-based CLIP Similarity (PCS) to measure the reliability of synthetic image, thus removing samples with low-quality images. Then we propose a class-balance Annotation Similarity Filter (ASF) by comparing the synthetic annotation with the response of CLIP to remove the samples related to low-quality annotations. The experimental results show that using our method significantly reduces the data size by half, while the trained segmenter achieves higher performance.

    Original languageEnglish
    Title of host publicationSpecial Track on AI Alignment
    EditorsToby Walsh, Julie Shah, Zico Kolter
    PublisherAssociation for the Advancement of Artificial Intelligence
    Pages7229-7237
    Number of pages9
    Edition7
    ISBN (Electronic)157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978
    DOIs
    Publication statusPublished - 11 Apr 2025
    Event39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States
    Duration: 25 Feb 20254 Mar 2025

    Publication series

    NameProceedings of the AAAI Conference on Artificial Intelligence
    Number7
    Volume39
    ISSN (Print)2159-5399
    ISSN (Electronic)2374-3468

    Conference

    Conference39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
    Country/TerritoryUnited States
    CityPhiladelphia
    Period25/02/254/03/25

    Fingerprint

    Dive into the research topics of 'A Training-free Synthetic Data Selection Method for Semantic Segmentation'. Together they form a unique fingerprint.

    Cite this