Segment Anything Also Detect Anything

EasyChair Preprint 9952, version 2

Versions: 12→history

6 pages•Date: April 11, 2023

Abstract

The field of natural language processing (NLP) has been revolutionised by the emergence of large language models (LLMs), which have demonstrated impressive capabilities in zero-shot and few-shot tasks, as well as more complex tasks such as mathematical problem-solving and commonsense reasoning, due to their massive corpus and intensive training computation.The emergence of computer vision macromodels (SAMs) is also transforming computer vision (CV) tasks. In this paper, we propose the use of SAM vision macromodels to guide semi-automated annotation of data in the domain of specific object detection. Also focusing on visual image data augmentation, we propose the High Fine Grain Fill-in Augmentation (HFGFA) method, which can generate false images with higher fineness and greatly improve data imbalance and small object problems. Through early experimental validation, such an approach can improve model generalisation and model generalisation capabilities. Finally, we focus on open world object detection, where the advent of SAM will greatly advance research related to open world object detection.

Keyphrases: SAM, data augmentation, large language models, objection detection

Links:

https://easychair.org/publications/preprint/T1rc

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:9952,
  author    = {Rongsheg Wang and Yaofei Duan and Yukun Li},
  title     = {Segment Anything Also Detect Anything},
  howpublished = {EasyChair Preprint 9952},
  year      = {EasyChair, 2023}}

Download PDF Open PDF in browser