OWL-ViT
This post is based on Simple Open-Vocabulary Object Detection with Vision Transformers written by google research in 2022
Here is main contribution of this paper
Simple and strong Vision-Language Model for Object detection
Thanks to the Contrastive learning and pre-training for large-scale image-text data, it can be available to open-vocabulary object detraction
How to train
컴퓨터 시스템의 구성요소
데이터의 표현과 컴퓨터 연산
디지털 논리
중앙처리장치