File size: 1,595 Bytes
d617811 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation
This is our official implementation of CAT-Seg🐱!
[[arXiv](#)] [[Project](#)]<br>
by [Seokju Cho](https://seokju-cho.github.io/)\*, [Heeseong Shin](https://github.com/hsshin98)\*, [Sunghwan Hong](https://sunghwanhong.github.io), Seungjun An, Seungjun Lee, [Anurag Arnab](https://anuragarnab.github.io), [Paul Hongsuck Seo](https://phseo.github.io), [Seungryong Kim](https://cvlab.korea.ac.kr)
## Introduction
![](assets/fig1.png)
We introduce cost aggregation to open-vocabulary semantic segmentation, which jointly aggregates both image and text modalities within the matching cost.
## Installation
Install required packages.
```bash
conda create --name catseg python=3.8
conda activate catseg
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt
```
## Data Preparation
## Training
### Preparation
you have to blah
### Training script
```bash
python train.py --config configs/eval/{a847 | pc459 | a150 | pc59 | pas20 | pas20b}.yaml
```
## Evaluation
```bash
python eval.py --config configs/eval/{a847 | pc459 | a150 | pc59 | pas20 | pas20b}.yaml
```
## Citing CAT-Seg🐱 :pray:
```BibTeX
@article{liang2022open,
title={Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP},
author={Liang, Feng and Wu, Bichen and Dai, Xiaoliang and Li, Kunpeng and Zhao, Yinan and Zhang, Hang and Zhang, Peizhao and Vajda, Peter and Marculescu, Diana},
journal={arXiv preprint arXiv:2210.04150},
year={2022}
}
``` |