File size: 4,060 Bytes
9f2fd3b
 
01d9795
 
 
 
 
 
 
 
 
 
 
9f2fd3b
01d9795
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
03874be
e39715e
01d9795
 
 
 
 
 
 
 
 
e39715e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
de6ffcd
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
license: apache-2.0
datasets:
- detection-datasets/coco
language:
- en
library_name: diffusers
tags:
- pytorch
- controlnet
- image-colorization
- image-to-image
pipeline_tag: image-to-image
---

# Model Card for ColorizeNet

<!-- Provide a quick summary of what the model is/does. -->

This model is a ControlNet training to perform image colorization from black and white images.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

ColorizeNet is an image colorization model based on ControlNet, trained using the pre-trained Stable Diffusion model version 2.1 proposed by Stability AI.

- **Finetuned from model :** [https://huggingface.co/stabilityai/stable-diffusion-2-1]

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** [https://github.com/rensortino/ColorizeNet]

## Usage

### Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model has been trained on COCO, using all the images in the dataset and converting them to grayscale to use them to condition the ControlNet

[https://huggingface.co/datasets/detection-datasets/coco]

### Run the model

Instantiate the model and load its configuration and weights

```python
import random

import cv2
import einops
import numpy as np
import torch
from pytorch_lightning import seed_everything

from utils.data import HWC3, apply_color, resize_image
from utils.ddim import DDIMSampler
from utils.model import create_model, load_state_dict

model = create_model('./models/cldm_v21.yaml').cpu()
model.load_state_dict(load_state_dict(
    'lightning_logs/version_6/checkpoints/colorizenet-sd21.ckpt', location='cuda'))
model = model.cuda()
ddim_sampler = DDIMSampler(model)
```

Read the image to be colorized

```python
input_image = cv2.imread("sample_data/sample1_bw.jpg")
input_image = HWC3(input_image)
img = resize_image(input_image, resolution=512)
H, W, C = img.shape

num_samples = 1
control = torch.from_numpy(img.copy()).float().cuda() / 255.0
control = torch.stack([control for _ in range(num_samples)], dim=0)
control = einops.rearrange(control, 'b h w c -> b c h w').clone()
```

Prepare the input and parameters of the model

```python
seed = 1294574436
seed_everything(seed)
prompt = "Colorize this image"
n_prompt = ""
guess_mode = False
strength = 1.0
eta = 0.0
ddim_steps = 20
scale = 9.0

cond = {"c_concat": [control], "c_crossattn": [
    model.get_learned_conditioning([prompt] * num_samples)]}
un_cond = {"c_concat": None if guess_mode else [control], "c_crossattn": [
    model.get_learned_conditioning([n_prompt] * num_samples)]}
shape = (4, H // 8, W // 8)

model.control_scales = [strength * (0.825 ** float(12 - i)) for i in range(13)] if guess_mode else (
    [strength] * 13)
```

Sample and post-process the results

```python
samples, intermediates = ddim_sampler.sample(ddim_steps, num_samples,
                                             shape, cond, verbose=False, eta=eta,
                                             unconditional_guidance_scale=scale,
                                             unconditional_conditioning=un_cond)

x_samples = model.decode_first_stage(samples)
x_samples = (einops.rearrange(x_samples, 'b c h w -> b h w c')
             * 127.5 + 127.5).cpu().numpy().clip(0, 255).astype(np.uint8)

results = [x_samples[i] for i in range(num_samples)]
colored_results = [apply_color(img, result) for result in results]
```

## Results

BW Input             |  Colorized
:-------------------------:|:-------------------------:
![image](docs/sample1_bw.jpg) | ![image](docs/sample1.png)
![image](docs/sample2_bw.jpg) | ![image](docs/sample2.png)
![image](docs/sample3_bw.jpg) | ![image](docs/sample3.png)
![image](docs/sample4_bw.jpg) | ![image](docs/sample4.png)
![image](docs/sample5_bw.jpg) | ![image](docs/sample5.png)
![image](docs/sample6_bw.jpg) | ![image](docs/sample6.png)