DinoVd'eau is a fine-tuned version of facebook/dinov2-large. It achieves the following results on the test set:

Explained variance: 0.4014
Loss: 0.3578
MAE: 0.1288
MSE: 0.0378
R2: 0.4008
RMSE: 0.1943

Model description

DinoVd'eau is a model built on top of dinov2 model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

The source code for training the model can be found in this Git repository.

Developed by: lombardata, credits to César Leblanc and Victor Illien

Intended uses & limitations

You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

Training and evaluation data

Details on the number of images for each class are given in the following table:

Class	train	val	test	Total
Acropore_branched	1956	651	652	3259
Acropore_digitised	1717	576	576	2869
Acropore_tabular	1105	384	379	1868
Algae	11092	3677	3674	18443
Dead_coral	5888	1952	1959	9799
Fish	3453	1157	1157	5767
Millepore	1760	690	693	3143
No_acropore_encrusting	2707	974	999	4680
No_acropore_massive	6487	2158	2167	10812
No_acropore_sub_massive	5015	1776	1776	8567
Rock	11176	3725	3725	18626
Rubble	10689	3563	3563	17815
Sand	11168	3723	3723	18614

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Number of Epochs: 100
Learning Rate: 0.001
Train Batch Size: 64
Eval Batch Size: 64
Optimizer: Adam
LR Scheduler Type: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
Freeze Encoder: Yes
Data Augmentation: Yes

Data Augmentation

Data were augmented using the following transformations :

Train Transforms

PreProcess: No additional parameters
Resize: probability=1.00
RandomHorizontalFlip: probability=0.25
RandomVerticalFlip: probability=0.25
ColorJiggle: probability=0.25
RandomPerspective: probability=0.25
Normalize: probability=1.00

Val Transforms

PreProcess: No additional parameters
Resize: probability=1.00
Normalize: probability=1.00

Training results

Epoch	Explained Variance	Validation Loss	MAE	MSE	R2	RMSE	Learning Rate
1	0.28	0.386	0.157	0.046	0.262	0.215	0.001
2	0.321	0.376	0.147	0.044	0.312	0.21	0.001
3	0.339	0.372	0.145	0.043	0.332	0.206	0.001
4	0.357	0.367	0.14	0.041	0.355	0.202	0.001
5	0.349	0.369	0.139	0.042	0.343	0.205	0.001
6	0.359	0.367	0.141	0.041	0.355	0.202	0.001
7	0.35	0.368	0.141	0.042	0.346	0.204	0.001
8	0.364	0.366	0.139	0.041	0.36	0.201	0.001
9	0.361	0.366	0.134	0.041	0.355	0.202	0.001
10	0.356	0.367	0.138	0.041	0.353	0.202	0.001
11	0.357	0.367	0.137	0.041	0.355	0.202	0.001
12	0.36	0.366	0.14	0.041	0.359	0.202	0.001
13	0.37	0.363	0.136	0.04	0.37	0.199	0.001
14	0.363	0.367	0.142	0.041	0.356	0.202	0.001
15	0.364	0.364	0.14	0.04	0.362	0.201	0.001
16	0.372	0.364	0.136	0.04	0.369	0.2	0.001
17	0.373	0.367	0.141	0.041	0.362	0.202	0.001
18	0.371	0.363	0.137	0.04	0.37	0.2	0.001
19	0.373	0.363	0.135	0.04	0.372	0.199	0.001
20	0.362	0.365	0.135	0.041	0.359	0.201	0.001
21	0.363	0.367	0.136	0.041	0.358	0.202	0.001
22	0.37	0.365	0.137	0.04	0.368	0.2	0.001
23	0.374	0.363	0.136	0.04	0.37	0.2	0.001
24	0.376	0.363	0.139	0.04	0.373	0.199	0.001
25	0.373	0.364	0.138	0.04	0.37	0.2	0.001
26	0.384	0.361	0.133	0.039	0.382	0.198	0.0001
27	0.388	0.36	0.135	0.039	0.386	0.197	0.0001
28	0.39	0.359	0.134	0.038	0.389	0.196	0.0001
29	0.391	0.36	0.135	0.038	0.389	0.196	0.0001
30	0.389	0.36	0.135	0.039	0.388	0.197	0.0001
31	0.392	0.359	0.132	0.038	0.391	0.196	0.0001
32	0.393	0.358	0.133	0.038	0.393	0.196	0.0001
33	0.395	0.358	0.131	0.038	0.395	0.195	0.0001
34	0.397	0.358	0.132	0.038	0.395	0.195	0.0001
35	0.395	0.358	0.132	0.038	0.395	0.195	0.0001
36	0.39	0.359	0.135	0.039	0.39	0.196	0.0001
37	0.397	0.358	0.131	0.038	0.397	0.195	0.0001
38	0.394	0.358	0.133	0.038	0.392	0.196	0.0001
39	0.397	0.358	0.131	0.038	0.396	0.195	0.0001
40	0.4	0.357	0.133	0.038	0.398	0.195	0.0001
41	0.399	0.358	0.132	0.038	0.396	0.195	0.0001
42	0.399	0.357	0.133	0.038	0.397	0.195	0.0001
43	0.402	0.357	0.133	0.038	0.401	0.194	0.0001
44	0.403	0.357	0.131	0.038	0.401	0.194	0.0001
45	0.403	0.357	0.132	0.038	0.402	0.194	0.0001
46	0.401	0.357	0.13	0.038	0.4	0.194	0.0001
47	0.4	0.357	0.129	0.038	0.397	0.195	0.0001
48	0.404	0.356	0.13	0.038	0.402	0.194	0.0001
49	0.402	0.357	0.131	0.038	0.401	0.194	0.0001
50	0.401	0.357	0.132	0.038	0.4	0.194	0.0001
51	0.402	0.358	0.134	0.038	0.396	0.195	0.0001
52	0.405	0.356	0.131	0.037	0.404	0.194	0.0001
53	0.405	0.357	0.131	0.038	0.403	0.194	0.0001
54	0.402	0.357	0.132	0.038	0.401	0.194	0.0001
55	0.405	0.356	0.129	0.038	0.403	0.194	0.0001
56	0.405	0.357	0.128	0.038	0.402	0.194	0.0001
57	0.405	0.356	0.129	0.038	0.403	0.194	0.0001
58	0.406	0.356	0.13	0.038	0.404	0.194	0.0001
59	0.406	0.356	0.129	0.037	0.405	0.194	1e-05
60	0.408	0.356	0.13	0.037	0.406	0.193	1e-05
61	0.407	0.355	0.13	0.037	0.407	0.193	1e-05
62	0.406	0.356	0.132	0.038	0.404	0.194	1e-05
63	0.409	0.356	0.129	0.037	0.408	0.193	1e-05
64	0.409	0.355	0.13	0.037	0.408	0.193	1e-05
65	0.406	0.356	0.131	0.038	0.405	0.194	1e-05
66	0.409	0.355	0.13	0.037	0.408	0.193	1e-05
67	0.408	0.355	0.13	0.037	0.408	0.193	1e-05
68	0.407	0.356	0.13	0.037	0.406	0.193	1e-05
69	0.409	0.355	0.13	0.037	0.408	0.193	1e-05
70	0.409	0.356	0.131	0.037	0.407	0.193	1e-05
71	0.407	0.356	0.13	0.037	0.407	0.193	1e-05
72	0.408	0.356	0.13	0.037	0.407	0.193	1e-05
73	0.409	0.355	0.13	0.037	0.408	0.193	1.0000000000000002e-06
74	0.409	0.355	0.128	0.037	0.409	0.193	1.0000000000000002e-06
75	0.406	0.356	0.13	0.037	0.405	0.194	1.0000000000000002e-06
76	0.408	0.356	0.128	0.037	0.406	0.193	1.0000000000000002e-06
77	0.405	0.356	0.132	0.038	0.404	0.194	1.0000000000000002e-06
78	0.409	0.355	0.131	0.037	0.409	0.193	1.0000000000000002e-06
79	0.402	0.357	0.131	0.038	0.4	0.195	1.0000000000000002e-06
80	0.406	0.356	0.131	0.037	0.405	0.194	1.0000000000000002e-06
81	0.409	0.356	0.131	0.037	0.408	0.193	1.0000000000000002e-07
82	0.409	0.356	0.131	0.037	0.407	0.193	1.0000000000000002e-07
83	0.41	0.356	0.13	0.037	0.407	0.193	1.0000000000000002e-07
84	0.408	0.356	0.131	0.037	0.406	0.193	1.0000000000000002e-07

CO2 Emissions

The estimated CO2 emissions for training this model are documented below:

Emissions: 0.22861184690098074 grams of CO2
Source: Code Carbon
Training Type: fine-tuning
Geographical Location: Brest, France
Hardware Used: NVIDIA Tesla V100 PCIe 32 Go

Framework Versions

Transformers: 4.41.1
Pytorch: 2.3.0+cu121
Datasets: 2.19.1
Tokenizers: 0.19.1

lombardata
/

drone-DinoVdeau-large-2024_09_17-batch-size64_epochs100_freeze