dennisjooo commited on
Commit
9e21d9b
1 Parent(s): 170a155

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -36,10 +36,13 @@ model-index:
36
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
37
  should probably proofread and complete it, then remove this comment. -->
38
 
39
- # emotion_classification
40
 
41
  This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)
42
  on the [FastJobs/Visual_Emotional_Analysis](https://huggingface.co/datasets/FastJobs/Visual_Emotional_Analysis) dataset.
 
 
 
43
  It achieves the following results on the evaluation set:
44
  - Loss: 1.1031
45
  - Accuracy: 0.6312
@@ -49,19 +52,20 @@ It achieves the following results on the evaluation set:
49
  ## Model description
50
 
51
  The Vision Transformer base version trained on ImageNet-21K released by Google.
52
- Further details can be found on their [repo]((https://huggingface.co/google/vit-base-patch16-224-in21k))
53
 
54
  ## Training and evaluation data
55
 
56
  ### Data Split
57
 
58
- Used a 4:1 ratio for training and development sets and a seed of 42.
 
59
 
60
  ### Pre-processing Augmentation
61
 
62
  The main pre-processing phase for both training and evaluation includes:
63
- - Resizing to (224, 224, 3) because it uses ImageNet images to train the original model
64
- - Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5]
65
 
66
  Other than the aforementioned pre-processing, the training set was augmented using:
67
  - Random horizontal & vertical flip
 
36
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
37
  should probably proofread and complete it, then remove this comment. -->
38
 
39
+ # Emotion Classification
40
 
41
  This model is a fine-tuned version of [google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)
42
  on the [FastJobs/Visual_Emotional_Analysis](https://huggingface.co/datasets/FastJobs/Visual_Emotional_Analysis) dataset.
43
+
44
+ In theory, the accuracy for a random guess on this dataset is 0.1429.
45
+
46
  It achieves the following results on the evaluation set:
47
  - Loss: 1.1031
48
  - Accuracy: 0.6312
 
52
  ## Model description
53
 
54
  The Vision Transformer base version trained on ImageNet-21K released by Google.
55
+ Further details can be found on their [repo](https://huggingface.co/google/vit-base-patch16-224-in21k).
56
 
57
  ## Training and evaluation data
58
 
59
  ### Data Split
60
 
61
+ Used a 4:1 ratio for training and development sets and a random seed of 42.
62
+ Also used a seed of 42 for batching the data, completely unrelated lol.
63
 
64
  ### Pre-processing Augmentation
65
 
66
  The main pre-processing phase for both training and evaluation includes:
67
+ - Bilinear interpolation to resize the image to (224, 224, 3) because it uses ImageNet images to train the original model
68
+ - Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5] just like the original model
69
 
70
  Other than the aforementioned pre-processing, the training set was augmented using:
71
  - Random horizontal & vertical flip