raghavbali commited on
Commit
25093e1
1 Parent(s): b248c85

update model card

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -12,21 +12,22 @@ model-index:
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
- # gpt2-finetuned-headliner
16
 
17
- This model is a fine-tuned version of [openai-community/gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) on an unknown dataset.
 
18
 
19
  ## Model description
20
 
21
- More information needed
22
 
23
  ## Intended uses & limitations
24
 
25
- More information needed
26
 
27
  ## Training and evaluation data
28
 
29
- More information needed
30
 
31
  ## Training procedure
32
 
@@ -43,7 +44,8 @@ The following hyperparameters were used during training:
43
  - num_epochs: 2
44
 
45
  ### Training results
46
-
 
47
 
48
 
49
  ### Framework versions
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # GPT2 Fine Tuned Headline Generator
16
 
17
+ - This model is trained on the [harvard/abcnews-dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL) to generate news headlines
18
+ - This model is a fine-tuned version of [openai-community/gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) on an unknown dataset.
19
 
20
  ## Model description
21
 
22
+ The model is fine-tuned for 2 epochs and 4k training samples from the abcnews dataset. This enables the model to generate news headline like text given a simple prompt
23
 
24
  ## Intended uses & limitations
25
 
26
+ This model is only for learning purposes only. The model easily hallucinates people names, locations and other artifacts & incidents.
27
 
28
  ## Training and evaluation data
29
 
30
+ The model leverages 2k test samples for evaluation
31
 
32
  ## Training procedure
33
 
 
44
  - num_epochs: 2
45
 
46
  ### Training results
47
+ The final output after 2 epochs is as follows:
48
+ TrainOutput(global_step=130, training_loss=5.044873604407678, metrics={'train_runtime': 140.587, 'train_samples_per_second': 59.166, 'train_steps_per_second': 0.925, 'total_flos': 248723096358912.0, 'train_loss': 5.044873604407678, 'epoch': 2.0})
49
 
50
 
51
  ### Framework versions