flanT5-xl-3.2 / README.md
devvanshhh's picture
Model save
80e13f4
metadata
base_model: ybelkada/flan-t5-xl-sharded-bf16
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flanT5-xl-3.2
    results: []

flanT5-xl-3.2

This model is a fine-tuned version of ybelkada/flan-t5-xl-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6816
  • Rouge1: 32.8295
  • Rouge2: 24.633
  • Rougel: 29.5824
  • Rougelsum: 29.842
  • Gen Len: 10.9596

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 362 4.1881 14.6341 9.0532 12.5623 12.7062 15.8012
19.4437 2.0 724 0.8038 31.6983 24.0636 28.4549 28.672 10.6522
0.8703 3.0 1086 0.7598 32.6624 24.6635 29.339 29.5778 10.5311
0.8703 4.0 1448 0.7359 32.6045 24.52 29.2079 29.466 10.6304
0.7965 5.0 1810 0.7155 33.1775 25.1312 29.924 30.1659 10.5901
0.7601 6.0 2172 0.7023 32.5547 24.3195 29.2416 29.5173 10.9099
0.7475 7.0 2534 0.6923 33.0802 24.8653 29.769 30.0683 10.7640
0.7475 8.0 2896 0.6858 32.6578 24.333 29.3174 29.6478 11.0435
0.7287 9.0 3258 0.6827 32.9542 24.7132 29.6381 29.928 10.9193
0.7215 10.0 3620 0.6816 32.8295 24.633 29.5824 29.842 10.9596

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0