--- license: apache-2.0 base_model: google/flan-t5-base tags: - generated_from_trainer metrics: - rouge model-index: - name: flan_vary_merged_5800_1 results: [] --- # flan_vary_merged_5800_1 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.1597 - Rouge1: 66.8856 - Rouge2: 55.6869 - Rougel: 63.8241 - Rougelsum: 66.7005 - Gen Len: 16.3392 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 8 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 200 - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 11.8095 | 0.35 | 200 | 0.5275 | 38.2792 | 29.3331 | 37.9276 | 38.1283 | 8.0624 | | 0.4481 | 0.7 | 400 | 0.3046 | 64.4437 | 52.3632 | 62.0225 | 64.2515 | 16.4262 | | 0.3616 | 1.05 | 600 | 0.2656 | 64.9871 | 53.1185 | 62.4919 | 64.739 | 16.4279 | | 0.2944 | 1.41 | 800 | 0.2412 | 65.2117 | 53.5512 | 62.6779 | 64.9318 | 16.4464 | | 0.264 | 1.76 | 1000 | 0.2295 | 65.5748 | 54.0948 | 62.9803 | 65.3339 | 16.3866 | | 0.2571 | 2.11 | 1200 | 0.2223 | 65.7216 | 53.793 | 62.9877 | 65.491 | 16.1898 | | 0.2364 | 2.46 | 1400 | 0.2164 | 65.5444 | 53.9296 | 62.9975 | 65.3055 | 16.3172 | | 0.2293 | 2.81 | 1600 | 0.2029 | 65.7977 | 54.3067 | 63.1851 | 65.5544 | 16.1766 | | 0.2129 | 3.16 | 1800 | 0.2006 | 65.8342 | 53.9105 | 63.163 | 65.6175 | 16.1757 | | 0.2184 | 3.51 | 2000 | 0.1931 | 65.1608 | 53.7707 | 62.6719 | 64.9743 | 16.1547 | | 0.1952 | 3.87 | 2200 | 0.1873 | 66.3361 | 54.8382 | 63.2054 | 66.0954 | 16.3155 | | 0.1992 | 4.22 | 2400 | 0.1847 | 66.316 | 55.0379 | 63.5154 | 66.0694 | 16.3594 | | 0.1873 | 4.57 | 2600 | 0.1811 | 66.4999 | 55.263 | 63.8319 | 66.2513 | 16.3146 | | 0.1839 | 4.92 | 2800 | 0.1783 | 66.0055 | 54.3406 | 62.9554 | 65.7387 | 16.3304 | | 0.1748 | 5.27 | 3000 | 0.1777 | 66.1592 | 54.8048 | 63.407 | 66.0067 | 16.3348 | | 0.1844 | 5.62 | 3200 | 0.1736 | 66.7642 | 55.3404 | 63.7069 | 66.5324 | 16.2996 | | 0.1745 | 5.98 | 3400 | 0.1698 | 66.3946 | 55.1716 | 63.5596 | 66.1663 | 16.3216 | | 0.1739 | 6.33 | 3600 | 0.1678 | 66.4472 | 55.1785 | 63.602 | 66.2704 | 16.3049 | | 0.1633 | 6.68 | 3800 | 0.1680 | 66.6666 | 55.4584 | 63.8058 | 66.4708 | 16.3445 | | 0.1659 | 7.03 | 4000 | 0.1682 | 66.6592 | 55.3712 | 63.5841 | 66.4587 | 16.2953 | | 0.1557 | 7.38 | 4200 | 0.1634 | 66.876 | 55.423 | 63.8431 | 66.5569 | 16.2434 | | 0.158 | 7.73 | 4400 | 0.1622 | 66.6165 | 55.2948 | 63.5996 | 66.4314 | 16.3849 | | 0.1647 | 8.08 | 4600 | 0.1622 | 66.7592 | 55.5552 | 63.7194 | 66.5229 | 16.2794 | | 0.1579 | 8.44 | 4800 | 0.1614 | 66.7889 | 55.5768 | 63.8266 | 66.5511 | 16.3181 | | 0.1526 | 8.79 | 5000 | 0.1610 | 66.7516 | 55.5383 | 63.6509 | 66.5754 | 16.261 | | 0.1506 | 9.14 | 5200 | 0.1608 | 66.9266 | 55.6277 | 63.7712 | 66.6668 | 16.3445 | | 0.1502 | 9.49 | 5400 | 0.1604 | 66.9759 | 55.6586 | 63.8856 | 66.7849 | 16.3251 | | 0.158 | 9.84 | 5600 | 0.1597 | 66.8856 | 55.6869 | 63.8241 | 66.7005 | 16.3392 | ### Framework versions - Transformers 4.34.0 - Pytorch 2.0.1+cu117 - Datasets 2.14.4 - Tokenizers 0.14.0