tyzhu's picture
End of training
b787eef verified
metadata
license: llama2
base_model: meta-llama/Llama-2-7b-hf
tags:
  - generated_from_trainer
datasets:
  - tyzhu/lmind_nq_train6000_eval6489_v1_qa
metrics:
  - accuracy
model-index:
  - name: lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: tyzhu/lmind_nq_train6000_eval6489_v1_qa
          type: tyzhu/lmind_nq_train6000_eval6489_v1_qa
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.5965641025641025

lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4443
  • Accuracy: 0.5966

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
2.0369 1.0 187 0.6128 1.2953
1.2821 2.0 375 0.6146 1.2741
1.1987 3.0 562 0.6162 1.2715
1.066 4.0 750 0.6151 1.3011
0.9381 5.0 937 0.6126 1.3728
0.8238 6.0 1125 0.6091 1.4599
0.7289 7.0 1312 0.6064 1.5455
0.6559 8.0 1500 0.6026 1.6359
0.5733 9.0 1687 0.6006 1.7149
0.5336 10.0 1875 0.5989 1.8006
0.5116 11.0 2062 0.5982 1.8851
0.4934 12.0 2250 0.5982 1.9262
0.4823 13.0 2437 0.5974 1.9413
0.47 14.0 2625 0.5967 2.0121
0.4661 15.0 2812 0.5968 2.0250
0.462 16.0 3000 0.5990 1.9805
0.4357 17.0 3187 0.5976 2.0656
0.4348 18.0 3375 0.5979 2.0308
0.4331 19.0 3562 0.5990 2.0629
0.4341 20.0 3750 0.5983 2.0815
0.434 21.0 3937 0.5968 2.1253
0.4335 22.0 4125 0.5971 2.1789
0.4346 23.0 4312 0.5952 2.1455
0.4326 24.0 4500 0.5971 2.1990
0.4139 25.0 4687 0.5976 2.1890
0.4139 26.0 4875 0.5968 2.1939
0.4162 27.0 5062 0.5965 2.2190
0.4177 28.0 5250 0.5955 2.2781
0.4173 29.0 5437 0.5976 2.2681
0.4187 30.0 5625 0.5959 2.2996
0.4199 31.0 5812 0.5981 2.2395
0.4213 32.0 6000 0.5957 2.2991
0.4015 33.0 6187 0.5952 2.3223
0.4058 34.0 6375 0.5957 2.3266
0.4056 35.0 6562 0.5946 2.3779
0.4078 36.0 6750 0.5951 2.3453
0.4097 37.0 6937 0.5965 2.3379
0.4105 38.0 7125 0.5969 2.3624
0.4116 39.0 7312 0.5962 2.3846
0.4121 40.0 7500 0.5945 2.3748
0.3973 41.0 7687 0.5956 2.3797
0.3985 42.0 7875 0.5967 2.3599
0.4014 43.0 8062 0.5971 2.3475
0.4032 44.0 8250 0.5987 2.3937
0.4028 45.0 8437 0.5967 2.3863
0.4027 46.0 8625 0.5956 2.4195
0.4046 47.0 8812 0.5970 2.3832
0.4067 48.0 9000 0.5973 2.3805
0.3923 49.0 9187 0.5957 2.4460
0.3949 49.87 9350 0.5966 2.4443

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1