Edit model card

lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3_1e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3 dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7443
  • Accuracy: 0.6446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Accuracy Validation Loss
1.3346 1.0 187 0.6671 1.2008
1.1638 2.0 375 0.6681 1.1940
1.0624 3.0 562 0.6676 1.2021
0.9457 4.0 750 0.6663 1.2319
0.8373 5.0 937 0.6639 1.2842
0.7159 6.0 1125 0.6607 1.3518
0.5964 7.0 1312 0.6582 1.4532
0.4861 8.0 1500 0.6549 1.5512
0.3754 9.0 1687 0.6529 1.6544
0.2938 10.0 1875 0.6505 1.7852
0.2268 11.0 2062 0.6490 1.9338
0.1792 12.0 2250 0.6479 2.0116
0.1418 13.0 2437 0.6470 2.1431
0.1171 14.0 2625 0.6447 2.2358
0.1038 15.0 2812 0.6461 2.3164
0.0958 16.0 3000 0.6452 2.3597
0.0848 17.0 3187 0.6453 2.4430
0.0804 18.0 3375 0.6441 2.4833
0.0786 19.0 3562 0.6439 2.4723
0.0786 20.0 3750 0.6437 2.5403
0.0792 21.0 3937 0.6441 2.4761
0.0792 22.0 4125 0.6447 2.5409
0.0781 23.0 4312 0.6449 2.5628
0.0766 24.0 4500 0.6446 2.5601
0.0709 25.0 4687 0.6453 2.5480
0.07 26.0 4875 0.6455 2.6145
0.0704 27.0 5062 0.6437 2.6258
0.073 28.0 5250 0.6449 2.5735
0.0738 29.0 5437 0.6441 2.6097
0.0727 30.0 5625 0.6427 2.5475
0.0727 31.0 5812 0.6435 2.6130
0.0715 32.0 6000 0.6441 2.6316
0.0679 33.0 6187 0.6442 2.5900
0.0684 34.0 6375 0.6445 2.6209
0.0676 35.0 6562 0.6452 2.6090
0.068 36.0 6750 0.6451 2.6729
0.0682 37.0 6937 0.6456 2.6381
0.0695 38.0 7125 0.6441 2.7113
0.07 39.0 7312 0.6438 2.6791
0.0709 40.0 7500 0.6444 2.6901
0.0662 41.0 7687 0.6455 2.6341
0.0664 42.0 7875 0.6451 2.7369
0.0658 43.0 8062 0.6452 2.6964
0.0677 44.0 8250 0.6442 2.6634
0.0668 45.0 8437 0.6436 2.7614
0.0657 46.0 8625 0.6446 2.7360
0.0656 47.0 8812 0.6441 2.7653
0.0658 48.0 9000 0.6453 2.7756
0.0626 49.0 9187 0.6464 2.7578
0.0666 49.87 9350 0.6446 2.7443

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.14.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3_1e-4_lora2

Finetuned
(558)
this model

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3_1e-4_lora2

Evaluation results

  • Accuracy on tyzhu/lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3
    self-reported
    0.645