lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_qa dataset. It achieves the following results on the evaluation set:

Loss: 2.4443
Accuracy: 0.5966

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Accuracy	Validation Loss
2.0369	1.0	187	0.6128	1.2953
1.2821	2.0	375	0.6146	1.2741
1.1987	3.0	562	0.6162	1.2715
1.066	4.0	750	0.6151	1.3011
0.9381	5.0	937	0.6126	1.3728
0.8238	6.0	1125	0.6091	1.4599
0.7289	7.0	1312	0.6064	1.5455
0.6559	8.0	1500	0.6026	1.6359
0.5733	9.0	1687	0.6006	1.7149
0.5336	10.0	1875	0.5989	1.8006
0.5116	11.0	2062	0.5982	1.8851
0.4934	12.0	2250	0.5982	1.9262
0.4823	13.0	2437	0.5974	1.9413
0.47	14.0	2625	0.5967	2.0121
0.4661	15.0	2812	0.5968	2.0250
0.462	16.0	3000	0.5990	1.9805
0.4357	17.0	3187	0.5976	2.0656
0.4348	18.0	3375	0.5979	2.0308
0.4331	19.0	3562	0.5990	2.0629
0.4341	20.0	3750	0.5983	2.0815
0.434	21.0	3937	0.5968	2.1253
0.4335	22.0	4125	0.5971	2.1789
0.4346	23.0	4312	0.5952	2.1455
0.4326	24.0	4500	0.5971	2.1990
0.4139	25.0	4687	0.5976	2.1890
0.4139	26.0	4875	0.5968	2.1939
0.4162	27.0	5062	0.5965	2.2190
0.4177	28.0	5250	0.5955	2.2781
0.4173	29.0	5437	0.5976	2.2681
0.4187	30.0	5625	0.5959	2.2996
0.4199	31.0	5812	0.5981	2.2395
0.4213	32.0	6000	0.5957	2.2991
0.4015	33.0	6187	0.5952	2.3223
0.4058	34.0	6375	0.5957	2.3266
0.4056	35.0	6562	0.5946	2.3779
0.4078	36.0	6750	0.5951	2.3453
0.4097	37.0	6937	0.5965	2.3379
0.4105	38.0	7125	0.5969	2.3624
0.4116	39.0	7312	0.5962	2.3846
0.4121	40.0	7500	0.5945	2.3748
0.3973	41.0	7687	0.5956	2.3797
0.3985	42.0	7875	0.5967	2.3599
0.4014	43.0	8062	0.5971	2.3475
0.4032	44.0	8250	0.5987	2.3937
0.4028	45.0	8437	0.5967	2.3863
0.4027	46.0	8625	0.5956	2.4195
0.4046	47.0	8812	0.5970	2.3832
0.4067	48.0	9000	0.5973	2.3805
0.3923	49.0	9187	0.5957	2.4460
0.3949	49.87	9350	0.5966	2.4443

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.14.1

tyzhu
/

lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

Dataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_qa_3e-5_lora2

Evaluation results