yspkm's picture
Training completed!
6b750f7 verified
metadata
license: apache-2.0
base_model: mistralai/Mistral-7B-Instruct-v0.3
tags:
  - generated_from_trainer
model-index:
  - name: Mistral-7B-Instruct-v0.3-lora-commonsense
    results: []

Visualize in Weights & Biases

Mistral-7B-Instruct-v0.3-lora-commonsense

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6863

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.8738 0.1503 200 0.8158
0.8589 0.3006 400 0.7939
0.8589 0.4510 600 0.7800
0.8589 0.6013 800 0.7725
0.8305 0.7516 1000 0.7650
0.8331 0.9019 1200 0.7506
0.7808 1.0522 1400 0.7438
0.7781 1.2026 1600 0.7350
0.7647 1.3529 1800 0.7252
0.7651 1.5032 2000 0.7228
0.7522 1.6535 2200 0.7099
0.7587 1.8038 2400 0.6997
0.7383 1.9542 2600 0.6932
0.7071 2.1045 2800 0.6949
0.6919 2.2548 3000 0.6899
0.7136 2.4051 3200 0.6884
0.6912 2.5554 3400 0.6878
0.6889 2.7057 3600 0.6867
0.6862 2.8561 3800 0.6863

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1