--- library_name: transformers base_model: meta-llama/Llama-2-7b-chat-hf language: - vi --- # Vietnamese Fine-tuned Llama-2-7b-chat-hf This repository contains a Vietnamese-tuned version of the `Llama-2-7b-chat-hf` model, which has been fine-tuned on Vietnamese datasets using LoRA (Low-Rank Adaptation) techniques. ## Model Details This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities. ### Model Description - **Developed by:** [Daniel Du](https://github.com/danghoangnhan) - **Model type:** Large Language Model - **Language(s) (NLP):** Vietnamese - **License:** [More Information Needed] - **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) - **Language:** Vietnamese ### Direct Use You can use this model directly with the Hugging Face Transformers library: ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel, PeftConfig # Load the base model base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf") # Load the LoRA configuration and model peft_model_id = "CallMeMrFern/Llama-2-7b-chat-hf_vn" config = PeftConfig.from_pretrained(peft_model_id) model = PeftModel.from_pretrained(base_model, peft_model_id) # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf") # Example usage input_text = "Xin chào, hôm nay thời tiết thế nào?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_length=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations - This model is specifically fine-tuned for Vietnamese and may not perform as well on other languages. - The model inherits limitations from the base Llama-2-7b-chat-hf model. - Performance may vary depending on the specific task and domain. ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data Dataset: alpaca_translate_GPT_35_10_20k.json (Vietnamese translation of the Alpaca dataset) #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ### Model Architecture and Objective [More Information Needed] ## Citation If you use this model in your research, please cite: ``` @misc{vietnamese_llama2_7b_chat, author = {[Your Name]}, title = {Vietnamese Fine-tuned Llama-2-7b-chat-hf}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://huggingface.co/CallMeMrFern/Llama-2-7b-chat-hf_vn}} } ``` ## Training procedure The following `bitsandbytes` quantization config was used during training: - quant_method: bitsandbytes - load_in_8bit: True - load_in_4bit: False - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: fp4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float32 ### Framework versions - PEFT 0.6.3.dev0 ## Model Description This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities. ## Fine-tuning Details - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - **LoRA Config:** - Target Modules: `["q_proj", "v_proj"]` - Precision: 8-bit - **Dataset:** `alpaca_translate_GPT_35_10_20k.json` (Vietnamese translation of the Alpaca dataset) ## Training Procedure The model was fine-tuned using the following command: ```bash python finetune/lora.py \ --base_model meta-llama/Llama-2-7b-chat-hf \ --model_type llama \ --data_dir data/general/alpaca_translate_GPT_35_10_20k.json \ --output_dir finetuned/meta-llama/Llama-2-7b-chat-hf \ --lora_target_modules '["q_proj", "v_proj"]' \ --micro_batch_size 1 ``` For multi-GPU training, a distributed training approach was used. ## Evaluation Results [Include any evaluation results, perplexity scores, or benchmark performances here] ## Acknowledgements - This project is part of the TF07 Course offered by ProtonX. - We thank the creators of the original Llama-2-7b-chat-hf model and the Hugging Face team for their tools and resources. - Appreciation to [VietnamAIHub/Vietnamese_LLMs](https://github.com/VietnamAIHub/Vietnamese_LLMs) for the translated dataset.