atsuki-yamaguchi
/

Llama-2-7b-hf-si-30K-1000-align-2x2ls-mtp-512

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

atsuki-yamaguchi commited on 13 days ago

Commit

af31c21

•

1 Parent(s): 3dab4d0

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 base_model: meta-llama/Llama-2-7b-hf
 library_name: transformers
 ---
-# Llama2 7B for Sinhala: 1000 target vocabulary size + Align target vocabulary initialization + T&B2LS/MTP/512 training
 This model is built on top of Llama2 7B adapted for Sinhala using 30K target language sentences sampled from CC-100.
@@ -14,7 +14,7 @@ This model is built on top of Llama2 7B adapted for Sinhala using 30K target lan
 * **Vocabulary**: This model has an additional 1000 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Align initialization.
-* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
 ## Model Description

 base_model: meta-llama/Llama-2-7b-hf
 library_name: transformers
 ---
+# Llama2 7B for Sinhala: 1000 target vocabulary size + Align target vocabulary initialization + 2x2LS/MTP/512 training
 This model is built on top of Llama2 7B adapted for Sinhala using 30K target language sentences sampled from CC-100.
 * **Vocabulary**: This model has an additional 1000 target vocabulary.
 * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Align initialization.
+* **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the 2x2LS/MTP/512 strategies introduced in the paper.
 ## Model Description