File size: 801 Bytes
022f861 547574b 5c1e0c5 3da5d7d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
license: apache-2.0
datasets:
- LooksJuicy/ruozhiba
metrics:
- accuracy
library_name: adapter-transformers
---
---
license: apache-2.0
--- Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data Params Context length GQA Token count Knowledge cutoff
Llama 3 A new mix of publicly available online data. 8B 8k Yes 15T+ March, 2023
70B 8k Yes December, 2023
Llama 3 family of models. Token counts refer to pretraining data only. Both the 8 and 70B versions use Grouped-Query Attention (GQA) for improved inference scalability.
--- sft 1700 llama3 test, 25 EPOCH |