bashFish
commited on
Commit
•
a167acd
1
Parent(s):
4140f1b
adding checkpoint
Browse files- 00000370000_instruct_000000010000/tokenizer.pt +3 -0
- 00000370000_instruct_000000010000/tokenizer/tokenizer.pt +3 -0
- 00000370000_instruct_000000010000/tokenizer/tokenizer_config.yaml +3 -0
- 00000370000_instruct_000000010000/tokenizer_config.yaml +3 -0
- 00000370000_instruct_000000010000/transformer/transformer.pt +3 -0
- 00000370000_instruct_000000010000/transformer/transformer_config.yaml +3 -0
- README.md +7 -0
00000370000_instruct_000000010000/tokenizer.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fad7016d158905c099e607b52c707b4f019fed531d95c6235a201758f6e4ba05
|
3 |
+
size 3718892766
|
00000370000_instruct_000000010000/tokenizer/tokenizer.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9c7ae5ef18db0fc7b1168b6d95bbbe7d8b57a0dd0dd80cb772d8015a197673f4
|
3 |
+
size 2607973210
|
00000370000_instruct_000000010000/tokenizer/tokenizer_config.yaml
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ee361247e7b839df69b83b6f42c4971f869304f8f877c991c270e1567abbd7c6
|
3 |
+
size 384
|
00000370000_instruct_000000010000/tokenizer_config.yaml
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ee361247e7b839df69b83b6f42c4971f869304f8f877c991c270e1567abbd7c6
|
3 |
+
size 384
|
00000370000_instruct_000000010000/transformer/transformer.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c378f7056e0b061ed4571fde57ef7524d138c7849ef0a185875e4fa899d6f69
|
3 |
+
size 13413983204
|
00000370000_instruct_000000010000/transformer/transformer_config.yaml
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6e8484ba14f7c37a7dc3ace6396e2bd824bd06e8e6d5ef6d732b83b1ef28e103
|
3 |
+
size 483
|
README.md
CHANGED
@@ -3,3 +3,10 @@ license: other
|
|
3 |
license_name: open-aleph-license
|
4 |
license_link: https://github.com/Aleph-Alpha/.github/blob/main/oal.pdf
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
license_name: open-aleph-license
|
4 |
license_link: https://github.com/Aleph-Alpha/.github/blob/main/oal.pdf
|
5 |
---
|
6 |
+
|
7 |
+
This model is to support our ongoing research [T-Free](https://github.com/Aleph-Alpha/trigrams).
|
8 |
+
It is publicly available under the Open Aleph License, a license explicitly allowing for non-commercial research and educational use.
|
9 |
+
|
10 |
+
The model was trained on 1 epoch of the [fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset with a sequence length of 4k and 1k batch-size. It was further continued with a llama-style instruction finetuning.
|
11 |
+
|
12 |
+
It has an embedding layer of dimension 32k, vocab population of 10 (activations per trigram) and partial lower case overlap of 2 (i.e. 2 of the 10 activations overlap with the trigrams lowercase counterpart). This model aggregates all activations by taking the sum only.
|