bashFish commited on
Commit
a167acd
1 Parent(s): 4140f1b

adding checkpoint

Browse files
00000370000_instruct_000000010000/tokenizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fad7016d158905c099e607b52c707b4f019fed531d95c6235a201758f6e4ba05
3
+ size 3718892766
00000370000_instruct_000000010000/tokenizer/tokenizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c7ae5ef18db0fc7b1168b6d95bbbe7d8b57a0dd0dd80cb772d8015a197673f4
3
+ size 2607973210
00000370000_instruct_000000010000/tokenizer/tokenizer_config.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee361247e7b839df69b83b6f42c4971f869304f8f877c991c270e1567abbd7c6
3
+ size 384
00000370000_instruct_000000010000/tokenizer_config.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee361247e7b839df69b83b6f42c4971f869304f8f877c991c270e1567abbd7c6
3
+ size 384
00000370000_instruct_000000010000/transformer/transformer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c378f7056e0b061ed4571fde57ef7524d138c7849ef0a185875e4fa899d6f69
3
+ size 13413983204
00000370000_instruct_000000010000/transformer/transformer_config.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e8484ba14f7c37a7dc3ace6396e2bd824bd06e8e6d5ef6d732b83b1ef28e103
3
+ size 483
README.md CHANGED
@@ -3,3 +3,10 @@ license: other
3
  license_name: open-aleph-license
4
  license_link: https://github.com/Aleph-Alpha/.github/blob/main/oal.pdf
5
  ---
 
 
 
 
 
 
 
 
3
  license_name: open-aleph-license
4
  license_link: https://github.com/Aleph-Alpha/.github/blob/main/oal.pdf
5
  ---
6
+
7
+ This model is to support our ongoing research [T-Free](https://github.com/Aleph-Alpha/trigrams).
8
+ It is publicly available under the Open Aleph License, a license explicitly allowing for non-commercial research and educational use.
9
+
10
+ The model was trained on 1 epoch of the [fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset with a sequence length of 4k and 1k batch-size. It was further continued with a llama-style instruction finetuning.
11
+
12
+ It has an embedding layer of dimension 32k, vocab population of 10 (activations per trigram) and partial lower case overlap of 2 (i.e. 2 of the 10 activations overlap with the trigrams lowercase counterpart). This model aggregates all activations by taking the sum only.