mnaylor commited on
Commit
3a4ca8d
1 Parent(s): 8a61279

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -1,3 +1,16 @@
1
  ---
2
  license: apache-2.0
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
  ---
7
+
8
+ # Mega Masked LM on wikitext-103
9
+
10
+ This is the location on the Hugging Face hub for the Mega MLM checkpoint. I trained this model on the `wikitext-103` dataset using standard
11
+ BERT-style masked LM pretraining using the [original Mega repository](https://github.com/facebookresearch/mega) and uploaded the weights
12
+ initially to hf.co/mnaylor/mega-wikitext-103. When the implementation of Mega into Hugging Face's `transformers` is finished, the weights here
13
+ are designed to be used with `MegaForMaskedLM` and are compatible with the other (encoder-based) `MegaFor*` model classes.
14
+
15
+ This model uses the RoBERTa base tokenizer since the Mega paper does not implement a specific tokenizer aside from the character-level
16
+ tokenizer used to illustrate long-sequence performance.