elmadany commited on
Commit
e2c20ba
1 Parent(s): ad72751

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -6,11 +6,23 @@
6
 
7
  In addition, we provide the three models on two architectures small and base. For all models, we use a learning rate of 0.01, a batch size of 128 sequences, and a maximum sequence length of 512 whereas AraT5-tweet 128 maximum sequence is used. Hence, the original implementation of T5 in the TensorFlow framework is used to train the models. We train the models for 1M steps.8 Training took ∼ 80 days on 1 on Google Cloud TPU with 8 cores (v3.8) from TensorFlow Research Cloud (TFRC).
8
 
 
9
  # How to use AraT5 models
10
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1GFOGolWPIfDvYdSNdGFrOXwu3Gu28k2b?usp=sharing)This is an example for fine-tuning **AraT5-base** for News Title Generation on the Aranews dataset
11
 
12
  For more details, please visit our own [GitHub](https://github.com/UBC-NLP/araT5).
13
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # BibTex
16
 
 
6
 
7
  In addition, we provide the three models on two architectures small and base. For all models, we use a learning rate of 0.01, a batch size of 128 sequences, and a maximum sequence length of 512 whereas AraT5-tweet 128 maximum sequence is used. Hence, the original implementation of T5 in the TensorFlow framework is used to train the models. We train the models for 1M steps.8 Training took ∼ 80 days on 1 on Google Cloud TPU with 8 cores (v3.8) from TensorFlow Research Cloud (TFRC).
8
 
9
+
10
  # How to use AraT5 models
11
  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1GFOGolWPIfDvYdSNdGFrOXwu3Gu28k2b?usp=sharing)This is an example for fine-tuning **AraT5-base** for News Title Generation on the Aranews dataset
12
 
13
  For more details, please visit our own [GitHub](https://github.com/UBC-NLP/araT5).
14
 
15
+ # AraT5 Models Checkpoints
16
+
17
+ AraT5 Pytorch and TensorFlow checkpoints are available on the Huggingface website for direct download and use ```exclusively for research```. `For commercial use, please contact the authors via email @ (*muhammad.mageed[at]ubc[dot]ca*).`
18
+
19
+ | **Model** | **Link** |
20
+ |---------|:------------------:|
21
+ | **AraT5-base** | [https://huggingface.co/UBC-NLP/AraT5-base](https://huggingface.co/UBC-NLP/AraT5-base) |
22
+ | **AraT5-msa-base** | [https://huggingface.co/UBC-NLP/AraT5-msa-base](https://huggingface.co/UBC-NLP/AraT5-msa-base) |
23
+ | **AraT5-tweet-base** | [https://huggingface.co/UBC-NLP/AraT5-tweet-base](https://huggingface.co/UBC-NLP/AraT5-tweet-base) |
24
+ | **AraT5-msa-small** | [https://huggingface.co/UBC-NLP/AraT5-msa-small](https://huggingface.co/UBC-NLP/AraT5-msa-small) |
25
+ | **AraT5-tweet-small**| [https://huggingface.co/UBC-NLP/AraT5-tweet-small](https://huggingface.co/UBC-NLP/AraT5-tweet-small) |
26
 
27
  # BibTex
28