--- language: en license: apache-2.0 datasets: - bookcorpus - wikipedia tags: - fill-mask library_name: transformers --- This model was derived from the bert-base-uncased checkpoint by replacing the GELU with ReLU activation function and continued pre-training to adapt it to the change of the activation function.