smajumdar94 commited on
Commit
13f33b0
1 Parent(s): e502137

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -1,7 +1,4 @@
1
  ---
2
-
3
-
4
-
5
  language:
6
  - be
7
  library_name: nemo
@@ -78,8 +75,9 @@ asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained("nvidia/stt_be_con
78
  ```
79
 
80
  ### Transcribing using Python
81
- ```
82
  Simply do:
 
83
  ```
84
  asr_model.transcribe(['sample.wav'])
85
  ```
@@ -88,7 +86,7 @@ asr_model.transcribe(['sample.wav'])
88
 
89
  ```shell
90
  python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py
91
- pretrained_name="nvidia/stt_en_conformer_ctc_large"
92
  audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
93
  ```
94
 
@@ -120,11 +118,15 @@ All the models in this collection are trained on a composite dataset (NeMo ASRSE
120
 
121
  ## Performance
122
 
123
- Performances of the ASR models are reported in terms of Word Error Rate (WER%) with greedy decoding. WER on dev is 4.8%
 
 
 
 
124
 
125
  ## Limitations
126
 
127
- Since all models are trained on just MCV-10 dataset, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
128
 
129
  ## Deployment with NVIDIA Riva
130
 
 
1
  ---
 
 
 
2
  language:
3
  - be
4
  library_name: nemo
 
75
  ```
76
 
77
  ### Transcribing using Python
78
+
79
  Simply do:
80
+
81
  ```
82
  asr_model.transcribe(['sample.wav'])
83
  ```
 
86
 
87
  ```shell
88
  python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py
89
+ pretrained_name="nvidia/stt_be_conformer_ctc_large"
90
  audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
91
  ```
92
 
 
118
 
119
  ## Performance
120
 
121
+ Performances of the ASR models are reported in terms of Word Error Rate (WER%) with greedy decoding.
122
+
123
+ | Version | Tokenizer | Vocabulary Size | MCV 10 Test | Train Dataset |
124
+ |---------|----------------------|-----------------|-------------|---------------|
125
+ | 1.12.0 | Google Sentencepiece | 1024 | 4.8 | MCV 10 |
126
 
127
  ## Limitations
128
 
129
+ Since all models are trained on just academic datasets, the performance of this model might degrade for speech which includes technical terms, or vernacular that the model has not been trained on. The model might also perform worse for accented speech.
130
 
131
  ## Deployment with NVIDIA Riva
132