pankajmathur commited on
Commit
e987f7e
1 Parent(s): 0ad4395

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -8
README.md CHANGED
@@ -22,14 +22,17 @@ We evaluated model_009 on a wide range of tasks using [Language Model Evaluation
22
 
23
  Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
24
 
25
- |||||
26
- |:------:|:--------:|:-------:|:--------:|
27
- |**Task**|**Metric**|**Value**|**Stderr**|
28
- |*arc_challenge*|acc_norm|0.6843|0.0141|
29
- |*hellaswag*|acc_norm|0.8671|0.0038|
30
- |*mmlu*|acc_norm|0.6931|0.0351|
31
- |*truthfulqa_mc*|mc2|0.5718|0.0157|
32
- |**Total Average**|-|**0.7041**||
 
 
 
33
 
34
 
35
  ## Example Usage
 
22
 
23
  Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
24
 
25
+ |||
26
+ |:------:|:-------:|
27
+ |**Task**|**Value**|
28
+ |*ARC*|0.7159|
29
+ |*HellaSwag*|0.8771|
30
+ |*MMLU*|0.6943|
31
+ |*TruthfulQA*|0.6072|
32
+ |*Winogrande*|0.8232|
33
+ |*GSM8k*|0.3942|
34
+ |*DROP*|0.4401|
35
+ |**Total Average**|**0.6503**|
36
 
37
 
38
  ## Example Usage