Demo-MR-Breexe-8x7B

Running

App Files Files Community

YC-Chen commited on Jan 15

Commit

79e6b14

•

1 Parent(s): 8971fda

Update app.py

Browse files

Files changed (1) hide show

app.py +25 -0

app.py CHANGED Viewed

@@ -7,6 +7,31 @@ from transformers import AutoTokenizer
 DESCRIPTION = """
 """

 DESCRIPTION = """
+# Demo: Breeze-7B-Instruct-v0.1
+Breeze-7B is a language model family that builds on top of [Mistral-7B](https://huggingface.co/mistralai/Mistral-7B-v0.1), specifically intended for Traditional Chinese use.
+[Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v0.1) is the base model for the Breeze-7B series.
+It is suitable for use if you have substantial fine-tuning data to tune it for your specific use case.
+[Breeze-7B-Instruct](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v0.1) derives from the base model Breeze-7B-Base, making the resulting model amenable to be used as-is for commonly seen tasks.
+[Breeze-7B-Instruct-64k](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-64k-v0.1) is a slightly modified version of
+Breeze-7B-Instruct to enable a 64k-token context length. Roughly speaking, that is equivalent to 88k Traditional Chinese characters.
+The current release version of Breeze-7B is v0.1.
+Practicality-wise:
+- Breeze-7B-Base expands the original vocabulary with additional 30,000 Traditional Chinese tokens. With the expanded vocabulary, everything else being equal, Breeze-7B operates at twice the inference speed for Traditional Chinese to Mistral-7B and Llama 7B. [See [Inference Performance](#inference-performance).]
+- Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.
+- In particular, Breeze-7B-Instruct-64k can perform tasks at a document level, not a chapter level.
+Performance-wise:
+- Breeze-7B-Instruct demonstrates impressive performance in benchmarks for Traditional Chinese and English, when compared to similar sized open-source contemporaries such as Taiwan-LLM-7B/13B-chat, QWen-7B-Chat, and Yi-6B-Chat. [See [Chat Model Performance](#chat-model-performance).]
+*A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*
 """