gurgutan
/

ruGPT-13B-4bit

@@ -1,53 +1,35 @@
 ---
 license: mit
 language:
-- ru
 - en
-library_name: transformers
----
 tags:
 - gpt3
 - transformers
 ---
 # ruGPT-13B-4bit
 This files are GPTQ model files for sberbank [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B) model.
 ## Technical details
 Model was quantized to 4-bit
 ## Examples of usage
 First make sure you have AutoGPTQ installed:
 GITHUB_ACTIONS=true pip install auto-gptq
 Then try the following example code:
 ```python
 from transformers import AutoTokenizer, TextGenerationPipeline
 from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
 repo_name = "gurgutan/ruGPT-13B-4bit"
 # load tokenizer from Hugging Face Hub
 tokenizer = AutoTokenizer.from_pretrained(repo_name, use_fast=True)
 # download quantized model from Hugging Face Hub and load to the first GPU
 model = AutoGPTQForCausalLM.from_quantized(repo_name, device="cuda:0", use_safetensors=True, use_triton=False)
 # inference with model.generate
 request = "Буря мглою небо кроет"
 print(tokenizer.decode(model.generate(**tokenizer(request, return_tensors="pt").to(model.device))[0]))
 # or you can also use pipeline
 pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
 print(pipeline(request)[0]["generated_text"])
 ```
 # Original model:  [ruGPT-3.5 13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B)
 Language model for Russian. Model has 13B parameters as you can guess from it's name. This is our biggest model so far and it was used for trainig GigaChat (read more about it in the [article](https://habr.com/ru/companies/sberbank/articles/730108/)).

 ---
 license: mit
 language:
 - en
+- ru
 tags:
 - gpt3
 - transformers
 ---
 # ruGPT-13B-4bit
 This files are GPTQ model files for sberbank [ruGPT-3.5-13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B) model.
 ## Technical details
 Model was quantized to 4-bit
 ## Examples of usage
 First make sure you have AutoGPTQ installed:
 GITHUB_ACTIONS=true pip install auto-gptq
 Then try the following example code:
 ```python
 from transformers import AutoTokenizer, TextGenerationPipeline
 from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
 repo_name = "gurgutan/ruGPT-13B-4bit"
 # load tokenizer from Hugging Face Hub
 tokenizer = AutoTokenizer.from_pretrained(repo_name, use_fast=True)
 # download quantized model from Hugging Face Hub and load to the first GPU
 model = AutoGPTQForCausalLM.from_quantized(repo_name, device="cuda:0", use_safetensors=True, use_triton=False)
 # inference with model.generate
 request = "Буря мглою небо кроет"
 print(tokenizer.decode(model.generate(**tokenizer(request, return_tensors="pt").to(model.device))[0]))
 # or you can also use pipeline
 pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
 print(pipeline(request)[0]["generated_text"])
 ```
 # Original model:  [ruGPT-3.5 13B](https://huggingface.co/ai-forever/ruGPT-3.5-13B)
 Language model for Russian. Model has 13B parameters as you can guess from it's name. This is our biggest model so far and it was used for trainig GigaChat (read more about it in the [article](https://habr.com/ru/companies/sberbank/articles/730108/)).