alokabhishek commited on
Commit
079c4af
1 Parent(s): c92c340

Updated Readme

Browse files
Files changed (1) hide show
  1. README.md +92 -13
README.md CHANGED
@@ -1,21 +1,100 @@
1
  ---
2
  library_name: transformers
3
- widget:
4
- - messages:
5
- - role: user
6
- content: How does the brain work?
7
- inference:
8
- parameters:
9
- max_new_tokens: 200
10
- extra_gated_heading: Access Gemma on Hugging Face
11
- extra_gated_prompt: >-
12
- To access Gemma on Hugging Face, you’re required to review and agree to
13
- Google’s usage license. To do this, please ensure you’re logged-in to Hugging
14
- Face and click below. Requests are processed immediately.
15
- extra_gated_button_content: Acknowledge license
16
  license: gemma
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ---
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  # Gemma Model Card
20
 
21
  **Model Page**: [Gemma](https://ai.google.dev/gemma/docs)
 
1
  ---
2
  library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  license: gemma
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - GGUF
7
+ - quantized
8
+ - Q4_K_M
9
+ - Q5_K_M
10
+ - 4bit
11
+ - 5bit
12
+ - Gemma
13
+ - Gemma-7B
14
+ - Gemma-1.1
15
+ - Gemma-1.1-7b
16
+ - Google
17
  ---
18
 
19
+
20
+ # Model Card for alokabhishek/gemma-1.1-7b-it-GGUF
21
+
22
+ <!-- Provide a quick summary of what the model is/does. -->
23
+ This repo GGUF quantized version of Google's Gemma-1.1-7b-it model using llama.cpp.
24
+
25
+
26
+ ## Model Details
27
+
28
+ - Model creator: [Google](https://huggingface.co/google)
29
+ - Original model: [gemma-7b-it-GGUF](https://huggingface.co/google/gemma-1.1-7b-it-GGUF)
30
+
31
+
32
+ ### About GGUF quantization using llama.cpp
33
+
34
+ - llama.cpp github repo: [llama.cpp github repo](https://github.com/ggerganov/llama.cpp)
35
+ - llama-cpp-python github repo: [llama-cpp-python github repo](https://github.com/abetlen/llama-cpp-python)
36
+
37
+
38
+
39
+ # How to Get Started with the Model
40
+
41
+ Use the code below to get started with the model.
42
+
43
+
44
+ ## How to run from Python code
45
+
46
+ #### First install the package
47
+ ```shell
48
+ # Base ctransformers with CUDA GPU acceleration
49
+ ! pip install ctransformers[cuda]>=0.2.24
50
+ # Or with no GPU acceleration
51
+ # ! pip install llama-cpp-python
52
+ ! pip install -U sentence-transformers
53
+ ! pip install transformers huggingface_hub torch
54
+
55
+ ```
56
+
57
+ # Import
58
+
59
+ ```python
60
+ from llama_cpp import Llama
61
+ from transformers import pipeline, AutoModel, AutoTokenizer
62
+ from sentence_transformers import SentenceTransformer
63
+ import os
64
+ ```
65
+
66
+ # Using llama_cpp as a high-level helper
67
+
68
+ ```python
69
+
70
+ repo_id = "alokabhishek/gemma-1.1-7b-it-GGUF"
71
+ filename = "Q4_K_M.gguf"
72
+
73
+ llm = Llama.from_pretrained(
74
+ repo_id=repo_id,
75
+ filename=filename,
76
+ verbose=False,
77
+ )
78
+
79
+
80
+ prompt = "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar."
81
+ llm_response = llm.create_chat_completion(
82
+ messages=[{"role": "user", "content": prompt}],
83
+ temperature=1.5,
84
+ top_p=0.8,
85
+ top_k=50,
86
+ repeat_penalty=1.01,
87
+ )
88
+
89
+ llm_respose_formatted = llm_response["choices"][0]["message"]["content"]
90
+ print(llm_respose_formatted)
91
+
92
+
93
+ ```
94
+
95
+
96
+ # Orignial Gemma Model Card
97
+
98
  # Gemma Model Card
99
 
100
  **Model Page**: [Gemma](https://ai.google.dev/gemma/docs)