RichardErkhov commited on
Commit
fd88a11
1 Parent(s): 674479c

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ tamillama_tiny_30m - bnb 8bits
11
+ - Model creator: https://huggingface.co/RajuKandasamy/
12
+ - Original model: https://huggingface.co/RajuKandasamy/tamillama_tiny_30m/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: gpl
20
+ datasets:
21
+ - roneneldan/TinyStoriesInstruct
22
+ language:
23
+ - ta
24
+ - en
25
+ library_name: transformers
26
+ inference:
27
+ parameters:
28
+ max_new_tokens: 120
29
+ repetition_penalty: 1.4
30
+ temperature: 0.01
31
+ widget:
32
+ - text: |
33
+ சொற்கள்:
34
+ வீழ்ச்சி, சீட்டு, பிடிவாதம்
35
+ சுருக்கம்:
36
+ example_title: Tamil Story with words 1
37
+ - text: |
38
+ சொற்கள்:
39
+ ஓட்டம், பயணம், குழப்பம்
40
+ சுருக்கம்:
41
+ example_title: Tamil Story with words 2
42
+ - text: |
43
+ சொற்கள்:
44
+ உதவி, பதிவு, சங்கடம்
45
+ சுருக்கம்:
46
+ example_title: Tamil Story with words 3
47
+ - text: |
48
+ சொற்கள்:
49
+ வாக்குறுதி, எலி, பெரியது
50
+ சுருக்கம்:
51
+ example_title: Tamil Story with words 4
52
+ - text: |
53
+ Words: prevent, car, broken
54
+ Features: Dialogue, Twist
55
+ example_title: Story in English
56
+ - text: |
57
+ சொற்கள்:
58
+ திரும்பு, வாசனை திரவியம், துணிச்சல்
59
+ சுருக்கம்:
60
+ example_title: Tamil Story with words 5
61
+ ---
62
+
63
+ ## Tamillama_Tiny: A 30M tiny llama model trained to tell stories in Tamil
64
+ ### TL;DR:
65
+ This is an experimental model inspired by the paper https://arxiv.org/abs/2305.07759 - How Small Can Language Models Be and Still Speak Coherent English?.
66
+
67
+ Extended the same concept for Tamil. A 30M parameter LLaMA architecture model that outputs coherent Tamil is preseted here.
68
+
69
+ Additional experimentation which is included in the model:
70
+ 1. This is a multilanguage model as it can output both English and Tamil stories.
71
+ 2. The model also does translation of stories from Engish to tamil and vice versa. To see the translation feature, set the max_new_tokens > 512.
72
+ 3. Translation of original stories from the tinystories dataset was done using [IndicTrans](https://ai4bharat.iitm.ac.in/indic-trans)
73
+
74
+ For now, this is a toy model for researchers, students and LLM enthusiasts to play with the linquistic capability of the model.
75
+
76
+ ## Weights Release, License and Usage
77
+ We release the weights in two formats: Hugging Face transformers format and GGML format to use with CTransformers or LLaMA.cpp.
78
+
79
+ This is not fit for any practical purpose other than for research/experimentation use cases.
80
+
81
+ Usage:
82
+ ```python
83
+ from transformers import AutoTokenizer, AutoModelForCausalLM
84
+
85
+ tokenizer = AutoTokenizer.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
86
+ model = AutoModelForCausalLM.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
87
+ prompt = f"""சொற்கள்:
88
+ வாக்குறுதி, எலி, பெரியது
89
+ சுருக்கம்:"""
90
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
91
+
92
+ generation_output = model.generate(
93
+ input_ids=input_ids, max_new_tokens=256
94
+ )
95
+ print(tokenizer.decode(generation_output[0]))
96
+ ```
97
+