RichardErkhov
/

RajuKandasamy_-_tamillama_tiny_30m-8bits

+Quantization made by Richard Erkhov.
+[Github](https://github.com/RichardErkhov)
+[Discord](https://discord.gg/pvy7H8DZMG)
+[Request more models](https://github.com/RichardErkhov/quant_request)
+tamillama_tiny_30m - bnb 8bits
+- Model creator: https://huggingface.co/RajuKandasamy/
+- Original model: https://huggingface.co/RajuKandasamy/tamillama_tiny_30m/
+Original model description:
+---
+license: gpl
+datasets:
+- roneneldan/TinyStoriesInstruct
+language:
+- ta
+- en
+library_name: transformers
+inference:
+  parameters:
+    max_new_tokens: 120
+    repetition_penalty: 1.4
+    temperature: 0.01
+widget:
+- text: |
+    சொற்கள்:
+    வீழ்ச்சி, சீட்டு, பிடிவாதம்
+    சுருக்கம்:
+  example_title: Tamil Story with words 1
+- text: |
+    சொற்கள்:
+    ஓட்டம், பயணம், குழப்பம்
+    சுருக்கம்:
+  example_title: Tamil Story with words 2
+- text: |
+    சொற்கள்:
+    உதவி, பதிவு, சங்கடம்
+    சுருக்கம்:
+  example_title: Tamil Story with words 3
+- text: |
+    சொற்கள்:
+    வாக்குறுதி, எலி, பெரியது
+    சுருக்கம்:
+  example_title: Tamil Story with words 4
+- text: |
+    Words: prevent, car, broken
+    Features: Dialogue, Twist
+  example_title: Story in English
+- text: |
+    சொற்கள்:
+    திரும்பு, வாசனை திரவியம், துணிச்சல்
+    சுருக்கம்:
+  example_title: Tamil Story with words 5
+---
+## Tamillama_Tiny: A 30M tiny llama model trained to tell stories in Tamil
+### TL;DR:
+This is an experimental model inspired by the paper https://arxiv.org/abs/2305.07759 - How Small Can Language Models Be and Still Speak Coherent English?.
+Extended the same concept for Tamil. A 30M parameter LLaMA architecture model that outputs coherent Tamil is preseted here.
+Additional experimentation which is included in the model:
+1. This is a multilanguage model as it can output both English and Tamil stories.
+2. The model also does translation of stories from Engish to tamil and vice versa. To see the translation feature, set the max_new_tokens > 512.
+3. Translation of original stories from the tinystories dataset was done using [IndicTrans](https://ai4bharat.iitm.ac.in/indic-trans)
+For now, this is a toy model for researchers, students and LLM enthusiasts to play with the linquistic capability of the model.
+## Weights Release, License and Usage
+We release the weights in two formats: Hugging Face transformers format and GGML format to use with CTransformers or LLaMA.cpp.
+This is not fit for any practical purpose other than for research/experimentation use cases.
+Usage:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
+model = AutoModelForCausalLM.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
+prompt = f"""சொற்கள்:
+வாக்குறுதி, எலி, பெரியது
+சுருக்கம்:"""
+input_ids = tokenizer(prompt, return_tensors="pt").input_ids
+generation_output = model.generate(
+    input_ids=input_ids, max_new_tokens=256
+)
+print(tokenizer.decode(generation_output[0]))
+```