Technotech's picture
Update README.md
9ba85bd
|
raw
history blame
No virus
2.16 kB
---
library_name: transformers
license: apache-2.0
datasets:
- Gustavosta/Stable-Diffusion-Prompts
language:
- en
tags:
- completion
---
# MagicPrompt TinyStories-33M (Merged)
## Info
Magic prompt completion model trained on a dataset 70k Stable Diffusion prompts. Base model: TinyStories-33M. Inspired by [MagicPrompt-Stable-Diffusion](Gustavosta/MagicPrompt-Stable-Diffusion).
Model seems to be pretty decent for 33M params, but it clearly lacks much of an understanding of pretty much anything. Still, considering the size, I think it's decent. Whether you would use this over a small GPT-2 based model is up to you.
## Examples
Generation settings: `max_new_tokens=40, do_sample=True, temperature=2.0, num_beams=10, repetition_penalty=1.2, top_k=40, top_p=0.75, eos_token_id=tokenizer.eos_token_id` (there may be better settings).
(Bold text is generated by the model)
"A close shot of a bird in a jungle, **with two legs, with long hair on a tall, long brown body, long white skin, sharp teeth, high bones, digital painting, artstation, concept art, illustration by wlop,**"
"Camera shot of **a strange young girl wearing a cloak, wearing a mask in clothes, with long curly hair, long hair, black eyes, dark skin, white teeth, long brown eyes eyes, big eyes, sharp**"
"An illustration of a house, stormy weather, **sun, moonlight, night, concept art, 4 k, wlop, by wlop, by jose stanley, ilya kuvshinov, sprig**"
"A field of flowers, camera shot, 70mm lens, **fantasy, intricate, highly detailed, artstation, concept art, sharp focus, illustration, illustration, artgerm jake daggaws, artgerm and jaggodieie brad**"
## Training config
- Rank 16 LoRA
- Trained on Gustavosta/Stable-Diffusion-Prompts for 10 epochs
- Batch size of 64
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
### Framework versions
- PEFT 0.5.0.dev0