Edit model card

OpenELM 3B Instruct GGUFs

After a long wait, Llama.cpp (b3324) finally supports OpenELM, and that means OpenELM GGUFs! iMatrix quants are here!

Download the GGUFs below!

Filename Quant type File Size Description
OpenELM-3B-Instruct-F32.gguf F32 11.31 GB Full precision, 32-bit floating point. Largest file size, baseline quality.
OpenELM-3B-Instruct-F16.gguf F16 5.66 GB Full precision, 16-bit floating point. Very large file size, highest quality.
OpenELM-3B-Instruct-Q8_0.gguf Q8_0 3.01 GB Extremely high quality, generally unneeded but max available quant.
OpenELM-3B-Instruct-Q6_K.gguf Q6_K 2.32 GB Very high quality, near perfect, recommended.
OpenELM-3B-Instruct-Q5_K_M.gguf Q5_K_M 2.06 GB High quality, recommended.
OpenELM-3B-Instruct-Q5_K_S.gguf Q5_K_S 1.96 GB High quality, recommended.
OpenELM-3B-Instruct-Q5_1.gguf Q5_1 2.13 GB High quality, improved 5-bit quantization.
OpenELM-3B-Instruct-Q5_0.gguf Q5_0 1.96 GB High quality, alternative 5-bit quantization.
OpenELM-3B-Instruct-Q4_K_M.gguf Q4_K_M 1.76 GB Good quality, recommended.
OpenELM-3B-Instruct-Q4_K_S.gguf Q4_K_S 1.62 GB Slightly lower quality with more space savings, recommended.
OpenELM-3B-Instruct-Q4_1.gguf Q4_1 1.79 GB Good quality, improved 4-bit quantization.
OpenELM-3B-Instruct-Q4_0.gguf Q4_0 1.62 GB Good quality, older 4-bit quantization.
OpenELM-3B-Instruct-IQ4_XS.gguf IQ4_XS 1.54 GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
OpenELM-3B-Instruct-Q3_K_L.gguf Q3_K_L 1.55 GB Lower quality but usable, good for low RAM availability.
OpenELM-3B-Instruct-Q3_K_M.gguf Q3_K_M 1.43 GB Even lower quality.
OpenELM-3B-Instruct-IQ3_M.gguf IQ3_M 1.34 GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
OpenELM-3B-Instruct-IQ3_XS.gguf IQ3_XS 1.20 GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
OpenELM-3B-Instruct-Q3_K_S.gguf Q3_K_S 1.25 GB Low quality, not recommended.
OpenELM-3B-Instruct-IQ3_XXS.gguf IQ3_XXS 1.16 GB Lower quality, new method with decent performance, comparable to Q3 quants.
OpenELM-3B-Instruct-Q2_K.gguf Q2_K 1.07 GB Very low quality but surprisingly usable.
OpenELM-3B-Instruct-IQ2_M.gguf IQ2_M 0.97 GB Very low quality, uses SOTA techniques to also be surprisingly usable.
OpenELM-3B-Instruct-IQ2_S.gguf IQ2_S 0.89 GB Very low quality, uses SOTA techniques to be usable.
OpenELM-3B-Instruct-IQ2_XS.gguf IQ2_XS 0.86 GB Very low quality, uses SOTA techniques to be usable.
OpenELM-3B-Instruct-IQ2_XXS.gguf IQ2_XXS 0.77 GB Extremely low quality, smallest 2-bit quantization.
OpenELM-3B-Instruct-IQ1_M.gguf IQ1_M 0.68 GB Extremely low quality, 1-bit quantization.
OpenELM-3B-Instruct-IQ1_S.gguf IQ1_S 0.62 GB Extremely low quality, smallest possible 1-bit quantization.
Downloads last month
274
GGUF
Model size
3.04B params
Architecture
openelm

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference API
Unable to determine this model's library. Check the docs .