ISTA-DASLab
/

Phi-3-mini-128k-instruct-AQLM-1x16

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Official AQLM quantization of microsoft/Phi-3-mini-128k-instruct .

For this quantization, we used 1 codebook of 16 bits.

Results:

Model	Quantization	MMLU (5-shot)	ArcC	ArcE	Hellaswag	Winogrande	PiQA	Model size, Gb
microsoft/Phi-3-mini-128k-instruct	None	0.6881	0.5418	0.8127	0.5980	0.7873	0.7340	7.6
	1x16	0.5815	0.4599	0.7845	0.5235	0.7666	0.6930	1.4

Downloads last month: 15

Safetensors

Model size

718M params

Tensor type

FP16

·

I16

·

Inference Examples

Text Generation

Inference API (serverless) is not available, repository is disabled.