Commit History

Fix: Higher vram usage for mistral and sample_packing (#691)
669f1d0
unverified

Nanobit commited on

flash_attention + sample packing for stablelm 3b (#671)
2d60ba3
unverified

winglian commited on

Fix: ValueError when FA + Mistral when padding_side=right (#681)
eb480df
unverified

Nanobit commited on

Fix(tokenizer): Set rstrip,lstrip,norm to False (#678)
e0b7eea
unverified

Nanobit commited on

chore: Clean up repetitive model kwargs (#670)
e62d590
unverified

Nanobit commited on

Feat: Allow usage of native Mistral FA when no sample_packing (#669)
697c50d
unverified

Nanobit commited on

remove patch fix for phi (#664)
f34648c
unverified

winglian commited on

Mistral flash attn packing (#646)
b6ab8aa
unverified

winglian commited on

skip some flash attn patches unless explicitly enabled (#643)
895f0a0
unverified

winglian commited on

Feat: Add support for upstream FA2 (#626)
19a600a
unverified

Nanobit commited on

misc fixes to add gptq tests (#621)
03e5907
unverified

winglian commited on

support to disable exllama for gptq (#604)
faecff9
unverified

winglian commited on

Delete duplicate lines (#606)
aa656e0
unverified

bofenghuang commited on

btlm and falcon monkey patches for flash attn (#566)
6b9b229
unverified

winglian commited on

make phi training work with Loras (#588)
62eaee7
unverified

winglian commited on

don't resize embeddings if it's already large enough (#577)
3607882
unverified

winglian commited on

Support Sample packing for phi arch (#586)
12a2dbb
unverified

winglian commited on

Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified

Glavin001 commited on

fix for quant config from model (#540)
a94f9cb
unverified

winglian commited on

Add support for GPTQ using native transformers/peft (#468)
3355706
unverified

winglian commited on

fix: bad dtype for full finetune (#504)
1991946
unverified

Maxime winglian commited on

Refactor train cfg cli (#499)
125cccb
unverified

winglian commited on

simplify linear layer locator
267b7b2

tmm1 commited on

fsdp requires params be the same type too (#493)
98bf76e
unverified

winglian commited on

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)
4c37bd0
unverified

Nanobit commited on

fix condition and add logging
3a011ea

tmm1 commited on

rename var and reformat
f319b0b

tmm1 commited on

Update src/axolotl/utils/models.py
7fd662d
unverified

Maxime tmm1 commited on

Update src/axolotl/utils/models.py
9e69968
unverified

Maxime tmm1 commited on

ignore: address pr review
d03887f
unverified

Maxime commited on

ignore: linter
a184549
unverified

Maxime commited on

fix: finetune model inference needs the dtype fix to work with flash-attn
f311df9
unverified

Maxime commited on

fix types w lora (#478)
0b7ba57
unverified

winglian commited on

Fix(tokenizer): Fix condition to add pad token (#477)
71bd062
unverified

Nanobit commited on

improve llama pad token handling (#475)
cb9797e
unverified

winglian commited on

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified

winglian commited on

fix evals (#447)
ee26281
unverified

winglian commited on

standardize attn hijack patches (#381)
06edf17
unverified

tmm1 winglian commited on

don't use mask expansion for inference (#392)
1687be6
unverified

winglian commited on

don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified

winglian commited on

try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified

tmm1 commited on

remove unnecessary local variable
0c96727

tmm1 commited on

simplify `load_tokenizer`
efb3b2c

tmm1 commited on

improve GPU logging to break out pytorch cache and system mem
7b55fe6

tmm1 commited on

quiet noise from llama tokenizer by setting pad token earlier
e029ab3

tmm1 commited on

Attention mask and position id fixes for packing (#285)
2bb0b78
unverified

winglian commited on

Feat: Add rope scaling (#343)
b521206
unverified

Nanobit commited on

Merge pull request #356 from tmm1/load_model-args
11ddccb
unverified

tmm1 commited on

simplify load_model signature
7181022

tmm1 commited on

log GPU memory usage
e303d64

tmm1 commited on