Commits · Dovakiins/qwerrwe

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

Nanobit commited on Dec 22, 2023

Fix Deepspeed loading (#950)

5ea3aa3
unverified

winglian commited on Dec 13, 2023

Flash attn hotfix (#951)

f1f60cb
unverified

winglian commited on Dec 13, 2023

Mixtral official (#942)

7fabc4d
unverified

winglian commited on Dec 12, 2023

Mixtral multipack (#928)

68b227a
unverified

winglian commited on Dec 10, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

fix(tokenizer): handle fast tokenizer properly for bos/eos (#914)

fde091c
unverified

Nanobit commited on Dec 8, 2023

feat: add check for quantized model (#913)

a581e9f
unverified

Nanobit

winglian commited on Dec 4, 2023

Support device_map=sequential & max_memory config parameters (#903)

992e742
unverified

Bryan Thornbury

winglian commited on Dec 4, 2023

fix for qwen w lora (#906)

3e3229e
unverified

winglian commited on Nov 30, 2023

Feat: Add Qwen (#894)

1115c50
unverified

Nanobit commited on Nov 25, 2023

Phi update 202311 (#876)

9bf854e
unverified

winglian commited on Nov 17, 2023

allow overriding of model_config parameters from the YML (#853)

1bc1186
unverified

winglian commited on Nov 16, 2023

fix model parallel (#816)

964d858
unverified

winglian commited on Nov 3, 2023

fix(tokenizer): update log order after update (#806)

10388a8
unverified

Nanobit commited on Oct 31, 2023

fix(config): Set eos/bos to tokenizer if different (#801)

637ed09
unverified

Nanobit commited on Oct 29, 2023

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

winglian commited on Oct 29, 2023

chore: refactor truthy check and fix mypy (#780)

11d1d60
unverified

Nanobit commited on Oct 24, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

Fix(model): Linear detected and added to target module with rope linear (#738)

440c3ab
unverified

Nanobit commited on Oct 19, 2023

add noisy embedding (#721)

3bd9528
unverified

Maxime Maxime commited on Oct 13, 2023

Fix: Higher vram usage for mistral and sample_packing (#691)

669f1d0
unverified

Nanobit commited on Oct 6, 2023

flash_attention + sample packing for stablelm 3b (#671)

2d60ba3
unverified

winglian commited on Oct 5, 2023

Fix: ValueError when FA + Mistral when padding_side=right (#681)

eb480df
unverified

Nanobit commited on Oct 5, 2023

Fix(tokenizer): Set rstrip,lstrip,norm to False (#678)

e0b7eea
unverified

Nanobit commited on Oct 5, 2023

chore: Clean up repetitive model kwargs (#670)

e62d590
unverified

Nanobit commited on Oct 4, 2023

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

Nanobit commited on Oct 4, 2023

remove patch fix for phi (#664)

f34648c
unverified

winglian commited on Oct 3, 2023

Mistral flash attn packing (#646)

b6ab8aa
unverified

winglian commited on Sep 27, 2023

skip some flash attn patches unless explicitly enabled (#643)

895f0a0
unverified

winglian commited on Sep 27, 2023

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

Nanobit commited on Sep 26, 2023

misc fixes to add gptq tests (#621)

03e5907
unverified

winglian commited on Sep 22, 2023

support to disable exllama for gptq (#604)

faecff9
unverified

winglian commited on Sep 19, 2023

Delete duplicate lines (#606)

aa656e0
unverified

bofenghuang commited on Sep 19, 2023

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

winglian commited on Sep 17, 2023

make phi training work with Loras (#588)

62eaee7
unverified

winglian commited on Sep 16, 2023

don't resize embeddings if it's already large enough (#577)

3607882
unverified

winglian commited on Sep 15, 2023

Support Sample packing for phi arch (#586)

12a2dbb
unverified

winglian commited on Sep 15, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

fix for quant config from model (#540)

a94f9cb
unverified

winglian commited on Sep 10, 2023

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

winglian commited on Sep 5, 2023

fix: bad dtype for full finetune (#504)

1991946
unverified

Maxime

winglian commited on Sep 1, 2023

Refactor train cfg cli (#499)

125cccb
unverified

winglian commited on Aug 29, 2023

simplify linear layer locator

267b7b2

tmm1 commited on Aug 28, 2023

fsdp requires params be the same type too (#493)

98bf76e
unverified

winglian commited on Aug 28, 2023

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)

4c37bd0
unverified

Nanobit commited on Aug 28, 2023

fix condition and add logging

3a011ea

tmm1 commited on Aug 27, 2023

rename var and reformat

f319b0b

tmm1 commited on Aug 27, 2023

Update src/axolotl/utils/models.py

7fd662d
unverified

Maxime

tmm1 commited on Aug 27, 2023

Update src/axolotl/utils/models.py

9e69968
unverified

Maxime

tmm1 commited on Aug 27, 2023

Commit History

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified

Fix Deepspeed loading (#950) 5ea3aa3 unverified

Flash attn hotfix (#951) f1f60cb unverified

Mixtral official (#942) 7fabc4d unverified

Mixtral multipack (#928) 68b227a unverified

support for mamba (#915) 40a6362 unverified

fix(tokenizer): handle fast tokenizer properly for bos/eos (#914) fde091c unverified

feat: add check for quantized model (#913) a581e9f unverified

Support device_map=sequential & max_memory config parameters (#903) 992e742 unverified

fix for qwen w lora (#906) 3e3229e unverified

Feat: Add Qwen (#894) 1115c50 unverified

Phi update 202311 (#876) 9bf854e unverified

allow overriding of model_config parameters from the YML (#853) 1bc1186 unverified

fix model parallel (#816) 964d858 unverified

fix(tokenizer): update log order after update (#806) 10388a8 unverified

fix(config): Set eos/bos to tokenizer if different (#801) 637ed09 unverified

refactor neft patch to be more re-usable similar to trl's impl (#796) 827ec3d unverified

chore: refactor truthy check and fix mypy (#780) 11d1d60 unverified

Implement fused modules (#747) 15d3a65 unverified

Fix(model): Linear detected and added to target module with rope linear (#738) 440c3ab unverified

add noisy embedding (#721) 3bd9528 unverified

Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified

flash_attention + sample packing for stablelm 3b (#671) 2d60ba3 unverified

Fix: ValueError when FA + Mistral when padding_side=right (#681) eb480df unverified

Fix(tokenizer): Set rstrip,lstrip,norm to False (#678) e0b7eea unverified

chore: Clean up repetitive model kwargs (#670) e62d590 unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified

remove patch fix for phi (#664) f34648c unverified

Mistral flash attn packing (#646) b6ab8aa unverified

skip some flash attn patches unless explicitly enabled (#643) 895f0a0 unverified

Feat: Add support for upstream FA2 (#626) 19a600a unverified

misc fixes to add gptq tests (#621) 03e5907 unverified

support to disable exllama for gptq (#604) faecff9 unverified

Delete duplicate lines (#606) aa656e0 unverified

btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified

make phi training work with Loras (#588) 62eaee7 unverified

don't resize embeddings if it's already large enough (#577) 3607882 unverified

Support Sample packing for phi arch (#586) 12a2dbb unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

fix for quant config from model (#540) a94f9cb unverified

Add support for GPTQ using native transformers/peft (#468) 3355706 unverified

fix: bad dtype for full finetune (#504) 1991946 unverified

Refactor train cfg cli (#499) 125cccb unverified

simplify linear layer locator 267b7b2

fsdp requires params be the same type too (#493) 98bf76e unverified

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489) 4c37bd0 unverified

fix condition and add logging 3a011ea

rename var and reformat f319b0b

Update src/axolotl/utils/models.py 7fd662d unverified

Update src/axolotl/utils/models.py 9e69968 unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

Fix Deepspeed loading (#950)

5ea3aa3
unverified

Flash attn hotfix (#951)

f1f60cb
unverified

Mixtral official (#942)

7fabc4d
unverified

Mixtral multipack (#928)

68b227a
unverified

support for mamba (#915)

40a6362
unverified

fix(tokenizer): handle fast tokenizer properly for bos/eos (#914)

fde091c
unverified

feat: add check for quantized model (#913)

a581e9f
unverified

Support device_map=sequential & max_memory config parameters (#903)

992e742
unverified

fix for qwen w lora (#906)

3e3229e
unverified

Feat: Add Qwen (#894)

1115c50
unverified

Phi update 202311 (#876)

9bf854e
unverified

allow overriding of model_config parameters from the YML (#853)

1bc1186
unverified

fix model parallel (#816)

964d858
unverified

fix(tokenizer): update log order after update (#806)

10388a8
unverified

fix(config): Set eos/bos to tokenizer if different (#801)

637ed09
unverified

refactor neft patch to be more re-usable similar to trl's impl (#796)

827ec3d
unverified

chore: refactor truthy check and fix mypy (#780)

11d1d60
unverified

Implement fused modules (#747)

15d3a65
unverified

Fix(model): Linear detected and added to target module with rope linear (#738)

440c3ab
unverified

add noisy embedding (#721)

3bd9528
unverified

Fix: Higher vram usage for mistral and sample_packing (#691)

669f1d0
unverified

flash_attention + sample packing for stablelm 3b (#671)

2d60ba3
unverified

Fix: ValueError when FA + Mistral when padding_side=right (#681)

eb480df
unverified

Fix(tokenizer): Set rstrip,lstrip,norm to False (#678)

e0b7eea
unverified

chore: Clean up repetitive model kwargs (#670)

e62d590
unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

remove patch fix for phi (#664)

f34648c
unverified

Mistral flash attn packing (#646)

b6ab8aa
unverified

skip some flash attn patches unless explicitly enabled (#643)

895f0a0
unverified

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

misc fixes to add gptq tests (#621)

03e5907
unverified

support to disable exllama for gptq (#604)

faecff9
unverified

Delete duplicate lines (#606)

aa656e0
unverified

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

make phi training work with Loras (#588)

62eaee7
unverified

don't resize embeddings if it's already large enough (#577)

3607882
unverified

Support Sample packing for phi arch (#586)

12a2dbb
unverified

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

fix for quant config from model (#540)

a94f9cb
unverified

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

fix: bad dtype for full finetune (#504)

1991946
unverified

Refactor train cfg cli (#499)

125cccb
unverified

simplify linear layer locator

267b7b2

fsdp requires params be the same type too (#493)

98bf76e
unverified

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)

4c37bd0
unverified

fix condition and add logging

3a011ea

rename var and reformat

f319b0b

Update src/axolotl/utils/models.py

7fd662d
unverified

Update src/axolotl/utils/models.py

9e69968
unverified