Commits · Dovakiins/qwerrwe

support llama-adapter zero init attention

2255bb7

winglian commited on May 1, 2023

fdsp config dict fix, todo list, add torchdistx support

ad2b48c

winglian commited on Apr 30, 2023

8bit and deepspeed changes

9190ada

winglian commited on Apr 30, 2023

don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case

6dfdd2d

winglian commited on Apr 30, 2023

fix fsdp training args

29936bb

winglian commited on Apr 30, 2023

fix for zero value warmup steps

7882181

winglian commited on Apr 30, 2023

fix sharegpt tokenization, refactor tokenization debugging

5159d00

winglian commited on Apr 30, 2023

wire up gradient checkpointing for 4bit

c0f50d9

winglian commited on Apr 29, 2023

fix dataset handling, support galactica

4a17a4c

winglian commited on Apr 24, 2023

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

winglian commited on Apr 22, 2023

shuffle and split dataset after save/load

4f2584f

winglian commited on Apr 20, 2023

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

winglian commited on Apr 20, 2023

various bugfixes

94f5e41

winglian commited on Apr 19, 2023

fix bug when model_type not explicitly passed

bb991fd

winglian commited on Apr 19, 2023

improve inference

d653859

winglian commited on Apr 19, 2023

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

winglian commited on Apr 18, 2023

attempt xformers hijack attention

8746b70

winglian commited on Apr 18, 2023

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

winglian commited on Apr 18, 2023

Spaces:

Dovakiins
/

qwerrwe

Build error

Commit History

support llama-adapter zero init attention

2255bb7

fdsp config dict fix, todo list, add torchdistx support

ad2b48c

8bit and deepspeed changes

9190ada

don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case

6dfdd2d

fix fsdp training args

29936bb

fix for zero value warmup steps

7882181

fix sharegpt tokenization, refactor tokenization debugging

5159d00

wire up gradient checkpointing for 4bit

c0f50d9

fix dataset handling, support galactica

4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

shuffle and split dataset after save/load

4f2584f

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

various bugfixes

94f5e41

fix bug when model_type not explicitly passed

bb991fd

improve inference

d653859

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

attempt xformers hijack attention

8746b70

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

Commit History

support llama-adapter zero init attention 2255bb7

fdsp config dict fix, todo list, add torchdistx support ad2b48c

8bit and deepspeed changes 9190ada

don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case 6dfdd2d

fix fsdp training args 29936bb

fix for zero value warmup steps 7882181

fix sharegpt tokenization, refactor tokenization debugging 5159d00

wire up gradient checkpointing for 4bit c0f50d9

fix dataset handling, support galactica 4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367

shuffle and split dataset after save/load 4f2584f

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 8d43785

various bugfixes 94f5e41

fix bug when model_type not explicitly passed bb991fd

improve inference d653859

quickstart instructions for starting from runpod (#5) 0a472e1 unverified

attempt xformers hijack attention 8746b70

WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified

support llama-adapter zero init attention

2255bb7

fdsp config dict fix, todo list, add torchdistx support

ad2b48c

8bit and deepspeed changes

9190ada

don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case

6dfdd2d

fix fsdp training args

29936bb

fix for zero value warmup steps

7882181

fix sharegpt tokenization, refactor tokenization debugging

5159d00

wire up gradient checkpointing for 4bit

c0f50d9

fix dataset handling, support galactica

4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

shuffle and split dataset after save/load

4f2584f

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

various bugfixes

94f5e41

fix bug when model_type not explicitly passed

bb991fd

improve inference

d653859

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

attempt xformers hijack attention

8746b70

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified