Commit History

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches
d1aed4c

winglian commited on

fix logging
a459383

winglian commited on

prepare datasets only flag
2393801

winglian commited on

add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)
d060c80

winglian commited on

configure log level, add llama 7b config
d33a975

winglian commited on

more logging, wandb fixes
05fffb5

winglian commited on

refactor trainer setup to account for deepspeed integration
2df63ef

winglian commited on

improve prepared dataset loading, fix inference
b164725

winglian commited on

helpful info output
937f44f

winglian commited on

fix issue with completed model being empty
902dd0a

winglian commited on

various bugfixes
80b2ed2

winglian commited on

bettter handling of llama model import
45f77dd

winglian commited on

more fixes and prep for llama training
949a27b

winglian commited on

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes
f2a2029

winglian commited on

black formatting
a6028d3

winglian commited on

make it work with pythia in the cloud
8d959a7

winglian commited on

WIP for axolotl trainer
ce24f5e

winglian commited on

initial commit of README
e9da4b9

winglian commited on