qwerrwe / ds_config.json

Commit History

shuffle and split dataset after save/load
4f2584f

winglian commited on

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches
d1aed4c

winglian commited on

more logging, wandb fixes
05fffb5

winglian commited on