Commits · Dovakiins/qwerrwe

push intermediate model checkpoints to hub

612aabd

winglian commited on Jun 27, 2023

support adamw and grad norm hyperparams

6d0ee4b

winglian commited on Jun 15, 2023

Merge branch 'main' into flash-optimum

fd2c981
unverified

winglian commited on Jun 12, 2023

Fix set mem_id for inference and refactor

974dc00

Nanobit commited on Jun 11, 2023

fix formatting

958da70

winglian commited on Jun 10, 2023

address PR feedback

0c6f928

winglian commited on Jun 10, 2023

fix bettertransformers save, force it to skip after saving correctly in callback

1a82082

winglian commited on Jun 1, 2023

more tweaks to do pre-training with bettertransformers

1210dc8

winglian commited on Jun 1, 2023

Feat: Add landmark attention

55b8542

Nanobit commited on Jun 9, 2023

Refactor out unmodified save_steps and eval_steps

2ef4634

Nanobit commited on Jun 8, 2023

Set to use cfg.seed or 42 for backward compat

2cfe9e9

Nanobit commited on Jun 8, 2023

fix relative path for fixtures

cfcc549

winglian commited on May 30, 2023

Apply isort then black

37293dc

Nanobit commited on May 29, 2023

Fix mypy typing

e9650d3

Nanobit commited on May 29, 2023

Lint trainer.py

ddb86ea

Nanobit commited on May 29, 2023

fix relative path for fixtures

e65aeed

winglian commited on May 30, 2023

refactor(param): rename load_4bit config param by gptq

dd00657

Thytu commited on May 27, 2023

fixes to make qlora actually work

34c99f9

winglian commited on May 26, 2023

apply black formatting

ce34d64

winglian commited on May 25, 2023

fix missing fp16 kwarg

2ae936f

winglian commited on May 24, 2023

Add qa style data for alpaca instructions, fix one_cycle scheduler

3a50377

winglian commited on May 23, 2023

don't need to set here

de6da13

winglian commited on May 22, 2023

be able to use adam bnb 8bit and one cycle scheduler w fsdp

9493b1b

winglian commited on May 22, 2023

make one cycle lr div factor configurable

99383f1

winglian commited on May 22, 2023

Merge branch 'main' into patch-2

89b7f26
unverified

Nanobit commited on May 11, 2023

black formatting

2bc1a5b

winglian commited on May 10, 2023

various fixes

7a490a4

winglian commited on May 10, 2023

Fix Trainer() got multiple values for keyword argument 'callbacks'

813aab3
unverified

Nanobit commited on May 10, 2023

Merge pull request #21 from NanoCode012/patch-1

bd3c5a5
unverified

winglian commited on May 8, 2023

Update trainer.py

36aaea0
unverified

Nanobit commited on May 8, 2023

Fix condition scheduler

5b6690a
unverified

Nanobit commited on May 8, 2023

Add callbacks to Trainer

cc77bab

Nanobit commited on May 8, 2023

Add callback save peft_model on_save

0d6708b

Nanobit commited on May 8, 2023

fix #16 load best model setting when using 8bit

a4329b1

winglian commited on May 7, 2023

use micro batch size for eval size if not specified

550502b

winglian commited on May 7, 2023

refactor inference, warn if model is frozen

247825b

winglian commited on May 7, 2023

Merge pull request #13 from winglian/dev

cb9a887
unverified

winglian commited on May 7, 2023

Add eval_batch_size for evaluation

0e74b64

Nanobit commited on May 6, 2023

fix log sweep lr

a10a826

winglian commited on May 3, 2023

support for multi line inference input, log sweep over learning rates

9105935

winglian commited on May 3, 2023

fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting

7748f3d

winglian commited on May 1, 2023

fdsp config dict fix, todo list, add torchdistx support

ad2b48c

winglian commited on Apr 30, 2023

fix fsdp training args

29936bb

winglian commited on Apr 30, 2023

fix for zero value warmup steps

7882181

winglian commited on Apr 30, 2023

fix sharegpt tokenization, refactor tokenization debugging

5159d00

winglian commited on Apr 30, 2023

wire up gradient checkpointing for 4bit

c0f50d9

winglian commited on Apr 29, 2023

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

winglian commited on Apr 22, 2023

various bugfixes

94f5e41

winglian commited on Apr 19, 2023

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

winglian commited on Apr 18, 2023

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

winglian commited on Apr 18, 2023

Commit History

push intermediate model checkpoints to hub 612aabd

support adamw and grad norm hyperparams 6d0ee4b

Merge branch 'main' into flash-optimum fd2c981 unverified

Fix set mem_id for inference and refactor 974dc00

fix formatting 958da70

address PR feedback 0c6f928

fix bettertransformers save, force it to skip after saving correctly in callback 1a82082

more tweaks to do pre-training with bettertransformers 1210dc8

Feat: Add landmark attention 55b8542

Refactor out unmodified save_steps and eval_steps 2ef4634

Set to use cfg.seed or 42 for backward compat 2cfe9e9

fix relative path for fixtures cfcc549

Apply isort then black 37293dc

Fix mypy typing e9650d3

Lint trainer.py ddb86ea

fix relative path for fixtures e65aeed

refactor(param): rename load_4bit config param by gptq dd00657

fixes to make qlora actually work 34c99f9

apply black formatting ce34d64

fix missing fp16 kwarg 2ae936f

Add qa style data for alpaca instructions, fix one_cycle scheduler 3a50377

don't need to set here de6da13

be able to use adam bnb 8bit and one cycle scheduler w fsdp 9493b1b

make one cycle lr div factor configurable 99383f1

Merge branch 'main' into patch-2 89b7f26 unverified

black formatting 2bc1a5b

various fixes 7a490a4

Fix Trainer() got multiple values for keyword argument 'callbacks' 813aab3 unverified

Merge pull request #21 from NanoCode012/patch-1 bd3c5a5 unverified

Update trainer.py 36aaea0 unverified

Fix condition scheduler 5b6690a unverified

Add callbacks to Trainer cc77bab

Add callback save peft_model on_save 0d6708b

fix #16 load best model setting when using 8bit a4329b1

use micro batch size for eval size if not specified 550502b

refactor inference, warn if model is frozen 247825b

Merge pull request #13 from winglian/dev cb9a887 unverified

Add eval_batch_size for evaluation 0e74b64

fix log sweep lr a10a826

support for multi line inference input, log sweep over learning rates 9105935

fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting 7748f3d

fdsp config dict fix, todo list, add torchdistx support ad2b48c

fix fsdp training args 29936bb

fix for zero value warmup steps 7882181

fix sharegpt tokenization, refactor tokenization debugging 5159d00

wire up gradient checkpointing for 4bit c0f50d9

tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367

various bugfixes 94f5e41

quickstart instructions for starting from runpod (#5) 0a472e1 unverified

WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified

push intermediate model checkpoints to hub

612aabd

support adamw and grad norm hyperparams

6d0ee4b

Merge branch 'main' into flash-optimum

fd2c981
unverified

Fix set mem_id for inference and refactor

974dc00

fix formatting

958da70

address PR feedback

0c6f928

fix bettertransformers save, force it to skip after saving correctly in callback

1a82082

more tweaks to do pre-training with bettertransformers

1210dc8

Feat: Add landmark attention

55b8542

Refactor out unmodified save_steps and eval_steps

2ef4634

Set to use cfg.seed or 42 for backward compat

2cfe9e9

fix relative path for fixtures

cfcc549

Apply isort then black

37293dc

Fix mypy typing

e9650d3

Lint trainer.py

ddb86ea

fix relative path for fixtures

e65aeed

refactor(param): rename load_4bit config param by gptq

dd00657

fixes to make qlora actually work

34c99f9

apply black formatting

ce34d64

fix missing fp16 kwarg

2ae936f

Add qa style data for alpaca instructions, fix one_cycle scheduler

3a50377

don't need to set here

de6da13

be able to use adam bnb 8bit and one cycle scheduler w fsdp

9493b1b

make one cycle lr div factor configurable

99383f1

Merge branch 'main' into patch-2

89b7f26
unverified

black formatting

2bc1a5b

various fixes

7a490a4

Fix Trainer() got multiple values for keyword argument 'callbacks'

813aab3
unverified

Merge pull request #21 from NanoCode012/patch-1

bd3c5a5
unverified

Update trainer.py

36aaea0
unverified

Fix condition scheduler

5b6690a
unverified

Add callbacks to Trainer

cc77bab

Add callback save peft_model on_save

0d6708b

fix #16 load best model setting when using 8bit

a4329b1

use micro batch size for eval size if not specified

550502b

refactor inference, warn if model is frozen

247825b

Merge pull request #13 from winglian/dev

cb9a887
unverified

Add eval_batch_size for evaluation

0e74b64

fix log sweep lr

a10a826

support for multi line inference input, log sweep over learning rates

9105935

fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting

7748f3d

fdsp config dict fix, todo list, add torchdistx support

ad2b48c

fix fsdp training args

29936bb

fix for zero value warmup steps

7882181

fix sharegpt tokenization, refactor tokenization debugging

5159d00

wire up gradient checkpointing for 4bit

c0f50d9

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

various bugfixes

94f5e41

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified