Commits · Dovakiins/qwerrwe

standardize attn hijack patches (#381)

06edf17
unverified

tmm1

winglian commited on Aug 18, 2023

adds color (#425)

0a22847
unverified

mhenrichsen

winglian commited on Aug 18, 2023

remove extra accelearate in requirements (#430)

82e111a
unverified

winglian commited on Aug 18, 2023

fix fixture for new tokenizer handling in transformers (#428)

8cace80
unverified

winglian commited on Aug 17, 2023

fix orca prompts (#422)

1b7e860
unverified

winglian commited on Aug 16, 2023

Fix(docs): Remove gptq+lora and fix xformer compat list (#423)

3d1f203
unverified

Nanobit commited on Aug 16, 2023

just resort to tags ans use main-latest (#424)

d3d6fd6
unverified

winglian commited on Aug 16, 2023

Fix(template): Inform to place stack trace to Issue (#417)

b7449a9
unverified

Nanobit

winglian commited on Aug 16, 2023

use inputs for image rather than outputs for docker metadata (#420)

5f80b35
unverified

winglian commited on Aug 15, 2023

hopefully improve the README (#419)

2495909
unverified

winglian commited on Aug 15, 2023

tag with latest as well for axolotl-runpod (#418)

7af8166
unverified

winglian commited on Aug 15, 2023

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config

f806e86
unverified

mhenrichsen commited on Aug 15, 2023

Feat(doc): Add lr_quadratic_warmup to readme (#412)

2b990eb
unverified

Nanobit commited on Aug 15, 2023

update path to align with fsdp example

bd8cab4

mhenrichsen commited on Aug 15, 2023

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Nanobit commited on Aug 15, 2023

Fix(docs): Update flash attn requirements (#409)

72fe3f8
unverified

Nanobit commited on Aug 15, 2023

update docs for tokenizer_legacy (#401)

47961fd
unverified

winglian commited on Aug 15, 2023

Fix(template): Remove iPhone/android from Issue template (#407)

7ad37cb
unverified

Nanobit commited on Aug 15, 2023

Ax art (#405)

29241cf
unverified

winglian commited on Aug 15, 2023

add templates, CoC and contributing guide (#126)

31db0ec
unverified

lightningRalf

winglian

Nanobit commited on Aug 15, 2023

fix eval steps and strategy (#403)

da10af0
unverified

winglian commited on Aug 15, 2023

better handling of empty input ids when tokenizing (#395)

85cf4f8
unverified

winglian commited on Aug 15, 2023

add utils.data.prepare_dataset

2e22404

tmm1 commited on Aug 15, 2023

Feat(doc): Add how to save by epochs (#396)

be294fd
unverified

Nanobit commited on Aug 15, 2023

use context manager to run things on rank0 before others (#397)

fc2d6be
unverified

winglian commited on Aug 15, 2023

don't use mask expansion for inference (#392)

1687be6
unverified

winglian commited on Aug 15, 2023

Feat(doc): Add max_steps to readme (#389)

41ecb45
unverified

Nanobit commited on Aug 14, 2023

Feat(config): add max steps (#387)

3c2ad00
unverified

ittailup commited on Aug 14, 2023

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

flotos commited on Aug 14, 2023

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Nanobit commited on Aug 14, 2023

Error msg for sharegpt if conv has less than 2 msg (#379)

63fdb5a
unverified

flotos commited on Aug 14, 2023

new llama-2 default settings (#370)

fdffef5
unverified

mhenrichsen Mads Henrichsen commited on Aug 14, 2023

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

winglian commited on Aug 13, 2023

bump flash-attn to 2.0.4 for the base docker image (#382)

ffac902
unverified

winglian commited on Aug 13, 2023

Fix crash when running without CUDA

15f6e57

chargoddard commited on Aug 13, 2023

Feat(doc): Improve sharegpt doc (#378)

729c299
unverified

Nanobit commited on Aug 13, 2023

save tokenizer before training starts (#380)

86a91e2
unverified

winglian commited on Aug 13, 2023

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

tmm1 commited on Aug 13, 2023

Create FUNDING.yml

2dafa73
unverified

winglian commited on Aug 13, 2023

fix check for flash attn branching (#377)

343ac84
unverified

winglian commited on Aug 13, 2023

remove unnecessary local variable

0c96727

tmm1 commited on Aug 13, 2023

simplify `load_tokenizer`

efb3b2c

tmm1 commited on Aug 13, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

tmm1 commited on Aug 13, 2023

extract module for working with cfg

8cec513

tmm1 commited on Aug 13, 2023

fix DefaultDict.or

a13e45d

tmm1 commited on Aug 10, 2023

revert previous change and build ax images w docker on gpu (#371)

918f1b0
unverified

winglian commited on Aug 13, 2023

attempt to run non-base docker builds on regular cpu hosts (#369)

c3fde36
unverified

winglian commited on Aug 12, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Fix(save): Save as safetensors (#363)

a276c9c
unverified

Nanobit commited on Aug 12, 2023

Commit History

standardize attn hijack patches (#381) 06edf17 unverified

adds color (#425) 0a22847 unverified

remove extra accelearate in requirements (#430) 82e111a unverified

fix fixture for new tokenizer handling in transformers (#428) 8cace80 unverified

fix orca prompts (#422) 1b7e860 unverified

Fix(docs): Remove gptq+lora and fix xformer compat list (#423) 3d1f203 unverified

just resort to tags ans use main-latest (#424) d3d6fd6 unverified

Fix(template): Inform to place stack trace to Issue (#417) b7449a9 unverified

use inputs for image rather than outputs for docker metadata (#420) 5f80b35 unverified

hopefully improve the README (#419) 2495909 unverified

tag with latest as well for axolotl-runpod (#418) 7af8166 unverified

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config f806e86 unverified

Feat(doc): Add lr_quadratic_warmup to readme (#412) 2b990eb unverified

update path to align with fsdp example bd8cab4

Fix(config): Update handling of deepspeed config (#404) c01015f unverified

Fix(docs): Update flash attn requirements (#409) 72fe3f8 unverified

update docs for tokenizer_legacy (#401) 47961fd unverified

Fix(template): Remove iPhone/android from Issue template (#407) 7ad37cb unverified

Ax art (#405) 29241cf unverified

add templates, CoC and contributing guide (#126) 31db0ec unverified

fix eval steps and strategy (#403) da10af0 unverified

better handling of empty input ids when tokenizing (#395) 85cf4f8 unverified

add utils.data.prepare_dataset 2e22404

Feat(doc): Add how to save by epochs (#396) be294fd unverified

use context manager to run things on rank0 before others (#397) fc2d6be unverified

don't use mask expansion for inference (#392) 1687be6 unverified

Feat(doc): Add max_steps to readme (#389) 41ecb45 unverified

Feat(config): add max steps (#387) 3c2ad00 unverified

Added "epoch" evaluation_strategy (#388) 5d48a10 unverified

Feat(config): Add hub_strategy (#386) 73a0b6e unverified

Error msg for sharegpt if conv has less than 2 msg (#379) 63fdb5a unverified

new llama-2 default settings (#370) fdffef5 unverified

don't pass rope_scaling kwarg if it's None (#383) 919246f unverified

bump flash-attn to 2.0.4 for the base docker image (#382) ffac902 unverified

Fix crash when running without CUDA 15f6e57

Feat(doc): Improve sharegpt doc (#378) 729c299 unverified

save tokenizer before training starts (#380) 86a91e2 unverified

try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified

Create FUNDING.yml 2dafa73 unverified

fix check for flash attn branching (#377) 343ac84 unverified

remove unnecessary local variable 0c96727

simplify `load_tokenizer` efb3b2c

improve GPU logging to break out pytorch cache and system mem 7b55fe6

quiet noise from llama tokenizer by setting pad token earlier e029ab3

extract module for working with cfg 8cec513

fix DefaultDict.__or__ a13e45d

revert previous change and build ax images w docker on gpu (#371) 918f1b0 unverified

attempt to run non-base docker builds on regular cpu hosts (#369) c3fde36 unverified

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Fix(save): Save as safetensors (#363) a276c9c unverified

standardize attn hijack patches (#381)

06edf17
unverified

adds color (#425)

0a22847
unverified

remove extra accelearate in requirements (#430)

82e111a
unverified

fix fixture for new tokenizer handling in transformers (#428)

8cace80
unverified

fix orca prompts (#422)

1b7e860
unverified

Fix(docs): Remove gptq+lora and fix xformer compat list (#423)

3d1f203
unverified

just resort to tags ans use main-latest (#424)

d3d6fd6
unverified

Fix(template): Inform to place stack trace to Issue (#417)

b7449a9
unverified

use inputs for image rather than outputs for docker metadata (#420)

5f80b35
unverified

hopefully improve the README (#419)

2495909
unverified

tag with latest as well for axolotl-runpod (#418)

7af8166
unverified

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config

f806e86
unverified

Feat(doc): Add lr_quadratic_warmup to readme (#412)

2b990eb
unverified

update path to align with fsdp example

bd8cab4

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Fix(docs): Update flash attn requirements (#409)

72fe3f8
unverified

update docs for tokenizer_legacy (#401)

47961fd
unverified

Fix(template): Remove iPhone/android from Issue template (#407)

7ad37cb
unverified

Ax art (#405)

29241cf
unverified

add templates, CoC and contributing guide (#126)

31db0ec
unverified

fix eval steps and strategy (#403)

da10af0
unverified

better handling of empty input ids when tokenizing (#395)

85cf4f8
unverified

add utils.data.prepare_dataset

2e22404

Feat(doc): Add how to save by epochs (#396)

be294fd
unverified

use context manager to run things on rank0 before others (#397)

fc2d6be
unverified

don't use mask expansion for inference (#392)

1687be6
unverified

Feat(doc): Add max_steps to readme (#389)

41ecb45
unverified

Feat(config): add max steps (#387)

3c2ad00
unverified

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Error msg for sharegpt if conv has less than 2 msg (#379)

63fdb5a
unverified

new llama-2 default settings (#370)

fdffef5
unverified

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

bump flash-attn to 2.0.4 for the base docker image (#382)

ffac902
unverified

Fix crash when running without CUDA

15f6e57

Feat(doc): Improve sharegpt doc (#378)

729c299
unverified

save tokenizer before training starts (#380)

86a91e2
unverified

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

Create FUNDING.yml

2dafa73
unverified

fix check for flash attn branching (#377)

343ac84
unverified

remove unnecessary local variable

0c96727

simplify `load_tokenizer`

efb3b2c

improve GPU logging to break out pytorch cache and system mem

7b55fe6

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

extract module for working with cfg

8cec513

fix DefaultDict.or

a13e45d

revert previous change and build ax images w docker on gpu (#371)

918f1b0
unverified

attempt to run non-base docker builds on regular cpu hosts (#369)

c3fde36
unverified

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Fix(save): Save as safetensors (#363)

a276c9c
unverified