qwerrwe / src

Commit History

let transformers handle adamw_bnb_8bit
868530c

tmm1 commited on

ignore: address pr review
d03887f
unverified

Maxime commited on

ignore: linter
a184549
unverified

Maxime commited on

fix: finetune model inference needs the dtype fix to work with flash-attn
f311df9
unverified

Maxime commited on

fix checkpints on multigpu (#481)
31f3e71
unverified

winglian commited on

fix types w lora (#478)
0b7ba57
unverified

winglian commited on

Fix(tokenizer): Fix condition to add pad token (#477)
71bd062
unverified

Nanobit commited on

improve llama pad token handling (#475)
cb9797e
unverified

winglian commited on

ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified

chargoddard winglian commited on

workaround so training doesn't hang when packed dataloader batches aren't even (#461)
c69faee
unverified

winglian commited on

feat: add Metharme prompt strategy (#446)
f474650
unverified

TearGosling Nanobit commited on

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified

winglian commited on

always drop samples that are too long (#452)
50682a3
unverified

winglian commited on

set env var for FSDP layer to wrap (#453)
5a1985b
unverified

winglian commited on

is_causal fix for evals?
fbf49a4

winglian commited on

add missing positional arg (#450)
58cf7e7
unverified

winglian commited on

fix evals (#447)
ee26281
unverified

winglian commited on

gracefully handle empty input (#442)
9d629d8
unverified

winglian commited on

support user defined prompters, pretokenized datasets in config, local parquet, local arrow files (#348)
d2e7f27
unverified

winglian commited on

disable eval using multipack for now (#437)
f733d0f
unverified

winglian commited on

fix comma, not a tuple (#436)
008505c
unverified

winglian commited on

use save_strategy from config if available (#434)
b3f5e00
unverified

winglian commited on

set env for FSDP offload params (#433)
5247c50
unverified

winglian commited on

standardize attn hijack patches (#381)
06edf17
unverified

tmm1 winglian commited on

fix orca prompts (#422)
1b7e860
unverified

winglian commited on

Fix(config): Update handling of deepspeed config (#404)
c01015f
unverified

Nanobit commited on

fix eval steps and strategy (#403)
da10af0
unverified

winglian commited on

better handling of empty input ids when tokenizing (#395)
85cf4f8
unverified

winglian commited on

add utils.data.prepare_dataset
2e22404

tmm1 commited on

use context manager to run things on rank0 before others (#397)
fc2d6be
unverified

winglian commited on

don't use mask expansion for inference (#392)
1687be6
unverified

winglian commited on

Feat(config): add max steps (#387)
3c2ad00
unverified

ittailup commited on

Added "epoch" evaluation_strategy (#388)
5d48a10
unverified

flotos commited on

Feat(config): Add hub_strategy (#386)
73a0b6e
unverified

Nanobit commited on

Error msg for sharegpt if conv has less than 2 msg (#379)
63fdb5a
unverified

flotos commited on

don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified

winglian commited on

Fix crash when running without CUDA
15f6e57

chargoddard commited on

try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified

tmm1 commited on

fix check for flash attn branching (#377)
343ac84
unverified

winglian commited on

remove unnecessary local variable
0c96727

tmm1 commited on

simplify `load_tokenizer`
efb3b2c

tmm1 commited on

improve GPU logging to break out pytorch cache and system mem
7b55fe6

tmm1 commited on

quiet noise from llama tokenizer by setting pad token earlier
e029ab3

tmm1 commited on

extract module for working with cfg
8cec513

tmm1 commited on

fix DefaultDict.__or__
a13e45d

tmm1 commited on

Attention mask and position id fixes for packing (#285)
2bb0b78
unverified

winglian commited on

Add wandb_entity to wandb options, update example configs, update README (#361)
7019509
unverified

Morgan McGuire Morgan McGuire winglian commited on

Fix(model loading): Warn when model revision is passed to gptq (#364)
96bd6ae
unverified

Nanobit commited on