zen-E
/

deepspeed-chat-step3-rlhf-actor-model-opt1.3b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zen-E commited on Apr 27, 2023

Commit

d04c599

•

1 Parent(s): 1562d3a

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -6,18 +6,18 @@ language:
 library_name: transformers
 ---
-# OPT-1.3b finetuned by DeepSpeed-Chat
 # Model Description
 <!-- Provide a longer summary of what this model is. -->
-zen-E/deepspeed-chat-step1-model-opt1.3b is an OPT-1.3b model SFTed by DeepSpeedExamples/applications/DeepSpeed-Chat.
 The model is finetuned on 4 datasets with a split of 2, 4, 4 for steps of SFT, reward modeling, and RLHF.
-The training log is attached. 2 A100-40GB is used to finetune the model, gradient_accumulation_steps are tuned to be 4.
 ### Model Sources

 library_name: transformers
 ---
+# OPT-1.3b RLHFed by DeepSpeed-Chat
 # Model Description
 <!-- Provide a longer summary of what this model is. -->
+zen-E/deepspeed-chat-step3-rlhf-actor-model-opt1.3b is an OPT-1.3b model RLHFed by DeepSpeedExamples/applications/DeepSpeed-Chat.
 The model is finetuned on 4 datasets with a split of 2, 4, 4 for steps of SFT, reward modeling, and RLHF.
+The training log is attached. 2 A100-40GB is used to finetune the model, gradient_accumulation_steps are tuned to be 8, the batch size is 4.
 ### Model Sources