zen-E commited on
Commit
d04c599
1 Parent(s): 1562d3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -6,18 +6,18 @@ language:
6
  library_name: transformers
7
  ---
8
 
9
- # OPT-1.3b finetuned by DeepSpeed-Chat
10
 
11
 
12
 
13
  # Model Description
14
 
15
  <!-- Provide a longer summary of what this model is. -->
16
- zen-E/deepspeed-chat-step1-model-opt1.3b is an OPT-1.3b model SFTed by DeepSpeedExamples/applications/DeepSpeed-Chat.
17
 
18
  The model is finetuned on 4 datasets with a split of 2, 4, 4 for steps of SFT, reward modeling, and RLHF.
19
 
20
- The training log is attached. 2 A100-40GB is used to finetune the model, gradient_accumulation_steps are tuned to be 4.
21
 
22
  ### Model Sources
23
 
 
6
  library_name: transformers
7
  ---
8
 
9
+ # OPT-1.3b RLHFed by DeepSpeed-Chat
10
 
11
 
12
 
13
  # Model Description
14
 
15
  <!-- Provide a longer summary of what this model is. -->
16
+ zen-E/deepspeed-chat-step3-rlhf-actor-model-opt1.3b is an OPT-1.3b model RLHFed by DeepSpeedExamples/applications/DeepSpeed-Chat.
17
 
18
  The model is finetuned on 4 datasets with a split of 2, 4, 4 for steps of SFT, reward modeling, and RLHF.
19
 
20
+ The training log is attached. 2 A100-40GB is used to finetune the model, gradient_accumulation_steps are tuned to be 8, the batch size is 4.
21
 
22
  ### Model Sources
23