Kwaai
/

GPT2_NonToxic

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pt-sk commited on Jul 20

Commit

fab6277

•

1 Parent(s): 7254bfb

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ tags:
 pipeline_tag: text-generation
 ---
 Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
-Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)
 ```python
 # Load model and tokenizer directly
 from transformers import AutoTokenizer, AutoModelForCausalLM

 pipeline_tag: text-generation
 ---
 Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
+Implementation code is available here: [GitHub](https://github.com/Kwaai-AI-Lab/kwaai-alignment/tree/main/Implementations/GPT2_NonToxic)
 ```python
 # Load model and tokenizer directly
 from transformers import AutoTokenizer, AutoModelForCausalLM