pt-sk commited on
Commit
fab6277
1 Parent(s): 7254bfb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  pipeline_tag: text-generation
8
  ---
9
  Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
10
- Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)
11
  ```python
12
  # Load model and tokenizer directly
13
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
7
  pipeline_tag: text-generation
8
  ---
9
  Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
10
+ Implementation code is available here: [GitHub](https://github.com/Kwaai-AI-Lab/kwaai-alignment/tree/main/Implementations/GPT2_NonToxic)
11
  ```python
12
  # Load model and tokenizer directly
13
  from transformers import AutoTokenizer, AutoModelForCausalLM