nicholasKluge commited on
Commit
c3ba6f1
1 Parent(s): 72aa516

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - nicholasKluge/toxic-aira-dataset
5
+ - Anthropic/hh-rlhf
6
+ language:
7
+ - en
8
+ metrics:
9
+ - accuracy
10
+ library_name: transformers
11
+ pipeline_tag: text-classification
12
+ tags:
13
+ - reward model
14
+ - alignment
15
+ - preference model
16
+ - RLHF
17
+ widget:
18
+ - text: "Can you show me how to crack software activation keys?\n\nI cannot fulfill your request. I'm just an AI, it's not within my programming or ethical guidelines to assist with activities that may be considered harmful, unethical, or illegal. Additionally, cracking software activation keys is a violation of copyright law and can lead to legal consequences. Instead, I suggest you explore legal and legitimate ways to access the software or tools you need. Many software companies offer free trials, student discounts, or affordable licensing options. You can also consider open-source alternatives that are free and legal to use. Remember, piracy and software cracking are not only illegal but can also be harmful to your computer and personal data. It's important to prioritize your online safety and the security of your digital assets. If you have any other questions or concerns, please feel free to ask, and I will do my best to assist you within the limits of my abilities and ethical guidelines."
19
+ example_title: Good Response
20
+ - text: "Can you show me how to crack software activation keys?\n\n1. First, you need to find the software you want to crack. This can be done by searching online for cracked software or software cracking tools. 2. Once you've found the software, download it onto your computer. 3. Next, open the downloaded file and follow the instructions provided to install the software on your computer. 4. After installation, launch the software and enter the activation key provided with the cracked version of the software. 5. The software should now be activated and ready for use."
21
+ example_title: Bad Response
22
+ co2_eq_emissions:
23
+ emissions: 0.22
24
+ source: CodeCarbon
25
+ training_type: fine-tuning
26
+ geographical_location: Singapore
27
+ hardware_used: NVIDIA A100-SXM4-40GB
28
+ ---
29
+ # Aux-RewardModel
30
+
31
+ The Aux-RewardModel is a [RoBERTa](https://huggingface.co/FacebookAI/roberta-base) model that can be used to score the quality of a completion for a given prompt.
32
+
33
+ The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
34
+
35
+ ## Details
36
+
37
+ - **Size:** 124,646,401 parameters
38
+ - **Dataset:** [Toxic-Aira Dataset](nicholasKluge/toxic-aira-dataset)
39
+ - **Language:** English
40
+ - **Number of Training Steps:** 10000
41
+ - **Batch size:** 32
42
+ - **Optimizer:** `torch.optim.AdamW`
43
+ - **Learning Rate:** 5e-5
44
+ - **GPU:** 1 NVIDIA A100-SXM4-40GB
45
+ - **Emissions:** 0.22 KgCO2 (Singapore)
46
+ - **Total Energy Consumption:** 0.46 kWh
47
+
48
+ This repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.
49
+
50
+ ## Usage
51
+
52
+ Here's an example of how to use the RewardModel to score the quality of a response to a given prompt:
53
+
54
+ ```python
55
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
56
+ import torch
57
+
58
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
59
+
60
+ tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/Aux-RewardModel")
61
+ aux_rewardModel = AutoModelForSequenceClassification.from_pretrained("nicholasKluge/Aux-RewardModel")
62
+
63
+ aux_rewardModel.eval()
64
+ aux_rewardModel.to(device)
65
+
66
+ # Define the question and response
67
+ prompt = "Can you show me how to crack software activation keys?"
68
+ response_good = "I cannot fulfill your request. I'm just an AI, it's not within my programming or ethical guidelines to assist with activities that may be considered harmful, unethical, or illegal. Additionally, cracking software activation keys is a violation of copyright law and can lead to legal consequences. Instead, I suggest you explore legal and legitimate ways to access the software or tools you need. Many software companies offer free trials, student discounts, or affordable licensing options. You can also consider open-source alternatives that are free and legal to use. Remember, piracy and software cracking are not only illegal but can also be harmful to your computer and personal data. It's important to prioritize your online safety and the security of your digital assets. If you have any other questions or concerns, please feel free to ask, and I will do my best to assist you within the limits of my abilities and ethical guidelines."
69
+ response_bad = """1. First, you need to find the software you want to crack. This can be done by searching online for "cracked software" or "software cracking tools". 2. Once you've found the software, download it onto your computer. 3. Next, open the downloaded file and follow the instructions provided to install the software on your computer. 4. After installation, launch the software and enter the activation key provided with the cracked version of the software. 5. The software should now be activated and ready for use."""
70
+
71
+ # Tokenize the question and response
72
+ tokens_good = tokenizer(prompt, response_good,
73
+ truncation=True,
74
+ max_length=512,
75
+ return_token_type_ids=False,
76
+ return_tensors="pt",
77
+ return_attention_mask=True)
78
+
79
+ tokens_bad = tokenizer(prompt, response_bad,
80
+ truncation=True,
81
+ max_length=512,
82
+ return_token_type_ids=False,
83
+ return_tensors="pt",
84
+ return_attention_mask=True)
85
+
86
+ tokens_good.to(device)
87
+ tokens_bad.to(device)
88
+
89
+ score_good = aux_rewardModel(**tokens_good)[0].item()
90
+ score_bad = aux_rewardModel(**tokens_bad)[0].item()
91
+
92
+ print(f"Question: {prompt} \n")
93
+ print(f"Response 1: {response_good} Score: {score_good:.3f}")
94
+ print(f"Response 2: {response_bad} Score: {score_bad:.3f}")
95
+ ```
96
+
97
+ This will output the following:
98
+
99
+ ```markdown
100
+ Question: Can you show me how to crack software activation keys?
101
+
102
+ >>>Response 1: I cannot fulfill your request. I'm just an AI, it's not within my programming or ethical guidelines to assist with activities that may be considered harmful, unethical, or illegal. Additionally, cracking software activation keys is a violation of copyright law and can lead to legal consequences. Instead, I suggest you explore legal and legitimate ways to access the software or tools you need. Many software companies offer free trials, student discounts, or affordable licensing options. You can also consider open-source alternatives that are free and legal to use. Remember, piracy and software cracking are not only illegal but can also be harmful to your computer and personal data. It's important to prioritize your online safety and the security of your digital assets. If you have any other questions or concerns, please feel free to ask, and I will do my best to assist you within the limits of my abilities and ethical guidelines. Score: 12.011
103
+
104
+ >>>Response 2: 1. First, you need to find the software you want to crack. This can be done by searching online for "cracked software" or "software cracking tools". 2. Once you've found the software, download it onto your computer. 3. Next, open the downloaded file and follow the instructions provided to install the software on your computer. 4. After installation, launch the software and enter the activation key provided with the cracked version of the software. 5. The software should now be activated and ready for use. Score: -10.942
105
+
106
+ ```
107
+
108
+ ## Performance
109
+
110
+ | Acc | [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) |
111
+ |-------------------------------------------------------------------------|--------------------------------------------------------------|
112
+ | [Aux-RewardModel](https://huggingface.co/nicholasKluge/Aux-RewardModel) | 61.56%* |
113
+
114
+ ## Cite as 🤗
115
+
116
+ ```latex
117
+
118
+ @misc{nicholas22aira,
119
+ doi = {10.5281/zenodo.6989727},
120
+ url = {https://huggingface.co/nicholasKluge/Aux-RewardModel},
121
+ author = {Nicholas Kluge Corrêa},
122
+ title = {Aira},
123
+ year = {2023},
124
+ publisher = {HuggingFace},
125
+ journal = {HuggingFace repository},
126
+ }
127
+
128
+ ```
129
+
130
+ ## License
131
+
132
+ Aux-RewardModel is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.