privetin commited on
Commit
a5d5962
1 Parent(s): 241f5e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -1
README.md CHANGED
@@ -11,4 +11,72 @@ base_model:
11
  - google-t5/t5-small
12
  pipeline_tag: summarization
13
  library_name: transformers
14
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - google-t5/t5-small
12
  pipeline_tag: summarization
13
  library_name: transformers
14
+ ---
15
+ # Model Card for t5_small Summarization Model
16
+
17
+ ## Model Details
18
+
19
+ - Model Architecture: T5 (Text-to-Text Transfer Transformer)
20
+ - Variant: t5-small
21
+ - Task: Text Summarization
22
+ - Framework: Hugging Face Transformers
23
+
24
+ ## Training Data
25
+
26
+ - Dataset: CNN/DailyMail
27
+ - Content: News articles and their summaries
28
+ - Size: Approximately 300,000 article-summary pairs
29
+
30
+ ## Training Procedure
31
+
32
+ - Fine-tuning method: Using Hugging Face Transformers library
33
+ - Hyperparameters:
34
+ - Learning rate: 5e-5
35
+ - Batch size: 8
36
+ - Number of epochs: 3
37
+ - Optimizer: AdamW
38
+
39
+ ## How to Use
40
+
41
+ 1. Install the Hugging Face Transformers library:
42
+ ```
43
+ pip install transformers
44
+ ```
45
+
46
+ 2. Load the model:
47
+ ```python
48
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
49
+
50
+ tokenizer = AutoTokenizer.from_pretrained("t5-small")
51
+ model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
52
+ ```
53
+
54
+ 3. Generate a summary:
55
+ ```python
56
+ input_text = "Your input text here"
57
+ inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
58
+ summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
59
+ summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
60
+ ```
61
+
62
+ ## Evaluation
63
+
64
+ - Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
65
+ - Exact scores not available, but typically evaluated on:
66
+ - ROUGE-1 (unigram overlap)
67
+ - ROUGE-2 (bigram overlap)
68
+ - ROUGE-L (longest common subsequence)
69
+
70
+ ## Limitations
71
+
72
+ - Performance may be lower compared to larger T5 variants
73
+ - Optimized for news article summarization, may not perform as well on other text types
74
+ - Limited to input sequences of 512 tokens
75
+ - Generated summaries may sometimes contain factual inaccuracies
76
+
77
+ ## Ethical Considerations
78
+
79
+ - May inherit biases present in the CNN/DailyMail dataset
80
+ - Not suitable for summarizing sensitive or critical information without human review
81
+ - Users should be aware of potential biases and inaccuracies in generated summaries
82
+ - Should not be used as a sole source of information for decision-making processes