Edit model card

Model Card for t5_small Summarization Model

Model Details

  • Model Architecture: T5 (Text-to-Text Transfer Transformer)
  • Variant: t5-small
  • Task: Text Summarization
  • Framework: Hugging Face Transformers

Training Data

  • Dataset: CNN/DailyMail
  • Content: News articles and their summaries
  • Size: Approximately 300,000 article-summary pairs

Training Procedure

  • Fine-tuning method: Using Hugging Face Transformers library
  • Hyperparameters:
    • Learning rate: 5e-5
    • Batch size: 8
    • Number of epochs: 3
  • Optimizer: AdamW

How to Use

  1. Install the Hugging Face Transformers library:
pip install transformers
  1. Load the model:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
  1. Generate a summary:
input_text = "Your input text here"
inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

Evaluation

  • Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
  • Exact scores not available, but typically evaluated on:
    • ROUGE-1 (unigram overlap)
    • ROUGE-2 (bigram overlap)
    • ROUGE-L (longest common subsequence)

Limitations

  • Performance may be lower compared to larger T5 variants
  • Optimized for news article summarization, may not perform as well on other text types
  • Limited to input sequences of 512 tokens
  • Generated summaries may sometimes contain factual inaccuracies

Ethical Considerations

  • May inherit biases present in the CNN/DailyMail dataset
  • Not suitable for summarizing sensitive or critical information without human review
  • Users should be aware of potential biases and inaccuracies in generated summaries
  • Should not be used as a sole source of information for decision-making processes
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for privetin/model-1

Base model

google-t5/t5-small
Finetuned
(1386)
this model

Dataset used to train privetin/model-1