File size: 2,999 Bytes

6bce90c
 
 
 
 
200f0dd
1ffd591
 
 
 
 
 
 
 
 
 
 
6bce90c
 
 
 
 
 
 
b2e88d0
6bce90c
 
 
fbabe61
6bce90c
 
 
5ab7cfe
6bce90c
 
 
5ab7cfe
6bce90c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ab7cfe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6bce90c

---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- hendrycks/competition_math
widget:
- text: Find the number of positive divisors of 9!.
  example_title: Number theory
- text: Quadrilateral $ABCD$ is a parallelogram. If the measure of angle $A$ is 62
    degrees and the measure of angle $ADB$ is 75 degrees, what is the measure of angle
    $ADC$, in degrees?
  example_title: Prealgebra
- text: Suppose $x \in [-5,-3]$ and $y \in [2,4]$. What is the largest possible value
    of $\frac{x+y}{x-y}$?
  example_title: Intermediate algebra
base_model: bert-base-uncased
model-index:
- name: bert-finetuned-math-prob-classification
  results: []
---

# bert-finetuned-math-prob-classification

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the part of the [competition_math dataset](https://huggingface.co/datasets/competition_math). Specifically, it was trained as a multi-class multi-label model on the problem text. The problem types (labels) used here are "Counting & Probability", "Prealgebra", "Algebra", "Number Theory", "Geometry", "Intermediate Algebra", and "Precalculus".

## Model description

See the [bert-base-uncased](https://huggingface.co/bert-base-uncased) model for more details. The only architectural modification made was to the classification head. Here, 7 classes were used.

## Intended uses & limitations

This model is intended for demonstration purposes only. The problem type data was in English and contains many LaTeX tokens.

## Training and evaluation data

The `problem` field of [competition_math dataset](https://huggingface.co/datasets/competition_math) was used for training and evaluation input data. The target data was taken from the `type` field. 

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0

### Training results

This fine-tuned model achieves the following result on the problem type competition math test set:
```
                        precision    recall  f1-score   support

               Algebra       0.78      0.79      0.79      1187
Counting & Probability       0.75      0.81      0.78       474
              Geometry       0.76      0.83      0.79       479
  Intermediate Algebra       0.86      0.84      0.85       903
         Number Theory       0.79      0.82      0.80       540
            Prealgebra       0.66      0.61      0.63       871
           Precalculus       0.95      0.89      0.92       546

              accuracy                           0.79      5000
             macro avg       0.79      0.80      0.79      5000
          weighted avg       0.79      0.79      0.79      5000
```

### Framework versions

- Transformers 4.22.2
- Pytorch 1.12.1+cu113
- Datasets 2.5.1
- Tokenizers 0.12.1