calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6463

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4635 1.0 5 2.8650
2.5314 2.0 10 2.1158
1.9819 3.0 15 1.8063
1.6986 4.0 20 1.6031
1.5730 5.0 25 1.5538
1.5102 6.0 30 1.5036
1.4481 7.0 35 1.4591
1.3779 8.0 40 1.3530
1.3051 9.0 45 1.2608
1.2610 10.0 50 1.1978
1.2074 11.0 55 1.2343
1.1567 12.0 60 1.1180
1.1375 13.0 65 1.1520
1.1428 14.0 70 1.1030
1.0972 15.0 75 1.0581
1.0503 16.0 80 0.9979
0.9758 17.0 85 0.9513
0.9473 18.0 90 0.9317
0.9206 19.0 95 0.9380
0.9384 20.0 100 0.8643
0.8769 21.0 105 0.9630
0.9673 22.0 110 0.9533
0.9098 23.0 115 0.8435
0.8675 24.0 120 0.8262
0.8382 25.0 125 0.8295
0.8148 26.0 130 0.7936
0.8002 27.0 135 0.7727
0.7794 28.0 140 0.7617
0.7631 29.0 145 0.7373
0.7419 30.0 150 0.7182
0.7297 31.0 155 0.7168
0.7208 32.0 160 0.6962
0.7054 33.0 165 0.6853
0.6964 34.0 170 0.6826
0.6895 35.0 175 0.6700
0.6787 36.0 180 0.6599
0.6689 37.0 185 0.6539
0.6651 38.0 190 0.6495
0.6646 39.0 195 0.6490
0.6592 40.0 200 0.6463

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
21
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support