calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4635	1.0	5	2.8650
2.5314	2.0	10	2.1158
1.9819	3.0	15	1.8063
1.6986	4.0	20	1.6031
1.5730	5.0	25	1.5538
1.5102	6.0	30	1.5036
1.4481	7.0	35	1.4591
1.3779	8.0	40	1.3530
1.3051	9.0	45	1.2608
1.2610	10.0	50	1.1978
1.2074	11.0	55	1.2343
1.1567	12.0	60	1.1180
1.1375	13.0	65	1.1520
1.1428	14.0	70	1.1030
1.0972	15.0	75	1.0581
1.0503	16.0	80	0.9979
0.9758	17.0	85	0.9513
0.9473	18.0	90	0.9317
0.9206	19.0	95	0.9380
0.9384	20.0	100	0.8643
0.8769	21.0	105	0.9630
0.9673	22.0	110	0.9533
0.9098	23.0	115	0.8435
0.8675	24.0	120	0.8262
0.8382	25.0	125	0.8295
0.8148	26.0	130	0.7936
0.8002	27.0	135	0.7727
0.7794	28.0	140	0.7617
0.7631	29.0	145	0.7373
0.7419	30.0	150	0.7182
0.7297	31.0	155	0.7168
0.7208	32.0	160	0.6962
0.7054	33.0	165	0.6853
0.6964	34.0	170	0.6826
0.6895	35.0	175	0.6700
0.6787	36.0	180	0.6599
0.6689	37.0	185	0.6539
0.6651	38.0	190	0.6495
0.6646	39.0	195	0.6490
0.6592	40.0	200	0.6463

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support