gemma-2-2b_hs2_iter1_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.4494	0.0427	5	1.2764	283664
1.2474	0.0853	10	1.1812	576880
1.2268	0.128	15	1.1447	858360
1.0463	0.1707	20	1.1207	1150400
1.091	0.2133	25	1.1145	1435208
1.0393	0.256	30	1.1119	1716288
0.996	0.2987	35	1.1091	2002968
0.9853	0.3413	40	1.1115	2284960
0.8797	0.384	45	1.1093	2574696
1.0232	0.4267	50	1.1144	2861120
0.9278	0.4693	55	1.1065	3142784
0.8712	0.512	60	1.1112	3431816
0.8836	0.5547	65	1.1035	3720448
0.9139	0.5973	70	1.1034	4007136
0.8125	0.64	75	1.1018	4294416
0.8507	0.6827	80	1.1010	4576968
0.8093	0.7253	85	1.0978	4861272
0.8551	0.768	90	1.0976	5150768
0.7879	0.8107	95	1.0955	5441720
0.844	0.8533	100	1.0929	5720656
0.7869	0.896	105	1.0932	6007136
0.8237	0.9387	110	1.0916	6286960
0.768	0.9813	115	1.0900	6573592