Wizard fox proofreading in library small

Properly is the proofreader that doesn't steal your voice. LoRA + Gemma 1B-IT.

Screenshot 2026-04-10 221131

Created via Kitsune : Forge.
Local: RTX 5060 Ti 16GB

This model ran through 4 epochs on curated dataset mixtures from Hugging Face, as noted in the dataset information.

Properly v1.01 — E1 During smoke testing, the adapted model performed simple edits, missed a few things, and added witty banter post-edit. And emojis.

Properly v1.02 — E2 Dropped emoji data and other social media datasets. The model improved but still loved emojis and missed a few spelling mistakes. It also treated most inputs like LinkedIn posts — complete with hashtags, occasional duplicates, and a fondness for the word "theorectical." Painful or philosophical, that misspelling proved the dataset gaps.

Properly v1.03 — E3 Added spelling data to the mix. Catches the majority of errors. No banter. Occasional rogue 🚀. Pretty solid across tested turns. "theorectical" became "theoretical."

Properly v1.04 — E4 Increased spelling and edit percentage, removed everything else, lowered steps. Adjusted learning rate from 1e-4 to 5e-5 and grad accumulation from 8 to 16. Determined that temp 0.5 with top_p 0.9 is ideal, paired with a system prompt. Eradicates most undesired behavior while preserving the author's voice. Drastically improved spelling correction. The model does struggle with informal conversational input — prompts like "OMG i loved that song im listening to" can produce a full conversation rather than a correction. This behavior has not appeared in typical email or post editing tests. Some other examples (shown below) is works > works which is still incorrect but I also stated it poorly. It did not change that kind of typo. -- imProperly? ;) A future training run should revise the dataset mix accordingly. It will still generate errors at this stage. Also found a bug in the dataviewer that leads to the zigzagging in the curve. The issue originated in E1. Finally identified that bug and corrected plus added better health checks + viewing options to Forge.


image


This experimentation is meant for learning and to hopefully provide useful tools — or a reference for others to learn and experiment with. The model is functional but likely not ready for unsupervised use at this point. (Though imperfect spelling and grammar has its fans in certain circles...)


Training Data

image

The training finished below target loss for a final product, but the model performs quite well for a 1B model on limited training and testing. The purpose was to understand model size, capability, dataset mixtures, and temperature behavior within the pipeline.

System Prompt: You are Properly a helpful assistant. You fix grammar, spelling, and clarity. Preserve the author's voice. Return only the corrected text. No explanations. No commentary. No emojis or hashtags.

Baked in for Ollama. HuggingFace users will want to add this prompt manually.

Technical Stuff.

Properly-E4-91E3-2

Summary

  • Training run: #95
  • Base model: google/gemma-3-1b-it
  • Artifact: Local artifact (path omitted)
  • Status: completed
  • Started: 2026-04-30T04:00:32.996087+00:00
  • Finished: 2026-04-30T09:22:44.290944+00:00
  • Final loss: 0.8404612632730544
  • Final accuracy: N/A - token accuracy not logged for this run type

Training Configuration

  • attn_implementation: eager
  • batch_size: 1
  • epochs: 1
  • grad_accum: 16
  • learning_rate: 5e-5
  • lora_alpha: 32
  • lora_dropout: 0.05
  • lora_rank: 16
  • max_grad_norm: 1
  • max_seq: 512
  • system_prompt_override: You are Properly a helpful assistant. You fix grammar, spelling, and clarity. Preserve the author's voice. Return only the corrected text. No explanations. No commentary. No emojis or hashtags.
  • target_examples: 50000

Dataset Configuration

  • Dataset Mix | 3 sources | seed 42 (mixture; 35,000 rows)

Recent Training Metrics

Accuracy is marked N/A because this run type logs causal language-model loss, not a publishable evaluation accuracy.

Step Loss Accuracy LR Epoch Timestamp
2130 / 2188 0.8876 - - 0.97 2026-04-30T09:14:13.929849+00:00
2140 / 2188 0.8566 - - 0.98 2026-04-30T09:15:41.870227+00:00
2150 / 2188 0.8243 - - 0.98 2026-04-30T09:17:06.787695+00:00
2160 / 2188 0.8341 - - 0.99 2026-04-30T09:18:34.679929+00:00
2170 / 2188 0.8131 - - 0.99 2026-04-30T09:20:01.712890+00:00
2180 / 2188 0.8141 - - 1.0 2026-04-30T09:21:24.802808+00:00
2188 / 2188 0.8404612632730544 - - - 2026-04-30T09:22:42.250939+00:00

Deployment History

Target Reference Status Created
huggingface deltakitsune/properly completed 2026-05-01T01:13:15.059197+00:00

Notes

Generated by Kitsune Training Suite. Review limitations, intended use, safety notes, and licensing before publishing.

Downloads last month
13
GGUF
Model size
1.0B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train deltakitsune/properly

Collection including deltakitsune/properly