Gemma 4 26B-A4B SecureCode

Security-specialized code generation model fine-tuned on the SecureCode and SecureCode Web datasets.

Part of the SecureCode model collection by perfecXion.ai.

Model Details

Property	Value
Base Model	google/gemma-4-26b-a4b-it
Architecture	Gemma 4 Mixture-of-Experts (26B total, 4B active per token)
Method	QLoRA (4-bit NormalFloat quantization)
Parameters Trained	~1-2% via LoRA adapters
Tier	Tier 3: Large Security Specialist

Training Configuration

QLoRA Settings

Parameter	Value
Quantization	4-bit NormalFloat (NF4)
Compute Dtype	bfloat16
Double Quantization	Enabled
LoRA Rank	16
LoRA Alpha	32
LoRA Dropout	0.05
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Hyperparameters

Parameter	Value
Learning Rate	2e-4
LR Scheduler	Cosine with 100-step warmup
Epochs	3
Per-device Batch Size	2
Gradient Accumulation	8x
Effective Batch Size	16
Max Sequence Length	4,096 tokens
Optimizer	paged_adamw_8bit
Precision	bf16

Hardware

Component	Specification
System	NVIDIA DGX Spark
GPU	NVIDIA GB10
Memory	128 GB Unified (CPU/GPU)

Training Data

Combined and deduplicated from two datasets:

Dataset	Examples	Focus
scthornton/securecode	2,185	Web + AI/ML security (OWASP Top 10 2021 + LLM Top 10 2025)
scthornton/securecode-web	1,378	Web security with framework-specific patterns

Coverage

Vulnerability Standards:

OWASP Top 10 2021 (Web/Application Security)
OWASP LLM Top 10 2025 (AI/ML Security)
92+ CWEs mapped

Programming Languages: Python, JavaScript, Java, Go, PHP, TypeScript, C#, Ruby, Rust, Kotlin, YAML, HCL

Frameworks: 49+ including LangChain, OpenAI, Anthropic, HuggingFace, Django, Express.js, Spring Boot, FastAPI, and more

Training Format: 4-turn conversational examples:

Developer asks about implementing a feature
Assistant provides vulnerable + secure implementations with attack demonstrations
Developer asks about testing and edge cases
Assistant delivers defense-in-depth operational guidance

Every example is grounded in real CVEs and published security incidents.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

# Load with 4-bit quantization (matches training)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-26b-a4b-it",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("scthornton/gemma4-26b-securecode")
model = PeftModel.from_pretrained(base_model, "scthornton/gemma4-26b-securecode")

messages = [
    {"role": "user", "content": "How do I implement JWT authentication with refresh tokens in Python?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

What Makes This Different

Standard code models generate functional but often insecure code. SecureCode-trained models:

Generate secure implementations by default with proper input validation, parameterized queries, and cryptographic best practices
Provide vulnerable AND secure code side-by-side so developers understand the risk
Include defense-in-depth guidance: logging, monitoring, SIEM integration, and infrastructure hardening
Cover AI/ML-specific vulnerabilities: prompt injection defenses, RAG security, model supply chain protection

SecureCode Model Collection

Model	Parameters	Base
llama-3.2-3b-securecode	3B	Llama 3.2 3B
codegemma-7b-securecode	7B	CodeGemma 7B IT
deepseek-coder-6.7b-securecode	6.7B	DeepSeek Coder
qwen-coder-7b-securecode	7B	Qwen Coder 7B
codellama-13b-securecode	13B	Code Llama 13B
qwen2.5-coder-14b-securecode	14B	Qwen 2.5 Coder 14B
starcoder2-15b-securecode	15B	StarCoder2 15B
granite-20b-code-securecode	20B	Granite 20B Code
gemma4-26b-securecode	26B (4B active)	Gemma 4 26B-A4B IT

Limitations

Training data focuses on defensive security patterns; not designed for offensive security tooling
4-turn conversation format may not generalize to all coding interaction patterns
MoE architecture means only 4B parameters are active per token despite 26B total
Security guidance reflects best practices as of early 2026; new vulnerabilities may not be covered

License

Model: Gemma license (inherited from base model)
Dataset: CC BY-NC-SA 4.0
Adapters: CC BY-NC-SA 4.0

Citation

@misc{thornton2026securecode,
  title={SecureCode: A Production-Grade Multi-Turn Dataset for Training Security-Aware Code Generation Models},
  author={Thornton, Scott},
  year={2026},
  publisher={perfecXion.ai},
  url={https://huggingface.co/datasets/scthornton/securecode},
  note={arXiv:2512.18542}
}

Downloads last month: 10

Datasets used to train scthornton/gemma4-26b-securecode

Paper for scthornton/gemma4-26b-securecode

SecureCode v2.0: A Production-Grade Dataset for Training Security-Aware Code Generation Models

Paper • 2512.18542 • Published Dec 20, 2025 • 5