Llama-3.2-3B · CodeAlpaca LoRA Adapter
A LoRA adapter fine-tuned on CodeAlpaca-20k
for instruction-following code generation tasks. Built on top of
meta-llama/Llama-3.2-3B with
4-bit NF4 quantization via bitsandbytes. Only ~1% of parameters are
trainable — the rest of the base model is frozen.
Model Details
| Field | Value |
|---|---|
| Base Model | meta-llama/Llama-3.2-3B |
| Adapter Type | LoRA (via PEFT) |
| Task | Instruction-following code generation |
| Language | English |
| License | MIT |
| Author | Parth Deshmukh |
| Date | April 2026 |
Training Configuration
| Config | Value |
|---|---|
| LoRA Rank (r) | 8 |
| LoRA Alpha | 16 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, v_proj |
| Quantization | 4-bit NF4 (bitsandbytes BitsAndBytesConfig) |
| Compute dtype | float16 |
| Batch size | 2 (+ gradient accumulation steps = 4) |
| Mixed Precision | fp16 |
| Hardware | Google Colab T4 GPU (16GB VRAM) |
| Experiment Tracking | MLflow + Weights & Biases |
Dataset
- Name: CodeAlpaca-20k
- Size: ~20,000 code instruction samples
- Split: 90/10 train/test (~18,000 train, ~2,000 test)
- Columns:
instruction,input,output - Prompt format: Instruction: {instruction}
Input: {input}
Response: {output}
text
Evaluation Results
Evaluated on 200 held-out test samples from CodeAlpaca-20k using 4-bit
quantized inference. Metrics computed with evaluate (ROUGE-L) and
bert_score (BERTScore-F1).
| Model | ROUGE-L | BERTScore-F1 |
|---|---|---|
| Base (Llama-3.2-3B, no adapter) | 0.3303 | 0.7835 |
| Fine-tuned (this adapter) | 0.5458 | 0.8856 |
| Delta | +0.2155 (+65.2%) | +0.1021 (+13.0%) |
ROUGE-L of 0.5458 is at the top of the competitive range for fine-tuned code generation models (0.43–0.55), confirming that LoRA fine-tuning successfully taught the model consistent instruction-following and code formatting behavior.
How to Use
Load the base model with 4-bit quantization, then apply this adapter using
PEFT's PeftModel.from_pretrained().
Prompt format: Instruction: Write a Python function that reverses a string.
Input: Response: text
Inference parameters used during evaluation:
max_new_tokens: 200do_sample: Falserepetition_penalty: 1.1pad_token_id: tokenizer.eos_token_id
Limitations
- Trained for only 1–3 epochs on 18k samples — may struggle with highly complex or multi-file code tasks.
- Optimized for single-instruction, single-response code generation; not designed for multi-turn conversation.
- Performance is measured on CodeAlpaca-style prompts; may degrade on very different prompt formats.
- Base model is 3B parameters — larger models (7B+) would likely achieve higher absolute scores.
Project
This adapter was built as part of a 7-day end-to-end LLM fine-tuning project covering LoRA/QLoRA concepts, dataset preparation, training, evaluation, deployment, and CI/CD. Full project repository: github.com/your-username/llm-lora-finetuning
- Downloads last month
- 27
Model tree for parthtamu/QLoRA-Finetuning
Base model
meta-llama/Llama-3.2-3B