Llama 3.2 3B — E-commerce Distributed SQL

Fine-tuned version of Llama 3.2 3B that converts natural language questions into SQL queries for distributed e-commerce databases.

Example

Input:

### Instruction:
Convert to distributed SQL

### Input:
Find all customers who spent more than 1000 euros in Germany

### Response:

Output:

SELECT * FROM customers 
WHERE country = 'Germany' AND amount > 1000;

Model Details

Property	Value
Base model	Llama 3.2 3B
Fine-tuning method	QLoRA (4-bit quantization + LoRA)
LoRA rank	16
Trainable parameters	0.14%
Training GPU	Google Colab T4 (free tier)
Training time	~20 minutes
Dataset size	25 examples
Training epochs	3

Training Details

Fine-tuned using QLoRA — 4-bit NF4 quantization with LoRA adapters on the attention layers (q_proj, v_proj). This reduced memory requirements enough to train on a free Colab T4 GPU (15GB VRAM) in under 20 minutes, while only updating 0.14% of parameters.

Libraries used: HuggingFace Transformers, PEFT, TRL (SFTTrainer), bitsandbytes, datasets

Dataset

25 natural language → SQL pairs covering distributed e-commerce scenarios:

Orders across regions and shards
Inventory across warehouses
Customer analytics and segmentation
Revenue aggregations
JOIN queries across fragmented tables

Prompt format used during training:

### Instruction:
Convert to distributed SQL

### Input:
{natural language question}

### Response:
{SQL query}

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel

# Load base model + adapter
base = AutoModelForCausalLM.from_pretrained("unsloth/Llama-3.2-3B")
model = PeftModel.from_pretrained(base, "haricharanhl22/ecommerce-distributed-sql")
tokenizer = AutoTokenizer.from_pretrained("haricharanhl22/ecommerce-distributed-sql")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

query = """### Instruction:
Convert to distributed SQL

### Input:
Find top 5 customers by total order value

### Response:"""

result = pipe(query, max_new_tokens=100, do_sample=False)
print(result[0]["generated_text"])

Limitations

Trained on a small dataset (25 examples) — works best for common query patterns
Optimized for e-commerce schemas (orders, customers, products, inventory)
May not generalize well to very complex multi-level nested subqueries
SQL dialect closest to standard SQL / SQLite

Author

Hari Charan Hosakote Lokesh
M.Sc. Digital Engineering — Otto-von-Guericke-Universität Magdeburg

GitHub: haricharanhl22
LinkedIn: haricharanhl22
Live project: ai-bewerbung-assistant.vercel.app

Downloads last month: 44

Model tree for haricharanhl22/ecommerce-distributed-sql

Base model

meta-llama/Llama-3.2-3B

Finetuned

unsloth/Llama-3.2-3B

Adapter

(369)

this model