20 8

Luke Alonso PRO

lukealonso

AI & ML interests

None yet

Recent Activity

updated a model about 22 hours ago

lukealonso/GLM-5.1-NVFP4

new activity about 22 hours ago

lukealonso/GLM-5.1-NVFP4:RuntimeError: The size of tensor a (3072) must match the size of tensor b (6144) at non-singleton dimension 1

updated a model about 22 hours ago

lukealonso/MiniMax-M2.7-NVFP4

View all activity

Organizations

None yet

New activity in lukealonso/GLM-5.1-NVFP4 about 22 hours ago

RuntimeError: The size of tensor a (3072) must match the size of tensor b (6144) at non-singleton dimension 1

#5 opened about 22 hours ago by

lianyouzao

New activity in lukealonso/MiniMax-M2.7-NVFP4 1 day ago

w1 not matching w3 weight scales

#1 opened 1 day ago by

dareposte

New activity in lukealonso/GLM-5.1-NVFP4 1 day ago

From "Doesn't Work" to 641 tok/s: GLM-5.1 NVFP4 on 6× RTX PRO 6000 Blackwell

🔥 1

#4 opened 2 days ago by

sakamakismile

New activity in lukealonso/GLM-5.1-NVFP4 3 days ago

Hopper GPU?

#2 opened 3 days ago by

AndrewMatienko

New activity in lukealonso/MiniMax-M2.5-NVFP4 about 2 months ago

Request: NVFP4 version of MiniMax-M2.5-REAP-139B (to fit on a single RTX 6000 Pro)

#7 opened about 2 months ago by

mondovero

New activity in lukealonso/GLM-5-NVFP4 about 2 months ago

Crash on first request on RTX Pro 6000 x8

👍 1

#3 opened about 2 months ago by

koushd

New activity in cerebras/MiniMax-M2.5-REAP-139B-A10B about 2 months ago

nvfp4

➕👍 2

#1 opened about 2 months ago by

ktsaou

New activity in lukealonso/MiniMax-M2.5-NVFP4 about 2 months ago

VLLM error for kv weight scaling - workaround

#6 opened about 2 months ago by

ShaunEvansMD

fp8 kv cache

#4 opened about 2 months ago by

festr2

Thanks for your effort

#5 opened about 2 months ago by

darkstar3537

KeyError: '110.w1.input_scale' with TRT

#3 opened about 2 months ago by

guanwenyu1995

"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."

👍 1

#2 opened about 2 months ago by

zenmagnets

Here's the vLLM recipe I'm using with 2x RTX Pro 6000

👍 3

#1 opened about 2 months ago by

zenmagnets

New activity in lukealonso/MiniMax-M2.1-NVFP4 about 2 months ago

one more time..? Minimax-M2.5 🔥

#4 opened about 2 months ago by

reneho

New activity in mratsim/MiniMax-M2.1-FP8-INT4-AWQ about 2 months ago

nvfp4

#9 opened 2 months ago by

festr2

New activity in lukealonso/MiniMax-M2.1-NVFP4 3 months ago

This is perfect! Thank you!

🔥 1

#1 opened 3 months ago by

ktsaou

New activity in lukealonso/MiniMax-M2-NVFP4 4 months ago

MinimaxM2.1

#5 opened 4 months ago by

reneho

New activity in MiniMaxAI/MiniMax-M2.1 4 months ago

NVFP4?

#2 opened 4 months ago by

ktsaou

New activity in lukealonso/MiniMax-M2-NVFP4 4 months ago

Devstral-2 NVFP4?

#3 opened 4 months ago by

reneho

New activity in lukealonso/MiniMax-M2-NVFP4 5 months ago

you know which nightly it worked with? because it does not with current one

#1 opened 5 months ago by

willfalco

Luke Alonso PRO

AI & ML interests

Recent Activity

Organizations

lukealonso's activity

RuntimeError: The size of tensor a (3072) must match the size of tensor b (6144) at non-singleton dimension 1

w1 not matching w3 weight scales

From "Doesn't Work" to 641 tok/s: GLM-5.1 NVFP4 on 6× RTX PRO 6000 Blackwell

Hopper GPU?

Request: NVFP4 version of MiniMax-M2.5-REAP-139B (to fit on a single RTX 6000 Pro)

Crash on first request on RTX Pro 6000 x8

nvfp4

VLLM error for kv weight scaling - workaround

fp8 kv cache

Thanks for your effort

KeyError: '110.w1.input_scale' with TRT

"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."

Here's the vLLM recipe I'm using with 2x RTX Pro 6000

one more time..? Minimax-M2.5 🔥

nvfp4

This is perfect! Thank you!

MinimaxM2.1

NVFP4?

Devstral-2 NVFP4?

you know which nightly it worked with? because it does not with current one