Writing · Tag

#quantization

2 posts tagged #quantization.

Jun 8, 20266 min read
How to Pick a GGUF Quant Level for Your VRAM Budget
Given your GPU, which GGUF quant do you actually pick? The VRAM math, a card-by-card table, and the quality tradeoff in plain terms.
#local-llm#gguf#quantization#gpu
May 13, 20268 min read
GGUF Quantization: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)
Q4_K_M cuts model size 75% with barely any quality loss — but Q5, Q6, and Q8 each win in specific cases. We benchmarked every quant level on real hardware. Here's which to pick. (2026)
#llama.cpp#GGUF#Quantization#Local AI

The AI agent build notes

Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.