Writing · Tag

#local llm

4 posts tagged #local llm.

Jul 2, 20265 min read
How I Make Local Model Runs Fail Safely On A 5090
A local model run should prove its safety path before it proves a score. Here is the small guardrail loop I use on my RTX 5090 for QLoRA starter work.
#local ai#rtx 5090#qlora#local llm
Jun 30, 20265 min read
How to Run Local LLM Verifier Loops on Owned Hardware
A local LLM workflow needs more than a model prompt. It needs a verifier loop that proves the file, command, URL, or report changed before the agent claims done.
#local LLM#AI agents#verifier loops#llama.cpp
Apr 2, 20266 min read
Raspberry Pi 5 Offline Voice Assistant: Sub-2s, No Cloud (2026)
Want a private voice assistant with zero cloud and no subscription? A Raspberry Pi 5 runs it offline at sub-2s latency. We tested 6 local models on real hardware — here's the winner. (2026)
#Raspberry Pi#Edge AI#Voice Assistant#Local LLM
Mar 28, 20267 min read
Local LLM on Consumer GPUs: 50 req/s, $0/Call [Benchmarks 2026]
Cloud LLM bills hit $2K/month fast. An RTX 5070 Ti serves Llama 3.1 at 50 req/s for $0 per call — we benchmarked 4 consumer GPUs and built the exact production setup.
#Local LLM#Consumer GPU#AI Agents#llama.cpp

The AI agent build notes

Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.