Writing · Tag
1 post tagged #multi-gpu.
Split a 70B model across multiple GPUs with llama.cpp. How --tensor-split, --main-gpu, and --split-mode work on a real consumer rig.
Real costs, real tools, no fluff. M-F when I ship, publish, or learn something worth sending.