Writing · Tag
1 post tagged #agent-cost-control.
Tom Tunguz called it localmaxxing. I run a 3070 + 5070 Ti + 5090 in one box and serve Llama 3.1 8B locally every day. Here are the real tokens-per-second, the real watts, and the real cost per million tokens.
Real costs, real tools, no fluff. One email per week with what I'm building, what's working, and what's not.