Ask HN: What does your local LLM setup looks like?

PJHkorea · 2026-06-15T12:47:46 1781527666

I used Gemma 4 on a budget PC to work with an AI that thinks in a new way. Since I am testing how far it can be controlled and developed through simple conversation rather than heavy-duty tasks, I dedicated all the low-end PCs to Gemma. I am satisfied with the test results.

Duaard · 2026-06-15T11:43:32 1781523812

RTX 5090, just because it dual boots as a gaming rig. I have a llama server usually with `gemma4:31b` or `qwen3.6:27b` that powers my own automation / orchestration harness.

If I can go back in time, I would probably buy a more AI dedicated machine but I also don't regret finally being able to play Cyberpunk in 4k with great FPS and overkill mods.

Peterz_shu · 2026-06-15T11:19:06 1781522346

I have a Mac mini (the square one, idk the model) with M3 chip.

Runs pretty well with Ollama on the Qwen models. It seems like Qwen has done a great job with speed.

abahgat · 2026-06-15T06:26:59 1781504819

I bought a pretty powerful desktop computer for gaming in late 2025. It came with an RTX 5080, which I started using to run some local LLMs and run some experiments (most recently, I was trying to get agents to get better at playing Zork I).

I've mostly enjoyed having WSL to leverage Linux dev tools, but it seems like it's still adding overhead that prevents me from taking advantage of the GPU in full, so I'll likely get another drive and install Linux.

I tried Qwen, Llama, Mistral and Gemma. Gemma 4 was pretty impressive.

xiaoyu2006 · 2026-06-15T06:57:10 1781506630

V100 32G SXM2 adapted to PCIE. Running llamacpp with Q4KM Qwen 3.6 27B or Gemma 4 31B. I use them when I feel of privacy is important or I just want to mess around.

aniokono · 2026-06-15T06:02:06 1781503326

Claude.MD and other MD context files with guardrails and enforcement?

Are you offering something in this space?