Ne restez pas en arrière. 5 vidéos OpenClaw gratuites →
3 lecture min.by Yanko Aleksandrov

Local AI on a Budget: What 8GB of Jetson Actually Runs in 2026

A practical look at which local models and automations actually run well on an 8GB Jetson Orin Nano — and where the honest limits are.

clawboxopenclawHQlocal-aijetsonedge-aibudgetself-hostededucationblog

"Will 8GB even be enough?" is the most common question we get about running AI locally on small hardware. It deserves an honest answer, not a sales pitch — so here is exactly what an 8GB Jetson Orin Nano runs well, what it runs adequately, and where it genuinely hits a wall.

What 8GB actually means on a Jetson

Unlike a desktop PC, the Jetson's 8GB is unified memory — the CPU and the 1024-core GPU share the same pool. That cuts both ways: there's no separate "VRAM limit" to fight, but the OS, your services and the model all live in the same 8GB. In practice, after the system takes its share, you have roughly 5–6GB to spend on models.

What runs well

  • Quantized 4–8B language models. Llama 3.1 8B, Qwen 2.5 7B, or Phi-class models at Q4 quantization fit comfortably and typically generate at 10–20 tokens/second with GPU acceleration — fast enough for real conversational use. This is the sweet spot.
  • Speech-to-text. Whisper-class models run hardware-accelerated and handle voice notes and transcription without breaking a sweat.
  • An always-on assistant. This is the workload the box was made for: OpenClaw running 24/7 — inbox triage, browser automation, scheduled tasks, messaging on Telegram/WhatsApp/Discord — sips a few GB and leaves room for a local model alongside.
  • Hybrid setups. The pragmatic pattern most people land on: local model for routine/private work, plus your own OpenAI or Anthropic API key when a task needs a frontier-class brain. The box orchestrates both.

What runs with compromises

  • Longer contexts. Stuffing tens of thousands of tokens into a small model slows prompt processing noticeably. For "chat with my whole PDF library" workloads, expect patience or use the hybrid route.
  • 13B-class models. Aggressively quantized ones can technically load, but speed and quality trade-offs make 7–8B models the better daily drivers on 8GB.
  • Vision + language together. Doable with small models, but you're budgeting memory carefully at that point.

What honestly doesn't fit

  • Training or fine-tuning large models. Not the machine for it. Rent cloud GPUs for that.
  • 70B-class local models. No — and anyone telling you otherwise is selling something.
  • "Run everything at once." Like any small server, it rewards picking a focused set of services.

The bigger picture

The question behind the question is usually: can a small, low-power box be my actual daily AI assistant? And there the answer is clearly yes — because an assistant's usefulness comes from being always on and connected to your stuff (email, browser, messengers, schedules), not from raw model size. A 7B model that's awake at 3 AM doing your inbox beats a 70B model you have to boot up.

That's the design behind ClawBox: a Jetson Orin Nano Super (8GB, 67 TOPS) with 512GB NVMe storage and OpenClaw pre-installed — €549, drawing about as much power as a lightbulb. Run local models, bring your own cloud key, or mix both. You own the hardware; you decide where your data goes — and now you also know exactly where the limits are.

Prêt à découvrir Edge AI ?

ClawBox apporte de puissantes capacités d'IA directement à votre domicile ou votre bureau. Aucune dépendance au cloud, confidentialité totale et contrôle total sur votre assistant IA.