How to Deploy NVIDIA NemoClaw on JarvisLabs

March 30, 2026 · 10 min read

JarvisLabs.ai

Deploy NemoClaw on JarvisLabs

NVIDIA NemoClaw is an open-source security stack that sandboxes autonomous AI agents at the kernel level. It wraps OpenClaw (an always-on AI assistant) inside the NVIDIA OpenShell runtime, enforcing filesystem isolation, network policies, and process restrictions so your agent can't escape its sandbox.

This guide shows the setup we used to run NemoClaw with local Nemotron 3 Nano 30B inference on a JarvisLabs A100 VM. No cloud API keys needed. Total setup time was under 15 minutes, inference runs at ~1 second per query, and the whole thing cost $1.13.

caution

NemoClaw is in early preview (released March 16, 2026). APIs and runtime behavior may have breaking changes. Do not use this in production.

Tested Environment


GPU	A100 40GB VRAM, 110GB system RAM
Instance type	VM (full disk persistence)
Model	Nemotron 3 Nano 30B via Ollama
OS	Ubuntu 22.04, CUDA 12.8, Docker 29.2
NemoClaw	v0.1.0 (early preview)
Cold start to first query	~12 minutes (includes model download)
Inference time	~1.1 seconds (warm)
GPU memory usage	25GB / 40GB
Total cost	$1.13 (52 minutes of A100 time)

Run your ML workloads on Jarvislabs

A100s, H100s, and H200s with per-minute billing. Pre-configured environments, 90-second startup, and no long-term commitments.

Get Started

Before You Start

JarvisLabs account + jl CLI installed (installation guide)
The Nemotron 3 Nano 30B model download is ~24GB
NemoClaw pulls ~2.4GB of container images
Total disk usage: ~35GB (well within the 100GB VM disk)

To install the CLI (requires uv):

uv tool install jarvislabs
jl setup  # interactive — enter your API token from jarvislabs.ai/settings

Why a VM Instead of a Container

NemoClaw needs Docker (it runs OpenShell inside Docker, which manages a K3s cluster internally). It also requires Node.js, the openshell CLI, and several system-level packages. On a JarvisLabs container, anything installed via apt is lost when you pause and resume. A VM gives you full disk persistence, so you set up once and everything survives pause/resume.

The tradeoff: VMs have a public IP, which means you need to set up a firewall. We cover that below.

NemoClaw Installation

Step 1: Create a VM

jl create --gpu A100 --region IN2 --vm --name nemoclaw-agent

Note the instance ID and SSH command from the output. SSH into the VM:

ssh -o StrictHostKeyChecking=no ubuntu@<instance-ip>

All remaining commands run on the VM unless stated otherwise.

Step 2: Lock Down the Firewall

VMs have a public IP. Before installing anything, set up UFW to block all incoming traffic except SSH:

sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw --force enable

Then allow Docker's internal networks to reach Ollama (this is critical and easy to miss):

sudo ufw allow from 172.17.0.0/16 to any port 11434 proto tcp
sudo ufw allow from 172.18.0.0/16 to any port 11434 proto tcp
sudo ufw allow from 10.42.0.0/16 to any port 11434 proto tcp

Without these rules, NemoClaw's sandboxed containers won't be able to reach the Ollama inference server running on the host. The 172.17.0.0/16 rule covers the default Docker bridge network, 172.18.0.0/16 covers the Docker network that the OpenShell gateway container uses, and 10.42.0.0/16 covers the K3s pod CIDR (K3s default). These ranges may differ on your system — check docker network ls and docker inspect if the defaults don't work.

Step 3: Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Expected output: >>> NVIDIA GPU installed.

Step 4: Configure Ollama to Listen on All Interfaces

By default, Ollama only listens on 127.0.0.1. NemoClaw's sandbox runs inside a container, so it needs Ollama listening on 0.0.0.0:

sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo tee /etc/systemd/system/ollama.service.d/override.conf > /dev/null <<EOF
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
EOF
sudo systemctl daemon-reload
sudo systemctl restart ollama

Verify it's listening correctly:

curl -s http://localhost:11434/api/tags

Expected output: {"models":[]} (empty because we haven't pulled a model yet).

Step 5: Pull the Nemotron Model

ollama pull nemotron-3-nano:30b

This downloads ~24GB. On the A100 instance it took about 1 minute 40 seconds.

Expected output: success after the download completes.

Step 6: Start Docker

The JarvisLabs VM has Docker pre-installed but it may not be running:

sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker $USER
newgrp docker

The newgrp docker applies the group change to your current shell. Without it, the NemoClaw installer will fail with Docker permission errors.

Step 7: Install NemoClaw

curl -fsSL https://www.nvidia.com/nemoclaw.sh | \
  NEMOCLAW_NON_INTERACTIVE=1 \
  NEMOCLAW_PROVIDER=ollama \
  NEMOCLAW_MODEL=nemotron-3-nano:30b \
  bash

This does three things:

Installs Node.js via nvm (if not present)
Installs the NemoClaw CLI
Runs the onboarding wizard in non-interactive mode

Expected output: The installer will progress through 7 steps:

[1/7] Preflight checks          ✓
[2/7] Starting OpenShell gateway ✓
[3/7] Configuring inference      ✓
[4/7] Setting up inference       ✓
[5/7] Creating sandbox           ✓ (takes a few minutes on first run)
[6/7] Setting up OpenClaw        ✓
[7/7] Policy presets             ✓

At the end you'll see:

Sandbox      my-assistant (Landlock + seccomp + netns)
Model        nemotron-3-nano:30b (Local Ollama)

If step 4 fails with "containers cannot reach host.openshell.internal:11434", the firewall rules from Step 2 are missing. Go back and add them, then rerun the installer.

Step 8: Verify Inference

Test that Ollama is serving the model correctly on the host:

curl -sf http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"nemotron-3-nano:30b","messages":[{"role":"user","content":"What is the capital of France? Reply in one sentence."}]}'

Expected output: A JSON response with "choices" containing the model's reply. The first query takes a few seconds as the model loads into GPU memory; subsequent queries are faster (~1 second).

This verifies Ollama on the host. The NemoClaw sandbox routes inference through inference.local internally — the onboarding wizard already validated that path in Step 7.

Step 9: Connect to the Sandbox

nemoclaw my-assistant connect

This drops you into a sandboxed shell. From here, launch the OpenClaw chat interface:

openclaw tui

You're now chatting with Nemotron 3 Nano 30B through a kernel-level sandboxed agent. Every network request, filesystem access, and process spawn is governed by NemoClaw's security policies.

Step 10: Clean Up

When you're done, exit the SSH session and pause from your local machine to stop billing. You can find your instance ID with jl list:

jl list
jl pause <instance-id>

Since this is a VM, everything persists. When you resume, Ollama and Docker restart automatically. Reconnect with:

ssh -o StrictHostKeyChecking=no ubuntu@<instance-ip>
nemoclaw my-assistant connect

If nemoclaw my-assistant connect fails after resume, restart the services first:

sudo systemctl start docker ollama
# Wait ~30 seconds for the OpenShell gateway to come back
nemoclaw my-assistant connect

Run your ML workloads on Jarvislabs

A100s, H100s, and H200s with per-minute billing. Pre-configured environments, 90-second startup, and no long-term commitments.

Get Started

How NemoClaw Security Works

NemoClaw creates a multi-layered sandbox using Linux kernel security features:

Landlock restricts filesystem access. The agent can only read/write to /sandbox and /tmp. System directories like /usr and /etc are read-only. The agent cannot access your home directory, SSH keys, or any files outside its sandbox.

seccomp filters system calls. The agent can't call ptrace, mount, or other privileged syscalls that could let it escape the sandbox or escalate privileges.

Network namespaces isolate network access. All outbound traffic is blocked by default. The agent can only reach endpoints explicitly allowed in the policy (like inference.local for the LLM, github.com for code access, etc.). Every connection goes through the OpenShell gateway which enforces TLS termination and path-based rules.

Inference routing keeps API keys off the sandbox. When the agent calls the LLM, the request goes to inference.local (a virtual endpoint inside the sandbox). The OpenShell gateway intercepts it and routes it to the actual provider (Ollama in our case). Your API keys never enter the sandbox.

The default policy includes network rules for GitHub, npm registry, NVIDIA endpoints, Telegram, Discord, and the inference endpoint. Additional presets like PyPI and npm can be added during onboarding or later. Run nemoclaw my-assistant policy-list to see what's active, and nemoclaw my-assistant policy-add to add more.

NemoClaw Troubleshooting

Issue	Cause	Fix
"containers cannot reach host.openshell.internal:11434"	UFW blocking Docker-to-host traffic	Add UFW rules for 172.17.0.0/16, 172.18.0.0/16, and 10.42.0.0/16 on port 11434
"Docker is not running"	Docker service not started	`sudo systemctl start docker`
"Ollama listens on 127.0.0.1"	Default Ollama config	Create systemd override with `OLLAMA_HOST=0.0.0.0:11434`
"nemoclaw: command not found"	nvm PATH not loaded	`source ~/.bashrc` or restart shell
"port 8080 in use"	Previous gateway still running	Reuse it (NemoClaw detects this automatically)
Sandbox image pull timeout	Slow network or large image	Retry; the openclaw image is ~2.2GB compressed

Cost and GPU Requirements

GPU	VRAM	RAM	$/hr	Result
A100	40GB	110GB	$1.29	Works. 25GB VRAM used, 1.1s inference

Nemotron 3 Nano 30B needs ~25GB VRAM during inference on our test (short prompts, single turn). The A100's 40GB gives comfortable headroom. An L4 (24GB VRAM) fits the model at short context lengths according to Ollama's docs, but we didn't test it — longer conversations or larger contexts may push beyond 24GB.

Total compute cost for this tutorial: $1.13 (52 minutes including setup, model download, testing, and idle time between steps).

What is NemoClaw?

NemoClaw was announced at GTC on March 16, 2026. It's NVIDIA's answer to a real problem: as AI agents get more autonomous (browsing the web, writing code, managing files), how do you stop them from doing damage?

The stack has four layers:

NemoClaw CLI — TypeScript tool that orchestrates everything
NemoClaw Blueprint — Python orchestration for sandbox and policy management
OpenShell — The runtime that creates and manages sandboxed containers with kernel-level isolation
OpenClaw — The AI assistant framework that runs inside the sandbox

NemoClaw is Apache 2.0 licensed and currently in early preview. It supports NVIDIA Endpoints, OpenAI, Anthropic, Google Gemini, and local inference via Ollama.

GitHub: NVIDIA/NemoClaw
Docs: docs.nvidia.com/nemoclaw
Ollama integration: docs.ollama.com/integrations/nemoclaw

What We Learned

The firewall gotcha is the #1 blocker. NemoClaw runs containers inside Docker inside K3s. These containers need to reach Ollama on the host, but UFW blocks this by default. You need explicit rules for three CIDR ranges (172.17.0.0/16, 172.18.0.0/16, 10.42.0.0/16).
Ollama's default bind address breaks containerized setups. Every guide that runs Ollama alongside Docker containers hits this. The systemd override to set OLLAMA_HOST=0.0.0.0 should be your first step after installing Ollama.
VMs are the right call for NemoClaw. The stack has deep system dependencies (Docker, K3s, Node.js, multiple container images). On a JarvisLabs container, you'd need to reinstall most of this after every pause/resume. A VM persists everything.
Local inference on A100 is fast and cheap. Nemotron 3 Nano 30B runs at ~1 second per query with 25GB VRAM usage. At $1.29/hr, that's under $0.001 per query with no API rate limits or token costs.

Try it on JarvisLabs. Get started at jarvislabs.ai.

Run your ML workloads on Jarvislabs

A100s, H100s, and H200s with per-minute billing. Pre-configured environments, 90-second startup, and no long-term commitments.

Get Started

Tested Environment​

Before You Start​

Why a VM Instead of a Container​

NemoClaw Installation​

Step 1: Create a VM​

Step 2: Lock Down the Firewall​

Step 3: Install Ollama​

Step 4: Configure Ollama to Listen on All Interfaces​

Step 5: Pull the Nemotron Model​

Step 6: Start Docker​

Step 7: Install NemoClaw​

Step 8: Verify Inference​

Step 9: Connect to the Sandbox​

Step 10: Clean Up​

How NemoClaw Security Works​

NemoClaw Troubleshooting​

Cost and GPU Requirements​

What is NemoClaw?​

What We Learned​

Tested Environment

Before You Start

Why a VM Instead of a Container

NemoClaw Installation

Step 1: Create a VM

Step 2: Lock Down the Firewall

Step 3: Install Ollama

Step 4: Configure Ollama to Listen on All Interfaces

Step 5: Pull the Nemotron Model

Step 6: Start Docker

Step 7: Install NemoClaw

Step 8: Verify Inference

Step 9: Connect to the Sandbox

Step 10: Clean Up

How NemoClaw Security Works

NemoClaw Troubleshooting

Cost and GPU Requirements

What is NemoClaw?

What We Learned