JarvisLabs CLI

New CLI — Part of the jarvislabs package

The jl CLI is part of the new jarvislabs package, replacing the deprecated jlclient. If you're still using jlclient, see the migration note.

The jl command-line tool lets you manage GPU instances, run training scripts, transfer files, and monitor experiments on JarvisLabs.ai — all from your terminal. It's built to work seamlessly with AI coding agents like Claude Code, Codex, Cursor, and OpenCode, so your agent can spin up GPUs, run experiments, and monitor results autonomously.

Package: jarvislabs | CLI command: jl | Version: 0.2.x (beta)

Beta Software

The jl CLI is in beta. Commands and options may change between releases. Pin your version in CI/automation scripts and check the changelog when upgrading.

Platform Support

Linux and macOS are fully supported. Windows is experimental and not fully tested — if you run into issues, please report them.

Want to see what the CLI can do?

Jump to the Examples section for end-to-end workflows covering training runs, agent automation, filesystem management, and more.

Installation

The package is currently in beta. Install with the --pre flag to get the latest prerelease.

As a CLI tool (recommended)

uv tool install --pre jarvislabs

To upgrade:

uv tool upgrade --pre jarvislabs

With pip

pip install --pre jarvislabs

After installation, the jl command is available in your terminal.

What does jl setup do?

Run jl setup once after installing from your terminal. It walks you through:

Authentication — prompts for your API token (get one from jarvislabs.ai/settings/api-keys) and saves it locally
Account status — shows your current balance and active instances
Agent skill installation — asks which AI coding agents you use (Claude Code, Codex, Cursor, OpenCode) and installs skill files for them with your approval, so your agent knows how to use jl out of the box

Exploring the CLI with --help

Every command and subcommand supports --help. It's the quickest way to see what's available, what flags a command takes, and what they do. You can pretty much learn the entire CLI from help alone.

jl --help                  # top-level commands
jl instance --help         # all instance subcommands
jl run --help              # run options, targets, lifecycle flags
jl instance create --help  # every flag for creating an instance

Quick Start

There are two main ways to use the CLI, depending on how much control you need.

Path 1: Run a script directly on a fresh GPU

The fastest way to get started. This creates a GPU instance, uploads your code, installs dependencies, runs the script, and pauses the instance when done — all in one command.

# One-time setup
jl setup

# Check your balance and make sure you're good to go
jl status

# See which GPUs are currently available and their pricing
jl gpus

# Run a single training script on a fresh RTX5000
# Creates instance, uploads train.py, installs requirements, runs it, pauses when done
jl run train.py --gpu RTX5000 --requirements requirements.txt -- --epochs 50

# Or if you have a project directory, sync the whole thing
# This uploads your directory, creates a venv, installs deps, and runs the entrypoint
jl run . --script train.py --gpu A100 --requirements requirements.txt

# You can also run setup commands or a setup script before your main command
jl run . --script train.py --gpu A100 \
  --requirements requirements.txt \
  --setup "pip install flash-attn" \
  --setup-file setup.sh

# The CLI streams logs by default. Once the run finishes, the instance is auto-paused.
# If you detached (Ctrl+C) or used --no-follow, you can check logs anytime:
jl run logs <run_id> --tail 50

# Check the final status of your run
jl run status <run_id>

Path 2: Manage instances yourself

If you want more control — SSH access, reusing machines across runs, attaching filesystems, or interactive debugging — create and manage instances directly.

# One-time setup
jl setup

# See available GPUs and pricing
jl gpus

# Create a GPU instance with 100 GB storage
jl instance create --gpu A100 --storage 100 --name "my-experiment"

# List your instances to get the machine ID
jl instance list

# SSH into your instance for interactive work
jl instance ssh <machine_id>

# Or upload and run a script on it
jl run train.py --on <machine_id>

# Check logs while the run is going
jl run logs <run_id> --tail 50

# Upload additional files to the instance
jl instance upload <machine_id> ./data /home/data

# Download results when you're done
jl instance download <machine_id> /home/results ./results -r

# Pause when you're done - stops compute billing, keeps your data
jl instance pause <machine_id>

# Later, resume with the same or a different GPU
jl instance resume <machine_id> --gpu RTX5000

# When you're completely done, destroy to stop all billing (including storage)
jl instance destroy <machine_id>

Authentication

Get your API token from jarvislabs.ai/settings/api-keys.

Interactive setup

jl setup

This authenticates, optionally installs agent skills, shows your account status, and displays a getting-started guide.

Non-interactive setup

jl setup --token YOUR_TOKEN --yes

tip

Without --yes, jl setup will still prompt for agent-skill installation even when --token is provided. Use --agents all or --yes to make setup fully non-interactive.

Environment variable

export JL_API_KEY="YOUR_TOKEN"

Token precedence

Both the CLI and SDK use the same resolution chain:

Priority	Method	Used by
1	`Client(api_key="...")` argument	SDK only
2	`JL_API_KEY` environment variable	CLI + SDK
3	Config file (saved by `jl setup`)	CLI + SDK

See Config file location below for config paths. See the SDK Authentication docs for more details.

Config file location

The config file is stored via platformdirs:

Linux: ~/.config/jl/config.toml
macOS: ~/Library/Application Support/jl/config.toml

Removing saved credentials

jl logout

Global Flags

These flags are available on most commands (exceptions noted below):

Flag	Description
`--json`	Output as machine-readable JSON (to stdout). Human-readable output goes to stderr.
`--yes` / `-y`	Skip all confirmation prompts.
`--version`	Print version and exit (root-level: `jl --version`).

info

--json and --yes are command-level options, not root-level — so jl instance list --json works correctly. Most commands support --json. --yes is only available on commands that have confirmation prompts (create, pause, resume, destroy, rename, run start, etc.). jl setup supports --yes but not --json. Read-only commands like jl gpus and jl run logs do not accept --yes.

Account Commands

`jl setup`

Set up the JarvisLabs CLI: authenticate and install agent skills.

Option	Short	Description
`--token`	`-t`	API token (skips interactive prompt)
`--agents`		Comma-separated agent list: `claude-code`, `codex`, `cursor`, `opencode`, or `all`
`--yes`	`-y`	Skip confirmation prompts; auto-selects all agents

# Interactive setup
jl setup

# Non-interactive with token and all agent skills
jl setup --token YOUR_TOKEN --agents all --yes

# Install skills for specific agents only
jl setup --agents claude-code,cursor

If already authenticated, jl setup will show your current login and ask to re-authenticate. The --agents flag controls which coding agent skill files are installed:

Agent	Skill file path
`claude-code`	`~/.claude/skills/jarvislabs/SKILL.md`
`codex`	`~/.agents/skills/jarvislabs/SKILL.md`
`cursor`	`~/.cursor/skills/jarvislabs/SKILL.md`
`opencode`	`~/.config/opencode/skills/jarvislabs/SKILL.md`

`jl logout`

Remove the saved API token from the config file. Supports --json for scripted usage.

jl logout

`jl status`

Show account info: name, user ID, balance, grants, and running/paused instance counts.

jl status
jl status --json

info

JSON output includes additional fields not shown in the human-readable table: running VMs, paused VMs, active deployments, filesystems, and billing currency.

`jl gpus`

Show GPU types with availability, region, VRAM, RAM, CPUs, and hourly pricing. Available GPUs are marked with a green dot, unavailable with a dim circle.

jl gpus
jl gpus --json

`jl templates`

List available framework templates that can be used with --template when creating instances (e.g. pytorch, tensorflow, jax, vm).

jl templates
jl templates --json

Regions & GPUs

JarvisLabs has three regions, each with different GPU types available. When creating an instance, the CLI auto-selects the best region based on your chosen GPU — or you can pin a specific region with --region.

Region	Available GPUs
`IN1`	RTX5000, A5000Pro, A6000, RTX6000Ada, A100
`IN2`	L4, A100, A100-80GB
`EU1`	H100, H200

Run jl gpus to see real-time availability and pricing for each GPU type.

Storage & Template Constraints

EU1 region: supports 1 or 8 GPUs per instance only, 100 GB minimum storage (auto-bumped if you specify less)
VM template: 100 GB minimum storage (auto-bumped if you specify less)
VM template is only available in IN2 and EU1 regions, and requires at least one SSH key registered

Instance Commands

All instance commands live under jl instance. Here's how you can manage the full lifecycle of GPU instances — from creation to teardown.

`jl instance list`

List all your instances with their ID, name, status, GPU type, GPU count, storage, region, cost, and template.

jl instance list
jl instance list --json

`jl instance get <machine_id>`

Show full details of a specific instance including SSH command, notebook URL, HTTP ports, and endpoint URLs.

jl instance get 12345
jl instance get 12345 --json

`jl instance create`

Create a new GPU instance. The command blocks until the instance reaches Running status, so when it returns, your instance is ready to use.

Option	Short	Default	Description
`--gpu`	`-g`	(required)	GPU type (run `jl gpus` to see options)
`--template`	`-t`	`pytorch`	Framework template (run `jl templates` to see options)
`--storage`	`-s`	`40`	Storage in GB
`--name`	`-n`	`"Name me"`	Instance name (max 40 characters)
`--num-gpus`		`1`	Number of GPUs
`--region`			Region pin (e.g. `IN1`, `IN2`, `EU1`)
`--http-ports`			Comma-separated HTTP ports to expose (e.g. `7860,8080`)
`--script-id`			Startup script ID to run on launch
`--script-args`			Arguments passed to the startup script
`--fs-id`			Filesystem ID to attach
`--yes`	`-y`		Skip confirmation
`--json`			Output as JSON

# Basic instance
jl instance create --gpu RTX5000

# H100 with more storage and a name
jl instance create --gpu H100 --storage 200 --name "training-box"

# With a startup script and filesystem
jl instance create --gpu A100 --script-id 42 --fs-id 10

# Pin to a region
jl instance create --gpu A100 --region EU1

# Expose HTTP ports
jl instance create --gpu RTX5000 --http-ports "7860,8080"

# VM instance (requires SSH key - add one first with jl ssh-key add)
jl instance create --gpu H100 --template vm --name "my-vm"

# Non-interactive
jl instance create --gpu RTX5000 --yes --json

Prompts for confirmation unless --yes is passed. See Regions & GPUs for which GPUs are available in each region and storage constraints.

`jl instance pause <machine_id>`

Pause a running instance. Compute billing stops; a small storage cost continues.

Option	Short	Description
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl instance pause 12345
jl instance pause 12345 --yes --json

`jl instance resume <machine_id>`

Resume a paused instance. You can also use this opportunity to change the GPU type, expand storage, rename the instance, or attach a different startup script or filesystem. The command blocks until the instance is running again.

Option	Short	Description
`--gpu`	`-g`	Resume with a different GPU type
`--num-gpus`		Change number of GPUs
`--storage`	`-s`	Expand storage in GB (can only increase, never shrink)
`--name`	`-n`	Rename instance on resume
`--script-id`		Startup script ID to run on resume
`--script-args`		Arguments for the startup script
`--fs-id`		Filesystem ID to attach
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

# Resume with defaults
jl instance resume 12345

# Resume with a bigger GPU
jl instance resume 12345 --gpu H100

# Resume with more storage and a new name
jl instance resume 12345 --storage 200 --name "upgraded"

Region Lock & ID Changes

Resume is region-locked — an instance always resumes in its original region. If you request a GPU type not available in that region, the API returns an error.

Resume may also assign a new machine ID. The CLI warns you when this happens. Always use the returned ID for subsequent operations.

`jl instance destroy <machine_id>`

Permanently delete an instance and all its data.

warning

This action is irreversible. All data on the instance is lost. If you need to keep data across instances, use a filesystem.

Option	Short	Description
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl instance destroy 12345
jl instance destroy 12345 --yes --json

`jl instance rename <machine_id>`

Rename an instance.

Option	Short	Description
`--name`	`-n`	New instance name (required, max 40 characters)
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl instance rename 12345 --name "experiment-v2"

SSH, Exec & File Transfer

These commands let you interact directly with running instances — open a shell, run commands remotely, or transfer files back and forth.

`jl instance ssh <machine_id>`

SSH into a running instance. This opens an interactive shell session.

Option	Short	Description
`--print-command`	`-p`	Print the raw SSH command to stdout instead of connecting
`--json`		Output the SSH command as JSON

# Interactive session
jl instance ssh 12345

# Get the SSH command for use in scripts
jl instance ssh 12345 --print-command

The instance must be in Running status. If paused, you'll be told to resume it first.

tip

--print-command and --json output the stored SSH command regardless of instance status — useful for scripting and automation.

`jl instance exec <machine_id> -- <command>`

Run a command on a running instance and stream the output back to your terminal. The -- separator is required so jl can distinguish your remote command from its own flags.

Option	Short	Description
`--json`		Capture output as JSON with `stdout`, `stderr`, and `exit_code` fields

# Check GPU
jl instance exec 12345 -- nvidia-smi

# Run Python
jl instance exec 12345 -- python -c "import torch; print(torch.cuda.device_count())"

# List files
jl instance exec 12345 -- ls -la /home

# Use shell features (pipes, redirection) - wrap in sh -lc
jl instance exec 12345 -- sh -lc 'grep "loss" /home/output.log | tail -5'

# Structured output for scripting
jl instance exec 12345 --json -- nvidia-smi

The exit code of the remote command is propagated as the exit code of jl instance exec.

tip

If your remote command uses pipes, redirection, or other shell features, wrap it in sh -lc '...' as shown above. Without the wrapper, each argument is treated as a separate command argument rather than shell syntax.

`jl instance upload <machine_id> <source> [dest]`

Upload a local file or directory to a running instance. If no remote destination is given, it uploads to the instance's home directory (/home/ for containers, /home/<user>/ for VMs).

Directories are uploaded recursively automatically.

Option	Short	Description
`--json`		Output upload result as JSON

# Upload a file (lands at /home/data.csv)
jl instance upload 12345 ./data.csv

# Upload a directory (lands at /home/my-project/)
jl instance upload 12345 ./my-project

# Upload to a specific remote path
jl instance upload 12345 ./config.yaml /home/config.yaml

`jl instance download <machine_id> <source> [dest] [-r]`

Download a file or directory from a running instance. If no local destination is given, it saves to ./<filename> in the current directory.

Option	Short	Description
`--recursive`	`-r`	Download directories recursively
`--json`		Output download result as JSON

# Download a file (saves to ./results.csv)
jl instance download 12345 /home/results.csv

# Download to a specific local path
jl instance download 12345 /home/results.csv ./my-results.csv

# Download a directory
jl instance download 12345 /home/outputs ./local-outputs -r

Managed Runs

Managed runs are the fastest way to run scripts on GPU instances. A single jl run command handles uploading your code, setting up a Python virtual environment (via uv), installing requirements (when specified with --requirements), running your command in the background, and tracking logs.

Runs persist in the background even if you disconnect or close your terminal. Logs, status, and lifecycle are tracked locally in ~/.jl/runs/.

Run Targets

Target	What happens
`train.py`	Uploads the single file, runs `python3 train.py`
`run.sh`	Uploads the single file, runs `bash run.sh`
`.` or `./my-project`	Syncs the directory via rsync, runs `--script` inside it (requires `rsync` installed locally)
(no target)	Runs the command given after `--` directly on the instance

Only .py and .sh file targets are supported directly. For other file types, use a directory target or jl instance upload + jl instance exec.

Starting a Run on an Existing Instance

jl run <target> --on <machine_id> [options] [-- extra args]

# Run a Python file
jl run train.py --on 12345

# Upload a directory and run a script inside it
jl run . --script train.py --on 12345

# Pass arguments to your script
jl run train.py --on 12345 -- --epochs 50 --lr 0.001

# Run an arbitrary remote command (no upload)
jl run --on 12345 -- python -c "print('hello from GPU')"

Starting a Run on a Fresh Instance

jl run <target> --gpu <gpu_type> [options] [-- extra args]

This creates a new instance, uploads your code, runs the command, and handles instance lifecycle when done.

# Run on a fresh RTX5000
jl run train.py --gpu RTX5000

# With requirements
jl run . --script train.py --gpu A100 --requirements requirements.txt

# Destroy instance after run (no leftover costs)
jl run train.py --gpu RTX5000 --destroy

# Keep instance running after run (for debugging)
jl run train.py --gpu RTX5000 --keep

You must use either --on or --gpu, not both.

All Start Options

Option	Short	Default	Description
`--on`			Run on an existing instance (machine ID)
`--gpu`	`-g`		Create a fresh instance with this GPU type
`--script`			Entrypoint script path inside a directory target
`--template`	`-t`	`pytorch`	Framework template (fresh instances only)
`--storage`	`-s`	`40`	Storage in GB (fresh instances only)
`--name`	`-n`	`jl-run`	Instance name (fresh instances only)
`--num-gpus`		`1`	Number of GPUs (fresh instances only)
`--region`			Region pin, e.g. `IN1`, `EU1` (fresh instances only)
`--http-ports`			Comma-separated HTTP ports to expose (fresh instances only)
`--requirements`			Local requirements file to upload and install
`--setup`			Shell command to run before the main command
`--setup-file`			Local bash file to upload and run before the main command
`--follow` / `--no-follow`		`--follow`	Stream logs after starting the run
`--pause`			Pause fresh instance after the run (default for fresh)
`--destroy`			Destroy fresh instance after the run
`--keep`			Leave fresh instance running after the run
`--yes`	`-y`		Skip confirmation prompts
`--json`			Output as JSON

Setup Chain

For file and directory targets, the following setup steps run before your main command (chained with &&):

uv installed if missing (via curl)
.venv created if missing (via uv venv)
.venv activated (via . .venv/bin/activate)
Requirements installed if --requirements is provided (via uv pip install -r <file>)
Setup file run if --setup-file is provided (via bash <file>)
Setup command run if --setup is provided (the raw shell command)
Main script runs

All steps are chained with &&, so if any step fails, subsequent steps (including your main command) will not run.

For command-mode runs (no target), only the --setup command is prepended — --requirements and --setup-file are not available.

# Full setup chain example
jl run . --script train.py --on 12345 \
  --requirements requirements.txt \
  --setup-file setup.sh \
  --setup "pip install flash-attn"

Lifecycle Flags (Fresh Instances Only)

When creating a fresh instance with --gpu, these flags control what happens after the run completes:

Flag	Behavior
`--pause`	Pause the instance after the run (default for fresh instances)
`--destroy`	Destroy the instance — no leftover costs
`--keep`	Leave the instance running (for debugging or follow-up work)

Only one lifecycle flag can be used at a time. These flags cannot be used with --on (existing instances are not touched after the run).

Detaching from fresh instances

--no-follow for fresh instances requires --keep. Since --pause and --destroy need the CLI to be connected when the run ends to perform the lifecycle action, they are incompatible with --no-follow. If you detach (Ctrl+C or --no-follow), the automatic lifecycle action will not happen — the instance stays running and billing continues. Manage it manually with jl instance pause or jl instance destroy.

Follow vs No-Follow

By default, jl run streams logs after starting (--follow). Press Ctrl+C to detach — the run keeps going in the background. Without --tail, --follow initially shows the last 20 lines before streaming new output.

# Default: stream logs, auto-pause when done
jl run train.py --gpu RTX5000

# Detached: start and return immediately (requires --keep for fresh instances)
jl run train.py --gpu RTX5000 --keep --no-follow

# Detached on existing instance (no lifecycle flag needed)
jl run train.py --on 12345 --no-follow

`jl run logs <run_id>`

View logs from a managed run.

Option	Short	Description
`--follow`	`-f`	Stream logs in real time (press Ctrl+C to stop)
`--tail`	`-n`	Show only the last N lines (minimum: 1)
`--json`		Output as JSON with `content` and `run_exit_code` fields

# Full log output
jl run logs r_abc123

# Last 50 lines
jl run logs r_abc123 --tail 50

# Stream logs live
jl run logs r_abc123 --follow

# Stream with initial context
jl run logs r_abc123 --follow --tail 100

# JSON output with exit code (for scripting/agents)
jl run logs r_abc123 --tail 50 --json

JSON output fields:

Field	Description
`run_id`	The run identifier
`machine_id`	Instance the run is on
`remote_log`	Path to the log file on the remote instance
`content`	The log text (last N lines if `--tail` used, full log otherwise)
`run_exit_code`	`null` = still running, `0` = succeeded, non-zero = failed

info

--json is not supported with --follow. Without --tail, the entire log file is returned — this can be very large for long-running jobs.

Non-JSON output shows raw logs with a header and footer indicating run state:

--- run r_abc123 | machine 12345 | running ---

step=100 loss=2.31
step=200 loss=2.11

--- still running | log: /home/jl-runs/r_abc123/output.log ---

`jl run status <run_id>`

Show the current state of a run.

Option	Short	Description
`--json`		Output as JSON

Possible states: running, succeeded, failed, instance-paused, instance-pausing, instance-missing, instance-creating, instance-resuming, instance-destroying, instance-failed, unknown.

jl run status r_abc123
jl run status r_abc123 --json

`jl run stop <run_id>`

Stop a managed run by sending TERM to its process group. The instance itself is not affected.

Option	Short	Description
`--json`		Output as JSON

jl run stop r_abc123
jl run stop r_abc123 --json

If the process doesn't exit after TERM, it escalates to SIGKILL. If the run has already finished, it reports the final state without error.

`jl run list`

List all locally tracked managed runs (most recent first).

Option	Short	Description
`--refresh`		Check live status for each run by querying the instance (slower)
`--machine`	`-m`	Filter by instance ID
`--limit`	`-l`	Show only the N most recent runs
`--status`	`-s`	Filter by state (e.g. `running`, `succeeded`, `failed`). Implies `--refresh`.
`--json`		Output as JSON

# All runs (shows "saved" state without live check)
jl run list

# With live status refresh
jl run list --refresh

# Filter by instance
jl run list --machine 12345

# Most recent 5 runs
jl run list --limit 5

# Only running jobs
jl run list --status running

# For scripting
jl run list --refresh --json

Without --refresh, the state column shows saved (from the local record). Use --refresh or --status to query each instance for live state. Using --status automatically implies --refresh.

Implicit `start` Subcommand

jl run <target> is shorthand for jl run start <target>. The start subcommand is implied when the first argument isn't a known subcommand (list, status, logs, stop).

# These are equivalent:
jl run train.py --gpu RTX5000
jl run start train.py --gpu RTX5000

Run Tracking is Local

info

All run management commands (jl run logs, jl run status, jl run stop, jl run list) depend on local records stored under ~/.jl/runs/. You need to start and monitor runs from the same machine. If the local record is missing, the run_id alone is not enough to interact with the run.

Each run record is a JSON file at ~/.jl/runs/<run_id>.json containing the machine ID, remote log path, PID file path, exit code path, and launch command.

SSH Key Commands

SSH keys are required if you want to use the VM template (bare-metal SSH access without a pre-configured container). You can manage your keys with jl ssh-key.

`jl ssh-key list`

List all SSH keys (ID, name, and truncated key).

jl ssh-key list
jl ssh-key list --json

`jl ssh-key add <pubkey_file>`

Add an SSH public key.

Option	Short	Description
`--name`	`-n`	Name for this key (required)
`--json`		Output as JSON

jl ssh-key add ~/.ssh/id_ed25519.pub --name "my-laptop"

`jl ssh-key remove <key_id>`

Remove an SSH key.

Option	Short	Description
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl ssh-key remove abc123

Startup Script Commands

Startup scripts are shell scripts that run automatically whenever an instance launches or resumes — useful for installing dependencies, pulling data, or setting up your environment. You can manage them with jl scripts.

`jl scripts list`

List startup scripts (ID and name).

jl scripts list
jl scripts list --json

`jl scripts add <script_file>`

Add a startup script.

Option	Short	Description
`--name`	`-n`	Script name (defaults to filename without extension)
`--json`		Output as JSON

jl scripts add ./setup.sh --name "install-deps"

`jl scripts update <script_id> <script_file>`

Replace the contents of an existing startup script.

Option	Short	Description
`--json`		Output as JSON

jl scripts update 42 ./setup-v2.sh

`jl scripts remove <script_id>`

Remove a startup script.

Option	Short	Description
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl scripts remove 42

Filesystem Commands

Filesystems are persistent storage volumes that survive instance pause, resume, and even destroy cycles. They're ideal for datasets, model checkpoints, or any data you want to reuse across multiple instances. You can manage them with jl filesystem.

`jl filesystem list`

List filesystems (ID, name, storage).

jl filesystem list
jl filesystem list --json

`jl filesystem create`

Create a new filesystem.

Option	Short	Description
`--name`	`-n`	Filesystem name (required, max 30 characters)
`--storage`	`-s`	Storage in GB (required, 50–2048)
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl filesystem create --name "datasets" --storage 200

`jl filesystem edit <fs_id>`

Expand filesystem storage. Can only increase, never shrink.

Option	Short	Description
`--storage`	`-s`	New storage size in GB (required)
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl filesystem edit 10 --storage 500

info

edit may return a new filesystem ID. Always use the returned value for subsequent operations.

`jl filesystem remove <fs_id>`

Delete a filesystem.

Option	Short	Description
`--yes`	`-y`	Skip confirmation
`--json`		Output as JSON

jl filesystem remove 10

JSON Mode for Scripting

Most commands support --json for machine-readable output. JSON goes to stdout; human-readable status messages go to stderr.

# Instance list as JSON
jl instance list --json

# Create and capture the machine ID
RESULT=$(jl instance create --gpu RTX5000 --yes --json)
MACHINE_ID=$(echo "$RESULT" | jq .machine_id)

# GPU availability pipeline
jl gpus --json | jq '.[] | select(.num_free_devices > 0) | .gpu_type'

# Run status in scripts
jl run status r_abc123 --json | jq .state

# Check if a run is still going
EXIT_CODE=$(jl run logs r_abc123 --tail 1 --json | jq .run_exit_code)

When --json is active:

Spinners and progress indicators are suppressed
Errors from jl itself (bad arguments, auth failures, etc.) are emitted as {"error": "..."} to stdout. Commands like jl instance exec --json return their own structured payload (with exit_code, stdout, stderr) even on non-zero exit
Exit codes are still set appropriately
For jl run start, --json returns immediately after the run is started (before log streaming), so lifecycle flags (--pause, --destroy) will not execute — use --keep when combining --json with fresh instances

tip

--json does not suppress confirmation prompts. Always use --yes alongside --json in scripts and agent workflows.

Shell Completion

Enable tab completion for your shell:

jl --install-completion

Supports bash, zsh, and fish.

Using with AI Coding Agents

One of the primary use cases for the jl CLI is letting AI coding agents manage GPU infrastructure on your behalf. Instead of manually creating instances, uploading code, and monitoring runs, you can let your agent handle the entire workflow — from provisioning a GPU to downloading results.

The CLI supports four major coding agents: Claude Code, Codex, Cursor, and OpenCode. During jl setup, you'll be asked which agents you use, and skill files are installed automatically to teach your agent how to use jl effectively.

Agent Setup

# Interactive: authenticates and asks which agents to install skills for
jl setup

# Non-interactive: installs skills for all supported agents
jl setup --token YOUR_TOKEN --agents all --yes

# Install skills for specific agents only
jl setup --agents claude-code,cursor

tip

Once skills are installed, your coding agent already knows how to use jl. Try asking it: "Spin up an A100, run my training script, and download the results when it's done."

Mental Model

Concept	CLI	Purpose
Instance	`jl instance`	A machine — create, pause, resume, destroy, SSH into
Run	`jl run`	A managed job with log file + PID tracking
Exec	`jl instance exec`	Quick one-off commands for system checks and debugging

Core Rules for Agent Workflows

Always use --yes on commands with confirmation prompts (create, pause, resume, destroy, run start) — agents can't answer interactive prompts
Use --json for structured data — use it on commands where the agent needs to parse output (create, gpus, run start, instance list). For jl run logs, the default output is designed for agents — the header/footer shows run ID, machine ID, and state in a readable format
Always use --no-follow when starting runs — --follow blocks the agent indefinitely
Always use --tail N when reading logs — full logs can be enormous
Do an early failure check — wait 15s after starting a run and check logs once. This catches fast failures (import errors, missing files, pip issues) before committing to a long polling loop
Then poll at steady intervals — 60-120s for short jobs, 180-600s for long training runs

The Agent Monitoring Loop

This is the primary pattern for running and monitoring GPU jobs:

# 1. Start a detached run
jl run train.py --on <machine_id> --no-follow --yes --json
# returns {"run_id": "r_abc123", ...}

# 2. Early failure check - catches import errors, bad paths, pip failures fast
sleep 15 && jl run logs r_abc123 --tail 30

# 3. If still running, poll at steady intervals
sleep 120 && jl run logs r_abc123 --tail 50

# The log output shows a header and footer with run state:
#   --- run r_abc123 | machine 12345 | running ---
#   <log output>
#   --- still running | log: /home/jl-runs/r_abc123/output.log ---
#
# When done:
#   --- run r_abc123 | machine 12345 | succeeded (exit 0) ---
#   <log output>
#   --- succeeded | exit code: 0 | log: /home/jl-runs/r_abc123/output.log ---
#
# On failure:
#   --- run r_abc123 | machine 12345 | failed (exit 1) ---
#   <log output>
#   --- failed | exit code: 1 | log: /home/jl-runs/r_abc123/output.log ---

The log output is the primary monitoring primitive — the header gives you the run ID and machine ID, and the footer tells you whether the run is still going or finished (with exit code).

Agent Workflow Example (End-to-End)

# 1. Check GPU availability
jl gpus --json

# 2. Create an instance
jl instance create --gpu RTX5000 --storage 50 --yes --json
# returns {"machine_id": 12345, ...}

# 3. Start a detached run
jl run . --script train.py --on 12345 --requirements requirements.txt --no-follow --yes --json
# returns {"run_id": "r_abc123", ...}

# 4. Early failure check - catches crashes fast
sleep 15 && jl run logs r_abc123 --tail 30

# 5. If still running, poll at steady intervals (repeat until footer shows exit code)
sleep 120 && jl run logs r_abc123 --tail 50

# 6. Download results
jl instance download 12345 /home/results ./results -r

# 7. Clean up
jl instance pause 12345 --yes --json

Starting Runs on Fresh Instances (Agent Mode)

When the agent needs to create a fresh instance inline:

jl run . --script train.py --gpu RTX5000 --no-follow --keep --json --yes

Key points:

--keep is required with --no-follow for fresh instances (the CLI will error without it)
The agent must manually pause or destroy the instance after the run completes
Additional fresh-instance flags: --template, --storage, --num-gpus, --region, --http-ports

Use separate jl instance create when you need to inspect GPU availability first, reuse machines across runs, or attach filesystems/scripts beforehand.

Quick System Checks with Exec

jl instance exec <id> --json -- nvidia-smi
jl instance exec <id> --json -- ps -ef
jl instance exec <id> --json -- df -h

For pipes or shell syntax, wrap in sh -lc:

jl instance exec <id> --json -- sh -lc 'grep "loss" /path/to/log | tail -5'

Skill files handle this for you

All of the patterns above — the monitoring loop, early failure checks, polling intervals, --no-follow, --tail, and more — are included in the skill files that jl setup installs for your agent. Once skills are installed, your agent already knows how to use jl correctly. You don't need to teach it these patterns yourself.

File Persistence Rules

The remote home directory (typically /home/ on containers, /home/<user>/ on VMs) persists across pause/resume cycles. Everything else is ephemeral.

Persists across pause/resume:

Files in the home directory (/home/ or /home/<user>/)
Uploaded directories: <home>/<directory_name>/
Uploaded files (via jl instance upload): <home>/<filename>
Run metadata: <home>/jl-runs/<run_id>/
.venv created inside the project directory
Attached filesystems

Lost on pause:

System-level installs (apt-get, global pip packages)
Files outside the home directory (/tmp, /root, etc.)

Use --setup, --requirements, or --setup-file to reinstall dependencies on each run.

Anti-Patterns

Don't	Why
Use `--follow` when starting runs	Blocks the agent indefinitely; will timeout
Omit `--no-follow` when starting runs	Default is `--follow`, which blocks
Use `jl run logs --follow`	Blocks forever; `--json` is also incompatible with `--follow`
Read full logs (omit `--tail N`)	Can return megabytes of output, overwhelming context
Poll every few seconds	Wasteful and noisy; use 60–600s intervals
Use lifecycle flags with `--on`	`--keep`, `--pause`, `--destroy` only apply to fresh instances
Forget to pause/destroy instances	They cost money while running

Examples

Train on a fresh GPU, auto-pause when done

The simplest workflow — run a training script on a fresh GPU with dependencies. The instance is automatically paused when the script finishes, so you only pay for compute time.

jl run train.py --gpu RTX5000 --requirements requirements.txt -- --epochs 100
# Instance created > code uploaded > deps installed > training runs > instance paused

Run a project directory with setup

When your project has multiple files, sync the entire directory and specify the entrypoint with --script. The CLI uses rsync under the hood, so only changed files are transferred on subsequent runs — making re-runs on the same instance fast even with large projects. You can also run custom setup commands before training starts.

jl run . --script train.py --gpu A100 \
  --requirements requirements.txt \
  --setup "pip install flash-attn" \
  -- --batch-size 32 --lr 1e-4

Multi-GPU training

For large-scale training, you can request multiple GPUs on a single instance. Check Regions & GPUs for available GPU counts per region.

# 8x H100 in EU1 for distributed training
jl instance create --gpu H100 --num-gpus 8 --region EU1 --storage 500 --name "distributed-training"

# Upload your project and run with torchrun for multi-GPU
jl run . --script train.py --on <machine_id> \
  --requirements requirements.txt \
  --setup "pip install flash-attn" \
  -- --num_gpus 8

Long-running job with manual control

For jobs where you want full control — create an instance, start a detached run, monitor at your own pace, and clean up when done.

# Create an instance
jl instance create --gpu A100 --storage 200 --name "research"

# Sync project and start a background run (--no-follow detaches from logs)
jl run ./my-project --script train.py --on <machine_id> --no-follow

# Monitor later
jl run status <run_id>
jl run logs <run_id> --tail 100
jl run logs <run_id> --follow

# Pause when done
jl instance pause <machine_id>

Detached run on existing instance

Start a run and come back to check on it later — the run continues in the background even if you close your terminal.

# Start without following
jl run train.py --on <machine_id> --no-follow

# Check on it later
jl run logs <run_id> --tail 50

# Stop it if needed
jl run stop <run_id>

Persistent data with filesystems

Filesystems let you keep datasets and model checkpoints across instances. Create a filesystem once, attach it to any instance, and your data is always available — even after destroying the instance.

# Create a filesystem for datasets
jl filesystem create --name "datasets" --storage 500

# Create an instance with the filesystem attached
jl instance create --gpu A100 --fs-id <fs_id> --name "training"

# Run your training - the filesystem is attached and accessible on the instance
jl run train.py --on <machine_id>

# Done with training? Destroy the instance - data is safe in the filesystem
jl instance destroy <machine_id>

# Spin up a cheaper GPU for inference, same data
jl instance create --gpu RTX5000 --fs-id <fs_id> --name "inference"

VM workflow (bare metal SSH access)

VM instances give you a clean Linux machine with SSH access instead of a pre-configured container. You'll need to register an SSH key first.

# Add your SSH key
jl ssh-key add ~/.ssh/id_ed25519.pub --name "my-key"

# Create a VM instance (available in IN2 and EU1 only)
jl instance create --gpu H100 --template vm --name "my-vm"

# SSH in
jl instance ssh <machine_id>

Scripting with JSON and jq

Most commands support --json output (except jl setup), making it easy to build automation pipelines with jq.

# Get IDs of all running instances
jl instance list --json | jq '[.[] | select(.status == "Running") | .machine_id]'

# Find cheapest available GPU
jl gpus --json | jq '[.[] | select(.num_free_devices > 0)] | sort_by(.price_per_hour) | .[0].gpu_type'

# Pause all running instances
for id in $(jl instance list --json | jq -r '.[] | select(.status == "Running") | .machine_id'); do
  jl instance pause "$id" --yes --json
done

# Check if a run is still going
jl run logs <run_id> --tail 1 --json | jq .run_exit_code

Autonomous research with coding agents

One of the most powerful patterns is letting a coding agent drive the entire research loop autonomously. Andrej Karpathy's autoresearch is a great example of this — an AI agent autonomously edits training code, runs experiments, checks metrics, and iterates, accumulating only improvements. In Karpathy's own run, the agent evaluated ~700 experimental changes over 2 days, found ~20 additive improvements, and achieved an 11% reduction in Time-to-GPT-2.

The core loop works like this:

Agent modifies train.py with an experimental idea and commits the change
Agent runs the experiment on a GPU (via jl run)
Agent reads the results from logs (via jl run logs) and extracts the target metric
Agent logs the result — appends the commit hash, metric value, and a description to a results.tsv file so every experiment (successes and failures) is tracked
If metrics improved — keep the commit, the branch advances
If metrics got worse or it crashed — git reset to revert, try a different idea

The key insight is that the git branch only contains improvements (each commit is guaranteed better than the last), while results.tsv records the full history of all experiments including dead ends. This gives you a clean chain of improvements you can review, plus a complete log for analysis.

This pattern works for any ML problem — not just GPT training. You can apply it to hyperparameter sweeps, architecture search, data augmentation experiments, or any iterative research workflow.

Here's how to replicate this with jl:

# 1. Create a dedicated instance for experiments
jl instance create --gpu A100 --storage 200 --name "auto-research" --yes

# 2. Create a branch for this research session
git checkout -b autoresearch/session-1

# 3. Run baseline to establish initial metric
jl run . --script train.py --on <machine_id> \
  --requirements requirements.txt --no-follow --yes

# 4. Wait for it, then check results
sleep 15 && jl run logs <run_id> --tail 50

# The agent then loops autonomously:

# 5. Edit train.py with an idea, commit, and run
jl run . --script train.py --on <machine_id> \
  --requirements requirements.txt --no-follow --yes

# 6. Check results
sleep 15 && jl run logs <run_id> --tail 30
# ... then steady polling
sleep 120 && jl run logs <run_id> --tail 50

# 7. Extract metric from logs and append to results.tsv
# Format: commit | val_metric | memory_gb | status | description
# e.g.: a1b2c3d | 1.432 | 12.5 | keep | increased hidden dim to 512

# 8. If improved: keep the commit, loop back to step 5
# If worse: git reset to revert, loop back to step 5
# If crashed: log as crash, fix or try something else

# 9. When done, pause the instance
jl instance pause <machine_id>

With 5-minute experiments, the agent can run ~12 experiments per hour — roughly 100 experiments in an overnight session. Check results.tsv and git log the next morning to see what your agent discovered.

tip

To get started, install agent skills with jl setup --agents all, then ask your agent something like: "Run a hyperparameter sweep comparing learning rates 1e-3, 1e-4, and 1e-5 on an A100 using my training script." The agent will handle the rest.

Installation​

As a CLI tool (recommended)​

With pip​

Quick Start​

Path 1: Run a script directly on a fresh GPU​

Path 2: Manage instances yourself​

Authentication​

Interactive setup​

Non-interactive setup​

Environment variable​

Token precedence​

Config file location​

Removing saved credentials​

Global Flags​

Account Commands​

jl setup​

jl logout​

jl status​

jl gpus​

jl templates​

Regions & GPUs​

Instance Commands​

jl instance list​

jl instance get <machine_id>​

jl instance create​

jl instance pause <machine_id>​

jl instance resume <machine_id>​

jl instance destroy <machine_id>​

jl instance rename <machine_id>​

SSH, Exec & File Transfer​

jl instance ssh <machine_id>​

jl instance exec <machine_id> -- <command>​

jl instance upload <machine_id> <source> [dest]​

jl instance download <machine_id> <source> [dest] [-r]​

Managed Runs​

Run Targets​

Starting a Run on an Existing Instance​

Starting a Run on a Fresh Instance​

All Start Options​

Setup Chain​

Lifecycle Flags (Fresh Instances Only)​

Follow vs No-Follow​

jl run logs <run_id>​

jl run status <run_id>​

jl run stop <run_id>​

jl run list​

Implicit start Subcommand​

Run Tracking is Local​

SSH Key Commands​

jl ssh-key list​

jl ssh-key add <pubkey_file>​

jl ssh-key remove <key_id>​

Startup Script Commands​

jl scripts list​

jl scripts add <script_file>​

jl scripts update <script_id> <script_file>​

jl scripts remove <script_id>​

Filesystem Commands​

jl filesystem list​

jl filesystem create​

jl filesystem edit <fs_id>​

jl filesystem remove <fs_id>​

JSON Mode for Scripting​

Shell Completion​

Using with AI Coding Agents​

Agent Setup​

Mental Model​

Core Rules for Agent Workflows​

The Agent Monitoring Loop​

Agent Workflow Example (End-to-End)​

Starting Runs on Fresh Instances (Agent Mode)​

Quick System Checks with Exec​

File Persistence Rules​

Anti-Patterns​

Examples​

Train on a fresh GPU, auto-pause when done​

Run a project directory with setup​

Multi-GPU training​

Long-running job with manual control​

Detached run on existing instance​

Installation

As a CLI tool (recommended)

With pip

Quick Start

Path 1: Run a script directly on a fresh GPU

Path 2: Manage instances yourself

Authentication

Interactive setup

Non-interactive setup

Environment variable

Token precedence

Config file location

Removing saved credentials

Global Flags

Account Commands

`jl setup`

`jl logout`

`jl status`

`jl gpus`

`jl templates`

Regions & GPUs

Instance Commands

`jl instance list`

`jl instance get <machine_id>`

`jl instance create`

`jl instance pause <machine_id>`

`jl instance resume <machine_id>`

`jl instance destroy <machine_id>`

`jl instance rename <machine_id>`

SSH, Exec & File Transfer

`jl instance ssh <machine_id>`

`jl instance exec <machine_id> -- <command>`

`jl instance upload <machine_id> <source> [dest]`

`jl instance download <machine_id> <source> [dest] [-r]`

Managed Runs

Run Targets

Starting a Run on an Existing Instance

Starting a Run on a Fresh Instance

All Start Options

Setup Chain

Lifecycle Flags (Fresh Instances Only)

Follow vs No-Follow

`jl run logs <run_id>`

`jl run status <run_id>`

`jl run stop <run_id>`

`jl run list`

Implicit `start` Subcommand

Run Tracking is Local

SSH Key Commands

`jl ssh-key list`

`jl ssh-key add <pubkey_file>`

`jl ssh-key remove <key_id>`

Startup Script Commands

`jl scripts list`

`jl scripts add <script_file>`

`jl scripts update <script_id> <script_file>`

`jl scripts remove <script_id>`

Filesystem Commands

`jl filesystem list`

`jl filesystem create`

`jl filesystem edit <fs_id>`

`jl filesystem remove <fs_id>`

JSON Mode for Scripting

Shell Completion

Using with AI Coding Agents

Agent Setup

Mental Model

Core Rules for Agent Workflows

The Agent Monitoring Loop

Agent Workflow Example (End-to-End)

Starting Runs on Fresh Instances (Agent Mode)

Quick System Checks with Exec

File Persistence Rules

Anti-Patterns

Examples

Train on a fresh GPU, auto-pause when done

Run a project directory with setup

Multi-GPU training

Long-running job with manual control

Detached run on existing instance