NVIDIA A100 GPU Price in 2026: Cost Per Hour, Cloud Pricing & Specs

March 9, 2026 · 16 min read

Founder @JarvisLabs.ai

The NVIDIA A100 GPU price has dropped significantly as everyone chases H100s and H200s — and that's great news if you want an A100. The GPU that trained GPT-3 and powered the first wave of open-source LLMs is now available at $1.49/hr — and for most workloads, it's more than enough.

Here's the short answer if you're in a hurry...

The NVIDIA A100 80GB GPU costs $7,000-$15,000 to buy (new) or $4,000-$9,000 used. Cloud rental ranges from $1.49 to $3.43 per GPU hour (March 2026). Jarvislabs offers on-demand A100 80GB access at $1.49/hr with per-minute billing — no commitments, no minimum rental period.

At $1.49/hr, the A100 is roughly half the cost of an H100 ($2.99/hr) — and for inference, fine-tuning, and training medium-sized models, the performance difference rarely justifies the price gap.

How Much Does an NVIDIA A100 GPU Cost?

The NVIDIA A100 80GB GPU costs between $7,000 and $15,000 to buy new, or $4,000–$9,000 on the used market. For cloud GPU rental, A100 pricing ranges from $1.49 to $3.43 per GPU hour depending on provider and configuration. Jarvislabs offers the A100 80GB at $1.49/hr with per-minute billing and no commitments — making it the cheapest way to access A100 compute on demand.

NVIDIA A100 Price Snapshot (March 2026)

Cloud GPU Pricing Table

Provider	GPU Config	On-Demand Price	Billing	Notes
Jarvislabs	A100 80GB SXM	$1.49/hr	Per-minute	Single GPU available. No commitments.
Lambda Labs	A100 80GB SXM	$2.06/hr	Per-hour	Academic-friendly. Limited availability.
RunPod	A100 80GB SXM	$1.49/hr	Per-second	Secure Cloud. Community Cloud from $1.39/hr.
AWS	p4de.24xlarge (8×A100 80GB)	$27.45/hr ($3.43/GPU)	Per-second	8-GPU minimum. On-demand.
Azure	Standard_ND96asr_v4	$27.20/hr ($3.40/GPU)	Per-hour	8×A100 40GB. ND96amsr for 80GB.

Note: Hyperscalers (AWS, Azure) typically require multi-GPU instances, which means you're paying for 8 GPUs even if you only need one. Jarvislabs, Lambda, and RunPod let you rent individual A100 GPUs — a massive cost advantage if you don't need 8 GPUs.

Key Takeaways

A100 is now 40-60% cheaper than H100 across all providers, making it the clear value leader.
RunPod matches at $1.49/hr (Secure Cloud), with Community Cloud from $1.39/hr.
Lambda Labs offers A100 80GB at $2.06/hr with academic-friendly terms.
Hyperscalers remain 2-3x more expensive per GPU due to bundled instance pricing.
Jarvislabs offers per-minute billing so you never pay for idle time — a big deal when you're experimenting.

Hardware Purchase Pricing

Configuration	New Price	Used / Refurbished	Notes
A100 40GB PCIe	$5,000-$8,000	$2,000-$4,000	Older config, good for inference
A100 80GB PCIe	$8,000-$12,000	$4,000-$7,000	Most common on secondary market
A100 80GB SXM	$10,000-$15,000	$5,000-$9,000	Best performance, requires SXM baseboard
DGX A100 (8×GPU)	$150,000-$200,000	$80,000-$120,000	Full system with NVLink and networking

A100 purchase prices have dropped dramatically as enterprises upgrade to H100 and H200 systems. The secondary market is flooded with used units — this is genuinely the best time to buy if you're building on-premises infrastructure and want a reliable, proven GPU.

NVIDIA A100 Specs: Full GPU Specifications & VRAM Details

The A100 was NVIDIA's flagship data center GPU from the Ampere generation. Here are the full specs:

Specification	A100 80GB SXM	A100 80GB PCIe	A100 40GB
Architecture	Ampere	Ampere	Ampere
GPU Memory	80GB HBM2e	80GB HBM2e	40GB HBM2e
Memory Bandwidth	2.0 TB/s	2.0 TB/s	1.6 TB/s
FP32 Performance	19.5 TFLOPS	19.5 TFLOPS	19.5 TFLOPS
FP16 Tensor	312 TFLOPS	312 TFLOPS	312 TFLOPS
TF32 Tensor	156 TFLOPS	156 TFLOPS	156 TFLOPS
INT8 Tensor	624 TOPS	624 TOPS	624 TOPS
TDP	400W	300W	250-400W
NVLink	3rd Gen (600 GB/s)	Via NVLink Bridge (2-GPU, 600 GB/s)	3rd Gen (600 GB/s)
PCIe	Gen4 ×16	Gen4 ×16	Gen4 ×16
MIG Support	Up to 7 instances	Up to 7 instances	Up to 7 instances

Why 80GB Matters

The jump from 40GB to 80GB isn't just "more VRAM." It fundamentally changes what you can do with a single GPU:

LLaMA 3 8B fits comfortably in FP16 on a 40GB A100, but LLaMA 3 70B needs 80GB (quantized) or multiple GPUs
Fine-tuning with LoRA: 80GB lets you fine-tune 70B models on a single GPU with QLoRA — on 40GB you're limited to ~13B models
vLLM inference: More VRAM means larger KV-cache, which directly translates to higher throughput and longer context windows. Check our vLLM optimization guide for practical tips.
Batch processing: Larger batches = better GPU utilization = lower cost per token

NVIDIA A100 Performance Benchmarks (2026)

Here's what we see in practice:

LLM Inference Performance

Model	A100 80GB (tokens/sec)	H100 80GB (tokens/sec)	A100 Cost/1M tokens
LLaMA 3 8B (FP16)	~4,200	~8,500	~$0.10
LLaMA 3 70B (INT8)	~850	~2,100	~$0.52
Mixtral 8x7B	~1,800	~4,200	~$0.24
Qwen 2.5 32B (FP16)	~1,200	~3,000	~$0.37

These benchmarks use vLLM with default settings. Throughput can improve 20-40% with prefix caching and FP8 KV-cache optimization.

Training & Fine-Tuning Performance

Task	A100 80GB	H100 80GB	A100 Savings
LoRA fine-tune LLaMA 3 8B	~2 hours	~0.8 hours	~50% cheaper despite 2.5x slower
Full fine-tune LLaMA 3 8B (4×GPU)	~18 hours	~7 hours	~35% cheaper
Train small model from scratch (1B params)	~48 hours	~20 hours	~40% cheaper

Bottom line: The A100 is slower than the H100, but it's proportionally much cheaper. For workloads where time-to-completion isn't critical — fine-tuning overnight, running batch inference, experimenting with model architectures — the A100 delivers better value per dollar.

A100 vs H100 Price and Performance Comparison

We have a detailed H100 vs A100 comparison, but here's the quick decision framework:

Choose A100 if...	Choose H100 if...
Budget is the primary constraint	Training speed is critical
Running inference at moderate scale	Serving high-throughput production inference
Fine-tuning with LoRA/QLoRA	Training large models from scratch
Experimenting or prototyping	Need FP8 native support
Model fits in 80GB VRAM	Need the latest architecture optimizations
Cost per token matters more than latency	Time-to-first-token is your bottleneck

For a deeper dive into the architectural differences — FP8, Transformer Engine, memory bandwidth — see our full H100 vs A100 guide.

A100 vs L4: Price and Performance Comparison

If you're choosing between an A100 and an L4, you're comparing a heavyweight against a lightweight. Both are excellent GPUs — for very different reasons:

Spec	A100 80GB	L4 24GB
VRAM	80GB HBM2e	24GB GDDR6
Memory Bandwidth	2.0 TB/s	300 GB/s
Architecture	Ampere	Ada Lovelace
FP16 Tensor	312 TFLOPS	121 TFLOPS
INT8 Tensor	624 TOPS	242 TOPS
TDP	400W	72W
Cloud Cost	~$1.49/hr	~$0.44/hr
Best For	Training + large inference	Cost-efficient small inference

Choose A100 if...	Choose L4 if...
Running models over 24GB VRAM	Running smaller models (under 24GB)
Need high memory bandwidth	Power efficiency is critical
Training or fine-tuning	Inference-only workloads
Serving 70B+ models	Serving 7B-13B models
Need maximum throughput	Need the lowest cost per token for small models

The L4 is a power efficiency champion — 72W TDP means dramatically lower operating costs. For inference workloads that fit in 24GB (LLaMA 3 8B, Mistral 7B, Qwen 2.5 14B quantized), the L4 is hard to beat on cost. But once you need more than 24GB of VRAM or higher memory bandwidth, the A100 is really the next step up. For a detailed comparison with benchmarks, see our L4 vs A100 guide.

A100 vs RTX 4090: Data Center vs Consumer GPU Comparison

The RTX 4090 is NVIDIA's most powerful consumer GPU and a popular choice for local AI workloads. Here's how it stacks up against the A100:

Spec	A100 80GB	RTX 4090 24GB
VRAM	80GB HBM2e	24GB GDDR6X
Memory Bandwidth	2.0 TB/s	1.0 TB/s
Architecture	Ampere	Ada Lovelace
FP16 Tensor	312 TFLOPS	165 TFLOPS
INT8 Tensor	624 TOPS	330 TOPS
TDP	400W	450W
Multi-GPU	NVLink (600 GB/s)	No NVLink
MIG Support	Yes (up to 7 instances)	No
Cloud Cost	~$1.49/hr	Not widely available

Choose A100 if...	Choose RTX 4090 if...
Running models that need >24GB VRAM	Running models that fit in 24GB
Need multi-GPU scaling with NVLink	Working locally on a single GPU
Running production inference serving	Prototyping and personal projects
Need MIG for multi-tenant workloads	Cost-sensitive and own the hardware
Cloud-based workflow	Prefer local development

The RTX 4090 offers strong single-GPU performance at a lower purchase price (~$1,600-$2,000), but it lacks the A100's VRAM capacity, multi-GPU interconnect, and data center features. For serious LLM work — especially models over 13B parameters — the A100's 80GB VRAM is the deciding factor.

Buy vs Rent A100 GPU: Cost Analysis

Monthly Cost Comparison

Scenario: 1×A100 80GB SXM

Usage Pattern	Cloud Cost (Jarvislabs)	Ownership Cost	Winner
24/7 (720 hrs/mo)	$1.49 × 720 = ~$1,073/mo	~$500/mo (depreciation + power + cooling)	Ownership (after ~12 months)
8 hrs/day (240 hrs/mo)	$1.49 × 240 = ~$358/mo	Same fixed: ~$500/mo	Cloud
Variable / on-demand	Pay only when used	Same fixed: ~$500/mo	Cloud (always)

The Hidden Costs of Buying

Hidden costs of buying used:

Power: 400W × $0.12/kWh × 24hrs × 30days = ~$35/month per GPU
Cooling: Industrial cooling for 400W adds $15-30/month per GPU
Network: InfiniBand for NVLink setups costs $2,000-5,000 upfront
Depreciation: A100s are losing ~20-30% of value per year as newer GPUs ship
Risk: Hardware failure means downtime and replacement costs — no SLA to fall back on

Our recommendation: Rent unless you have a guaranteed 24/7 workload with 18+ months of runway and in-house infrastructure expertise. The A100 is losing value fast, so buying only makes sense if you're sure you'll use it long enough to break even.

Frequently Asked Questions About NVIDIA A100 Price

How much does an NVIDIA A100 GPU cost?

The NVIDIA A100 80GB GPU costs $8,000-$15,000 new or $4,000-$9,000 used (March 2026). For cloud rentals, A100 prices range from $1.49 to $3.43 per GPU hour depending on the provider, with Jarvislabs offering competitive on-demand rates at $1.49/hour with per-minute billing.

Is the A100 still worth it in 2026?

For cloud rental — absolutely. The A100 offers the best price-to-performance ratio for inference, fine-tuning, and medium-scale training. For purchasing hardware, it depends: the secondary market offers great deals, but depreciation is accelerating as H100 and H200 become the new standard.

A100 40GB vs 80GB — which should I rent?

Always go for the 80GB if available. The price difference is minimal ($0.20-0.50/hr more), but the 80GB version lets you run models and batch sizes that simply won't fit in 40GB. The 40GB variant is increasingly hard to find on cloud providers anyway.

How does A100 pricing compare to H100?

The A100 is typically 40-60% cheaper than the H100 per hour. On Jarvislabs, A100 80GB costs $1.49/hr vs $2.99/hr for H100. The H100 is 2-3x faster for most workloads, so the cost per unit of work is often similar — but the A100 wins when you don't need maximum speed and want to keep hourly costs low.

Can I run LLaMA 3 70B on an A100?

Yes. LLaMA 3 70B fits on a single A100 80GB when quantized to INT8 or INT4 (using vLLM with AWQ or GPTQ quantization). For full FP16 precision, you'll need 2 A100 80GB GPUs with tensor parallelism. See our vLLM quantization guide for detailed benchmarks.

Is the A100 good for vLLM inference?

Excellent choice. The A100's 80GB VRAM and 2.0 TB/s memory bandwidth make it a strong platform for vLLM inference. We've published extensive benchmarks in our vLLM optimization guide and vLLM quantization guide — all tested on A100 and H200 GPUs.

How much does it cost to fine-tune a model on A100?

Fine-tuning costs on A100 80GB (estimated at $1.49/hr):

7B model (LoRA): ~2 hours = ~$3
70B model (QLoRA): ~8 hours = ~$12
70B model (Full fine-tune, 4×A100): ~18 hours × 4 GPUs = ~$107

These are significantly cheaper than equivalent H100 runs. See our LLM fine-tuning tutorial for step-by-step instructions.

Will A100 prices continue to drop?

Cloud prices have largely stabilized. We don't expect significant further drops since providers have already adjusted to the H100/H200 market reality. Purchase prices for used hardware may decline another 10-15% through 2026 as more enterprises upgrade to Blackwell GPUs.

How much VRAM does the A100 have?

The NVIDIA A100 comes in two VRAM configurations: 40GB HBM2e (older, less common) and 80GB HBM2e (standard). The 80GB variant offers 2.0 TB/s memory bandwidth on the SXM version, making it well-suited for large language model inference and training.

Is it cheaper to rent or buy an A100 for 24/7 use?

At current cloud rates, running a single A100 80GB on Jarvislabs 24/7 for a year costs approximately $1.49 × 24 × 365 = ~$13,052. A used A100 80GB SXM costs $5,000-$9,000, plus ~$500-600/year in power and cooling. The break-even point is roughly 6-9 months — but factor in depreciation risk, maintenance, and the flexibility of cloud before committing to a purchase.

How much does a DGX A100 cost?

The NVIDIA DGX A100 — a complete 8×A100 server with NVLink, networking, and NVMe storage — costs $150,000-$200,000 new or $80,000-$120,000 used. For most teams, renting 8 individual A100 GPUs in the cloud is far more practical: 8 × $1.49/hr = $11.92/hr on Jarvislabs, with no upfront capital.

What is the A100 memory bandwidth?

The A100 80GB SXM delivers 2.0 TB/s memory bandwidth via HBM2e — 6.8x higher than the L4's 300 GB/s and critical for memory-bandwidth-bound workloads like LLM token generation. The PCIe version also delivers 2.0 TB/s. The older 40GB variant has 1.6 TB/s.

A100 SXM vs PCIe — what's the difference?

The SXM version supports NVLink (600 GB/s GPU-to-GPU bandwidth) and has a higher TDP (400W vs 300W for PCIe). In practice, the SXM version is faster for multi-GPU training due to NVLink, while the PCIe version fits in standard server chassis. Both have the same VRAM (80GB) and memory bandwidth (2.0 TB/s). Cloud providers mostly offer SXM.

Can the A100 run Stable Diffusion?

Yes, comfortably. The A100's 80GB VRAM can load SDXL, ControlNet, and an upscaler simultaneously — something smaller GPUs struggle with. It generates SDXL images at ~12-15 images/min at 1024×1024. For pure image generation on a budget, the L4 is cheaper per image, but the A100 excels when you need multiple models loaded at once.

What models fit on an A100 80GB?

In FP16: models up to ~40B parameters (LLaMA 3 8B, Mistral 7B, Qwen 2.5 32B). In INT8: models up to ~70B parameters (LLaMA 3 70B with vLLM). In INT4/GPTQ: models up to ~130B parameters (LLaMA 3.1 70B with extra KV-cache room). The 80GB VRAM is what makes the A100 versatile — most production LLMs fit on a single card.

Where can I rent an A100 GPU?

You can rent individual A100 80GB GPUs from Jarvislabs ($1.49/hr, per-minute billing), RunPod ($1.49/hr), and Lambda Labs ($2.06/hr). AWS and Azure offer A100s only in 8-GPU instances ($3.40-$3.43/GPU/hr). For single-GPU workloads, Jarvislabs and RunPod offer the best value.

NVIDIA A100 GPU: Is It Worth It in 2026?

In 2026, the A100 80GB is the best price-to-performance GPU for teams that don't need the latest Hopper or Blackwell silicon. Whether you're fine-tuning LLMs, running inference with vLLM, or training medium-sized models, the A100 gets the job done at 40-60% less per hour than the H100.

For inference: Start with an A100 80GB. If throughput isn't enough, scale up to H100.
For fine-tuning: A100 80GB handles LoRA/QLoRA on models up to 70B parameters.
For large model serving: Use 2×A100 80GB with tensor parallelism for 70B+ models in full precision.
For training from scratch: Consider H100 or H200 — the speed difference justifies the cost for long training runs.

Ready to try it? Launch an A100 on Jarvislabs — it takes 90 seconds, bills per minute, and requires no commitment. Or reach out at support@jarvislabs.ai for custom quotes on multi-GPU setups.

Once you're set up, check out our guides on optimizing inference with vLLM or running quantized models to get the most out of your A100.

Last updated: March 2026. Prices verified against provider websites.

Related Guides:

NVIDIA L4 GPU: Price & Specs Guide — the budget alternative for inference under 24GB
NVIDIA L4 vs A100 Comparison — detailed specs, benchmarks, and when to choose each
NVIDIA H100 vs A100 Comparison — for when you're choosing between A100 and H100
NVIDIA H200 Price Guide — the next-gen alternative with 141GB HBM3e
vLLM Optimization Techniques — get 20-40% more throughput from your A100
vLLM Quantization Guide — run larger models on A100 with quantization

Need custom A100 pricing? For multiple GPUs or monthly commitments, we offer volume discounts that can reduce your hourly rate significantly. Contact us at support@jarvislabs.ai for a custom quote.

How Much Does an NVIDIA A100 GPU Cost?​

NVIDIA A100 Price Snapshot (March 2026)​

Cloud GPU Pricing Table​

Key Takeaways​

Hardware Purchase Pricing​

NVIDIA A100 Specs: Full GPU Specifications & VRAM Details​

Why 80GB Matters​

NVIDIA A100 Performance Benchmarks (2026)​

LLM Inference Performance​

Training & Fine-Tuning Performance​

A100 vs H100 Price and Performance Comparison​

A100 vs L4: Price and Performance Comparison​

A100 vs RTX 4090: Data Center vs Consumer GPU Comparison​

Buy vs Rent A100 GPU: Cost Analysis​

Monthly Cost Comparison​

The Hidden Costs of Buying​

Frequently Asked Questions About NVIDIA A100 Price​

How much does an NVIDIA A100 GPU cost?​

Is the A100 still worth it in 2026?​

A100 40GB vs 80GB — which should I rent?​

How does A100 pricing compare to H100?​

Can I run LLaMA 3 70B on an A100?​

Is the A100 good for vLLM inference?​

How much does it cost to fine-tune a model on A100?​

Will A100 prices continue to drop?​

How much VRAM does the A100 have?​

Is it cheaper to rent or buy an A100 for 24/7 use?​

How much does a DGX A100 cost?​

What is the A100 memory bandwidth?​

A100 SXM vs PCIe — what's the difference?​

Can the A100 run Stable Diffusion?​

What models fit on an A100 80GB?​

Where can I rent an A100 GPU?​

NVIDIA A100 GPU: Is It Worth It in 2026?​