DataCrunch
$0.17/GPU-hr
Verify pricing on the provider's official site before checkout.
Loading GPU prices…
Select workload
Workload type
Scale / tool
RTX 4090 and L4 often win on cost for smaller open-weight models.
Serving open-weight LLMs, chat apps, and agent backends, where latency and VRAM headroom matter most.
Cost per result
Hourly rates converted with conservative vLLM throughput estimates, so different GPUs become directly comparable.
RTX 4090iwinv
1,050 tok/s · Llama 3.1 8B (FP16)
$0.13/1M tokens
RTX A6000DataCrunch
920 tok/s · Llama 3.1 8B (FP16)
$0.18/1M tokens
RTX A6000iwinv
920 tok/s · Llama 3.1 8B (FP16)
$0.28/1M tokens
L4AWS
620 tok/s · Llama 3.1 8B (INT8)
$0.36/1M tokens
L4AWS
620 tok/s · Llama 3.1 8B (INT8)
$0.36/1M tokens
Best fit
Ranked by price and freshness from tracked provider data.
$0.17/GPU-hr
Verify pricing on the provider's official site before checkout.
$0.38/GPU-hr
Verify pricing on the provider's official site before checkout.
$0.51/GPU-hr
Verify pricing on the provider's official site before checkout.
$0.61/GPU-hr
Verify pricing on the provider's official site before checkout.
$0.64/GPU-hr
Verify pricing on the provider's official site before checkout.
$0.70/GPU-hr
Verify pricing on the provider's official site before checkout.