cloud run calculator

Google Cloud Run Cost Calculator

Estimate your monthly Cloud Run spend based on requests, execution time, CPU, memory, and outbound network traffic.

Monthly requests

Average request duration (ms)

vCPU allocated per instance

Memory allocated per instance (GiB)

Average concurrency per instance

Minimum instances kept warm

Outbound network egress (GB/month)

Region pricing tier

Billing model assumption

Currency display

Apply free tier (2M requests, 180k vCPU-sec, 360k GiB-sec)

Enter your workload values, then click Estimate Monthly Cost.

How this cloud run calculator works

This calculator estimates your monthly Google Cloud Run bill using a practical model: requests generate compute time, compute time converts to vCPU-seconds and GiB-seconds, then pricing rates are applied. Finally, request charges and network egress charges are added.

Cloud Run pricing depends on your workload shape. A short bursty API and a long-running streaming endpoint may process the same number of requests, but cost very differently. That is why this page asks for request duration, concurrency, CPU, memory, and minimum instances.

Inputs explained

1) Monthly requests

Total HTTP requests handled in a month. This affects both request-based charges and total compute demand. If your traffic is highly seasonal, use an average month and also test your peak month.

2) Average request duration

The mean processing time per request in milliseconds. Longer processing times directly increase billable compute seconds. If you have a mixed workload, use a weighted average or run several scenarios.

3) vCPU and memory per instance

Cloud Run allocates resources at the container instance level. Higher allocations improve performance and reduce tail latency for some apps, but they also raise cost. Tune these values based on real profiling.

4) Concurrency

Concurrency is how many simultaneous requests one instance can process. Higher concurrency usually lowers instance-seconds and total cost, but too high can hurt response times if your app is CPU-heavy.

5) Minimum instances

Minimum instances keep containers warm for latency-sensitive workloads. This can improve startup behavior, but introduces a baseline spend even when request volume is low.

6) Egress traffic

Outbound data transfer is billed separately from compute and requests. Media APIs, downloads, and large JSON responses can make egress a meaningful part of your Cloud Run bill.

Formula summary used by the calculator

Active instance-seconds = (monthly requests × average duration in seconds) ÷ concurrency
Baseline instance-seconds = min instances × seconds per month
Billable instance-seconds = max(active instance-seconds, baseline instance-seconds)
vCPU-seconds = billable instance-seconds × vCPU
GiB-seconds = billable instance-seconds × memory (GiB)

The final monthly estimate is: compute cost + request cost + egress cost, with optional free-tier deductions.

Why estimates can differ from your invoice

Real cloud invoices include details this simplified model does not always capture perfectly: region-specific rate updates, internal traffic behavior, revisions, startup patterns, or other products connected to your service. Use this as a planning calculator, then validate with billing export data.

Sustained usage patterns can change average concurrency behavior.
Different regions can have slightly different network costs.
Background jobs and Cloud Run Jobs may have different usage profiles than web requests.
Observability, logging, and storage products may add separate line items.

Practical ways to reduce Cloud Run costs

Optimize request duration first

Faster handlers reduce compute seconds immediately. Profile database calls, reduce payload sizes, add caching, and avoid unnecessary network hops.

Set realistic CPU and memory limits

Over-provisioned containers quietly burn budget. Start from measured peak needs, not guesswork. Move up only when metrics prove it.

Tune concurrency carefully

If your app is I/O bound, raising concurrency can produce large savings. If CPU bound, too much concurrency may cause latency spikes and retries, increasing costs elsewhere.

Control egress-heavy endpoints

Compress responses, use CDN caching for public content, and avoid repeated transfer of large assets. For APIs, include sparse fields and pagination defaults.

Example use cases

Internal API: low request count, high memory workload, small egress.
Public web backend: high request count, moderate duration, moderate concurrency.
Webhook processor: bursty traffic with short execution and low memory footprint.
AI inference endpoint: lower concurrency, higher CPU/memory, potentially high egress.

Final takeaway

A Cloud Run pricing estimate becomes truly useful when you test several scenarios: current traffic, expected growth, and worst-case monthly spikes. Use this calculator to create those scenarios quickly, then pair it with production metrics to guide scaling and budgeting decisions.