cloud run pricing calculator - Aaron Graves, PhDude Replica

Cloud Run Monthly Cost Estimator

Enter your expected usage and rates to estimate monthly spend for Cloud Run services.

Workload Inputs

Monthly requests

Average request duration (ms)

Concurrency per instance

vCPU allocated per instance

Memory allocated per instance (GiB)

Min instances (optional baseline)

Network egress (GB / month)

Pricing Inputs (USD)

vCPU price (per vCPU-second)

Memory price (per GiB-second)

Request price (per million requests)

Egress price (per GB)

Apply free tier credits

Advanced: free tier values (monthly)

Free vCPU-seconds

Free GiB-seconds

Free requests

Enter your values, then click Calculate Price.

Note: This is an estimate. Real billing may vary by region, CPU allocation mode, idle pricing behavior, and discounts.

How this cloud run pricing calculator works

This calculator gives you a practical monthly estimate for a Cloud Run service by combining four major cost components: compute (vCPU), memory, request volume, and network egress. It is designed for fast scenario planning so you can compare architecture choices before deployment.

The model is intentionally transparent. You can override every default pricing value and even adjust free tier assumptions. That means you can adapt the calculator to different regions, enterprise discounts, or future pricing updates without touching code.

Core estimation formula

busy_instance_seconds = (monthly_requests × avg_request_duration_seconds) / concurrency baseline_instance_seconds = min_instances × 730 × 3600 effective_instance_seconds = max(busy_instance_seconds, baseline_instance_seconds) total_vcpu_seconds = effective_instance_seconds × vcpu_per_instance total_gib_seconds = effective_instance_seconds × memory_gib_per_instance

After that, the calculator subtracts free tier quantities (if enabled), multiplies by your selected rates, and adds request + egress charges.

Understanding Cloud Run cost drivers

1) Request volume

More requests usually mean more compute seconds consumed. But request count by itself is not enough—duration matters just as much. A service handling 1 million very short requests may cost less than one handling 200,000 heavy requests.

2) Average request duration

Duration directly scales your compute and memory time. If you cut average latency from 400 ms to 200 ms, you can nearly halve usage-based compute cost, all else equal.

3) Concurrency

Concurrency determines how many requests each instance handles at once. Higher concurrency tends to reduce required instance-seconds, which can lower total cost. However, concurrency should be tuned carefully because very high values can increase p95/p99 latency for CPU- or memory-bound workloads.

4) CPU and memory allocation

Your selected instance size multiplies every second you run. Over-provisioning (for example, 2 vCPU when 0.5 is enough) can significantly increase monthly spend. Right-size instance resources based on profiling, not guesswork.

5) Min instances and baseline spend

Min instances are useful for reducing cold starts and smoothing latency, but they create a baseline cost floor. If traffic is bursty or low overnight, this baseline can dominate monthly spend. The calculator models this as a minimum level of instance-seconds.

Step-by-step: using the calculator effectively

Start with realistic monthly request volume from logs or analytics.
Use production median duration (or slightly conservative numbers) rather than local test timings.
Set concurrency to your tested value, not the platform maximum by default.
Confirm vCPU and memory from your current Cloud Run revision settings.
Add expected egress for APIs, assets, or downstream responses.
Turn free tier on/off to compare startup stage vs. growth stage economics.

Example scenarios

Low-traffic internal API

A small internal service with low request volume and short execution can often remain mostly inside the free tier, especially with zero min instances. In this case, egress may become the only meaningful charge.

High-traffic public endpoint

For high-volume APIs, the largest savings usually come from latency optimization and smarter concurrency tuning. Even small improvements in request duration create strong compounding cost reductions at scale.

Tips to reduce Cloud Run costs

Optimize response time: cache aggressively, reduce cold path overhead, and trim dependency startup work.
Tune concurrency with load tests: find the best price/performance point instead of using defaults.
Right-size memory: keep enough headroom, but avoid oversized allocations.
Evaluate min instances periodically: keep them only where latency SLOs require it.
Compress payloads and limit egress: outbound bandwidth can become a hidden cost center.
Separate workloads: move background jobs to dedicated services sized for batch behavior.

Important caveats

This page is a planning tool, not an official billing statement. Final charges can vary by region, committed use discounts, free tier policy changes, networking path, and product updates. Always validate assumptions against your provider's current pricing documentation.