ceph storage calculator - Aaron Graves, PhDude Replica

Number of OSD drives Total data OSDs that will contribute capacity.

Drive size (TB per OSD) Use decimal TB values from vendor specs (e.g., 3.84, 7.68, 18, 22).

Cluster overhead / reserve (%) Accounts for metadata, recovery headroom, and operational safety margin.

Target max utilization (%) How full you plan to let the cluster get during normal operations.

Data protection mode

Replication size Typical values: 2 (lab), 3 (production default in many environments).

Enter your values and click Calculate Capacity.

What this Ceph storage calculator estimates

This calculator gives you a fast, practical estimate of usable Ceph capacity from raw disk space. It considers the two big capacity reducers in almost every cluster: your data protection policy (replication or erasure coding) and your operational reserve overhead.

It is designed for early planning, budget discussions, and architecture comparisons. For production rollouts, you should always validate with real CRUSH rules, failure domains, object sizes, and workload patterns.

How Ceph capacity works in plain language

1) Start with raw capacity

Raw capacity is simply: number of OSDs × size per OSD. If you have 24 OSDs at 12 TB each, that is 288 TB raw.

2) Subtract operational overhead

Most teams reserve space for healthy operations and recovery behavior. This includes temporary rebalance pressure, metadata overhead, and a "do not run hot" margin. A common planning reserve is 10% to 20%, depending on risk tolerance.

3) Apply protection efficiency

Replication: Usable is roughly raw / replication size (after reserve).
Erasure coding: Usable is raw × k/(k+m) (after reserve).

Replication vs erasure coding

Replication (size=3 is common)

Replication is operationally simple and often offers strong read behavior. With size=3, each object is stored three times, so efficiency is about 33.3%. You trade capacity efficiency for straightforward durability and recovery.

Erasure coding (example 4+2, 8+3, etc.)

Erasure coding improves capacity efficiency by splitting data into chunks plus parity. For 4+2, efficiency is 4/6 = 66.7%, often doubling usable capacity versus 3x replication. The tradeoff is additional CPU and network overhead, plus design complexity depending on workload and pool profile.

Example planning scenarios

Scenario A: 36 drives, 18 TB, replication size=3

Raw: 648 TB
Reserve: 10% → 583.2 TB effective raw
Usable: 583.2 / 3 = 194.4 TB

Scenario B: same hardware, EC 4+2

Raw: 648 TB
Reserve: 10% → 583.2 TB effective raw
Usable: 583.2 × (4/6) = 388.8 TB

Same hardware, same reserve, but very different outcomes. This is why policy choice has massive budget impact.

Important assumptions and limits

This is a capacity calculator, not a performance model.
It assumes all OSDs are equal size and equally weighted.
It does not model small-object amplification, compression, or dedup behavior.
It does not account for class-based tiers (NVMe/HDD) with separate pool strategies.
Real failure-domain constraints can reduce practical maximums in edge designs.

Ceph sizing tips before buying hardware

Keep enough free capacity for rebuilds after expected failure events.
Validate host count and failure domains for your chosen replication or EC profile.
Match pool policy to workload: hot VM images may differ from cold object archives.
Model growth for 12 to 24 months, not just day-one ingestion.
Test recovery time objectives under realistic failure simulation.

Quick FAQ

Is 3x replication always best?

Not always. It is common and robust, but EC may dramatically improve efficiency for suitable workloads.

Should I set target utilization to 100%?

Usually no. Most production operators keep meaningful headroom to protect performance and recovery behavior.

Can this replace a full Ceph architecture review?

No. Use it for fast planning, then validate with detailed CRUSH, pool, and operational constraints.