What this Ceph storage calculator estimates
This calculator gives you a fast, practical estimate of usable Ceph capacity from raw disk space. It considers the two big capacity reducers in almost every cluster: your data protection policy (replication or erasure coding) and your operational reserve overhead.
It is designed for early planning, budget discussions, and architecture comparisons. For production rollouts, you should always validate with real CRUSH rules, failure domains, object sizes, and workload patterns.
How Ceph capacity works in plain language
1) Start with raw capacity
Raw capacity is simply: number of OSDs × size per OSD. If you have 24 OSDs at 12 TB each, that is 288 TB raw.
2) Subtract operational overhead
Most teams reserve space for healthy operations and recovery behavior. This includes temporary rebalance pressure, metadata overhead, and a "do not run hot" margin. A common planning reserve is 10% to 20%, depending on risk tolerance.
3) Apply protection efficiency
- Replication: Usable is roughly raw / replication size (after reserve).
- Erasure coding: Usable is raw × k/(k+m) (after reserve).
Replication vs erasure coding
Replication (size=3 is common)
Replication is operationally simple and often offers strong read behavior. With size=3, each object is stored three times, so efficiency is about 33.3%. You trade capacity efficiency for straightforward durability and recovery.
Erasure coding (example 4+2, 8+3, etc.)
Erasure coding improves capacity efficiency by splitting data into chunks plus parity. For 4+2, efficiency is 4/6 = 66.7%, often doubling usable capacity versus 3x replication. The tradeoff is additional CPU and network overhead, plus design complexity depending on workload and pool profile.
Example planning scenarios
Scenario A: 36 drives, 18 TB, replication size=3
- Raw: 648 TB
- Reserve: 10% → 583.2 TB effective raw
- Usable: 583.2 / 3 = 194.4 TB
Scenario B: same hardware, EC 4+2
- Raw: 648 TB
- Reserve: 10% → 583.2 TB effective raw
- Usable: 583.2 × (4/6) = 388.8 TB
Same hardware, same reserve, but very different outcomes. This is why policy choice has massive budget impact.
Important assumptions and limits
- This is a capacity calculator, not a performance model.
- It assumes all OSDs are equal size and equally weighted.
- It does not model small-object amplification, compression, or dedup behavior.
- It does not account for class-based tiers (NVMe/HDD) with separate pool strategies.
- Real failure-domain constraints can reduce practical maximums in edge designs.
Ceph sizing tips before buying hardware
- Keep enough free capacity for rebuilds after expected failure events.
- Validate host count and failure domains for your chosen replication or EC profile.
- Match pool policy to workload: hot VM images may differ from cold object archives.
- Model growth for 12 to 24 months, not just day-one ingestion.
- Test recovery time objectives under realistic failure simulation.
Quick FAQ
Is 3x replication always best?
Not always. It is common and robust, but EC may dramatically improve efficiency for suitable workloads.
Should I set target utilization to 100%?
Usually no. Most production operators keep meaningful headroom to protect performance and recovery behavior.
Can this replace a full Ceph architecture review?
No. Use it for fast planning, then validate with detailed CRUSH, pool, and operational constraints.