Ceph Placement Group (PG) Calculator
Estimate a reasonable starting pg_num and pgp_num for a pool. This is great for planning and sanity checks, especially before creating new pools in production.
What this ceph pg calculator helps you do
In Ceph, placement groups (PGs) are how objects are mapped to OSDs. If your PG count is too low, data placement can be uneven and recovery performance can suffer. If PG count is too high, cluster memory and peering overhead increase. This calculator gives you a practical planning estimate for a pool so you can start with sensible numbers.
Formula used by this calculator
This page uses a common sizing heuristic:
After calculating the raw value, we present power-of-two recommendations (nearest and round-up). Historically, Ceph guidance has favored power-of-two PG counts for simpler scaling and better distribution behavior.
How to use the calculator correctly
1) Enter realistic OSD count
Use the number of in-cluster OSDs that will actively serve the pool. If you are about to expand the cluster, you may want to calculate both current and post-expansion values.
2) Choose an appropriate target PGs per OSD
- Lower values reduce overhead.
- Higher values improve granularity and balancing.
- A starting range of 50–150 is common for many environments.
3) Adjust pool data percentage
If the pool will hold only part of your data (for example 30%), set this value accordingly. This keeps PG distribution in proportion when multiple pools share the same OSDs.
4) Pick replicated vs erasure-coded
Replicated pools use size as the acting set width. Erasure-coded pools use k + m. The calculator switches logic automatically.
Example scenario
Suppose you have 36 OSDs, a replicated pool with size 3, target 100 PG/OSD, and the pool is expected to hold 40% of total data:
That gives a strong starting point: pg_num=512 and usually pgp_num=512 initially.
Best practices and operational notes
- Use this as a planning estimate, not an absolute rule.
- Monitor cluster health after changes: peering behavior, recovery times, OSD memory, and client latency.
- Avoid frequent manual churn in PG counts during peak load windows.
- Remember autoscaler behavior in modern Ceph releases. Manual calculations are still useful for validation and troubleshooting.
Replicated vs erasure-coded quick comparison
Replicated pools
Easier operationally, often simpler for small clusters and metadata-heavy workloads. Overhead grows with replica count.
Erasure-coded pools
Better usable capacity efficiency, but tuning and recovery behavior can be more complex. For PG math, acting set is based on k + m.
Common mistakes to avoid
- Using 100% data share for every pool in a multi-pool environment.
- Ignoring autoscaler recommendations without reason.
- Setting very high PG counts on small clusters.
- Changing too many pool parameters at once.
Final takeaway
This ceph pg calculator gives you a fast, transparent way to estimate pg_num and pgp_num. Use it to establish a strong starting configuration, then validate with real cluster telemetry and Ceph health signals.