ceph pg calculator - Aaron Graves, PhDude Replica

Ceph Placement Group (PG) Calculator

Estimate a reasonable starting pg_num and pgp_num for a pool. This is great for planning and sanity checks, especially before creating new pools in production.

Number of OSDs

Target PGs per OSD Common planning target: 50–150 (depends on hardware and workload).

Pool Type

Estimated % of Cluster Data in This Pool

Replication Size

Minimum Recommended PG (floor)

EC k (data chunks)

EC m (coding chunks)

What this ceph pg calculator helps you do

In Ceph, placement groups (PGs) are how objects are mapped to OSDs. If your PG count is too low, data placement can be uneven and recovery performance can suffer. If PG count is too high, cluster memory and peering overhead increase. This calculator gives you a practical planning estimate for a pool so you can start with sensible numbers.

Formula used by this calculator

This page uses a common sizing heuristic:

raw_pg = (osd_count × target_pg_per_osd × (pool_data_percent / 100)) / acting_set_size where: - acting_set_size = replica size (replicated pools) - acting_set_size = k + m (erasure coded pools)

After calculating the raw value, we present power-of-two recommendations (nearest and round-up). Historically, Ceph guidance has favored power-of-two PG counts for simpler scaling and better distribution behavior.

How to use the calculator correctly

1) Enter realistic OSD count

Use the number of in-cluster OSDs that will actively serve the pool. If you are about to expand the cluster, you may want to calculate both current and post-expansion values.

2) Choose an appropriate target PGs per OSD

Lower values reduce overhead.
Higher values improve granularity and balancing.
A starting range of 50–150 is common for many environments.

3) Adjust pool data percentage

If the pool will hold only part of your data (for example 30%), set this value accordingly. This keeps PG distribution in proportion when multiple pools share the same OSDs.

4) Pick replicated vs erasure-coded

Replicated pools use size as the acting set width. Erasure-coded pools use k + m. The calculator switches logic automatically.

Example scenario

Suppose you have 36 OSDs, a replicated pool with size 3, target 100 PG/OSD, and the pool is expected to hold 40% of total data:

raw_pg = (36 × 100 × 0.40) / 3 = 480 nearest power of 2 = 512 round-up power of 2 = 512

That gives a strong starting point: pg_num=512 and usually pgp_num=512 initially.

Best practices and operational notes

Use this as a planning estimate, not an absolute rule.
Monitor cluster health after changes: peering behavior, recovery times, OSD memory, and client latency.
Avoid frequent manual churn in PG counts during peak load windows.
Remember autoscaler behavior in modern Ceph releases. Manual calculations are still useful for validation and troubleshooting.

Replicated vs erasure-coded quick comparison

Replicated pools

Easier operationally, often simpler for small clusters and metadata-heavy workloads. Overhead grows with replica count.

Erasure-coded pools

Better usable capacity efficiency, but tuning and recovery behavior can be more complex. For PG math, acting set is based on k + m.

Common mistakes to avoid

Using 100% data share for every pool in a multi-pool environment.
Ignoring autoscaler recommendations without reason.
Setting very high PG counts on small clusters.
Changing too many pool parameters at once.

Final takeaway

This ceph pg calculator gives you a fast, transparent way to estimate pg_num and pgp_num. Use it to establish a strong starting configuration, then validate with real cluster telemetry and Ceph health signals.