mde calculator - Aaron Graves, PhDude Replica

Minimum Detectable Effect (MDE) Calculator

Estimate the smallest lift your A/B test can reliably detect, based on your baseline conversion rate, sample size, confidence level, and statistical power.

Baseline conversion rate (%)

Sample size per variant (users)

Confidence level (%)

Statistical power (%)

Daily visitors per variant (optional)

What is MDE?

MDE stands for Minimum Detectable Effect. In experimentation, it is the smallest true change between control and variant that your test is likely to detect as statistically significant. If your test has an MDE of 1.2 percentage points, then effects smaller than that are unlikely to be detected with high confidence in the current setup.

Put simply: MDE helps answer this practical question before you launch a test: “How big does the improvement need to be for this test to catch it?”

Why an MDE calculator matters

Many A/B tests fail not because the idea is bad, but because the test is underpowered. Teams run experiments for a week, see “no significance,” and conclude there is no effect. Often the real issue is that the sample size was too small to detect the improvement they cared about.

Planning: decide if your traffic can support the sensitivity you need.
Expectation setting: communicate realistic outcomes to stakeholders.
Resource allocation: prioritize tests where meaningful effects are detectable.
Decision quality: reduce false “no impact” conclusions caused by tiny sample sizes.

How this calculator works

Inputs

Baseline conversion rate: your current conversion rate in percent.
Sample size per variant: expected users in control and treatment (equal split assumption).
Confidence level: usually 90%, 95%, or 99% (two-sided test).
Power: usually 80% or 90%, representing your chance to detect a true effect.

Core idea

For two conversion rates, the required sample size depends on noise in the Bernoulli process and the effect size. This tool numerically solves for the effect size that matches your provided sample size. That solved value is your estimated MDE.

The result is shown as:

Absolute lift: increase in percentage points (e.g., +0.90 pp).
Relative lift: increase relative to baseline (e.g., +11.25%).
Target variant rate: baseline + absolute lift.

How to interpret the output

Suppose your baseline is 8%, and the calculator returns an MDE of 1.0 percentage point. That means your test is tuned to reliably detect a lift around 8.0% → 9.0% (or bigger). If your true lift is only 0.2 percentage points, this test likely won’t detect it.

A large MDE usually means one of the following:

Not enough sample size
Confidence or power set very high
Baseline rate in a range with higher variance effects

Practical playbook for teams

1) Start from business relevance

Decide what minimum lift is worth shipping. If 0.5 percentage points is meaningful to your business, plan your experiment so MDE is around that number or lower.

2) Use realistic traffic assumptions

Overestimating traffic leads to optimistic plans and underpowered tests. Use recent, stable traffic by segment (desktop/mobile, country, paid/organic) rather than blended totals if your experiment is segmented.

3) Keep test design simple

Uneven splits, multiple variants, and many simultaneous metric cuts increase complexity and often require more samples. For early-stage validation, a clean two-variant setup is usually best.

4) Avoid peeking and stopping early

Repeatedly checking significance and stopping when p < 0.05 inflates false positives. Define a minimum runtime or sample requirement before launching.

Common mistakes when using MDE

Confusing MDE with expected lift: MDE is detectability, not forecasted performance.
Ignoring seasonality: weekday/weekend behavior can bias short tests.
Overfitting significance: a tiny significant lift may still be operationally irrelevant.
No guardrails: always monitor secondary metrics like bounce, refunds, or latency.

FAQ

What confidence and power should I choose?

A common default is 95% confidence and 80% power. Regulated or high-risk decisions may require stricter settings.

Why does MDE shrink with more sample size?

More data reduces variance in estimated conversion rates, so smaller true differences become detectable.

Can this be used for revenue or continuous metrics?

This specific calculator is for conversion-rate style proportion tests. Continuous outcomes need a different variance model.

Bottom line

Great experimentation starts with planning, not post-hoc interpretation. Use MDE to design tests that can answer the question you care about. If your current traffic cannot detect a meaningful effect, adjust timeline, scope, or metric before launch rather than gambling on noisy results.