A/B Test Statistical Significance Calculator
Compare two conversion rates using a standard two-tailed z-test for proportions.
Assumes independent samples and a fixed-horizon test (not continuous peeking).
What this calculator tells you
An A/B test can show a difference in raw conversion rates, but that difference might be random noise. This calculator helps answer the practical question: is Variant B truly different from Variant A, or could this happen by chance?
- Conversion rate for each variant
- Absolute uplift (percentage-point difference)
- Relative lift (%)
- Z-score and p-value
- Whether the result is statistically significant at your chosen confidence level
- Confidence interval for the conversion-rate difference
How the significance test works
1) Estimate conversion rates
For each group, conversion rate is:
2) Build the null hypothesis
The null hypothesis says both variants convert at the same true underlying rate. Under that assumption, we pool the data to estimate a common conversion rate.
3) Compute z-score
The z-score measures how far apart the observed rates are, relative to expected random variation:
4) Convert z-score to p-value
The p-value is the probability of seeing a difference this extreme (or more extreme) if the null hypothesis were true. A low p-value means the observed gap is unlikely to be random.
How to interpret the output correctly
- Significant + positive uplift: B is likely better than A.
- Significant + negative uplift: B is likely worse than A.
- Not significant: you do not have strong enough evidence yet; this does not prove A and B are equal.
Also check the confidence interval. If it includes zero, the real difference could still be zero (or opposite direction), even if point estimates look promising.
Common mistakes this page helps you avoid
Stopping too early
Early results are noisy. If you stop when numbers “look good,” false positives increase dramatically.
Calling every uplift a win
Even a 10% relative lift can be non-significant with small sample sizes. Statistical significance and practical significance are different checks—you need both.
Ignoring data quality
Tracking bugs, bot traffic, mismatched audiences, and uneven assignment can invalidate test results before statistics even begin.
Practical A/B testing checklist
- Define one primary metric before launching.
- Estimate required sample size before test start.
- Run long enough to include normal traffic cycles (weekday/weekend patterns).
- Keep assignment random and balanced.
- Avoid changing experiment rules mid-test.
- After significance, verify business impact (revenue, retention, not just clicks).
FAQ
Is 95% confidence always required?
95% is a common default, but not universal. In low-risk UI tests you may use 90%; for high-stakes decisions you might use 99%.
Can I use this for click-through rate, signup rate, and purchase rate?
Yes—any binary conversion metric (converted / not converted) is a good fit.
Does significance mean high business value?
No. A tiny improvement can be statistically significant with large samples but still not worth implementation cost. Always evaluate effect size and expected ROI.
Bottom line
This AB test statistical significance calculator gives you a fast, transparent way to evaluate experiment outcomes. Use it to support stronger product decisions—but pair it with good experimental design, clean data, and a realistic view of business impact.