ab test significance calculator - Aaron Graves, PhDude Replica

A/B Test Significance Calculator

Enter visitors and conversions for each variant. This calculator runs a two-tailed two-proportion z-test and reports statistical significance, p-value, and confidence interval for the conversion rate difference.

Variant A (Control)

Visitors

Conversions

Variant B (Treatment)

Visitors

Conversions

Confidence level

Tip: Use complete test data only. Peeking early can inflate false positives.

What this A/B test significance calculator tells you

When you run an A/B test, it is not enough to compare raw conversion rates and pick the bigger number. Random variation can make one variant look better even when there is no real effect. This calculator helps you answer a better question: is the observed lift likely real, or just noise?

By entering your sample sizes and conversions, you get:

Conversion rate for each variant
Absolute and relative lift
Z-score and p-value from a two-proportion z-test
A confidence interval for the conversion rate difference
A clear “significant / not significant” interpretation at your chosen confidence level

How the significance test works

1) Conversion rates

The calculator computes each variant’s conversion rate:

CR_A = conversions_A / visitors_A
CR_B = conversions_B / visitors_B

2) Test statistic (z-score)

For hypothesis testing, we use a pooled estimate of the conversion probability under the null hypothesis that both variants convert equally. Then we calculate the standard error and z-score for the observed difference.

A larger absolute z-score means stronger evidence against the null hypothesis.

3) P-value

The p-value measures how surprising your observed difference would be if there were truly no difference between variants. A small p-value means your result is unlikely under the null, which supports a real effect.

At 95% confidence, we typically use a threshold of 0.05. If p < 0.05, we call the result statistically significant.

4) Confidence interval

The confidence interval gives a plausible range for the true lift in conversion rate. This helps you assess not only whether an effect exists, but also whether the effect is large enough to matter for business outcomes.

How to interpret your output

Significant + positive lift: Variant B is likely better than A.
Significant + negative lift: Variant B is likely worse than A.
Not significant: You do not have enough evidence yet. Keep running the test or reevaluate sample size assumptions.

Always pair significance with effect size. A tiny lift can be statistically significant at high traffic but still not worth implementing.

Common mistakes in A/B test analysis

Stopping too early: Early fluctuations are volatile and can mislead decisions.
Multiple comparisons without correction: Testing many variants raises false-positive risk.
Ignoring sample ratio mismatch: Uneven traffic splits can indicate instrumentation issues.
Looking only at conversion rate: Consider downstream metrics like revenue, retention, and refund rate.

Practical guidance for better experiments

Estimate sample size before launch

Define baseline conversion rate, minimum detectable effect, desired power, and significance level. This prevents underpowered tests that end in ambiguity.

Use a clean success metric

Pick one primary KPI for decision-making. Secondary metrics are useful but should not override a failed primary metric without a clear rationale.

Validate tracking and assignment

Before trusting any significance output, verify event definitions, experiment bucketing, and data completeness.

FAQ

Does this calculator use a one-tailed or two-tailed test?

It uses a two-tailed test. That is the safer default unless you had a pre-registered directional hypothesis.

Can I use this for click-through rate or signup rate?

Yes. Any binary conversion event (clicked/not clicked, purchased/not purchased, subscribed/not subscribed) works with this method.

Is statistical significance enough to ship?

No. Combine statistical evidence with business impact, implementation cost, risk, and long-term user experience.

Final note

This A/B test significance calculator is ideal for quick directional analysis. For high-stakes decisions, pair it with deeper experimentation practices: power analysis, sequential testing safeguards, segmentation checks, and guardrail metrics.