bayesian ab test calculator - Aaron Graves, PhDude Replica

Variant A (Control)

Visitors / Exposures

Conversions

Variant B (Challenger)

Visitors / Exposures

Conversions

Prior Settings

Beta Prior α (alpha)

Beta Prior β (beta)

Simulation Settings

Posterior Samples (Monte Carlo)

Practical Lift Threshold (%)

Use 1 for “B must beat A by at least 1% relative lift”.

Enter your data and click Calculate Bayesian Result to see posterior probabilities, credible intervals, and risk metrics.

What this Bayesian A/B test calculator does

This calculator compares two conversion rates using a Bayesian model. Instead of relying on a single p-value, it estimates full posterior distributions for each variant and answers practical questions such as: “What is the probability that B is better than A?” and “How much downside risk do I take if I ship B?”

Bayesian testing is decision-friendly. You get probabilities and uncertainty ranges that are directly useful for product, growth, and marketing decisions.

Inputs explained

Visitors and conversions

For each variant, enter total exposures (visitors, sessions, emails delivered, etc.) and total conversions (signups, purchases, clicks, renewals, etc.). Conversion rates are modeled as binomial outcomes.

Prior alpha and beta

The calculator uses a Beta prior for conversion rate: rate ~ Beta(alpha, beta). With alpha = 1 and beta = 1, you get a uniform prior (a neutral default when you have little prior information). If historical data exists, you can encode it through stronger priors.

Monte Carlo samples

Posterior quantities are estimated by simulation. More samples improve stability but take slightly longer. For most cases, 20,000 to 60,000 samples is a good range.

How the math works (briefly)

With a Beta prior and Binomial likelihood, the posterior remains Beta (conjugacy):

Posterior alpha = prior alpha + conversions
Posterior beta = prior beta + failures (visitors - conversions)

So each variant gets its own posterior distribution. The calculator then draws many random samples from both posteriors, computes pairwise differences, and summarizes:

Probability B beats A
95% credible intervals for both rates and uplift
Expected loss for choosing each variant
Probability B exceeds your practical lift threshold

How to interpret results

Probability B > A

If this is very high (for example 95%+), evidence favors B. If it is near 50%, data is inconclusive.

Credible intervals

A 95% credible interval is a Bayesian uncertainty interval for the true conversion rate. Wide intervals mean you still need more traffic; narrow intervals mean more certainty.

Expected loss

Expected loss quantifies regret if you pick the wrong variant. It is one of the best metrics for risk-aware decisions, especially when rollout costs are meaningful.

Recommended decision framework

Ship B when probability B is high and expected loss for choosing B is acceptably low.
Keep A when evidence strongly favors control.
Continue the test when probability is middling or uncertainty remains large.

Teams often combine probability thresholds with business constraints (revenue sensitivity, engineering effort, user risk, and timeline pressure).

Practical tips for better experiments

Track one primary metric clearly tied to business value.
Ensure random assignment and avoid sample-ratio mismatch.
Run long enough to include normal weekday/weekend behavior.
Segment cautiously: segment analysis is useful, but power drops fast.
Use historical priors only when quality and context are strong.

Bottom line

A Bayesian A/B test calculator gives a more intuitive view of uncertainty than classical significance testing alone. Use it to move from “Is this significant?” to “What is the probability this is better, and what is the risk if I act now?”