x2 test calculator

Chi-Square (X²) Goodness-of-Fit Calculator

Enter observed and expected frequencies as comma-separated values. Example: 12, 18, 25, 15

Observed frequencies

Use non-negative numbers. At least two categories are required.

Expected frequencies

Expected values must be positive. If totals differ, they will be scaled to the observed total.

Significance level (α)

What is an X² (Chi-Square) test?

The X² test, usually written as χ² (chi-square), is a statistical test that compares what you observed in your data to what you would expect under a specific hypothesis. It helps answer questions like: “Do these counts look random, or is there a meaningful pattern?”

This page focuses on a goodness-of-fit version of the chi-square test, where you compare one set of observed category counts against expected category counts.

How this calculator works

The calculator uses the standard formula:

X² = Σ ((Observed - Expected)² / Expected)

It then computes:

Degrees of freedom (df) = number of categories - 1
p-value from the chi-square distribution
Decision at your chosen significance level α

When to use this test

Use it when:

Your data are counts/frequencies (not means or percentages by themselves).
Each observation belongs to exactly one category.
You want to test if observed category distribution matches an expected one.

Do not use it when:

Your data are continuous measurements (e.g., height, weight, time).
Expected counts are too small in many categories (common rule: expected should usually be at least 5).
Observations are not independent.

Step-by-step example

Suppose you flip a coin 100 times and observe 58 heads and 42 tails. Under a fair-coin hypothesis, expected counts are 50 and 50.

Observed: 58, 42
Expected: 50, 50
X² = (58-50)²/50 + (42-50)²/50 = 2.56
df = 2 - 1 = 1
p-value is about 0.11

Since p > 0.05, you would usually fail to reject the fair-coin assumption.

Interpreting your result

A small p-value means your observed counts are unlikely under the expected pattern. In that case, you reject the null hypothesis. A larger p-value means your data are reasonably consistent with the expected distribution.

Remember: “not significant” does not prove the null hypothesis true. It only means the data did not provide strong enough evidence against it.

Practical tips for better analysis

Combine sparse categories if expected counts are very low.
Use a clear hypothesis before running the test.
Report X², df, p-value, and sample size together.
Context matters: statistical significance is not the same as practical importance.

Frequently asked questions

Why does the calculator rescale expected counts?

For a goodness-of-fit test, expected and observed totals should match. If they do not, this calculator rescales expected values proportionally to the observed total so the test remains valid.

What if I need a chi-square test of independence?

That test uses a contingency table (rows × columns) and computes expected values from row/column totals. This page is specifically for one-sample goodness-of-fit.

Can I use percentages?

Yes, but convert percentages into expected counts using your total sample size, or enter a proportional set that can be scaled to the observed total.