Cohen’s Kappa Calculator
Enter counts from a 2×2 agreement table for two raters. This tool calculates observed agreement, expected agreement, and Cohen’s kappa.
What this online kappa calculator does
This online kappa calculator estimates Cohen’s kappa, a common statistic for measuring agreement between two raters after accounting for chance agreement. If two people are labeling records (for example: positive/negative, pass/fail, present/absent), kappa gives a more realistic reliability score than plain percent agreement.
How Cohen’s kappa is calculated
For a 2×2 table with values a, b, c, d, let N = a + b + c + d.
- Observed agreement (Po):
(a + d) / N - Expected agreement (Pe): agreement expected by chance from marginal proportions
- Kappa (κ):
(Po - Pe) / (1 - Pe)
A kappa of 1.0 means perfect agreement; 0.0 means agreement no better than chance; values below 0 can indicate systematic disagreement.
Interpretation guide used by this tool
| Kappa range | Interpretation |
|---|---|
| < 0.00 | Poor agreement (worse than chance) |
| 0.00 – 0.20 | Slight agreement |
| 0.21 – 0.40 | Fair agreement |
| 0.41 – 0.60 | Moderate agreement |
| 0.61 – 0.80 | Substantial agreement |
| 0.81 – 0.99 | Almost perfect agreement |
| 1.00 | Perfect agreement |
How to use the calculator
Step 1: Build your 2×2 agreement counts
Count how many items both raters marked positive (a), where A said yes and B said no (b), where A said no and B said yes (c), and both negative (d).
Step 2: Enter values and calculate
Fill all four boxes and click Calculate Kappa. The output includes total sample size, observed agreement, expected agreement, and final kappa with interpretation.
Step 3: Report responsibly
In reports, include the raw table, sample size, and kappa value. If possible, also include confidence intervals and context about prevalence and class imbalance.
When kappa is useful (and when it can mislead)
- Useful for binary ratings by two independent raters.
- Useful in medical screening, annotation tasks, audits, and quality control.
- Can appear low when one class is very rare, even with high percent agreement.
- For ordinal categories, weighted kappa is often better than simple kappa.
- For more than two raters, consider Fleiss’ kappa or Krippendorff’s alpha.
Quick example
Suppose a=35, b=8, c=6, d=51. Total N=100. The percent agreement is high, but kappa adjusts for chance agreement in the marginal totals. Use the Load Example button above to see this in action.
FAQ
Is this calculator for Cohen’s kappa or Fleiss’ kappa?
This page calculates Cohen’s kappa for exactly two raters and two categories.
Can I use decimals?
Counts should usually be whole numbers, but decimal input is accepted for flexibility in aggregated workflows.
Why is my kappa undefined?
If expected agreement is 1.0, the denominator of the formula becomes zero. In that edge case, kappa is mathematically undefined.