coefficient of determination calculation - Aaron Graves, PhDude Replica

R² Calculator (Coefficient of Determination)

Enter your observed values and model-predicted values to calculate R² instantly.

Observed (Actual) Values, Y

Use commas, spaces, semicolons, or new lines between numbers.

Predicted Values, Ŷ

Must contain the same number of points as the observed values.

Optional: Number of Predictors (k) for Adjusted R²

What is the coefficient of determination?

The coefficient of determination, written as R², tells you how much of the variation in a dependent variable is explained by your model. In plain English: it measures how well your predicted values track the real values.

If R² is 0.80, then your model explains about 80% of the variance in the outcome and about 20% remains unexplained (noise, missing variables, random error, or non-linear patterns not captured by the model).

Formula used in coefficient of determination calculation

Primary formula

R² is usually computed from sums of squares:

R² = 1 − (SSE / SST)

SSE (Sum of Squared Errors) = Σ(y_i − ŷ_i)²
SST (Total Sum of Squares) = Σ(y_i − ȳ)²
y_i = observed value, ŷ_i = predicted value, ȳ = mean of observed values

A lower SSE means your predictions are close to actual observations. As SSE gets smaller relative to SST, R² moves closer to 1.

How to calculate R² step by step

1) Compute the mean of observed values

Add all observed values and divide by the number of observations to get ȳ.

2) Compute SST

For each observed point, subtract the mean and square the result. Add all squared values together.

3) Compute SSE

For each data point, subtract predicted from observed, square the difference, and sum.

4) Apply the formula

Insert SSE and SST into R² = 1 − SSE/SST. That final number is your coefficient of determination.

Interpreting your R² result

R² = 1.00: perfect fit on the provided data.
R² between 0 and 1: partial explanatory power; higher is generally better.
R² near 0: the model explains little of the variation.
R² below 0: predictions are worse than using the mean as a baseline.

Interpretation depends on context. In some domains (finance, social science), modest R² values can still be useful. In tightly controlled engineering settings, you may expect much higher values.

Adjusted R² (when you have multiple predictors)

Adding more predictors can artificially increase R². Adjusted R² corrects for model size and sample size:

Adjusted R² = 1 − (1 − R²) × (n − 1)/(n − k − 1)

n = number of observations
k = number of predictors

Use adjusted R² when comparing models with different numbers of features.

Common mistakes to avoid

Using arrays with different lengths for observed and predicted values.
Assuming high R² always means a good model (it does not prove causality).
Ignoring residual diagnostics and overfitting risk.
Comparing R² across datasets with different outcome variance without context.

Quick practical workflow

Use this page to compute R², then validate with residual plots, cross-validation, and domain logic. R² is a strong summary metric, but model quality is always bigger than one number.