Linear Regression Calculator
Enter your data as two equal-length lists of numbers. This tool computes the best-fit line y = mx + b using ordinary least squares.
Tip: You need at least 2 paired points, and all x values cannot be identical.
What this calculator does
This calculator estimates a straight-line relationship between one independent variable (x) and one dependent variable (y). In statistics, this is called simple linear regression. The output gives you the slope, intercept, correlation, and goodness-of-fit metrics so you can quickly evaluate how well a line explains your data.
How to use it
- Enter all x values in the first field.
- Enter all y values in the second field in the same order.
- Optionally enter an x value for prediction.
- Click Calculate Regression.
Example: If x is advertising spend and y is sales, the regression line helps estimate expected sales at future spending levels.
Understanding the output
1) Regression equation: y = mx + b
The calculator returns the best-fit line. The slope (m) is how much y changes for each 1-unit increase in x. The intercept (b) is the predicted value of y when x = 0.
2) Correlation (r)
Correlation ranges from -1 to +1 and measures direction and strength of linear association. Values near +1 indicate strong positive linear relationships, near -1 indicate strong negative linear relationships, and near 0 indicate weak linear association.
3) Coefficient of determination (R²)
R² indicates the fraction of variability in y explained by x through a linear model. For example, R² = 0.84 means 84% of variation in y is explained by the fitted line.
4) Residual sum of squares (SSE) and standard error
SSE captures total squared error between observed and predicted y values. Lower SSE generally means a tighter fit. Standard error estimates typical prediction error magnitude in y-units.
The core formulas
The calculator uses standard ordinary least squares formulas:
- m = Sxy / Sxx
- b = ȳ - m x̄
- r = Sxy / √(Sxx · Syy)
- R² = 1 - SSE/SST (when SST > 0)
where Sxx = Σ(x - x̄)², Syy = Σ(y - ȳ)², and Sxy = Σ(x - x̄)(y - ȳ).
Best practices and assumptions
- Use this for roughly linear relationships.
- Watch out for outliers; one extreme point can strongly affect slope.
- Avoid extrapolating far beyond your observed x range.
- Remember: correlation does not imply causation.
When this tool is useful
- Trend estimation for business or operations data
- Academic homework checks for statistics and data science courses
- Quick forecasting from small datasets
- Sanity checks before building more advanced predictive models
If your data has multiple predictors, non-linear patterns, or strong seasonality, consider moving to multiple regression, polynomial regression, or time-series models.