line of best fit calculator

Linear Regression (Least Squares)

Enter your x and y data as comma-separated values. The calculator returns the best-fit line, correlation, and optional prediction.

What is a line of best fit?

A line of best fit is a straight line that summarizes the relationship between two variables in a scatter plot. If your data points are noisy, the line gives you a clean trend. In statistics, this is commonly found using simple linear regression.

The model is written as y = mx + b, where:

  • m is the slope (how much y changes when x increases by 1)
  • b is the intercept (the value of y when x = 0)

How this calculator works

This page uses the least-squares method. It chooses the line that minimizes the squared vertical distances between your observed points and the predicted points on the line.

Inputs

  • Two equal-length lists of numbers: x values and y values
  • At least two points
  • Optional x value for prediction

Outputs

  • Best-fit equation in the form y = mx + b
  • Slope and intercept
  • Pearson correlation coefficient (r)
  • Coefficient of determination ()
  • Predicted y value for a chosen x (if provided)

Why the slope and R² matter

The slope tells direction and strength of change. A positive slope means y tends to increase as x increases; a negative slope means y tends to decrease.

R² shows how much of y’s variation is explained by x using this linear model. For example, an R² of 0.81 means roughly 81% of the variation in y is captured by the line.

Step-by-step interpretation guide

1) Check data quality first

Make sure both lists are aligned by position. The first x should pair with the first y, the second x with the second y, and so on.

2) Read the equation

If the calculator returns y = 1.8x + 2.1, each 1-unit increase in x is associated with an average 1.8-unit increase in y.

3) Evaluate fit

Use r and R² to evaluate linear strength. Values near 1 or -1 for r, and values near 1 for R², indicate a stronger linear relationship.

4) Use predictions carefully

Predictions are most reliable inside the observed x range. Extrapolating far beyond your data can be risky.

Real-world use cases

  • Estimating sales growth over time
  • Modeling study hours vs test scores
  • Tracking ad spend vs conversions
  • Analyzing temperature vs energy usage
  • Understanding training volume vs performance

Common mistakes to avoid

  • Using mismatched list lengths
  • Forgetting that correlation does not prove causation
  • Assuming a linear model is always the best model
  • Ignoring outliers that can heavily affect the slope
  • Over-trusting predictions outside known data ranges

FAQ

Does this support nonlinear regression?

No. This tool is for straight-line (linear) best fit only.

Can I use decimals and negative numbers?

Yes. Decimals and negatives are supported in both x and y inputs.

What if all x values are the same?

A unique line cannot be computed in that case because the denominator in the slope formula becomes zero. The calculator will show an error message.

🔗 Related Calculators