Bias-Variance Tradeoff in Machine Learning

Himanshu Gaur
4 min readNov 28, 2022

--

Illustration that shows fitting line after training

In Machine Learning, we never try to fit the data, but our objective is to find the unknown underlying pattern.

Just, have a look to the illustration below: -

In this example, very few observations lie on our curve but that captures the correct pattern

In Supervised Machine Learning, there is always a tradeoff between approximation and generalization, known as bias-variance tradeoff.

What is Bias?

Whenever we are creating a model using supervised machine learning, we have some training data which contains an underlying pattern which we want to encapsulate in our model and use that to predict for a future(unseen) data-point. Now, if we choose a simple model to predict a complex pattern, our model will have predictions far from actual values, it’s known as underfitting or high bias problem.

A lot of parametric algorithms like linear regression, logistic regression etc suffer from high bias problem. In other words, high bias occurs when our algorithm makes some simplifying assumptions about the underlying unknown pattern. We want our models to have a low bias.

  • It is the error between average model prediction and ground truth.
  • The bias of the estimated function tells us the capacity of the underlying model to predict the values.

What is Variance?

Generally, we divide our dataset into 3 parts: -

  1. Training Set [This is used for training]
  2. Testing Set [ This is used for testing the model after training]
  3. Validation Set [The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters]

High variance problem occurs when our model is able to predict well in our training dataset but fails on the test set. In such a case, we say our model is not able to generalize well and has overfitted to the training data.

  • It is the average variability in the model prediction for the given datasets.
  • The variance of the estimated function tells ushow much the function can adjust to the change in the dataset.

High Bias

  • Overly-simplified model.
  • Leads to under-fitting.
  • High error on both test and train data.

High Variance

  • Overly-complex model.
  • Over-fitting
  • Low error on training data and high error on test data.
  • Starts modelling the noise in the input.
Bias-Variance possibilities

Bias-Variance Tradeoff

Hence, if we choose a more complicated algorithm, we run a risk of high variance problem while if we use a simple one, we will face high bias problem. It’s a double-edged sword. The total error in any supervised machine learning prediction is the sum of the bias term in your model, variance and irreducible error. The irreducible error is also known as Baye’s error or optimum error as it’s mostly noise which can’t be reduced by algorithms but by better data cleaning.

Total Error = Bias² + Variance + Irreducible Error

If we plot these values against model complexity, we shall see that at a certain optimum model complexity, we will have the minimum total error.

The point of Minimum Error is the point of Optimum Model Complexity
  • Increasing bias (not always) reduces variance and vice-versa.
  • The best model is where the error is reduced.
  • Compromise between bias and variance.

Bonus Section

How to identify and avoid overfitting/underfitting?

First thing, weshould always monitor validation accuracy along with training accuracy. If training accuracy is high(i.e. Training loss is low) but validation accuracy is low(i.e. Validation loss is high), it indicates overfitting. In this case, we should:

  • Increase regularization
  • Get more training data

If training accuracy itself is low(or training loss is high), then we need to think about:

  • Have we done sufficient training?
  • Should we train a more powerful network? (If we are using Neural Networks in our model)
  • Decrease regularization?

Thank you for reading!

Please leave comments if you have any suggestion/s or would like to add a point/s or if you noticed any mistake/typos!

P.S. If you found this article helpful, clap! 👏👏👏 [feels rewarding and gives the motivation to continue my writing].

--

--

Himanshu Gaur
Himanshu Gaur

Written by Himanshu Gaur

Data Engineering | Machine Learning | Like to Write | Love to Read

No responses yet