mean squared error

mathematics

Also known as: MSE

Written by Ken Stewart

Fact-checked by The Editors of Encyclopaedia Britannica

Article History

Also called:: mean squared deviation (MSD)

Related Topics:: prediction

See all related content

mean squared error (MSE), the average squared difference between the value observed in a statistical study and the values predicted from a model. When comparing observations with predicted values, it is necessary to square the differences as some data values will be greater than the prediction (and so their differences will be positive) and others will be less (and so their differences will be negative). Given that observations are as likely to be greater than the predicted values as they are to be less, the differences would add to zero. Squaring these differences eliminates this situation.

The formula for the mean squared error is MSE = Σ(y_i − p_i)²/n, where y_i is the ith observed value, p_i is the corresponding predicted value for y_i, and n is the number of observations. The Σ indicates that a summation is performed over all values of i.

If the prediction passes through all data points, the mean squared error is zero. As the distance between the data points and the associated values from the model increase, the mean squared error increases. Thus, a model with a lower mean squared error more accurately predicts dependent values for independent variable values.

For example, if temperature data is studied, forecast temperatures often differ from the actual temperatures. To measure the error in this data, mean squared error can be calculated. Here, it is not necessarily the case that actual differences will add to zero, as predicted temperatures are based on changing models for the weather in an area, and so the differences are based on a moving model used for predictions. The table below shows the actual monthly temperature in Fahrenheit, the predicted temperature, the error, and the square of the error.

Month	Actual	Predicted	Error	Squared Error
January	42	46	−4	16
February	51	48	3	9
March	53	55	−2	4
April	68	73	−5	25
May	74	77	−3	9
June	81	83	−2	4
July	88	87	1	1
August	85	85	0	0
September	79	75	4	16
October	67	70	−3	9
November	58	55	3	9
December	43	41	2	4

The squared errors are now added to generate the value of the summation in the numerator of the mean squared error formula:Σ(y_i − p_i)² = 16 + 9 + 4 + 25 + 9 + 4 + 1 + 0 + 16 + 9 + 9 + 4 = 106. Applying the mean squared error formulaMSE = Σ(y_i − p_i)²/n = 106/12 = 8.83.

After calculating the mean squared error, one must interpret it. How can a value of 8.83 for the MSE in the above example be interpreted? Is 8.83 close enough to zero to represent a “good” value? Such questions sometimes do not have a simple answer.

However, what can be done in this particular example is to compare the predicted values for various years. If one year had a MSE value of 8.83 and the next year, the MSE value for the same type of data was 5.23, this would show that the methods of prediction in that next year were better than those used in the previous year. While, ideally, a MSE value for predicted and actual values would be zero, in practice, this is almost always not possible. However, the results can be used to evaluate how changes should be made in predicting temperatures.

Ken Stewart