A product developer’s guide to machine learning (ML) regression model metrics

First published on August 26, 2021

Last updated at April 22, 2022

 

8 minute read

Nathaniel Tjandra

TLDR

The 2 commonly used metrics in regression are Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Mean Absolute Error is best for simple, orderly datasets. Root Mean Squared Error is best for complex, chaotic datasets.

Outline

  • Introduction

  • Regression defined

  • Bob’s Boba

  • Absolute measures

  • Mean Absolute Error (MAE)

  • Root Mean Squared Error (RMSE)

  • Understanding error values

  • Conclusion

Introduction

Most machine learning (ML) problems fit into 2 groups: classification and regression. The main metrics used to assess performance of regression models are Root Mean Square Error and Mean Absolute Error.

Regression defined

Regression is about finding a pattern in measurable values of the past to determine changes to those values in the future. To demonstrate each of these metrics we’ll look at a scenario and create a model for it. Then, apply regression evaluation metrics to determine whether the prediction was right or wrong in order to estimate how much error there was.

Regression… to the moon? (Source: 

)

Bob’s Boba

In the last 

, we heard the tale of 4 heroes using accuracy and precision to slay monsters. Now, let’s take a look at the tale from the perspective of the boba shop in the village. The boba shop is run by a shopkeeper named Bob and has 2 loyal customers named Dan and Janet.

On the 1st day, the shopkeeper had planned to have his regulars come by. Instead, the party of heroes appeared so Bob asked them to search for Dan and Janet.

On the 2nd day, Bob thought he would see the heroes come back with Dan and Janet. Instead, Dan and Janet came back together and told the shopkeeper that the heroes were off to continue fighting and that they planned to leave to support them.

On the 3rd day, the shopkeeper was glad his friends were joining the heroes’ quest, and didn’t expect any customers. To Bob’s surprise, the friends later came back to let him know that everyone loved Bob’s boba and wanted to take some on the hero’s journey.

On the 4th day Bob was getting ready to close down his shop since his only customers had left. As it approached noon, he heard the sound of footsteps. A flood of 8 customers from another village came to visit after hearing the news of delicious boba. Bob happily served them and the tale ends there.

Absolute measures

Metrics are used to tell how well a model fits with the data. Regression metrics fall under 2 categories, relative measure and absolute measure. The most commonly used absolute measures are Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).

Mean Absolute Error (MAE)

Mean Absolute Error represents how incorrect each value is from one another. It’s done by taking the sum of how incorrect each prediction was from the actual value.

MAE: 4

2 and 4 are incorrect by 2

6 and 2 are incorrect by 4

0 and 2 are incorrect by 2

0 and 8 are incorrect by 8

Mean Absolute Error = (2 + 4 + 2 + 8) / 4 = 16 / 4 = 4

1
2
3
4
5
6
7
from sk.learn metrics import mean_absolute_error
# Example Dataset
y_predicted = [2,6,0,0]
y_actual = [4,2,2,8]
# Calculate Mean Absolute Error
mae = mean_absolute_error(y_actual, y_predicted)
print("Mean Absolute Error:", mae)

(Source: 9gag)

The MAE is saying that the model is incorrect by 4 visits on average each day. MAE is great since it’s easy to calculate due to only having simple operations.

Root Mean Squared Error (RMSE)

Root Mean Squared Error is similar to MAE, except the difference between the predicted and actual values is squared (e.g to the power of 2), then square rooted. This heavily penalizes each error based on how large each error is. As larger errors grow, squaring a number has the value at the difference of twice as much for each error. This makes RMSE useful for volatile datasets, with many variables changing constantly during each trial. Be warned, by penalizing errors more strictly it may become a model that best fits a specific scenario rather than a general scenario, also known as

.

RMSE: 4.69

2 and 4 are under by 2 = -2² = 4

6 and 2 are over by 4 = 4² = 16

0 and 2 are under by 2 = -2² = 4

0 and 8 are under by 8 = -8² = 64

Root Mean Squared Error Calculation

1
2
3
4
5
6
7
from sk.learn metrics import mean_squared_error
# Example Dataset
y_predicted = [2,6,0,0]
y_actual = [4,2,2,8]
# Calculate Mean Squared Error
mse = mean_squared_error(y_actual, y_predicted)
print("Mean Squared Error:", mse)

The RMSE is saying that the overall model has an error of 4.69. By penalizing the average then taking the square root, RMSE gives a good approximation of the scale of error.

Understanding error values

Reading error is not as simple as high values are good and low values are bad. Similar to classification, choosing which evaluation metric to use depends on the data.

MAE is simpler to understand, and shows how orderly errors are. RMSE is for complex datasets with multiple variables and penalizes larger errors more than MAE. Penalizing means that the model will hold itself more accountable, letting the difference in RMSE and MAE determine to what degree to judge each error. This makes each judgment have degrees of error, ranging from small errors to critical failures.

In Bob’s Boba story he was living an everyday life, filled with consistent events. Each day, only his regulars would consistently visit. This made the dataset simpler.

MAE calculation

(0 + 1 + 1 + 0) / 4 = 2 / 4 = 0.5

RMSE calculation

But as we know in Bob’s Boba story, a lot of unforeseen circumstances happened. His customers are taken captive and the presence of heroes drastically increases his store’s popularity, leading to more regular customers in the future. All this makes the dataset very chaotic, highly volatile, and extremely complex. Therefore, choosing RMSE is helpful to account for all the extra publicity and future growth.

Conclusion

RMSE and MAE are the go to metrics for evaluating regression models. There are other metrics such as

. The metric chosen is based on the complexity of your data. Use MAE for a dataset that is small and stable with few changing variables. However, use RMSE for datasets that are constantly changing and contain widely varying values.