
Measuring accuracy with score functions
Now that we have checked our model's assumptions, we turn toward measuring it's predictive power. To measure our predictive accuracy, we will use two methods, one for numerical data (Proportion) and the other for categorical data (Vote). We know that the Vote variable is a transformation from the Proportion variable, meaning that we are measuring the same information in two different ways. However, both numerical and categorical data are frequently encountered in data analysis, and thus we wanted to show both approaches here. Both functions, score_proportions() (numerical) and score_votes() (categorical) receive the data we use for testing and the predictions for each of the observations in the testing data, which come from the model we built in previous sections.
In the numerical case, score_proportions() computes a score using the following expression:
Here, Y_i is the real response variable value for the ith observation in the testing data, Y'_i is our prediction for that same observation, SE is our prediction's standard error, and n is the number of observations in the testing data. This equation establishes that the score, which we want to minimize, is the average of studentized residuals. Studentized residuals, as you may know, are residuals pided by a measure of the standard errors. This formula gives us an average measure of how close we are to predicting an observation's value correctly relative to the variance observed for that data range. If we have a high degree of variance (resulting in high standard errors), we don't want to be too strict with the prediction, but if we are in a low-variance area, we want to make sure that our predictions are very accurate:
score_proportions <- function(data_test, predictions) { # se := standard errors se <- predictions$se.fit real <- data_test$Proportion predicted <- predictions$fit return(sum((real - predicted)^2 / se^2) / nrow(data)) }
In the categorical case, score_votes() computes a score by simply counting the number of times our predictions pointed toward the correct category, which we want to maximize. We do that by first using the same classification mechanism (if the predicted Proportion is larger than 0.5, then we classify it as a "Leave" vote and vice versa), and compare the categorical values. We know that the sum of Boolean vector will be equal to the number of TRUE values, and that's what we're using in the sum(real == predicted) expression:
score_votes <- function(data_test, predictions) { real <- data_test$Vote predicted <- ifelse(predictions$fit > 0.5, "Leave", "Remain") return(sum(real == predicted)) }
To test our model's scores, we do the following:
predictions <- predict(fit, data_test, se.fit = TRUE) score_proportions(data_test, predictions)
#> [1] 10.66 score_votes(data_test, predictions)
#> [1] 216 nrow(data_test)
#> [1] 241
In the case of the score_votes() function, the measure by itself tells us how well we are doing with our predictions since we can take the number of correct predictions (the output of the function call, which is 216), and pide it by the number of observations (rows) in the data_test object (which is 241). This gives us a precision of 89%. This means that if we are given the data from the regressors but we don't know how the ward actually voted, 89% of the time, we would provide a prediction for whether they wanted to leave or remain in the EU, which would be correct. This is pretty good if you ask me.
In the case of the score_proportions() function, since we're using a more abstract measure to be able to know how good we're doing, we would like to compare it against other model's scores and get a relative sense of the model's predictive power, and that's exactly what we'll do in the following sections.