Methods for comparing data mining models
How do we compare the relative performance of several data mining models? Previously, we discussed some basic model evaluation methods and metrics. Now we delve into more of them: ROC curves, Kappa statistic, mean square error, relative squared error, mean absolute error, and relative absolute error are the various metrics used, discussed below.
ROC curves permalink
Receiver Operating Characteristics curves plot the true positives against the false positives, thereby characterising the trade-off between hits and false alarms.
Kappa statistic permalink
Mean Square Error permalink
Mean absolute error permalink
Relative Square Error permalink
Correlation permalink
To use correlation in evaluating a model, one computes the correlation coefficient of the test data set and the predictions of the model. The model with the correlation coefficient that is closest to +1 is deemed the best predictor.
Deciding which performance metric to use permalink
The performance metric is decided on a case-by-case basis, according to the needs of the problem domain. One should consider what the costs of each type of error are and therefore which ones we are trying to minimise. In addition, some types of metrics are applicable only to numeric problems, and others to nominal problems.
It is important to note that when mining data, it is important to use several learning algorithms and produce several models, then evaluate each of them. This lends itself to a reiterative process, where after evaluation, one may:
- Select a different algorithm
- Use different parameters for the algorithm
- Alter the pre-processing used
- Collect new data or different data
- Redefine the problem entirely