# BITC: Ecological Niche Model Evaluation, part 2

0 (0 Likes / 0 Dislikes)

Hello, this is Town Peterson again, and this is the second part of the module on model evaluation.
In the first part, we dealt with introductory concepts. In this part, we're going to get some practicalities
and some performance measures.
Essentially, what we want to do is to start thinking about what a spatial prediction looks like,
and what the sorts of data that we might overlay on that prediction would be like.
So, I've drawn a very simple map. You can imagine this as being latitude and longitude.
And this black line outlines what would be the true distribution of the species if we were to know it.
Now, we do our modeling exercise, and we produce a prediction, which I have shown as this blue line.
Now, we overlay our independent evaluation data set, and that's this set of X's.
Now, there might be some X's here and here, but you can see that those are going to cause us some confusion.
So, that's literally what we call this ... a confusion matrix, which I've summarized here.
Within the truth, those are actual presences, so that's this column.
And outside of that true distribution are actual absences, and that's this column.
And then [for] our prediction, the same thing ... inside this is predicted present, and outside the blue polygon is predicted absent.
And so that essentially leaves us with four regions: this region here is predicted as part of the species' distribution, and in truth is;
and so that is essentially correct prediction of presence.
This area out here is predicted as not being suitable for the species and not being part of its distribution,
and, in truth, it is not.
So, d also is essentially correct prediction, but in this case of absence.
Then, we get into the two components of error ... we have areas that are predicted to be part of the species' distribution,
but are not, so that is here, b ...
and we have areas that are in truth part of the species' distribution, but are not predicted as such,
so that is c. So, essentially, a and d are correct predictions, and b and c are incorrect predictions,
but in different parts of this map.
OK, so that's the confusion matrix, and we need to pay a lot of attention to the confusion matrix because it
ends up being the basis for a lot of the methods that get used, and I will show you some of this.
Let's start out ... remember the difference between performance measures and significance measures?
Performance is asking, essentially, how well is this prediction anticipating the truth,
and significance is asking questions about whether you are doing better than random expectations.
Let's start with some of the common performance measures ...
We can measure omission error ... as c over (a + c).
And this is very simple: it's the proportion of actual presences that are not predicted correctly.
And so right away we can see the complement of this, which is the commission error,
and (you can guess) it's going to be b divided by (b + d).
So, essentially, that is ... of all of these absences ... and remember all of the caveats about absences ...
of all of these absences (places where the species is not), how many of them are mispredicted?
Now, we can look at these things together, and we can ask essentially how many of our test points
are correctly predicted versus not, so we can have a correct classification rate (CCR) equal to
(a + d) divided by (a + b + c + d),
and that would give us an overall correct classification rate.
You will see in the literature many other measures that are based on the confusion matrix,
and the most common one that you will see is kappa. Kappa is essentially
just a measure of correct classification rate above and beyond random expectations.
So those are a bunch of performance measures. They are based on the confusion matrix, but I want you to notice one thing.
We can use the correct classification rate as the illustration.
Notice that correct prediction or incorrect prediction of presences and of absences has the same weight.
So, it's just a much of an error to mispredict an absence as presence, as to mispredict a presence as absence.
This has pretty serious implications, because (if you were paying attention in the first module), you know that this kind of error
is much less of concern than this kind of error.
Which is to say, omission error is much more serious than commission error.
So, that is a very strong reason why we should avoid performance measures that weight b and c equally.
OK, that's essentially a use of a statistic that doesn't fit well with the conceptual basis of what it is that we are trying to test.
So, that gives you some of the practicalities regarding testing species distribution models and ecological niche models,
and in a moment we will be back with a third piece of this module to talk about some significance tests.