You have built a binary classification model to predict if a tree in a forest is sick or healthy. A prediction with an output value of 1.0 means that the model is 100% confident that a tree is healthy.
You drive around the forest and notice only a few healthy trees, more than 99% of the forest is sick. You decide to use the accuracy metric to evaluate your tree prediction model.
Is this the correct choice?
Choose the correct answer from the options below.
Please select at least one answer!
Correct!
I'm sorry, your answer is not correct.
Explanations for each answer:
Yes is incorrect. The accuracy metric should only be used if the training dataset is balanced with an equal number of positive and negative cases.
No is correct.
It's impossible to say is incorrect. There are clear guidelines for when using the accuracy is a valid choice.
Bias in datasets is always a problem in machine learning, but several binary classification metrics are more or less immune to bias. They include the F1 Score, the ROC curve, and the AUC. You should always pick one of these if you are unsure about the level of bias in your data, and be very careful when using accuracy, precision or recal individually.: