Statistic Terminology That You must be Very Clear about.

Stan He
3 min readAug 29, 2020

#1: Type I error: False Positive

Most common error that we get from our tests, where our test shows you may have developed certain ailment, but in fact you don’t. Or where we predict that you may display certain behavior, but in fact you didn’t perform that behavior.

#2: Type II error: False Negative

Less common error that we get from our predictions/tests, where the result showed that you didn’t have certain ailment, but in fact you had that ailment, where we predict that it may not have certain feature, but in fact it had.

#3. Recall/Sensitivity (To capture the FACT & Don’t Missing out)

How good is your model in capturing the fact?

If you are pregnant, our test result may show that either you are pregnant, or you are not pregnant. How good is our test/prediction to capture that you are in fact pregnant?

Our test result/prediction may show that you are not pregnant, but in fact you are pregnant (false negative, type II error), meaning that there will be some occasions where we will miss out on your factual pregnancy. Our test result was not good in that it missed out on capturing the factual situation in certain occasions.

For example, if the recall/sensitivity = 0.99, that means 99 out of 100 times we perform the test/prediction, we will capture you pregnancy. But there will be 1 out of 100 times when we perform the test, we will be missing out your pregnancy and predict you are not pregnant. (Missing rate/False negative rate/Type II error rate)

NOTE: For some serious disease like malignant tumor, we won’t want to miss out on the disease. Therefore, we have to do everything we can to lower the missing rate (false negative rate) and increase the recall. And therefore, a test/model of high recall/sensitivity would be more desirable than that of low recall.

#4. Precision (To make correct PREDICTIONS)

How good is your model in making an accurate/correct prediction?

If the result of test shows Positive, to what percentage of certainty we have about our positive result is in fact correct? How confident are we about our predicted/tested positive result?

Our test prediction may show that you are pregnant, but in fact you are not pregnant (false positive, type I error), meaning that there will be some occasions where we incorrectly reach an positive prediction result. Our result was not good in that it made an incorrect positive prediction.(False positive)

For example, if our result shows that you are pregnant, how sure are we about your pregnancy? If the precision = 0.95, it means that 95 out of 100 times we predict, the positive result we get is correct. And about 5 out of 100 times we predict, the positive result we get is incorrect, meaning we predict/our test shows that you are pregnant but in fact you are not pregnant. We call the residual of 0.05 “False Positive”, where the fact was actually negative, but it was falsely labelled/tested/predicted as “Positive”.

--

--

Stan He
0 Followers

A Researcher, A Data Scientist, A Consultant