Classification accuracy - a random model

Emil Bashkansky, Industrial Engineering, BRAUDE College of Engineering, Haifa, Israel (ebashkan@braude.ac.il)
Tamar Gadrich, Industrial Engineering, Braude College Of Engineering, Haifa, Israel
Yariv Marmor, Industrial Engineering, Braude College Of Engineering, Haifa, Israel


Classification of the analyzed property value of the objects under study into one of K exclusive categories forming a comprehensive spectrum (scale) of the studied property can be considered as categorical measurement. The results of classification are presented by categorical data. The classifier can be a laboratory, a human, a device, an algorithm, or their combination. Classification accuracy is particularly crucial in scenarios where the cost of misclassification is high.



As for classification precision, the issue has been discussed by various authors using methods similar to ANOVA but adapted to the analysis of categorical data and named CATANOVA (for nominal scale) and ORDANOVA (for ordinal scale).



 We will present the model where classifiers are randomly selected from a larger population, and we are interested in making inferences about the entire population (Random Effects ANOVA) e.g. where laboratories are selected randomly from a group of institutions, and the purpose of the study is to assess diagnostic variability. In these cases, the classifier is a laboratory, and we are interested in assessing the distribution of the property under study among the whole population. To build a good statistical model for  solving this issue 3 things are required:




  1. Definition of  classifier’s classification ability


  2. Assumption about the distribution of these abilities among classifiers in the entire population of classifiers


  3. A method of estimating the parameters of this distribution (consensus) based on the data contained in the sample of classifiers’ classifications.



As for such a metrological concept as trueness, the two steps are required:




  1. to determine what is the “true”/”ideal” classification


  2. to determine the distance measure for deviation from 4.



The aim of this presentation is to present a reasonable statistical model for the analysis of classification accuracy, extending the previously developed fixed-factor model to the situation with random factors.






 




Short Biography of Presenting Author


Emil Bashkansky, D.Sc. is professor emeritus of the Industrial Engineering & Management Department at the BRAUDE College of Engineering. The focus of his present teaching and research activity is quality engineering and metrology. He is the author of almost hundred peer-reviewed papers in physics, statistics and engineering, which have been published in scientific journals and conference proceedings. He is a member of different international projects and conference program committees, founder and a permanent chair of the annual Galilee quality conference. He was honored by "Certificate of Appreciation" (2014) and "Lifetime Achievement Award" (2020) from Israeli Society for Quality for his contribution to the advancement of quality in Israel.



 


Organized & Produced by:

www.isranalytica.org.il

POB 4043, Ness Ziona 70400, Israel
Tel.: +972-8-9313070, Fax: +972-8-9313071
Site: www.bioforum.org.il,
E-mail: hagit@bioforum.co.il