Publications:Categorizing Normal and Pathological Voices : Automated and Perceptual Categorization

From ISLAB/CAISR

Do not edit this section

Keep all hand-made modifications below

Title Categorizing Normal and Pathological Voices : Automated and Perceptual Categorization
Author Virgilijus Uloza and Antanas Verikas and Marija Bacauskiene and Adas Gelzinis and Ruta Pribuisiene and Marius Kaseta and Viktoras Saferis
Year 2011
PublicationType Journal Paper
Journal Journal of Voice
HostPublication
Conference
DOI http://dx.doi.org/10.1016/j.jvoice.2010.04.009
Diva url http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:352178
Abstract Objectives: The aims of the present study were to evaluate the accuracy of an elaborated automated voice categorization system that classified voice signal samples into healthy and pathological classes and to compare it with classification accuracy that was attained by human experts. Material and Methods: We investigated the effectiveness of 10 different feature sets in the classification of voice recordings of the sustained phonation of the vowel sound /a/ into the healthy and two pathological voice classes, and proposed a new approach to building a sequential committee of support vector machines (SVMs) for the classification. By applying “genetic search” (a search technique used to find solutions to optimization problems), we determined the optimal values of hyper-parameters of the committee and the feature sets that provided the best performance. Four experienced clinical voice specialists who evaluated the same voice recordings served as experts. The “gold standard” for classification was clinically and histologically proven diagnosis. Results: A considerable improvement in the classification accuracy was obtained from the committee when compared with the single feature type-based classifiers. In the experimental investigations that were performed using 444 voice recordings coming from 148 subjects, three recordings from each subject, we obtained the correct classification rate (CCR) of over 92% when classifying into the healthy-pathological voice classes, and over 90% when classifying into three classes (healthy voice and two nodular or diffuse lesion voice classes). The CCR obtained from human experts was about 74% and 60%, respectively. Conclusion: When operating under the same experimental conditions, the automated voice discrimination technique based on sequential committee of SVM was considerably more effective than the human experts.