Comparative analysis of instance selection algorithms for instance-based classifiers in the context of medical decision support

MacIej A. Mazurowski, Jordan M. Malof, Georgia D. Tourassi

Research output: Contribution to journalArticlepeer-review

Abstract

When constructing a pattern classifier, it is important to make best use of the instances (a.k.a. cases, examples, patterns or prototypes) available for its development. In this paper we present an extensive comparative analysis of algorithms that, given a pool of previously acquired instances, attempt to select those that will be the most effective to construct an instance-based classifier in terms of classification performance, time efficiency and storage requirements. We evaluate seven previously proposed instance selection algorithms and compare their performance to simple random selection of instances. We perform the evaluation using κ-nearest neighbor classifier and three classification problems: one with simulated Gaussian data and two based on clinical databases for breast cancer detection and diagnosis, respectively. Finally, we evaluate the impact of the number of instances available for selection on the performance of the selection algorithms and conduct initial analysis of the selected instances. The experiments show that for all investigated classification problems, it was possible to reduce the size of the original development dataset to less than 3% of its initial size while maintaining or improving the classification performance. Random mutation hill climbing emerges as the superior selection algorithm. Furthermore, we show that some previously proposed algorithms perform worse than random selection. Regarding the impact of the number of instances available for the classifier development on the performance of the selection algorithms, we confirm that the selection algorithms are generally more effective as the pool of available instances increases. In conclusion, instance selection is generally beneficial for instance-based classifiers as it can improve their performance,

Original languageEnglish
Pages (from-to)473-489
Number of pages17
JournalPhysics in Medicine and Biology
Volume56
Issue number2
DOIs
StatePublished - Jan 21 2011

Fingerprint

Dive into the research topics of 'Comparative analysis of instance selection algorithms for instance-based classifiers in the context of medical decision support'. Together they form a unique fingerprint.

Cite this