Performance evaluation of features for gene essentiality prediction
No Thumbnail Available
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Description
Essential genes are subset of genes required by an organism for growth and sustenance of life
and as well responsible for phenotypic changes when their activities are altered. They have been utilized as
drug targets, disease control agent, etc. Essential genes have been widely identified especially in
microorganisms, due to the extensive experimental studies on some of them such as Escherichia coli and
Saccharomyces cerevisiae. Experimental approach has been a reliable method to identify essential genes.
However, it is complex, costly, labour and time intensive. Therefore, computational approach has been
developed to complement the experimental approach in order to minimize resources required for essentiality
identification experiments. Machine learning approaches have been widely used to predict essential genes in
model organisms using different categories of features with varying degrees of accuracy and performance.
However, previous studies have not established the most important categories of features that provide the
distinguishing power in machine learning essentiality predictions. Therefore, this study evaluates the
discriminating strength of major categories of features used in essential gene prediction task as well as the
factors responsible for effective computational prediction. Four categories of features were considered and kfold cross-validation machine learning technique was used to build the classification model. Our results show
that ontology features with an AUROC score of 0.936 has the most discriminating power to classify essential
and non-essential genes. This studyconcludes that more ontology related features will further improve the
performance of machine learning approach and also sensitivity, precision and AUPRC are realistic measures
of performance in essentiality prediction.
Keywords
Q Science (General), QA75 Electronic computers. Computer science