Essential gene prediction in Drosophila melanogaster using machine learning approaches based on sequence and functional features

dc.contributor.authorAromolaran, Olufemi
dc.contributor.authorBeder, Thomas
dc.contributor.authorMarcus Oswald
dc.date.accessioned2023-08-31T13:41:21Z
dc.date.available2023-08-31T13:41:21Z
dc.date.issued2020-03-10
dc.description.abstractGenes are termed to be essential if their loss of function compromises viability or results in profound loss of fitness. On the genome scale, these genes can be determined experimentally employing RNAi or knockout screens, but this is very resource intensive. Computational methods for essential gene prediction can overcome this drawback, particularly when intrinsic (e.g. from the protein sequence) as well as extrinsic features (e.g. from transcription profiles) are considered. In this work, we employed machine learning to predict essential genes in Drosophila melanogaster. A total of 27,340 features were generated based on a large variety of different aspects comprising nucleotide and protein sequences, gene networks, proteinprotein interactions, evolutionary conservation and functional annotations. Employing cross-validation, we obtained an excellent prediction performance. The best model achieved in D. melanogaster a ROCAUC of 0.90, a PR-AUC of 0.30 and a F1 score of 0.34. Our approach considerably outperformed a benchmark method in which only features derived from the protein sequences were used (P < 0.001). Investigating which features contributed to this success, we found all categories of features, most prominently network topological, functional and sequence-based features. To evaluate our approach we performed the same workflow for essential gene prediction in human and achieved an ROC-AUC = 0.97, PR-AUC = 0.73, and F1 = 0.64en_US
dc.description.sponsorshipACE: Applied Informatics and Communicationen_US
dc.identifier.issn2001-0370
dc.identifier.urihttps://datad.aau.org/handle/123456789/2105
dc.language.isoenen_US
dc.publisherComputational and Structural Biotechnologyen_US
dc.relation.ispartofseriesComputational and Structural Biotechnology;18 (2020)
dc.subjectMachine-learningen_US
dc.subjectEssential genesen_US
dc.subjectLethalen_US
dc.subjectDigital Developmenten_US
dc.subjectCovenant Universityen_US
dc.subjectACE: Applied Informatics and Communicationen_US
dc.subjectNigeriaen_US
dc.subjectEssentiality predictionen_US
dc.subjectHomo sapiensen_US
dc.titleEssential gene prediction in Drosophila melanogaster using machine learning approaches based on sequence and functional featuresen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Essential.pdf
Size:
2.17 MB
Format:
Adobe Portable Document Format
Description:
Main article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections