Predicting host dependency factors of pathogens in Drosophila melanogaster using machine learning

dc.contributor.authorAromolaran, Olufemi
dc.contributor.authorBeder, Thomas
dc.contributor.authorAdedeji, Eunice
dc.date.accessioned2023-08-30T13:23:19Z
dc.date.available2023-08-30T13:23:19Z
dc.date.issued2021-08-09
dc.description.abstractPathogens causing infections, and particularly when invading the host cells, require the host cell machinery for efficient regeneration and proliferation during infection. For their life cycle, host proteins are needed and these Host Dependency Factors (HDF) may serve as therapeutic targets. Several attempts have approached screening for HDF producing large lists of potential HDF with, however, only marginal overlap. To get consistency into the data of these experimental studies, we developed a machine learning pipeline. As a case study, we used publicly available lists of experimentally derived HDF from twelve different screening studies based on gene perturbation in Drosophila melanogaster cells or in vivo upon bacterial or protozoan infection. A total of 50,334 gene features were generated from diverse categories including their functional annotations, topology attributes in protein interaction networks, nucleotide and protein sequence features, homology properties and subcellular localization. Cross-validation revealed an excellent prediction performance. All feature categories contributed to the model. Predicted and experimentally derived HDF showed a good consistency when investigating their common cellular processes and function. Cellular processes and molecular function of these genes were highly enriched in membrane trafficking, particularly in the trans-Golgi network, cell cycle and the Rab GTPase binding family. Using our machine learning approach, we show that HDF in organisms can be predicted with high accuracy evidencing their common investigated characteristics. We elucidated cellular processes which are utilized by invading pathogens during infection. Finally, we provide a list of 208 novel HDF proposed for future experimental studiesen_US
dc.description.sponsorshipACE: Applied Informatics and Communicationen_US
dc.identifier.issn2001-0370
dc.identifier.urihttps://datad.aau.org/handle/123456789/2093
dc.language.isoenen_US
dc.publisherComputational and Structural Biotechnology Journalen_US
dc.relation.ispartofseriesComputational and Structural Biotechnology Journal;19 (2021)
dc.subjectHost factorsen_US
dc.subjectBacteriaen_US
dc.subjectInfectionen_US
dc.subjectKnockout screenen_US
dc.subjectMachine learningen_US
dc.subjectDrosophilaen_US
dc.subjectYvonne Ajammaen_US
dc.subjectJelili Oyeladeen_US
dc.subjectNigeriaen_US
dc.subjectDigital Developmenten_US
dc.subjectCovenant Universityen_US
dc.subjectACE: Applied Informatics and Communicationen_US
dc.titlePredicting host dependency factors of pathogens in Drosophila melanogaster using machine learningen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
dependency.pdf
Size:
2.21 MB
Format:
Adobe Portable Document Format
Description:
Main article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections