Deep Contextual Representation Learning for Identifying Essential Proteins via Integrating Multisource Protein Features
-
Graphical Abstract
-
Abstract
Essential proteins with biological functions are necessary for the survival of organisms. Computational recognition methods of essential proteins can reduce the workload and provide candidate proteins for biologists. However, existing methods fail to efficiently identify essential proteins, and generally do not fully use amino acid sequence information to improve the performance of essential protein recognition. In this work, we propose an end-to-end deep contextual representation learning framework called DeepIEP to automatically learn biological discriminative features without prior knowledge based on protein network heterogeneous information. Specifically, the model attaches amino acid sequences as the attributes of each protein node in the protein interaction network, and then automatically learns topological features from protein interaction networks by graph embedding algorithms. Next, multi-scale convolutions and gated recurrent unit networks are used to extract contextual features from gene expression profiles. The extensive experiments confirm that our DeepIEP is an effective and efficient feature learning framework for identifying essential proteins and contextual features of protein sequences can improve the recognition performance of essential proteins.
-
-