[期刊论文]


HiFun: homology independent protein function prediction by a novel protein-language self-attention model

作   者:
Jun Wu;Haipeng Qing;Jian Ouyang;Jiajia Zhou;Zihao Gao;Christopher E Mason;Zhichao Liu;Tieliu Shi;

出版年:暂无

页    码:暂无
出版社:Oxford University Press (OUP)


摘   要:

Protein function prediction based on amino acid sequence alone is an extremely challenging but important task, especially in metagenomics/metatranscriptomics field, in which novel proteins have been uncovered exponentially from new microorganisms. Many of them are extremely low homology to known proteins and cannot be annotated with homology-based or information integrative methods. To overcome this problem, we proposed a Homology Independent protein Function annotation method (HiFun) based on a unified deep-learning model by reassembling the sequence as protein language. The robustness of HiFun was evaluated using the benchmark datasets and metrics in the CAFA3 challenge. To navigate the utility of HiFun, we annotated 2 212 663 unknown proteins and discovered novel motifs in the UHGP-50 catalog. We proved that HiFun can extract latent function related structure features which empowers it ability to achieve function annotation for non-homology proteins. HiFun can substantially improve newly proteins annotation and expand our understanding of microorganisms’ adaptation in various ecological niches. Moreover, we provided a free and accessible webservice at http://www.unimd.org/HiFun, requiring only protein sequences as input, offering researchers an efficient and practical platform for predicting protein functions.



关键字:

暂无


所属期刊
Briefings in Bioinformatics
ISSN: 1467-5463
来自:Oxford University Press (OUP)