09/12/2021
We are excited to share our latest article: " for newly sequenced organisms", published today in Nature Machine Intelligence: https://rdcu.be/cCVsT
In this paper we introduce S2F (sequence to function). A machine learning method that predicts protein function for organisms that have only sequence information.
The twist? We go beyond homology to do it!
By transferring protein-protein relationships from well studied organisms, we put together a functionally coherent network of proteins.
We use InterPro and HMMER predictions as a starting point. And these predictions are propagated and amplified on the network.
In development for many years, the project is open source https://github.com/paccanarolab/s2f
(PRs are welcome!)
An interactive explorer of our extensive testing is available in the project website:
https://paccanarolab.org/s2f/
Predicting the function of proteins in newly sequenced organisms is a challenging problem. Mateo Torres et al. present here a method to transfer the functional relations from known organisms and improve the prediction using network diffusion.