Fast protein structure searching using structure graph embeddings

2 Jan 2025

Comparing and searching protein structures independent of primary sequence has proved useful for remote homology detection, function annotation and protein classification. Fast and accurate methods to search with structures will be essential to make use of the vast databases that have recently become available, in the same way that fast protein sequence searching underpins much of bioinformatics. We train a simple graph neural network using supervised contrastive learning to learn a low-dimensional embedding of protein structure. The method, called Progres, is available as software at and as a web server at It has accuracy comparable to the best current methods and can search the AlphaFold database TED domains in a tenth of a second per query on CPU. DOI: 10.1101/2022.11.28.518224​

More articles

All articles

You are running an old browser version. We recommend updating your browser to its latest version.