Date |
Aug 4, 2016 |
Speaker |
Florian Krull, Gerrit Korff, Nadia Elghobashi-Meinhardt, and Ernst-Walter Knapp, Free University of Berlin
|
Title |
ProPairs: A Data Set for Protein−Protein Docking
|
Abstract |
ProPairs is a data set of crystal structures of protein complexes
defined as biological assemblies in the protein data bank (PDB), which
are classified as legitimate protein-protein docking complexes by also
identifying the corresponding unbound protein structures in the
PDB. The underlying program selecting suitable protein complexes, also
called ProPairs, is an automated method to extract structures of
legitimate protein docking complexes and their unbound partner
proteins from the PDB which fulfil specific criteria. In this way a
total of 5,642 protein complexes have been identified with 11,600
different decompositions in unbound protein pairs yielding legitimate
protein docking partners. After removing sequence redundancy
(requiring a sequence identity of the residues in the interface of
less than 40%), 2,070 different legitimate protein docking complexes
remain. For 810 of these protein docking complexes, both docking
partners possess corresponding unbound structures in the PDB. From the
2,070 non-redundant protein docking complexes there are 417 which
possess a cofactor at the interface. From the 176 protein docking
complexes of the Protein−Protein Docking Benchmark 4.0 (DB4.0) data
set, 13 differ from the ProPairs data set. Twelve of them differ with
respect to the composition of the unbound structures but are contained
in the large redundant ProPairs data set. One protein docking complex
of the DB4.0 data set is not contained in ProPairs since the
biological assembly specified in the PDB is wrong (PDB id 1d6r). For
one protein complex (PDB id 1bgx) the DB4.0 data set uses a fabricated
unbound structure. For public use interactive online access is
provided to the ProPairs data set of non-redundant protein docking
complexes along with the source code of the underlying method
[http://propairs.github.io].
|
|