Selection protocol for soluble proteins



We compiled the 3D-strcuture of soluble proteins from PDB_SELECT entry, as following procedure

  1. Using the recent PDB_SELECT entries based on 25% sequence identity cutoff. (Number of chains: 901)

  2. Excluded from all entries are the membrane proteins. (Number of chains: 901 -> 889)

  3. Excluded from 889 entries are the NMR structures. (Number of chains: 889 -> 742)

  4. Check on unsuitable entries with lacking of atom coordinates or having of non-sequenctial residues, from 742 chains. (Number of chains: 742 -> 663)

  5. Assingment of secondary structure regions by PDB discription.

  6. Compiling the proteins, having helices longer than 19 amino acid residues or potential long helices (bending type). (Total number of chains: 663 -> 397)