Intermolecular Contacts¶
This examples shows how to identify intermolecular contacts, e.g. protein
atoms interacting with a bound inhibitor. A structure of a protein-ligand
complex in PDB format will be used. Output will be Selection
instances that points to atoms matching the contact criteria given by the user.
Selection
instances can be used as input to other
functions for further analysis.
Simple contact selections¶
We start by importing everything from the ProDy package:
In [1]: from prody import *
In [2]: from pylab import *
In [3]: ion()
ProDy selection engine has a powerful feature that enables identifying intermolecular contacts very easily. We will see this by identifying protein atoms interacting with an inhibitor.
We start with parsing a PDB file that contains a protein and a bound ligand.
In [4]: pdb = parsePDB('1zz2')
1zz2 contains an inhibitor bound p38 MAP kinase structure. Residue name of inhibitor is B11. Protein atoms interacting with the inhibitor can simply be identified as follows:
In [5]: contacts = pdb.select('protein and within 4 of resname B11')
In [6]: repr(contacts)
Out[6]: "<Selection: 'protein and wit... of resname B11' from 1zz2 (50 atoms)>"
'protein and within 4 of resname B11'
is interpreted as select protein
atoms that are within 4 A of residue whose name is B11. This selects
protein atoms that within 4 A of the inhibitor.
Contacts between different atom groups¶
In some cases, the protein and the ligand may be in separate files. We will imitate this case by making copies of protein and ligand.
In [7]: inhibitor = pdb.select('resname B11').copy()
In [8]: repr(inhibitor)
Out[8]: "<AtomGroup: 1zz2 Selection 'resname B11' (33 atoms)>"
In [9]: protein = pdb.select('protein').copy()
In [10]: repr(protein)
Out[10]: "<AtomGroup: 1zz2 Selection 'protein' (2716 atoms)>"
We see that inhibitor molecule contains 33 atoms.
Now we have two different atom groups, and we want protein atoms that are within 4 Å of the inhibitor.
In [11]: contacts = protein.select('within 4 of inhibitor', inhibitor=inhibitor)
In [12]: repr(contacts)
Out[12]: "<Selection: 'index 227 230 2... 1354 1356 1358' from 1zz2 Selection 'protein' (50 atoms)>"
We found that 50 protein atoms are contacting with the inhibitor. In this case, we passed the atom group inhibitor as a keyword argument to the selection function. Note that the keyword must match that is used in the selection string.
Composite contact selections¶
Now, let’s try something more sophisticated. We select Cα atoms of residues that have at least one atom interacting with the inhibitor:
In [13]: contacts_ca = protein.select(
....: 'calpha and (same residue as within 4 of inhibitor)',
....: inhibitor=inhibitor)
....:
In [14]: repr(contacts_ca)
Out[14]: "<Selection: 'index 225 232 2... 1328 1351 1359' from 1zz2 Selection 'protein' (20 atoms)>"
In this case, 'calpha and (same residue as within 4 of inhibitor)'
is
interpreted as select Cα atoms of residues that have at least
one atom within 4 A of any inhibitor atom.
This shows that, 20 residues have atoms interacting with the inhibitor.
Spherical atom selections¶
Similarly, one can give arbitrary coordinate arrays as keyword arguments to identify atoms in a spherical region. Let’s find backbone atoms within 5 Å of point (25, 73, 13):
In [15]: sel = protein.select('backbone and within 5 of somepoint',
....: somepoint=np.array((25, 73, 13)))
....:
Fast contact selections¶
For repeated and faster contact identification Contacts
class is
recommended.
We pass the protein as argument:
In [16]: protein_contacts = Contacts(protein)
The following corresponds to "within 4 of inhibitor"
:
In [17]: contants = protein_contacts.select(4, inhibitor)
In [18]: repr(contacts)
Out[18]: "<Selection: 'index 227 230 2... 1354 1356 1358' from 1zz2 Selection 'protein' (50 atoms)>"
This method is 20 times faster than the one in the previous part, but it is
limited to selecting only contacting atoms (other selection arguments cannot
be passed). Again, it should be noted that Contacts
does not update
the KDTree that it uses, so it should be used if protein coordinates does not
change between selections.