zzlib.sch.structure module¶
zzlib.sch.structure
¶
A module for handling Schrodinger structure objects.
This module provides enhanced functionality for working with Schrodinger structures, including:
- Structure reading/writing/manipulation.
- Atom selection and filtering.
- Structure preparation, analysis and property calculation.
- Ligand handling.
- Structure visualization.
The module provides the following main classes:
- Structure: Enhanced Schrodinger Structure class with additional functionality for structure manipulation and analysis.
- Atoms: A list-like container for managing collections of atoms with selection and filtering capabilities.
- ChainAtoms: Specialized Atoms class for working with atoms belonging to a specific protein chain.
- LigandAtoms: Class for handling and analyzing ligand molecules within a structure.
- Pocket: Class for analyzing and manipulating protein binding pocket.
- Conformers: Container class for storing multiple conformers of a molecule while sharing topology information.
- StructureConf: Class representing a structure conformation.
- StructureConfIter: Iterator class for efficiently processing multiple structure conformations while sharing topology information.
The module also provides direct imports for common Schrodinger classes:
- Molecule (schrodinger.structure._Molecule)
- Residue (schrodinger.structure._Residue)
- Chain (schrodinger.structure._Chain)
- Atom (schrodinger.structure._StructureAtom)
zzlib.sch.structure.Atoms
¶
Bases: list
A class used to represent a list of atoms in a structure.
This class can be thought of as a list of atomic indices. If you need _StructureAtom instances, use
Atoms.atoms().
About reality
The storage type is determined by the object's reality flag.
When real=False, only atom indices
are stored - these indices become invalid if atoms are deleted since indices will shift.
When real=True, actual atom instances are stored - these remain valid even if atoms are deleted,
but cannot be transferred between structures (like conformers) that share the same atom indexing.
Examples:
Create an Atoms object from atom indices:
Create an Atoms object from ASL (Atom Specification Language):
Combine atom selections using operators:
ligand = st.eval_asl('ligand')
pocket = ligand + ligand.shell_atoms(4.0) # Atoms within 4A of ligand
not_pocket = ~pocket # All atoms not in pocket
common = ligand & pocket # Intersection of two selections
Get atom properties:
Extract atoms to new structure:
Set visualization properties:
atoms.set_color('red') # Set atoms to red
atoms.set_style('ball_stick') # Set to ball and stick style
atoms.hide() # Hide atoms
atoms.show() # Show atoms
Expand selection:
expanded = atoms.expand_bond(2) # Expand by 2 bonds
with_shell = atoms.expand_atoms(4.0) # Add atoms within 4A
full_res = atoms.expand_to_residue() # Expand to full residues
__init__(atoms, st=None, real=None)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
atoms |
Union[Iterable[int], Iterable[StructureAtom]]
|
Atom indices, or _StructureAtom instances. |
required |
st |
Structure
|
Parent structure. If the first item in |
None
|
real |
bool
|
Whether atoms are stored by instance rather than index, and can stand any atom deletion. |
None
|
add_h()
¶
Add hydrogens to the atoms.
angles: Dict[Tuple[int, int, int, int], float]
property
¶
asl: str
property
¶
Get the simplist Atom Specification Language representation of the atoms.
atoms()
¶
Iterate the atoms as instances (_StructureAtom) rather than atom indices.
bonded_atoms()
¶
Iterate over atoms bonded to any atoms in this list, excluding hydrogens.
bonded_hydrogens()
¶
Iterate over hydrogen atoms bonded to the atoms.
chains()
¶
Iterate over chains related this atoms, sorted by atom count.
chains_names()
¶
Iterate over chain names related this atoms, sorted by A-Z.
contact_chains(within=4, sort=False, exclude=None)
¶
Return the chains within certain distances.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
within |
float
|
Distance cutoff in angstroms to find nearby chains. |
4
|
sort |
When True, results will be ordered by number of nearby atoms. |
False
|
|
exclude |
Atoms
|
Optional Atoms object specifying atoms to exclude from the search. |
None
|
convert(st=None, real=None)
¶
Convert the reality of the object.
copy(**kw)
¶
create_surface()
¶
Create a new surface.
expand_atoms(dist=4.0)
¶
Returns the atoms within a specified distance from the current atoms, including self atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dist |
float
|
The distance threshold for selecting atoms. |
4.0
|
expand_bond(nbonds=1)
¶
Get atoms that is in n bond distance of the atoms.
expand_to_residue()
¶
Expands to full resiude, given an atom, an atom index, or an Atoms instance.
expandable_atoms(nbonds=1)
¶
Iterate over atoms in self that can be expanded by at least N bonds to nearby atoms.
extract(copy_props=True, renumber_map=False)
¶
Extract atoms into a new structure. After extraction, indices are renumbered from 1 to len(atoms). Pre-existing references will not be correct.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
copy_props |
bool
|
Whether to copy the properties of the original parent structure to the newly created structure. |
True
|
renumber_map |
bool
|
Whether to return the atom index renumber map. |
False
|
Returns:
| Type | Description |
|---|---|
Union[Structure, Tuple[Structure, Dict[int, int]]]
|
If renumber_map is False, returns the extracted Structure. If renumber_map is True, returns a tuple of (extracted Structure, renumber map dict). |
Note
The renumber map maps original atom indices to new indices after extraction.
filter_hydrogens(polar=True, nonpolar=False)
¶
Analyze the hydrogens in the current Atoms, and filter the list of atoms that meet the polar conditions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polar |
If True, keep polar hydrogens. |
True
|
|
nonpolar |
If True, keep nonpolar hydrogens. |
False
|
from_atom(atom, **kw)
classmethod
¶
Return an Atoms instance of a single atom. Structure is set automatically. Reality is set based on structure settings.
from_chains(chains, **kw)
classmethod
¶
Return an Atoms instance of a list of Chain. Structure is set automatically. Reality is False.
hide()
¶
Hide atoms.
hide_nonpolar_hydrogens()
¶
Hide all non-polar hydrogens.
invert()
¶
Return the inverse of this atom selection in the structure.
is_real()
¶
Return the reality, that means, atom is stored by instance (_StructureAtom) rather than index, and can stand any atom deletion.
nxgraph: Graph
cached
property
¶
Get the NetworkX graph representation of the atoms.
residues()
¶
Iterate over residues related this atoms.
residues_specs()
¶
Returns a list of unique residue specifications in the structure.
Each residue specification is a tuple containing the chain ID, PDB residue name, and residue number.
Returns:
| Type | Description |
|---|---|
None
|
A list of unique residue specifications. |
set_color(color)
¶
Set the atom color, where the color can be either:
- an
integer(colormap index). - n
string(color name, such as'white'). - n
string(hex"RRGGBB"value). - tuple/list of 3 ints, values 0-255 (an RGB value).
set_color_ramp(colors, property)
¶
Set the atom color, according to property.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
colors |
Union[List[_ColorSpec], Dict[_ColorSpec, float]]
|
Either a list of colors to evenly distribute each color in the list; or
a dict of |
required |
property |
Union[str, Dict[int, float]]
|
Either a string to read from atom property; or a dict of |
required |
set_color_scheme(scheme)
¶
Set the atom color, according to a scheme.
Commonly used schemes
bfactor, acharge, chain, entry, pdbconversion,
element_customc_green (Element + carbons to green),
element_customc_white (Element + carbons to white),
...
See: schrodinger.structutils.color.available_color_schemes()
You can also use a tuple of (R, G, B), to set 'element' scheme
with custom C atom color as the input color.
set_label(label)
¶
Set label to atoms.
Label variable format:
set_ribbon(style='cartoon')
¶
set_ribbon_color(color='position')
¶
Set the ribbon color, where the color can be either:
- an string (color style scheme, such as 'position'):
- an integer (colormap index)
- an string (color name, such as 'white')
- an string (hex "RRGGBB" value)
- tuple/list of 3 ints, values 0-255 (an RGB value)
set_style(style='ball_stick', with_bonds=True)
¶
set_visibility(visible=True)
¶
Set visibility for atoms in this list.
shell_atoms(dist=4.0)
¶
Returns the atoms within a specified distance from the current atoms, excluding self atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dist |
float
|
The distance threshold for selecting atoms. |
4.0
|
show()
¶
Show atoms.
smarts: str
property
¶
Get the SMARTS representation of the atoms.
smarts_identify(indices)
¶
Generate a special SMARTS for identifying a list of atoms.
The element symbol of the selected atoms will be adjusted:
C -> Ce, O -> Os, N -> Ne, S -> Se, Other -> Lp
This function is generally used for debug purposes.
smarts_pattern: str
property
¶
Get the SMARTS representation of the atoms.
split_by_connectivity()
¶
Group the atoms by connectivity, largest to smallest.
The difference of this function and the split_by_molecules is that this function can separate atoms in different parts of the same molecule, from Atoms that cover multiple subparts of one molecule.
split_by_molecules()
¶
st: Structure
property
¶
Get the parent structure.
torsions: Dict[Tuple[int, int, int, int], float]
property
¶
write(f_path, **kw)
¶
Write atoms to file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f_path |
str
|
The path to write the file to. |
required |
**kw |
Additional keyword arguments passed to |
{}
|
zzlib.sch.structure.ChainAtoms
¶
Bases: Atoms
A class used to represent a list of atoms of a chain in a structure.
__init__(*args, chain=None, **kw)
¶
Please use the ChainAtoms.from_chain() function to create instances.
from_chain(chain, **kw)
classmethod
¶
Return an Atoms instance of a Chain. Structure is set automatically. Reality is False.
name
property
writable
¶
Get the chain name of the chain (usually a single letter).
zzlib.sch.structure.Conformers
¶
A class for managing molecular conformers.
This class provides functionality to load and manage multiple conformers of a molecule, either from files or Structure objects. The conformers are stored as XYZ coordinates in a numpy array to save memory.
from_files(files, ref=None)
classmethod
¶
Load conformers from a list of files into a numpy array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
files |
Union[Union[Path, str], List[Union[Path, str]]]
|
Path or list of paths to conformer files. |
required |
ref |
Structure
|
Optional reference Structure object. If not provided, first file will be used. |
None
|
from_sts(sts)
classmethod
¶
Load conformers from a list of Structure objects into a numpy array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sts |
List[Structure]
|
List of Structure objects containing conformers. |
required |
Returns:
| Type | Description |
|---|---|
|
A new Conformers object containing the conformers from the input structures. |
|
|
The first structure is used as the reference structure. |
sts(copy=False)
¶
Iterate over each conformation structure.
This method iterates through all conformations in the Conformers object, yielding Structure objects with the coordinates of each conformation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
copy |
bool
|
Controls how structures are yielded: - If False (default), modifies and returns the same reference structure with updated coordinates for each conformation - If True, creates and returns a new copy of the reference structure for each conformation |
False
|
Yields:
| Name | Type | Description |
|---|---|---|
Structure |
A Structure object containing the coordinates for each conformation. If copy=False, yields the same structure object with updated coordinates. If copy=True, yields a new structure object for each conformation. |
zzlib.sch.structure.LigandAtoms
¶
Bases: Atoms
A class used to represent a list of atoms of a ligand in a structure.
__init__(*args, mol_num=None, asl=None, **kw)
¶
Please use LigandAtoms.build_smiles(), LigandAtoms.from_ligand(),
LigandAtoms.read(), or [LigandAtoms.read_first()][(c).read_first] function to create instances.
asl
property
¶
Get the Atom Specification Language representation representation of this ligand.
asl_res
property
¶
Get the Atom Specification Language representation of the atoms using chains and residues.
build_smiles(smiles, three_d=False)
classmethod
¶
chain: str
cached
property
¶
Get the most common chain ID of atoms in this ligand.
confgen(max=1, optimize=False, lite=False, max_rot=9, use_coord_stereo=False, host=None, out_dir=None)
¶
Run confgen for this ligand.
confsearch(method='MCMM', max_steps=1000, unique_cutoff=0.5, minimize_steps=0, limit_steps_each_freedom=None, stop_at_n_minimum=None, limit_structures=None, energy_window=21.0, host=None, out_dir=None)
¶
Run conformational search for this ligand.
eval_smarts(smarts)
¶
Query substructure in this ligand by SMARTs.
extract_ligand(force=True)
¶
Extract the ligand into a new structure, then return the ligand in that structure as a new LigandAtoms instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
force |
If False, then if there's only this ligand in the structure, no new structure will be generated. |
True
|
flexible_align(ref, host=None, out_dir=None)
¶
Run flexible ligand alignment for this ligand.
from_ligand(ligand, st, **kw)
classmethod
¶
Return an LigandAtoms instance of a Ligand. Structure should be provided explicitly. Reality is set based on structure settings.
ligprep(max_atoms=200, ionization='epik', metal_binding=False, epik_ph_range=(7.4, 2.0), desault=True, tautomer=True, stereos_max=32, stereo_from='property', host=None, out_dir=None)
¶
Run ligand preparation for this ligand.
map_atoms(atoms)
¶
Returns a map by maximum common substructure
{atom index in the input atoms: atom index in this ligand}
mol_num
property
¶
Get the molecular number of this ligand in the parent Structure.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the ligand contains no atoms or atoms from multiple molecules. |
pocket: Pocket
cached
property
¶
Get the binding pocket around the largest ligand.
Returns:
| Type | Description |
|---|---|
Pocket
|
A Pocket object representing the binding site around the ligand. |
read(inp, **kw)
classmethod
¶
Read ligands from a file or a structure as an iterator of LigandAtoms.
read_first(inp, **kw)
classmethod
¶
Read the first ligand from a file or a structure. If the input is a Structure, The largest ligand's LigandAtoms detected will be returned.
resname: str
cached
property
¶
Get the most common residue name of atoms in this ligand.
st_lig
cached
property
¶
Returns a structure containing only this ligand.
to_2d()
¶
Returns a new 2D structure of the ligand.
zzlib.sch.structure.Pocket
¶
atoms_in_interaction: Tuple[Atoms, Atoms]
cached
property
¶
chain: Optional[ChainAtoms]
cached
property
¶
Get the largest chain ligand pocket.
Returns:
| Name | Type | Description |
|---|---|---|
Chain |
Optional[ChainAtoms]
|
The chain object with the most contacts to this ligand, or None if no chains are in contact. |
entities_atoms: Atoms
cached
property
¶
Nearby entities (including protein residues, waters and ions) within 5 Å of the ligand.
get_binding_energy(ffld_version=None)
¶
Get binding energy of ligand binding to nearing residues.
get_closest_residue(atom)
¶
Get the closest residue for an atom in the ligand.
get_docking_grid_spec(local=False, extend=10.0, maximum=22.0)
¶
Obtain the dimensions of the grid used for docking, using this ligand as a reference for the pocket position.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local |
bool
|
If False, the grid will be a cube with the side length being the length of the ligand. If True, the grid will be a rectangular parallelepiped, with the side lengths being the lengths of the ligands in each direction. |
False
|
extend |
float
|
The additional length extended beyond the ligand length. |
10.0
|
maximum |
float
|
Maximum grid side length. |
22.0
|
Returns:
| Type | Description |
|---|---|
Tuple[Tuple[float, float, float], Tuple[float, float, float]]
|
|
get_docking_score()
¶
Run an inplace docking and get docking score. This function may take long time.
get_nearby_entities(extra_asl='', dist=5.0, fillres=True)
¶
Get atoms of nearby entities within specified distance of the ligand.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
extra_asl |
str
|
Additional ASL expression to filter entities. |
''
|
dist |
float
|
Distance cutoff in angstroms. |
5.0
|
fillres |
bool
|
Whether to include entire residues. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
list |
Atoms
|
List of atoms matching the ASL criteria. |
get_waters_bridges()
cached
¶
ions_atoms: Atoms
cached
property
¶
Nearby ion atoms within 8 Å of the ligand.
most_buried_residue: Atoms
cached
property
¶
The most ligand-buried residue.
residues_atoms: Atoms
cached
property
¶
Nearby residue atoms within 5 Å of the ligand.
residues_in_interaction: Atoms
cached
property
¶
Get all nearing residues that have at least one direct interaction with the ligand.
See: atoms_in_interaction.
residues_in_water_bridges
cached
property
¶
Get all atoms in nearing residues that participate in a water bridge.
See: get_waters_bridges()).
run_embrace_minimize(**kw)
¶
Run an embrace minimize job of the pocket ligand. This function may take long time.
run_redocking(**kw)
¶
Run a flexible docking of the pocket ligand to the pocket. This function may take long time.
run_refine_complex(**kw)
¶
Run an protein-ligand complex refinement job. This function may take long time.
set_style(style='pocket')
¶
Set predefined styles according to the pocket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
style |
Literal['pocket', 'interaction_only']
|
The style to apply. Can be either:
|
'pocket'
|
st: Structure
property
¶
Get the structure object that contains this ligand pocket.
to_pymol(interactions=True, turn=True, zoom=10.0)
¶
waters_atoms: Atoms
cached
property
¶
Nearby water atoms within 3 Å of the ligand.
waters_in_water_bridges
cached
property
¶
Get all waters that participate in a water bridge.
See: get_waters_bridges()).
zzlib.sch.structure.Structure
¶
This class provides some new methods for the Structure class provided by Schrodinger.
You can upgrade any schrodinger.structure.Structure object to this enhanced version with Structure(st).
Example
Upgrade to enhanced version:
from schrodinger import structure
from zzlib.sch import Structure
st = structure.StructureReader.read('protein.mae')
st = Structure(st)
Read a structure file:
Get ligands:
Select atoms:
Extract ligand:
add_h(readd=False)
¶
Add hydrogens to the structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
readd |
bool
|
If True, remove existing hydrogens before adding new ones. |
False
|
as_mae(delete=True)
¶
Write the structure to a temporary MAE file.
Note
This is a context manager that will delete the temporary file when exiting
the context if delete=True.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
delete |
Whether to delete the temporary file after the context exits. Defaults to True. |
True
|
as_pdb(delete=True)
¶
Write the structure to a temporary PDB file.
Note
This is a context manager that will delete the temporary file when exiting
the context if delete=True.
assign_bond_orders()
¶
Perform the bond order assignment on the structure.
ca_atoms: Atoms
property
¶
Get CA atoms in residues as an Atoms object.
chain_atoms(c, **kw)
¶
Returns a ChainAtoms instance of the selected chain name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c |
str
|
Chain name to select. |
required |
**kw |
Additional keyword arguments passed to ChainAtoms(). |
{}
|
chain_names()
¶
Returns a list of chain names in the structure.
chains_atoms(l=None, cls=ChainAtoms, **kw)
¶
Returns an Atoms instance of the selected chains.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l |
List[str]
|
A list of chain names. Default to all chains in the structure. |
None
|
cls |
Type[T]
|
Output chain atoms class type. Must be a subclass of ChainAtoms. |
ChainAtoms
|
**kw |
Additional keyword arguments passed to ChainAtoms(). |
{}
|
chains_atoms_iter(l=None, cls=ChainAtoms, **kw)
¶
Returns a list of ChainAtoms instances of the selected chains.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
l |
List[str]
|
A list of chain names. Default to all chains in the structure. |
None
|
cls |
Type[T]
|
Output chain atoms class type. Must be a subclass of ChainAtoms. |
ChainAtoms
|
convert_nonstandard_residues()
¶
Convert nonstandard residues to standard residues.
copy(editable=None)
¶
count_clashes(al1=None, al2=None, cutoff=0.75, except_hbond=True, except_salt_bridge=True)
¶
Count number of atom clashes in the structure.
This method counts the number of steric clashes between atoms in the structure, optionally excluding hydrogen bonds and salt bridges. A clash is defined as when two atoms are closer than expected based on their van der Waals radii.
Hydrogen atoms are ignored when counting clashes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
al1 |
List[int]
|
Optional list of atom indices to check for clashes from. If None, uses all atoms. |
None
|
al2 |
List[int]
|
Optional list of atom indices to check for clashes against. If None, uses all atoms. |
None
|
cutoff |
float
|
Strictness of determination of clash (1.3: good, 0.89: bad, 0.75: ugly). Lower values indicate stricter clash detection. |
0.75
|
except_hbond |
bool
|
If True, do not count clashes where hydrogen bonds are present. |
True
|
except_salt_bridge |
bool
|
If True, do not count clashes where salt bridges are present. |
True
|
Returns:
| Type | Description |
|---|---|
int
|
Number of atom clashes found in the structure. |
Cutoff values
The cutoff value determines how much overlap between van der Waals radii is allowed: - 1.3: Lenient, only counts severe clashes. - 0.89: Moderate strictness, ignore acceptable clashes. - 0.75: Count all clashes.
delete_waters(renumber_map=False)
¶
Delete all water atoms from the structure.
display()
¶
Display the structure via nglview when working with the structure in the notebook.
NOTE
nglview is not a dependency installed by default, please install it via:
schrun -C "pip install nglview"
eval_asl(asl)
¶
Get Atoms specified by an ASL string.
See: Atoms.asl.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
asl |
str
|
An Atom Specification Language string. |
required |
Returns:
| Type | Description |
|---|---|
Atoms
|
The atoms matching the ASL specification. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the ASL string is not valid. |
eval_residue_spec(spec)
¶
Get Residues according to a residue specification.
See: Atoms.residues_specs().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec |
Tuple[str, str, int]
|
A tuple containing the chain name, residue name, and residue number. |
required |
eval_smarts(smarts, mols_only=None, **kw)
¶
Get Atoms specified by an SMARTS string.
See: Atoms.smarts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mols_only |
Iterable[int]
|
Query only in certein molecules, by molecule index. |
None
|
reorder |
Whether to reorder results according to atom index, rather than smarts. |
required |
expand_bond(atoms, n_bonds=1, only_shell=False)
¶
Get atoms that is in n bond distance of atoms.
Get atoms that are within n bonds distance of the input atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
atoms |
Atoms
|
The starting atoms to expand from. |
required |
n_bonds |
int
|
Number of bonds to expand. |
1
|
only_shell |
bool
|
If True, only return the outer shell atoms. If False, include the input atoms. |
False
|
expand_shells(shells)
¶
Expand atoms in shells by one bond at a time, excluding previously seen atoms.
This method takes a list of shells (Atoms objects) and expands outward by one bond
at a time, yielding new shells of atoms. Each new shell contains atoms that are one bond away from
the previous shell (shells[-1]), excluding any atoms that appeared in earlier shells. Hydrogen atoms are ignored
during expansion.
For example, given shells [a(1), a(2), a(3)], it will: 1. Start from atoms in a(3) (the last shell) 2. Find all atoms one bond away from a(3) 3. Exclude any atoms that appeared in a(1), a(2), or a(3) 4. Yield the new shell a(4) 5. Repeat the process starting from a(4) until no new atoms are found
This is useful for systematically exploring molecular structure outward from a starting set of atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
shells |
List[Atoms]
|
A list of shells defining the current expand base. |
required |
expand_to_molecular(anchors, exclusion=None)
¶
Search all atoms within the same molecule from anchor atoms.
During the search process, anchors and atoms in exclusion will be treated as barriers that stop the expansion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
anchors |
Atoms
|
The starting atoms to expand from. |
required |
exclusion |
Atoms
|
Atoms to exclude and stop the expansion. |
None
|
extend(st, copy_props=False, renumber_map=False)
¶
Extend structure into current structures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
st |
Structure
|
The structure to be extended to the current structure. |
required |
copy_props |
bool
|
Whether to copy the properties of the extended structure to the current structure. |
False
|
renumber_map |
bool
|
Whether to return the atom index renumber map. |
False
|
Returns:
| Type | Description |
|---|---|
|
If not renumber_map, returns None. |
|
|
If renumber_map, returns: |
Note
The renumber map will be:
{<atom indices in the extended structure before operation>: <atom indices after operation>}
Note
This operation will not modify the atom index of orignal atoms.
extract(indices, copy_props=True, renumber_map=False)
¶
Extract atoms into a new structures.
After extractions, indices are renumbered from 1 to len(atoms).
Pre-existing indices will not be correct, use the renumber_map to convert.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
indices |
A list of atom indices to extract. |
required | |
copy_props |
bool
|
Whether to copy the properties of the original parent structure to the newly created structure. |
True
|
renumber_map |
bool
|
Whether to return the atom index renumber map. |
False
|
Returns:
| Type | Description |
|---|---|
|
If not renumber_map, returns the extracted structure (Structure). |
|
|
If renumber_map, returns |
Note
The renumber map will be:
{<atom indices before extraction>: <atom indices after extraction>}
from_file(f_path)
classmethod
¶
Read the first stucture in a file.
get_atoms(l=None, cls=Atoms, **kw)
¶
get_forcefield_energy(use_opls_2005=False)
¶
Calculate the forcefield energy of the structure.
Suitable for all types of structures. This function may be time-consuming.
get_macromodel_energy(solvent='vacuum')
¶
Calculate the Macromodel energy of the structure.
Very suitable for small molecules. This function may be time-consuming.
get_prime_energy()
¶
Calculate the Prime energy of the structure.
Suitable for ligand-protein complexes. This function may be time-consuming. For small molecules, use get_macromodel_energy() instead.
ion_atoms: Atoms
property
¶
Get all ion atoms as an Atoms object.
is_editable()
¶
Returns whether the structure is editable. An editable structure will generate real Atoms objects by default.
keep_ligands(n, renumber_map=False)
¶
Keep only N largest ligand and delete all other ligands.
largest_ligand(required=True, cls=LigandAtoms, **kw)
¶
Find the largest ligand in the structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
required |
Whether to raise a StructureError if no ligand was found. |
True
|
ligand_atoms: Atoms
property
¶
Get all ligand atoms as an Atoms object.
ligands(max_atoms=130, peptide=True, cls=LigandAtoms, excluded_res={}, include_res={}, covalent=False, **kw)
¶
Find the ligands in the structure. Results are sorted largest to smallest.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_atoms |
Molecules with atom counts above that number will not be recognized as ligands. |
130
|
|
peptide |
Allows peptides to be recognized as ligands. |
True
|
|
cls |
Type[T]
|
Output ligand atoms class type. Must be a subclass of LigandAtoms. |
LigandAtoms
|
excluded_res |
Set[str]
|
A set containing residue names such that these residues are not recognized as ligands. |
{}
|
include_res |
Set[str]
|
A set containing residue names such that these residues are forced to be recognized as ligands. |
{}
|
covalent |
When True, filter out covalent ligands. For non-covalent ligands, this can exclude the influence of some non-standard residues and covalent crystallization cofactors. |
False
|
Returns:
| Type | Description |
|---|---|
List[Union[LigandAtoms, T]]
|
List of LigandAtoms, each of which contains all bonded hydrogen atoms. |
merge(other_structure, copy_props=False)
¶
Return a new structure object which contains the atoms of the current structure and the atoms of other_structure.
Atom indices of other_structure will be extended to atom indices in self.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other_structure |
The structure to be merged with the current structure. |
required | |
copy_props |
bool
|
If True, properties from the current structure and other_structure will be added to the new structure. |
False
|
molecule_atoms(mn, **kw)
¶
Returns an Atoms instance of the selected molecule number.
molecules_atoms(l=None, **kw)
¶
pocket: Pocket
cached
property
¶
prepare(use_pdb_ph=False, preprocess=True, align=None, addh=True, re_addh=True, metalbonds=True, disulfides=True, glycosylation=False, palmitoylation=False, sel_to_met=False, fill_loops=False, fill_loops_fasta=None, fill_sidechains=True, add_terminal_oxygens=False, cap_termini=False, epik=True, epik_ph_range=(7.4, 2.0), epik_max_states=1, hbond_assign=True, sample_water=True, include_epik_states=True, use_crystal=False, propka_ph=7.4, minimize_hs=False, restrained_minimization=False, h_only=False, force_field=None, max_rmsd=0.3, remove_wat_from_lig=5.0, remove_wat_min_hbond=None, out_dir=None, log=None)
¶
Run protein preparation for this structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
use_pdb_ph |
bool
|
Whether to use pH information from PDB file. |
False
|
preprocess |
bool
|
Whether to run preprocessing. |
True
|
align |
Union[str, Path]
|
Path to reference structure for alignment. |
None
|
addh |
bool
|
Whether to add hydrogens. |
True
|
re_addh |
bool
|
Whether to ignore original hydrogens. |
True
|
metalbonds |
bool
|
Whether to treat metal bonds. |
True
|
disulfides |
bool
|
Whether to treat disulfide bonds. |
True
|
glycosylation |
bool
|
Whether to treat glycosylation. |
False
|
palmitoylation |
bool
|
Whether to treat palmitoylation. |
False
|
sel_to_met |
bool
|
Whether to convert selenomethionines to methionines. |
False
|
fill_loops |
bool
|
Whether to fill missing loops. |
False
|
fill_loops_fasta |
Union[str, Path]
|
Path to FASTA file for loop filling. |
None
|
fill_sidechains |
bool
|
Whether to fill missing sidechains. |
True
|
add_terminal_oxygens |
bool
|
Whether to add terminal oxygens. |
False
|
cap_termini |
bool
|
Whether to cap termini. |
False
|
epik |
bool
|
Whether to run Epik. |
True
|
epik_ph_range |
Tuple[float, float]
|
pH range for Epik in form of |
(7.4, 2.0)
|
epik_max_states |
int
|
Maximum number of Epik states to keep. |
1
|
hbond_assign |
bool
|
Whether to assign hydrogens. |
True
|
sample_water |
bool
|
Whether to sample water orientations. |
True
|
include_epik_states |
bool
|
Whether to include Epik states. |
True
|
use_crystal |
bool
|
Whether to use CCD information for ligands. |
False
|
propka_ph |
float
|
pH for PROPKA calculations. |
7.4
|
minimize_hs |
bool
|
Whether to minimize hydrogens. |
False
|
restrained_minimization |
bool
|
Whether to run restrained minimization. |
False
|
h_only |
bool
|
Whether to only minimize hydrogens in restrained minimization. |
False
|
force_field |
Optional[str]
|
Force field to use. Defaults to S-OPLS if available, else OPLS2005. |
None
|
max_rmsd |
float
|
Maximum RMSD for restrained minimization. |
0.3
|
remove_wat_from_lig |
Optional[float]
|
Remove waters beyond this distance from ligand. |
5.0
|
remove_wat_min_hbond |
Optional[int]
|
Remove waters with fewer H-bonds than this. |
None
|
out_dir |
Union[str, Path]
|
Output directory. |
None
|
log |
Union[Path, str]
|
Log file path. |
None
|
Returns:
| Type | Description |
|---|---|
Structure
|
The prepared structure. |
preprocess(fill_sidechains=False, epik=False, log=None)
¶
Run a quick protein preprocess with predefined options.
See: Structure.prepare().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fill_sidechains |
Whether to fill in missing sidechains. |
False
|
|
epik |
Whether to run Epik for protonation state generation. |
False
|
|
log |
Union[Path, str]
|
Path to log file, if None will suppress output. |
None
|
Returns:
| Type | Description |
|---|---|
Structure
|
The prepared structure. |
protein_atoms: Atoms
property
¶
Get all protein atoms as an Atoms object.
quick_fix(readd_hs=False, reassign_bonds=False)
¶
Performs quick structural fix including bond order assignment, hydrogen atom addition, and zero-level metal bond setting. The fix will be applied directly to the current structure.
See: Structure.prepare() and Structure.prepare() for a more advanced preparation workflow.
Note
For some scenarios, it may be more efficient to use add_h() or assign_bond_orders().
read(f_path)
classmethod
¶
Read the first stucture in a file.
remove_alt_pos()
¶
Remove alternate positions from the structure.
rename_chain(c, name)
¶
Rename a chain to a new name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
c |
Union[str, ChainAtoms]
|
Chain to rename, can be either a chain name string or a ChainAtoms instance. |
required |
name |
str
|
New name for the chain. |
required |
reorder_residues(sort_by_residue=False)
¶
Reorder atom indices in the structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sort_by_residue |
bool
|
Whether to sort residues by |
False
|
rotate(x_angle, y_angle, z_angle, atoms=None)
¶
Rotate the structure around x, y, z axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x_angle |
float
|
Rotation angle around x-axis in degrees. |
required |
y_angle |
float
|
Rotation angle around y-axis in degrees. |
required |
z_angle |
float
|
Rotation angle around z-axis in degrees. |
required |
atoms |
Atoms
|
Optional Atoms instance specifying which atoms to rotate. If None, rotates the whole structure. |
None
|
solvent_atoms: Atoms
property
¶
Get all solvent atoms as an Atoms object.
sync_rdkit_mol(mol, imap, cid=None)
¶
Synchronize coordinates from an RDKit molecule to this structure.
This method updates the 3D coordinates of atoms in the current structure based on the positions in an RDKit molecule. The mapping between RDKit and Schrodinger atom indices must be provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol |
Mol
|
The RDKit molecule to get coordinates from |
required |
imap |
Dict[int, int]
|
A dictionary mapping Schrodinger atom indices to RDKit atom indices
|
required |
cid |
int
|
Optional conformer ID to use from the RDKit molecule. If None, uses the first conformer. |
None
|
to_rdkit(inplicit_hydrogen=True, stereo=True)
¶
Convert the structure to an RDKit molecule.
This method converts the Schrodinger structure to an RDKit molecule object, with options to control hydrogen representation and stereochemistry.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inplicit_hydrogen |
bool
|
If True, converts hydrogens to implicit representation. If False, retains all original explicit hydrogens. |
True
|
stereo |
bool
|
Whether to preserve stereochemistry information during conversion. |
True
|
Returns:
| Type | Description |
|---|---|
Tuple[Mol, Dict[int, int]]
|
A tuple containing:
|
transform(mat, atoms=None)
¶
translate(x, y, z, atoms=None)
¶
Translate the structure along x, y, z axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
float
|
Translation distance along x-axis in Ångstrom. |
required |
y |
float
|
Translation distance along y-axis in Ångstrom. |
required |
z |
float
|
Translation distance along z-axis in Ångstrom. |
required |
atoms |
Atoms
|
Optional Atoms instance specifying which atoms to translate. If None, translates the whole structure. |
None
|
ungroup()
¶
Ungroup the structure from a subgroup.
water_atoms: Atoms
property
¶
Get all water atoms as an Atoms object.
zzlib.sch.structure.StructureConf
¶
A class representing a structure conformation.
This class stores a reference structure and coordinates for a single conformation, allowing memory-efficient storage of multiple conformers by only keeping the coordinate differences.
zzlib.sch.structure.StructureConfIter
¶
Bases: StructureIter
Iterator for structure conformations.
This class provides iteration over structure conformations, yielding StructureConf objects that efficiently store conformational data.
confs()
¶
Iterate over conformations.
Yields:
| Type | Description |
|---|---|
StructureConf
|
StructureConf objects containing the coordinates for each conformation, |
StructureConf
|
using the first structure as the reference. |
zzlib.sch.structure.StructureError
¶
Bases: Exception
Base exception class for structure-related errors.
zzlib.sch.structure.StructureIter
¶
Bases: Generic[T]
A class that provides memory-efficient iteration over multiple Structure objects.
This class is designed to avoid loading all structures into memory at once. Instead, it manages structures dynamically by loading them only when needed during iteration. This is especially useful when dealing with large structure files or many structures.
The structures can come from:
- A single structure file
- A list of structure files
- A list of Structure objects
- An iterator yielding structures
Key features:
- Lazy loading: Structures are read only when accessed.
- Memory efficient: Only keeps required structures in memory.
- Flexible input: Accepts various sources of structures.
- Iterator support: Can be used with iterator sources for streaming processing.
__init__(sts=None, length=None, tee_only=False, item_cls=None)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sts |
Union[Path, str, Iterable[Union[Structure, Path, str]]]
|
The Structure source - can be a file path, Structure object, or iterable of either. |
None
|
length |
int
|
Required length if using an iterator source. |
None
|
tee_only |
bool
|
If True, can only be copied using tee() method. |
False
|
item_cls |
Type[Structure]
|
Optional custom Structure cls for item to use. |
None
|
append(value)
¶
Append a value to the structure list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value |
An iterable of structures to append. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the source is not a list (i.e. is an iterator or file). |
extend(value)
¶
Extend the structure list with values from an iterable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value |
An iterable of structures to append. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the source is not a list (i.e. is an iterator or file). |
first: T
property
¶
Get the first structure.
multiply(n)
¶
Set tee-only and tee this Structure iterator to multiple iterators.
See: set_tee_only(), tee().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n |
int
|
Number of iterators to create. |
required |
Returns:
| Type | Description |
|---|---|
|
List of |
Note
When using an iterator as the Structure source, this method must be called before the original iterator is consumed.
set_tee_only()
¶
Set this Structure iterator to a master iterator.
After calling this method, this instance can only be used to create new iterators via tee() or multiply(), but cannot be iterated directly.
See: tee(), multiply().
Note
When using an iterator as the Structure source, this method must be called before the original iterator is consumed.
sts
property
¶
Get the structure source itself.
tee(maxidx=None)
¶
Create a new iterator that shares the same source structures with intelligent caching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
maxidx |
Maximum index to cache for the iterator. Important to set if you don't plan to iterate through all structures, otherwise other iterators will keep waiting for this iterator to consume the cache. |
None
|
Returns:
| Type | Description |
|---|---|
StructureIter[T]
|
A new StructureIter if using an iterator source, otherwise returns self. |
Example
structures = StructureIter("structures.mae")
structures.set_tee_only()
# Only process first 100 structures
iter1 = structures.tee(maxidx=100)
# Process all structures
iter2 = structures.tee()
# Cache for first 100 structures will be cleared after both iter1 and iter2 finish
# Cache for structures after 100 will be cleared immediately after iter2 processes them
Note
When using an iterator as the Structure source, this method must be called before the original iterator is consumed.
using_iterator()
¶
Check if is using an iterator as the Structure source.
zzlib.sch.structure.SubstructureNotFoundError
¶
Bases: StructureError
Exception raised when a requested substructure is not found in the parent structure.