Docking#
Docking#
Steps performing some form of docking, starting from a
Isomer
instance.
- class maize.steps.mai.docking.Glide(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Calls Schrodinger’s GLIDE to dock small molecules.
Notes
Due to Schrodinger’s licensing system, each call to a tool requires going through Schrodinger’s job server. This is run separately for each job to avoid conflicts with a potentially running main server.
See also
Vina
A popular open-source docking program
AutoDockGPU
Another popular open-source docking tool with GPU support
- required_callables: ClassVar[list[str]] = ['glide']
List of external commandline programs that are required for running the component.
- inp: Input[list[IsomerCollection]]
Molecules to dock
- out: Output[list[IsomerCollection]]
Docked molecules with poses and energies included
- host: Parameter[str]
Host to use for job submission (
default = localhost
)
- n_jobs: Parameter[int]
Number of jobs to spawn (
default = 1
)
- prepare() None
Prepares the execution environment for run.
Performs the following:
Changing the python environment, if required
Setting of environment variables
Setting of parameters from the config
Loading LMOD modules
Importing python packages listed in required_packages
Checking if software in required_callables is available
- run_command(command: str | list[str], validators: Sequence[Validator] | None = None, verbose: bool = False, raise_on_failure: bool = True, command_input: str | None = None, pre_execution: str | list[str] | None = None, batch_options: JobResourceConfig | None = None, timeout: float | None = None) CompletedProcess[bytes]
Runs an external command.
- Parameters:
command – Command to run as a single string, or a list of strings
validators – One or more Validator instances that will be called on the result of the command.
verbose – If
True
will also log any STDOUT or STDERR outputraise_on_failure – Whether to raise an exception when encountering a failure
command_input – Text string used as input for command
pre_execution – Command to run directly before the main one
batch_options – Job options for the batch system, if given, will attempt run on the batch system
timeout – Maximum runtime for the command in seconds, or unlimited if
None
- Returns:
Result of the execution, including STDOUT and STDERR
- Return type:
- Raises:
ProcessError – If any of the validators failed or the returncode was not zero
Examples
To run a single command:
>>> self.run_command("echo foo", validators=[SuccessValidator("foo")])
To run on a batch system, if configured:
>>> self.run_command("echo foo", batch_options=JobResourceConfig(nodes=1))
- class maize.steps.mai.docking.Vina(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Runs Vina [1] on a molecule input.
The step expects to either find a
vina
executable in thePATH
, an appropriate module defined inconfig.toml
, or a module specified using themodules
attribute.References
- inp: Input[list[IsomerCollection]]
List of molecules to dock
- n_jobs: Parameter[int]
Number of docking runs to perform in parallel (
default = 2
)
- n_poses: Parameter[int]
Number of poses to generate (
default = 1
)
- out: Output[list[IsomerCollection]]
Docked molecules with conformations and scores attached
- prepare() None
Prepares the execution environment for run.
Performs the following:
Changing the python environment, if required
Setting of environment variables
Setting of parameters from the config
Loading LMOD modules
Importing python packages listed in required_packages
Checking if software in required_callables is available
- receptor: FileParameter[Annotated[Path, Suffix('pdbqt')]]
Path to the receptor structure
- search_range: Parameter[tuple[float, float, float]]
Range of the search space for docking (
default = (15.0, 15.0, 15.0)
)
- seed: Parameter[int]
The default seed (
default = 42
)
- class maize.steps.mai.docking.VinaGPU(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Runs Vina-GPU [3] on a molecule input.
The step expects to either find a
vina
executable in thePATH
, an appropriate module defined inconfig.toml
, or a module specified using themodules
attribute.Notes
The interface is mostly the same as Vina’s, but requires some additional handling of the custom compiled kernels, a small change in the commandline parameters, and allows for docking a directory of ligands at once. The source can be found here. Installation requires both the boost sources and installed headers, and
-DOPENCL_3_0
should not be specified (contrary to the official installation instructions).References
[3] (1,2) Ding, J. et al. Vina-GPU 2.0: Further Accelerating AutoDock Vina and Its Derivatives with Graphics Processing Units. J. Chem. Inf. Model. (2023) doi:10.1021/acs.jcim.2c01504.
[4] Trott, O. & Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry 31, 455-461 (2010).
- inp: Input[list[IsomerCollection]]
List of molecules to dock
- n_jobs: Parameter[int]
Number of docking runs to perform in parallel (
default = 2
)
- n_poses: Parameter[int]
Number of poses to generate (
default = 1
)
- out: Output[list[IsomerCollection]]
Docked molecules with conformations and scores attached
- prepare() None
Prepares the execution environment for run.
Performs the following:
Changing the python environment, if required
Setting of environment variables
Setting of parameters from the config
Loading LMOD modules
Importing python packages listed in required_packages
Checking if software in required_callables is available
- receptor: FileParameter[Annotated[Path, Suffix('pdbqt')]]
Path to the receptor structure
- search_range: Parameter[tuple[float, float, float]]
Range of the search space for docking (
default = (15.0, 15.0, 15.0)
)
- seed: Parameter[int]
The default seed (
default = 42
)
- class maize.steps.mai.docking.QuickVinaGPU(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Runs QuickVina2 or QuickVina-W for GPUs [3] on a molecule input. For an overview, see this.
The step expects to either find a
quickvina
executable in thePATH
, an appropriate module defined inconfig.toml
, or a module specified using themodules
attribute.Notes
The interface is mostly the same as Vina’s, but requires some additional handling of the custom compiled kernels, a small change in the commandline parameters, and allows for docking a directory of ligands at once. The source can be found here. Installation requires both the boost sources and installed headers, and
-DOPENCL_3_0
should not be specified (contrary to the official installation instructions).References
[5] Hassan, N. M., Alhossary, A. A., Mu, Y. & Kwoh, C.-K. Protein-Ligand Blind Docking Using QuickVina-W With Inter-Process Spatio-Temporal Integration. Sci Rep 7, 15451 (2017).
[6] Alhossary, A., Handoko, S. D., Mu, Y. & Kwoh, C.-K. Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics 31, 2214-2216 (2015).
- inp: Input[list[IsomerCollection]]
List of molecules to dock
- n_jobs: Parameter[int]
Number of docking runs to perform in parallel (
default = 2
)
- n_poses: Parameter[int]
Number of poses to generate (
default = 1
)
- out: Output[list[IsomerCollection]]
Docked molecules with conformations and scores attached
- prepare() None
Prepares the execution environment for run.
Performs the following:
Changing the python environment, if required
Setting of environment variables
Setting of parameters from the config
Loading LMOD modules
Importing python packages listed in required_packages
Checking if software in required_callables is available
- receptor: FileParameter[Annotated[Path, Suffix('pdbqt')]]
Path to the receptor structure
- search_range: Parameter[tuple[float, float, float]]
Range of the search space for docking (
default = (15.0, 15.0, 15.0)
)
- seed: Parameter[int]
The default seed (
default = 42
)
- class maize.steps.mai.docking.AutoDockGPU(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Runs AutoDock on the GPU [7].
Notes
Clone the repo from here, load modules for the compiler and CUDA, set
GPU_INCLUDE_PATH
andGPU_LIBRARY_PATH
, and runmake DEVICE=CUDA
. This also requires meeko to convert to and from pdbqt files, specify mk_prepare and mk_export.If you get very high docking scores this often means that the ligand is outside of the grid. This can be due to a map that is too small (increase
search_range
) or a misplaced box that is hard to access (modifysearch_center
).References
[7] Santos-Martins, D. et al. Accelerating AutoDock4 with GPUs and Gradient-Based Local Search. J. Chem. Theory Comput. 17, 1060-1073 (2021).
- required_packages: ClassVar[list[str]] = ['meeko']
Requires a custom environment with
meeko==0.4
installed
- inp: Input[list[IsomerCollection]]
List of molecules to dock, each molecule can have multiple isomers, these will be docked separately.
- out: Output[list[IsomerCollection]]
Docked molecules with conformations and scores attached. Also include per-conformer clustering information performed by AutoDock, use the keys ‘rmsd’, ‘cluster_rmsd’, ‘cluster’ to access.
- out_scores: Output[ndarray[Any, dtype[float32]]]
Docking scores, the best for each docked IsomerCollection
- grid_file: FileParameter[Path]
The protein grid file, all internally referenced files must be available
- derivtypes: Parameter[dict[str, str]]
Atomtype mappings to add to
derivtype
, e.g. NA->N (default = {}
)
- class maize.steps.mai.docking.VinaScore(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Runs Vina scoring [1] on a molecule input.
The step expects to either find a
vina
executable in thePATH
, an appropriate module defined inconfig.toml
, or a module specified using themodules
attribute.- required_packages: ClassVar[list[str]] = ['meeko']
Requires a custom environment with
meeko==0.4
installed
- inp: Input[list[IsomerCollection]]
List of molecules to dock
- out: Output[list[IsomerCollection]]
Molecules with scores attached.
- out_scores: Output[ndarray[Any, dtype[float32]]]
Docking scores, the best for each docked IsomerCollection
- receptor: FileParameter[Path]
Path to the receptor structure
- class maize.steps.mai.docking.PrepareGrid(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Prepares a receptor for docking with AutoDock4.
- required_callables: ClassVar[list[str]] = ['prepare_receptor', 'write_gpf', 'autogrid']
Requires various scripts and tools:
- write_gpf
Script to create GPF output with all possible atomtypes, from here.
- prepare_receptor
Included in
AutoDockTools
.- autogrid
Included in the normal CPU-only version of AutoDock
- required_packages: ClassVar[list[str]] = ['meeko']
Requires a custom environment with
meeko
installed
- inp_ligand: Input[Isomer]
Reference ligand structure, if not provided requires search_center to be set
- class maize.steps.mai.docking.PreparePDBQT(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Prepares a receptor for docking with Vina.
- required_callables: ClassVar[list[str]] = ['prepare_receptor']
Requires various scripts and tools:
- prepare_receptor
Included in
AutoDockTools
.
- repairs: Parameter[Literal['bonds_hydrogens', 'bonds', 'hydrogens', 'checkhydrogens', 'None']]
Types of repairs to be done to the PDB file (
default = None
)
- preserve_charges: Parameter[bool]
Whether to preserve existing charges instead of adding Gasteiger charges (
default = False
)
- class maize.steps.mai.docking.ROCS(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Performs ROCS shape-match scoring [8].
Notes
Requires a maize environment with
openeye-toolkit
installed. OpenEye in turn requires the OE_LICENSE environment variable to be set to a valid license file.References
[8] Grant, J. A., Gallardo, M. A. & Pickup, B. T. A fast method of molecular shape comparison: A simple application of a Gaussian description of molecular shape. Journal of Computational Chemistry 17, 1653-1666 (1996).
See also the full list of related publications.
- inp: Input[list[IsomerCollection]]
List of molecules to be scored
- out: Output[list[IsomerCollection]]
List of molecules with conformers best matching the query
- query: FileParameter[Path]
Reference query molecule
- max_stereo: Parameter[int]
Maximum number of stereocenters to be enumerated in molecule (
default = 10
)
- similarity_measure: Parameter[Literal['Tanimoto', 'RefTversky', 'FitTversky']]
Similarity between reference and molecule (
default = Tanimoto
)
- class maize.steps.mai.docking.RMSDFilter(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Charge filtering for isomers and RMSD filtering for conformers.
Only isomers with target charge pass filter. For each isomer, only conformers that minmize RMSD to a given reference ligand are considered. If several isomers with target charge remain after charge filtering, either the isomer with smallest RMSD or lowest docking score pass through the filter. At the end, only one isomer with one conformer (or none) per SMILES pass the filter.
- inp: Input[list[IsomerCollection]]
List of molecules with isomers and conformations (from single SMILES) to filter
- out: Output[list[IsomerCollection]]
List of molecules with single isomer and conformer after filtering
- ref_lig: FileParameter[Path]
Path to the reference ligand
- reference_charge_type: Parameter[Literal['ref', 'target', 'no']]
If ‘ref’ is given then the charge of the reference ligand is the target charge. If ‘target’ is given, the charge specified under
target_charge
is used. If ‘no’ is given, every isomer charge is accepted. (default = target
)
- strict_target_charge: Parameter[bool]
If true and no isomer with target charge is found, an empty isomer list passes the filter. This is useful for RBFE calculations where FEP edges with changes in charge are unsuitable. If false and no isomer with target charge is found, accept any other isomer charge. This is useful for a standard REINVENT run where for each SMILES a conformation is passing the filter. (
default = True
)
- isomer_filter: Parameter[Literal['dock', 'rmsd', 'combo']]
If after filtering out isomers with wrong charge more than one isomer remain pass isomer with lowest docking score when set to ‘dock’, pass isomer with lowest rmsd when set to ‘rmsd’ or pass isomer with lowest combined score when set to ‘combo’. (
default = dock
)