Molecule#
Molecule utilities#
Various molecule and isomer handling steps, including isomer generation and embedding.
- class maize.steps.mai.molecule.Smiles2Molecules(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Converts SMILES codes into a set of molecules with distinct isomers and conformers using the RDKit embedding functionality.
See also
Gypsum
A more advanced procedure for producing different isomers and high-energy conformers.
- out: Output[list[IsomerCollection]]
Molecule output
- class maize.steps.mai.molecule.Gypsum(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Converts SMILES codes into a set of 3D molecules using Gypsum-DL [1].
3D embedding can fail, and in those cases it falls back on RDKit.
Notes
The implementation in this node does not use the MPI capabilities of Gypsum, and simply installing
MPI4PY
can cause problems executing this step on some HPC systems. So it might be better to simply not install it for this use case.References
See also
Smiles2Molecules
A simple, fast, and less accurate alternative to Gypsum, using RDKit embedding functionality.
- required_callables: ClassVar[list[str]] = ['gypsum']
List of external commandline programs that are required for running the component.
- out: Output[list[IsomerCollection]]
Molecule output
- thoroughness: Parameter[int]
Multiplier for the number of sampled conformers to evaluate energies. Higher numbers will increase the computational cost by performing more UFF energy evaluations. (
default = 3
)
- ph_range: Parameter[tuple[float, float]]
The pH range in which to generate variants (min, max) (
default = (6.4, 8.4)
)
- class maize.steps.mai.molecule.SaveMolecule(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Save a molecule to an SDF file.
- inp: Input[IsomerCollection]
Molecule input
- path: FileParameter[Path]
SDF output destination
- class maize.steps.mai.molecule.SaveScores(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Save VINA Scores to a JSON file.
- path: FileParameter[Path]
JSON output destination
- class maize.steps.mai.molecule.LoadSmiles(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Load SMILES codes from a
.smi
file.- path: FileParameter[Path]
SMILES file input
- class maize.steps.mai.molecule.SaveLibrary(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Save a list of molecules to multiple SDF files.
- inp: Input[list[IsomerCollection]]
Molecule library input
- base_path: FileParameter[Path]
Base output file path name without a suffix, i.e. /path/to/output
- class maize.steps.mai.molecule.LoadLibrary(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Load a small molecule library from an SDF file
- path: FileParameter[Path]
Input SDF file
- out: Output[list[IsomerCollection]]
Molecule output, each entry in the SDF is parsed as a separate molecule
- class maize.steps.mai.molecule.LoadMolecule(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Load a molecule from an SDF file.
- class maize.steps.mai.molecule.Ligprep(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
Calls Schrodinger’s Ligprep tool to embed small molecules and create isomers.
Notes
Due to Schrodinger’s licensing system, each call to a tool requires going through Schrodinger’s job server. This is run separately for each job to avoid conflicts with a potentially running main server.
See also
Smiles2Molecules
A simple, fast, and less accurate alternative to Gypsum and Ligprep, using RDKit embedding functionality.
Gypsum
A more advanced procedure for producing different isomers and high-energy conformers, and an open-source alternative to ligprep.
- required_callables: ClassVar[list[str]] = ['ligprep']
List of external commandline programs that are required for running the component.
- out: Output[list[IsomerCollection]]
Embedded isomer collection output
- ionization: Parameter[Literal[0, 1, 2]]
0 - do not ionize / neutralize, 1 - only neutralize, 2 - both
- Type:
Ionization treatment (
default = 1
)
- host: Parameter[str]
Host to use for job submission (
default = localhost
)
- n_jobs: Parameter[int]
Number of jobs to spawn (
default = 1
)
- prepare() None
Prepares the execution environment for run.
Performs the following:
Changing the python environment, if required
Setting of environment variables
Setting of parameters from the config
Loading LMOD modules
Importing python packages listed in required_packages
Checking if software in required_callables is available
- run_command(command: str | list[str], validators: Sequence[Validator] | None = None, verbose: bool = False, raise_on_failure: bool = True, command_input: str | None = None, pre_execution: str | list[str] | None = None, batch_options: JobResourceConfig | None = None, timeout: float | None = None) CompletedProcess[bytes]
Runs an external command.
- Parameters:
command – Command to run as a single string, or a list of strings
validators – One or more Validator instances that will be called on the result of the command.
verbose – If
True
will also log any STDOUT or STDERR outputraise_on_failure – Whether to raise an exception when encountering a failure
command_input – Text string used as input for command
pre_execution – Command to run directly before the main one
batch_options – Job options for the batch system, if given, will attempt run on the batch system
timeout – Maximum runtime for the command in seconds, or unlimited if
None
- Returns:
Result of the execution, including STDOUT and STDERR
- Return type:
- Raises:
ProcessError – If any of the validators failed or the returncode was not zero
Examples
To run a single command:
>>> self.run_command("echo foo", validators=[SuccessValidator("foo")])
To run on a batch system, if configured:
>>> self.run_command("echo foo", batch_options=JobResourceConfig(nodes=1))
- class maize.steps.mai.molecule.ToSmiles(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]
transform an isomer or IsomerCollection to SMILES
- inp: Input[Isomer | IsomerCollection]
SMILES output