Molecule#

Molecule utilities#

Various molecule and isomer handling steps, including isomer generation and embedding.

class maize.steps.mai.molecule.Smiles2Molecules(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Converts SMILES codes into a set of molecules with distinct isomers and conformers using the RDKit embedding functionality.

See also

Gypsum

A more advanced procedure for producing different isomers and high-energy conformers.

inp: Input[list[str]]

SMILES input

out: Output[list[IsomerCollection]]

Molecule output

n_conformers: Parameter[int]

Number of conformers to generate (default = 1)

n_variants: Parameter[int]

Maximum number of stereoisomers to generate (default = 1)

embed: Parameter[bool]

Whether to create embeddings for the molecule. May not be required if passing it on to another embedding system. (default = True)

class maize.steps.mai.molecule.Gypsum(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Converts SMILES codes into a set of 3D molecules using Gypsum-DL [1].

3D embedding can fail, and in those cases it falls back on RDKit.

Notes

The implementation in this node does not use the MPI capabilities of Gypsum, and simply installing MPI4PY can cause problems executing this step on some HPC systems. So it might be better to simply not install it for this use case.

References

See also

Smiles2Molecules

A simple, fast, and less accurate alternative to Gypsum, using RDKit embedding functionality.

required_callables: ClassVar[list[str]] = ['gypsum']

List of external commandline programs that are required for running the component.

inp: Input[list[str]]

SMILES input

out: Output[list[IsomerCollection]]

Molecule output

n_variants: Parameter[int]

Maximum number of variants to generate (default = 1)

thoroughness: Parameter[int]

Multiplier for the number of sampled conformers to evaluate energies. Higher numbers will increase the computational cost by performing more UFF energy evaluations. (default = 3)

ph_range: Parameter[tuple[float, float]]

The pH range in which to generate variants (min, max) (default = (6.4, 8.4))

use_filters: Parameter[bool]

Whether to use additional substructure filters from the Durrant lab (default = True)

n_jobs: Parameter[int]

Number of parallel processes to use (default = 2)

timeout: Parameter[int]

Timeout per SMILES in seconds, will attempt an RDKit embedding after (default = 5)

class maize.steps.mai.molecule.SaveMolecule(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Save a molecule to an SDF file.

inp: Input[IsomerCollection]

Molecule input

path: FileParameter[Path]

SDF output destination

class maize.steps.mai.molecule.SaveScores(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Save VINA Scores to a JSON file.

inp: Input[ndarray[Any, dtype[float32]]]

Molecule input

path: FileParameter[Path]

JSON output destination

class maize.steps.mai.molecule.LoadSmiles(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Load SMILES codes from a .smi file.

path: FileParameter[Path]

SMILES file input

out: Output[list[str]]

SMILES output

sample: Parameter[int]

Take a sample of SMILES

class maize.steps.mai.molecule.SaveLibrary(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Save a list of molecules to multiple SDF files.

inp: Input[list[IsomerCollection]]

Molecule library input

base_path: FileParameter[Path]

Base output file path name without a suffix, i.e. /path/to/output

class maize.steps.mai.molecule.LoadLibrary(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Load a small molecule library from an SDF file

path: FileParameter[Path]

Input SDF file

out: Output[list[IsomerCollection]]

Molecule output, each entry in the SDF is parsed as a separate molecule

class maize.steps.mai.molecule.LoadMolecule(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Load a molecule from an SDF file.

out: Output[Isomer]

Isomer output

path: Input[Path]

Path to the SDF file

class maize.steps.mai.molecule.Ligprep(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

Calls Schrodinger’s Ligprep tool to embed small molecules and create isomers.

Notes

Due to Schrodinger’s licensing system, each call to a tool requires going through Schrodinger’s job server. This is run separately for each job to avoid conflicts with a potentially running main server.

See also

Smiles2Molecules

A simple, fast, and less accurate alternative to Gypsum and Ligprep, using RDKit embedding functionality.

Gypsum

A more advanced procedure for producing different isomers and high-energy conformers, and an open-source alternative to ligprep.

required_callables: ClassVar[list[str]] = ['ligprep']

List of external commandline programs that are required for running the component.

inp: Input[list[str]]

SMILES input

out: Output[list[IsomerCollection]]

Embedded isomer collection output

epik: Parameter[bool]

Whether to use Epik for ionization and tautomerization (default = True)

ionization: Parameter[Literal[0, 1, 2]]

0 - do not ionize / neutralize, 1 - only neutralize, 2 - both

Type:

Ionization treatment (default = 1)

ph: Parameter[float]

Target pH

host: Parameter[str]

Host to use for job submission (default = localhost)

n_jobs: Parameter[int]

Number of jobs to spawn (default = 1)

prepare() None

Prepares the execution environment for run.

Performs the following:

  • Changing the python environment, if required

  • Setting of environment variables

  • Setting of parameters from the config

  • Loading LMOD modules

  • Importing python packages listed in required_packages

  • Checking if software in required_callables is available

run_command(command: str | list[str], validators: Sequence[Validator] | None = None, verbose: bool = False, raise_on_failure: bool = True, command_input: str | None = None, pre_execution: str | list[str] | None = None, batch_options: JobResourceConfig | None = None, timeout: float | None = None) CompletedProcess[bytes]

Runs an external command.

Parameters:
  • command – Command to run as a single string, or a list of strings

  • validators – One or more Validator instances that will be called on the result of the command.

  • verbose – If True will also log any STDOUT or STDERR output

  • raise_on_failure – Whether to raise an exception when encountering a failure

  • command_input – Text string used as input for command

  • pre_execution – Command to run directly before the main one

  • batch_options – Job options for the batch system, if given, will attempt run on the batch system

  • timeout – Maximum runtime for the command in seconds, or unlimited if None

Returns:

Result of the execution, including STDOUT and STDERR

Return type:

subprocess.CompletedProcess[bytes]

Raises:

ProcessError – If any of the validators failed or the returncode was not zero

Examples

To run a single command:

>>> self.run_command("echo foo", validators=[SuccessValidator("foo")])

To run on a batch system, if configured:

>>> self.run_command("echo foo", batch_options=JobResourceConfig(nodes=1))
ph_tolerance: Parameter[float]

pH tolerance

max_stereo: Parameter[int]

Maximum number of stereoisomers to generate (default = 32)

class maize.steps.mai.molecule.ToSmiles(parent: Graph | None = None, name: str | None = None, description: str | None = None, fail_ok: bool = False, n_attempts: int = 1, level: int | str | None = None, cleanup_temp: bool = True, resume: bool = False, logfile: Path | None = None, max_cpus: int | None = None, max_gpus: int | None = None, loop: bool | None = None, max_loops: int = -1, initial_status: Status = Status.NOT_READY)[source]

transform an isomer or IsomerCollection to SMILES

inp: Input[Isomer | IsomerCollection]

SMILES output

out: Output[List[str]]

SMILES output