rxnutils.chem.features package

Submodules

rxnutils.chem.features.reaction_centre_fp module

Calculates reaction centre RDKit fingerprints

class rxnutils.chem.features.reaction_centre_fp.ReactionCentreFingerprint(numbits=1024, max_centers=None)

Bases: object

Reaction featurizer based on a native RDKit functions for reaction centre atoms

if max_centers is not set, the output of the featurization will be a list of each centre’s fingerprint - but if max_centers is given the output is a flattend and concatenated list of all centres up to max_centers. the fingerprint is padded with zeros, and an initial bit indicate the number of fingerprints that has been concatenated

Params numbits:

length of fingerprint

Params max_centers:

if given, will concatenate the individual FPs up to a maximum of centres

Parameters:
  • numbits (int)

  • max_centers (int | None)

numbits: int = 1024
max_centers: int | None = None
rxnutils.chem.features.reaction_centre_fp.reaction_center_similarity(fingerprints1, fingerprints2)

Calculate the maximum similarity between two sets of reaction center fingerprints

Params fingerprints1:

the first set of center fingerprints

Params fingerprints2:

the second set of center fingerprints

Returns:

the maximum Jaccard distance

Parameters:
  • fingerprints1 (List[List[float]])

  • fingerprints2 (List[List[float]])

Return type:

float

rxnutils.chem.features.rxnfp_runner module

Module containing script to calculate RXNFP for a set of reactions

rxnutils.chem.features.rxnfp_runner.main(input_args=None)

Function for command-line tool

Parameters:

input_args (Sequence[str] | None)

Return type:

None

rxnutils.chem.features.sc_score module

Module containing the implementation of the SC-score model for synthetic complexity scoring.

class rxnutils.chem.features.sc_score.SCScore(model_path, fingerprint_length=1024, fingerprint_radius=2)

Bases: object

Encapsulation of the SCScore model

Re-write of the SCScorer from the scscorer package using ONNX

The predictions of the score is made with a sanitized instance of an RDKit molecule

mol = Chem.MolFromSmiles(CCC)
scscorer = SCScorer("path_to_model")
score = scscorer(mol)
Parameters:
  • model_path (str) – the filename of the model weights and biases

  • fingerprint_length (int) – the number of bits in the fingerprint

  • fingerprint_radius (int) – the radius of the fingerprint

rxnutils.chem.features.simple_rdkit module

Calculates RDKit fingerprints for reactions

class rxnutils.chem.features.simple_rdkit.SimpleRdkitFingerprint(featurizer, numbits=2048, product_bits=0)

Bases: object

Reaction featurizer based on a native RDKit functions

The featurizer is used by calling it with a reaction SMILES and a list of float constituting the fingerprints

Params featurizer:

the type of featurizer

Params numbits:

length of fingerprint

Params product_bits:

the number of bits for the product

Parameters:
  • featurizer (str)

  • numbits (int)

  • product_bits (int)

featurizer: str
numbits: int = 2048
product_bits: int = 0
rxnutils.chem.features.simple_rdkit.fingerprint_mixed(mol, numbits)

Calculates a mixed fingerprint first half being an ECFP3 and the second half being an RDKit fingerprint

Params mol:

the molecule

Params numbits:

the length of the finger fingerprint

Parameters:
  • mol (Mol)

  • numbits (int)

Return type:

ndarray

rxnutils.chem.features.simple_rdkit.fingerprint_ecfp(mol, numbits)

Calculates an ECFP6

Params mol:

the molecule

Params numbits:

the length of the fingal fingerprint

Parameters:
  • mol (Mol)

  • numbits (int)

Return type:

ndarray

Module contents