aizynthfinder.analysis package

Submodules

aizynthfinder.analysis.routes module

Module containing classes to store and manipulate collections of synthetic routes.

class aizynthfinder.analysis.routes.RouteCollection(reaction_trees, **kwargs)

Bases: object

Holds a collections of reaction routes.

If can be the top scored nodes, their scores and the reaction trees created from them. It can also be a cluster of such routes.

The class has the functionality to compute collective results for the different routes such as images.

Properties of individual route can be obtained with simple indexing.

route0 = collection[0]
Variables:
  • all_scores – all the computed scores for the routes

  • nodes – the top-ranked MCTS-like nodes

  • scores – initial scores of top-ranked nodes or routes

  • reaction_trees – the reaction trees created from the top-ranked nodes

  • clusters – the created clusters from the collection

  • route_metadata – the metadata of the reaction trees

Parameters:

reaction_trees (Sequence[ReactionTree]) – the trees to base the collection on

classmethod from_analysis(analysis, selection=None)

Create a collection from a tree analysis.

Parameters:
Returns:

the created collection

Return type:

RouteCollection

property dicts: Sequence[StrDict]

Returns a list of dictionary representation of the routes

property images: Sequence[PilImage | None]

Returns a list of pictoral representation of the routes

property jsons: Sequence[str]

Returns a list of JSON string representation of the routes

cluster(n_clusters, max_clusters=5, distances_model='ted', **kwargs)

Cluster the route collection into a number of clusters.

Additional arguments to the distance or clustering algorithm can be passed in as key-word arguments.

When distances_model is “lstm”, a key-word argument model_path needs to be given when distances_model is “ted”, two optional key-word arguments timeout and content can be given.

If the number of reaction trees are less than 3, no clustering will be performed

Parameters:
  • n_clusters (int) – the desired number of clusters, if less than 2 triggers optimization

  • max_clusters (int) – the maximum number of clusters to consider

  • distances_model (str) – can be ted or lstm and determines how the route distances are computed

  • kwargs (Any)

Returns:

the cluster labels

Return type:

np.ndarray

combined_reaction_trees(recreate=False)

Return an object that combines all the reaction tree into a single reaction tree graph

Parameters:

recreate (bool) – if False will return a cached object if available, defaults to False

Returns:

the combined trees

Return type:

CombinedReactionTrees

compute_scores(*scorers)

Compute new scores for all routes in this collection. They can then be accessed with the all_scores attribute.

Parameters:

scorers (Scorer)

Return type:

None

dict_with_extra(include_scores=False, include_metadata=False)

Return the routes as dictionaries with optionally all scores and all metadata added to the root (target) node.

Returns:

the routes as dictionaries

Return type:

Sequence[StrDict]

dict_with_scores()

Return the routes as dictionaries with all scores added to the root (target) node.

Returns:

the routes as dictionaries

Return type:

Sequence[StrDict]

distance_matrix(recreate=False, model='ted', **kwargs)

Compute the distance matrix between each pair of reaction trees

All key-word arguments are passed along to the route_distance_calculator function from the route_distances package.

When model is “lstm”, a key-word argument model_path needs to be given when model is “ted”, two optional key-word arguments timeout and content can be given.

Parameters:
  • recreate (bool) – if False, use a cached one if available

  • model (str) – the type of model to use “ted” or “lstm”

  • kwargs (Any)

Returns:

the square distance matrix

Return type:

np.ndarray

make_dicts()

Convert all reaction trees to dictionaries

Return type:

Sequence[StrDict]

make_images()

Convert all reaction trees to images

Return type:

Sequence[Optional[PilImage]]

make_jsons()

Convert all reaction trees to JSON strings

Return type:

Sequence[str]

rescore(scorer)

Rescore the routes in the collection, and thereby re-order them.

This will replace the scores attribute, and update the all_scores attribute with another entry.

Parameters:

scorer (Scorer) – the scorer to use

Return type:

None

aizynthfinder.analysis.tree_analysis module

Module containing classes to perform analysis of the tree search results.

class aizynthfinder.analysis.tree_analysis.TreeAnalysis(search_tree, scorer=None)

Bases: object

Class that encapsulate various analysis that can be performed on a search tree.

Variables:
  • scorers – the objects used to score the nodes

  • search_tree – the search tree

Parameters:
best()

Returns the route or MCTS-like node with the highest score. If several routes have the same score, it will return the first

Returns:

the top scoring node or route

Raises:

ValueError – if this is a multi-objective analysis, a single best solution cannot be return

Return type:

_Solution

pareto_front()

Returns the Routes or MCTS-like Nodes that are the Pareto Front in the Multi-Objective search.

Returns:

the pareto front solutions.

Raises:

ValueError – if this is a single-objective analysis, there is no Pareto-front

Return type:

Tuple[_Solution, …]

sort(selection=None)

Sort and select the nodes or routes in the search tree.

The score for each solution is a dictionary of scores, one for each of the objectives.

Parameters:

selection (Optional[RouteSelectionArguments]) – selection criteria for the routes

Returns:

the sorted and selected items

Returns:

the scores of the sorted items

Return type:

Tuple[_AnyListOfSolutions, Sequence[Dict[str, float]]]

tree_statistics()

Returns statistics of the tree

Currently it returns the number of nodes, the maximum number of transforms, maximum number of children, top score, if the top score route is solved, the number of molecule in the top score node, and information on pre-cursors

Returns:

the statistics

Return type:

StrDict

aizynthfinder.analysis.utils module

Helper routines and class for the aizynthfinder.analysis package. To avoid clutter in that package, larger utility algorithms are placed herein.

class aizynthfinder.analysis.utils.RouteSelectionArguments(nmin=5, nmax=25, return_all=False)

Bases: object

Selection arguments for the tree analysis class

If return_all is False, it will return at least nmin routes and if routes have the same score it will return them as well up to nmax routes.

If return_all is True, it will return all solved routes if there is at least one is solved, otherwise the nmin and nmax will be used.

Parameters:
  • nmin (int)

  • nmax (int)

  • return_all (bool)

nmin: int = 5
nmax: int = 25
return_all: bool = False
class aizynthfinder.analysis.utils.CombinedReactionTrees(reaction_trees)

Bases: object

Encapsulation of an algorithm that combines several reaction trees into a larger bipartite graph with all reactions and molecules.

The reactions at a specific level of the reaction trees are grouped based on the reaction smiles.

Params reactions_trees:

the list of reaction trees to combine

Parameters:

reaction_trees (Sequence[ReactionTree])

to_dict()

Returns the graph as a dictionary in a pre-defined format.

Returns:

the combined reaction trees

Return type:

StrDict

to_visjs_page(filename, in_stock_colors=None)

Create a visualization of the combined reaction tree using the vis.js network library.

The HTML page and all the images will be put into a tar-ball.

Parameters:
  • filename (str) – the name of the tarball

  • in_stock_colors (Optional[FrameColors]) – the colors around molecules, defaults to {True: “green”, False: “orange”}

Return type:

None

Module contents

Sub-package containing analysis routines