Command-line interface¶

This tools provide the possibility to perform tree search on a batch of molecules.

In its simplest form, you type

aizynthcli --config config_local.yml --smiles smiles.txt

where config_local.yml contains configurations such as paths to policy models and stocks (see here) and smiles.txt is a simple text file with SMILES (one on each row).

To find out what other arguments are available use the -h flag.

aizynthcli -h

That gives something like this:

usage: aizynthcli [-h] --smiles SMILES --config CONFIG
                  [--policy POLICY [POLICY ...]]
                  [--filter FILTER [FILTER ...]]
                  [--stocks STOCKS [STOCKS ...]] [--output OUTPUT]
                  [--log_to_file] [--nproc NPROC] [--cluster]
                  [--post_processing POST_PROCESSING [POST_PROCESSING ...]]
                  [--pre_processing PRE_PROCESSING] [--checkpoint CHECKPOINT]

options:
  -h, --help            show this help message and exit
  --smiles SMILES       the target molecule smiles or the path of a file
                        containing the smiles
  --config CONFIG       the filename of a configuration file
  --policy POLICY [POLICY ...]
                        the name of the expansion policy to use
  --filter FILTER [FILTER ...]
                        the name of the filter to use
  --stocks STOCKS [STOCKS ...]
                        the name of the stocks to use
  --output OUTPUT       the name of the output file (JSON or HDF5 file)
  --log_to_file         if provided, detailed logging to file is enabled
  --nproc NPROC         if given, the input is split over a number of
                        processes
  --cluster             if provided, perform automatic clustering
  --post_processing POST_PROCESSING [POST_PROCESSING ...]
                        a number of modules that performs post-processing
                        tasks
  --pre_processing PRE_PROCESSING
                        a module that perform pre-processing tasks
  --checkpoint CHECKPOINT
                        the path to the checkpoint file

By default:

All stocks are selected if no stock is specified

First expansion policy is selected if not expansion policy is specified

All filter policies are selected if it is not specified on the command-line

Analysing output¶

The results from the aizynthcli tool when supplying multiple SMILES is an JSON or HDF5 file that can be read as a pandas dataframe. It will be called output.json.gz by default.

A checkpoint.json.gz will also be generated if a checkpoint file path is provided as input when calling the aizynthcli tool. The checkpoint data will contain the processed smiles with their corresponding results in each line of the file.

import pandas as pd
data = pd.read_json("output.json.gz", orient="table")

it will contain statistics about the tree search and the top-ranked routes (as JSONs) for each target compound, see below.

When a single SMILES is provided to the tool, the statistics will be written to the terminal, and the top-ranked routes to a JSON file (trees.json by default).

This is an example of how to create images of the top-ranked routes for the first target compound

import pandas as pd
from aizynthfinder.reactiontree import ReactionTree

data = pd.read_json("output.json.gz", orient="table")
all_trees = data.trees.values  # This contains a list of all the trees for all the compounds
trees_for_first_target = all_trees[0]

for itree, tree in enumerate(trees_for_first_target):
    imagefile = f"route{itree:03d}.png"
    ReactionTree.from_dict(tree).to_image().save(imagefile)

The images will be called route000.png, route001.png etc.

Specification of output¶

The JSON or HDF5 file created when running the tool with a list of SMILES will have the following columns

Column	Description
target	The target SMILES
search_time	The total search time in seconds
first_solution_time	The time elapsed until the first solution was found
first_solution_iteration	The number of iterations completed until the first solution was found
number_of_nodes	The number of nodes in the search tree
max_transforms	The maximum number of transformations for all routes in the search tree
max_children	The maximum number of children for a search node
number_of_routes	The number of routes in the search tree
number_of_solved_routes	The number of solved routes in search tree
top_score	The score of the top-scored route (default to MCTS reward)
is_solved	If the top-scored route is solved
number_of_steps	The number of reactions in the top-scored route
number_of_precursors	The number of starting materials
number_of_precursors_in_stock	The number of starting materials in stock
precursors_in_stock	Comma-separated list of SMILES of starting material in stock
precursors_not_in_stock	Comma-separated list of SMILES of starting material not in stock
precursors_availability	Semi-colon separated list of stock availability of the staring material
policy_used_counts	Dictionary of the total number of times an expansion policy have been used
profiling	Profiling information from the search tree, including expansion models call and reactant generation
stock_info	Dictionary of the stock availability for each of the starting material in all extracted routes
top_scores	Comma-separated list of the score of the extracted routes (default to MCTS reward)
trees	A list of the extracted routes as dictionaries

If you running the tool with a single SMILES, all of this data will be printed to the screen, except the stock_info and trees.

aizynthfinder

Navigation

Related Topics

Command-line interface¶

Analysing output¶

Specification of output¶