rxnutils.routes.retro_bleu package

Submodules

rxnutils.routes.retro_bleu.ngram_collection module

Contains routines for creating, reading, and writing n-gram collections

Can be run as a module to create a collection from a set of routes:

python -m rxnutils.routes.retro_bleu.ngram_collection –filename routes.json –output ngrams.json –nitems 2 –metadata template_hash

class rxnutils.routes.retro_bleu.ngram_collection.NgramCollection(nitems, metadata_key, ngrams)

Bases: object

Class to create, read, and write a collection of n-grams

Parameters:
  • nitems (int) – the length of each n-gram

  • metadata_key (str) – the key used to extract the n-grams from the reactions

  • ngrams (Set[Tuple[str, ...]]) – the extracted n-grams

nitems: int
metadata_key: str
ngrams: Set[Tuple[str, ...]]
classmethod from_file(filename)

Read an n-gram collection from a JSON-file

Parameters:

filename (str) – the path to the file

Returns:

the n-gram collection

Return type:

NgramCollection

classmethod from_tree_collection(filename, nitems, metadata_key)

Make a n-gram collection by extracting them from a collection of synthesis routes.

Parameters:
  • filename (str) – the path to a file with a list of synthesis routes

  • nitems (int) – the length of the gram

  • metadata_key (str) – the metadata to extract

Returns:

the n-gram collection

Return type:

NgramCollection

save_to_file(filename)

Save an n-gram collection to a JSON-file

Parameters:

filename (str) – the path to the file

Return type:

None

rxnutils.routes.retro_bleu.scoring module

Contains routine to score routes according to Retro-BLEU paper

rxnutils.routes.retro_bleu.scoring.ngram_overlap_score(route, ref)

Calculate the fractional n-gram overlap of the n-grams in the given route and the reference n-gram collection

Parameters:
Returns:

the calculated score

Return type:

float

rxnutils.routes.retro_bleu.scoring.retro_bleu_score(route, ref, ideal_steps=3)

Calculate the Retro-BLEU score according to the paper:

Li, Junren, Lei Fang, och Jian-Guang Lou. ”Retro-BLEU: quantifying chemical plausibility of retrosynthesis routes through reaction template sequence analysis”. Digital Discovery 3, nr 3 (2024): 482–90. https://doi.org/10.1039/D3DD00219E.

Parameters:
  • route (SynthesisRoute) – the route to score

  • ref (NgramCollection) – the reference n-gram collection

  • ideal_steps (int) – a length-penalization hyperparameter (see Eq 2 in ref)

Returns:

the calculated score

Return type:

float

Module contents