Stocks
======

The stock files specified in the configuration file are loaded and a set of inchi keys
are stored in-memory for lookup. However, the tool supports other stock queries as well as a way
to fully customize the lookup.

Mongo database stock
--------------------

First, support for lookup inchi keys in a Mongo database is supported. The Mongo client should
have a database and a collection containing documents with at least two fields: `inchi_key` and `source`.
The `inchi_key` field will be used for lookup and `source` specifies the source database of the compound.

By adding these lines to the configuration file, the Mongo database will be used:

.. code-block:: yaml

    stock:
        type: mongodb
        host: user@myurl.com
        database: database_name
        collection: compounds


If no options are provided to the ``mongodb_stock`` key, the host, database and collection are taken to be `localhost`, 
`stock_db`, and `molecules`, respectively. 

Stop criteria
-------------

The stock can be used to stop the tree search based on three criteria: a) minimum price, b) maximum amount and c) count of different elements in the molecule.
Note that the stock query class need to support querying for price and amount, if the stop criteria should work properly.

The stop criteria can be specified in the configuration file 

.. code-block:: yaml

    stock:
        stop_criteria:
            price: 10
            counts:
                C: 10


In the Jupyter GUI you can set the limit on the element occurences, but currently not the price and amount limits. 

Custom stock
------------

Support for any type of lookup is provided. You just need to write a python class that implements the ``__contains__`` 
and subclasses the ``aizynthfinder.context.stock.queries.StockQueryMixin``. The ``__contains__`` method is used for lookup and should take a ``Molecule`` object as only argument.
The ``StockQueryMixin`` mixin class provide a default interface for some methods that perhaps isn't possible to implement in all query classes.

This is an example:

.. code-block::

  from rdkit.Chem import Lipinski
  from aizynthfinder.context.stock.queries import StockQueryMixin
  class CriteriaStock(StockQueryMixin):
      def __contains__(self, mol):
          return Lipinski.HeavyAtomCount(mol.rd_mol) < 10


To use this stock with the ``aizynthcli`` tool, save it in a ``custom_stock.py`` module that is located in a directory known to 
the python interpreter. Add this line to the module.

.. code-block::

  stock = CriteriaStock()


and it will be automatically used in the tree search. 

Alternatively the custom query class can be used by the ``aizynthapp`` tool.


.. code-block::

  from aizynthfinder import AiZynthApp
  configfile="config_local.yml"
  app = AiZynthApp(configfile, setup=False)
  app.finder.stock.load(CriteriaStock(), "criteria") # This loads the custom stock class
  app.setup()


Lastly, it is possible to specify a custom stock class in the configuration file if it is located in a module that 
is known by the python interpreter.

.. code-block::

    stock:
        type: aizynthfinder.contrib.stocks.CriteriaStock


can be used if the `aizynthfinder.contrib.stocks` is an existing sub-package and module.


Making stocks
-------------

We provide a tool to create inchi key-based stocks from SMILES strings. Thereby, one
can create a stock based on for instance a subset of the ZINC database.

The tool support both creating a stock in HDF5 format or adding them to an existing Mongo database.

The tool is easiest to use if one has a number of plain text files, in which each row has one SMILES.

Then one can use one of these two commands:


.. code-block::

    smiles2stock --files file1.smi file2.smi --output stock.hdf5
    smiles2stock --files file1.smi file2.smi --output my_db --target mongo


to create either an HDF5 stock or a Mongo database stock, respectively. The ``file1.smi`` and ``file2.smi``
are simple text files and ``my_db`` is the source tag for the Mongo database.


If one has SMILES in any other format, one has to provide a custom module that extract the SMILES from
the input files. This is an example of such a module that can be used with downloads from the Zinc database
where the first row contains headers and the SMILES are the first element on each line.


.. code-block::

    def extract_smiles(filename):
        with open(filename, "r") as fileobj:
            for i, line in enumerate(fileobj.readlines()):
                if i == 0:
                    continue
                yield line.strip().split(" ")[0]


if this is saved as ``load_zinc.py`` in a path that is known to the Python interpreter, it can be 
used like this

.. code-block::

    export PYTHONPATH=`pwd`
    smiles2stock --files load_zinc file1.smi file2.smi --source module --output stock.hdf5


where the first line adds the current directory to the python path (if you are using a Bash shell).