Hitgen

KNIME

Vernalis uses a range of commercial and open-source software.  We also contribute open-source software to the wider community.  In addition to our KNIME community contribution, our developers have also made contributions to a number of open source projects including the RDKitDataWarriorFXMolViewerOpenBabel and BioJava projects.

We have been KNIME Trusted Community contributors since June 2013.  The ‘trusted’ status provides guarantees about testing, security, maintenance and backwards compatibility to ensure our nodes are suitable for use in a production environment.  A more detailed description of our nodes can be found at the Vernalis Community Contribution Homepage, the KNIME Hub, and also via the Node Pit website.

We have recently published a historical overview of the development of the contribution to December 2017 (Roughley, S.D., Curr. Med. Chem. 2020).

Broadly speaking, the nodes fall into three categories:

General utility nodes – nodes of use to the wider KNIME community, beyond the cheminformatics arena. The nodes include many flow control nodes (switches, loop start / end nodes), fingerprint manipulation nodes, collection modifying nodes, database nodes, and a variety of testing and benchmarking nodes.

Data access nodes – our original node was the PDB Connector, developed with Enspiral Discovery, which allows advanced query of the RCSB PDB. Following changes to the RCSB RESTful webservices in 2020, we completely re-wrote this family of nodes and have added a variety of other nodes to query or download data from public sources.  We also provide a variety of nodes for loading text-based files into KNIME, in both plain-text, XML and chemical formats.

Cheminformatics and Structural Informatics nodes – nodes which allow extracting or processing data from PDB files, sequence data, and SMILES data (via the ‘Speedy SMILES’ nodes, which are designed for fast pre-processing of large SMILES-based datasets without internal or explicit conversion to a chemical toolkit format). We also provide nodes to perform Matched Molecular Pair Analysis (MMPA), and to calculate a range of molecular properties. Many of our nodes use the RDKit toolkit. More recently, we have also added a small number of bioinformatics nodes.

Some of our nodes have been subsequently donated to KNIME and incorporated into the KNIME core product.