The Chemistry Development Kit (CDK) is a collection of modular Java libraries for processing chemical information (Cheminformatics). The modules are free and open-source and are easy to integrate with other open-source or in-house projects.

21+ years of development



> 115 Contributors

Functionality is provided for many areas in cheminformatics including:
  • Molecule and reaction valence bond representation.
  • Read and write file formats: SMILES, SDF, InChI, Mol2, CML, and others.
  • Efficient molecule processing algorithms: Ring Finding, Kekulisation, Aromaticity.
  • Coordinate generation and rendering.
  • Canonical identifiers for fast exact searching.
  • Substructure and SMARTS pattern searching.
  • ECFP, Daylight, MACCS, and other fingerprint methods for similarity searching.
  • QSAR descriptor calculations ....and much more

Get started with CDK

Code snippets and other resources to get you started

The following resources can be used to learn the CDK API. The API describes the classes and methods in details, while the other resources show code snippets. The mailing list is also a good resource of answers.


 
 
 

  Mailing list archives - cdk-user, cdk-devel

Download


You can download the latest release JAR with all dependencies included from GitHub.

The easiest way to integrate the CDK with a project and keep up to date with the latest features is by using the Maven build system. The CDK modules and their dependencies are automatically fetched from the central repository during compilation of your code.


              
              <dependency>
                <groupId>org.openscience.cdk</groupId>
                <artifactId>cdk-bundle</artifactId>
                <version>2.9</version>
              </dependency>
            

To include the all library modules in your project add cdk-bundle to your pom.xml. Once familiar with the library it's good practise to only include the modules your project needs (e.g. cdk-smiles).

Get involved

We welcome contributions and feedback however big or small.


 Keep updated

There are plenty of ways to stay updated with CDK project. You can follow us on GitHub or Twitter.

In addition to social media the best way to ask for help is on the cdk-user mailing list. You can often find updates on new project functionality and usage tips on the developer's blog entries.

 

 Issues

If you find an issue when using the CDK or would like to request a new feature please report this via GitHub

 

 Build

CDK is built with Maven, you'll need to download and install the Maven Build Tool, mvn. Once Maven is installed, the whole project can be compiled, tested, and installed with the command mvn install.

$> mvn install

If you just want to use the very latest version, pre-release builds are available as from the OSSRH snapshot repository.

 

 Patch

To submit patches please create a pull request via GitHub.

To keep things organised please use a separate topic branch for each pull request. This will avoid including unrelated changes from elsewhere with the subject of the patch.

Publications


  • Willighagen et al. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J. Cheminform. 2017; 9(3), doi:10.1186/s13321-017-0220-4
  • May and Steinbeck. Efficient ring perception for the Chemistry Development Kit. J. Cheminform. 2014, doi:10.1186/1758-2946-6-3
  • Steinbeck et al. Recent Developments of the Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics. Curr. Pharm. Des. 2006; 12(17):2111-2120, doi:10.2174/138161206777585274 (free green Open Acccess version)
  • Steinbeck et al. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 2003 Mar-Apr; 43(2):493-500, doi:10.1021/ci025584y

If you wish to cite the use of CDK please reference the above journal articles. Use of a specific version (since 1.5.7) can also be cited by DOI provided by Zenodo.

Other projects

The CDK functionality underpins many exciting open and commercial projects.
Here are some of our favourites


Applications
Easy to use programs that solve a particular task

  • PaDEL - UI for QSAR descriptor calculations
  • ChemViz2 - cytoscape plugin, network visualisation
  • Scaffold Hunter - Visual analysis of data sets
  • LICSS (Excel CDK) - Integration in MS Excel for Windows
  • JChemPaint- Swing based chemical diagram editor (not maintained)
  • MAYGEN- A chemical structure generator for constitutional isomers based on the orderly generation principle (not maintained)
 

Toolboxes
Extend or enhance the CDK functionality

 License

GNU Lesser General Public License, version 2.1 (or later).

The LGPL is compatible with other major open-source licenses. Since Java libraries are dynamically linked, there is no restriction in using CDK in proprietary software (see the FSF's LGPL and Java). Keep in mind that libraries that are part of the CDK, depend on their own licenses.