Class SmilesParser

  • public final class SmilesParser
    extends Object
    Read molecules and reactions from a SMILES [?Authors?, SMILES Tutorial] string. Example usage
     try {
         SmilesParser   sp  = new SmilesParser(SilentChemObjectBuilder.getInstance());
         IAtomContainer m   = sp.parseSmiles("c1ccccc1");
     } catch (InvalidSmilesException e) {
    Reading Aromatic SMILES

    Aromatic SMILES are automatically kekulised producing a structure with assigned bond orders. The aromatic specification on the atoms is maintained from the SMILES even if the structures are not considered aromatic. For example 'c1ccc1' will correctly have two pi bonds assigned but the atoms/bonds will still be flagged as aromatic. Recomputing or clearing the aromaticty will remove these erroneous flags. If a kekulé structure could not be assigned this is considered an error. The most common example is the omission of hydrogens on aromatic nitrogens (aromatic pyrrole is specified as '[nH]1cccc1' not 'n1cccc1'). These structures can not be corrected without modifying their formula. If there are multiple locations a hydrogen could be placed the returned structure would differ depending on the atom input order. If you wish to skip the kekulistation (not recommended) then it can be disabled with kekulise. SMILES can be verified for validity with the DEPICT service. Unsupported Features

    The following features are not supported by this parser.

    • variable order of bracket atom attributes, '[C-H]', '[CH@]' are considered invalid. The predefined order required by this parser follows the OpenSMILES specification of 'isotope', 'symbol', 'chiral', 'hydrogens', 'charge', 'atom class'
    • atom class indication - this information is loaded but not annotated on the structure
    • extended tetrahedral stereochemistry (cumulated double bonds)
    • trigonal bipyramidal stereochemistry
    • octahedral stereochemistry
    Atom Class

    The atom class is stored as the CDKConstants.ATOM_ATOM_MAPPING property.

     SmilesParser   sp  = new SmilesParser(SilentChemObjectBuilder.getInstance());
     IAtomContainer m   = sp.parseSmiles("c1[cH:5]cccc1");
     Integer        c1  = m.getAtom(1)
                           .getProperty(CDKConstants.ATOM_ATOM_MAPPING); // 5
     Integer        c2  = m.getAtom(2)
                           .getProperty(CDKConstants.ATOM_ATOM_MAPPING); // null
    Christoph Steinbeck, Egon Willighagen, John May
    Source code:
    Belongs to CDK module:
    SMILES, parser
    Created on:
    • Constructor Detail

      • SmilesParser

        public SmilesParser​(IChemObjectBuilder builder)
        Create a new SMILES parser which will create IAtomContainers with the specified builder.
        builder - used to create the CDK domain objects
    • Method Detail

      • setStrict

        public void setStrict​(boolean strict)
        Sets whether the parser is in strict mode. In non-strict mode (default) recoverable issues with SMILES are reported as warnings.
        strict - strict mode true/false.
      • parseReactionSetSmiles

        public IReactionSet parseReactionSetSmiles​(String smiles)
                                            throws InvalidSmilesException
        Parse a SMILES that describes a set of reactions representing multiple synthesis steps or a metabolic pathway. This is a logical extension to the SMILES reaction syntax. The basic idea is the product(s) of the previous step become the reactants of the next step.
        Results in a reaction set with two reactions:
         {reactant}>{agent_1}>{product_1} step 1
         {product_1}>{agent_2}>{product_2} step 2
        smiles - the SMILES input string
        the reaction set
        InvalidSmilesException - the input was invalid (with reason)
      • setPreservingAromaticity

        public void setPreservingAromaticity​(boolean preservingAromaticity)
        Makes the Smiles parser set aromaticity as provided in the Smiles itself, without detecting it. Default false. Atoms will not be typed when set to true.
        preservingAromaticity - boolean to indicate if aromaticity is to be preserved.
        See Also:
      • isPreservingAromaticity

        public boolean isPreservingAromaticity()
        Gets the (default false) setting to preserve aromaticity as provided in the Smiles itself.
        true or false indicating if aromaticity is preserved.
      • kekulise

        public void kekulise​(boolean kekulise)
        Indicated whether structures should be automatically kekulised if they are provided as aromatic. Kekulisation is on by default but can be turned off if it is believed the structures can be handled without assigned bond orders (not recommended).
        kekulise - should structures be kekulised