Class SmilesGenerator

java.lang.Object
org.openscience.cdk.smiles.SmilesGenerator

public final class SmilesGenerator extends Object
SMILES [Weininger, David. Journal of Chemical Information and Computer Sciences. 1988. 28, Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29] provides a compact representation of chemical structures and reactions.
Different flavours of SMILES can be generated and are fully configurable. The standard flavours of SMILES defined by Daylight are:
  • Generic - non-canonical SMILES string, different atom ordering produces different SMILES. No isotope or stereochemistry encoded.
  • Unique - canonical SMILES string, different atom ordering produces the same* SMILES. No isotope or stereochemistry encoded.
  • Isomeric - non-canonical SMILES string, different atom ordering produces different SMILES. Isotope and stereochemistry is encoded.
  • Absolute - canonical SMILES string, different atom ordering produces the same SMILES. Isotope and stereochemistry is encoded.
To output a given flavour the flags in SmiFlavor are used:
 SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Isomeric);
 
SmiFlavor provides more fine grained control, for example, for the following is equivalent to SmiFlavor.Isomeric:
 SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Stereo |
                                              SmiFlavor.AtomicMass);
 
Bitwise logic can be used such that we can remove options: SmiFlavor.Isomeric ^ SmiFlavor.AtomicMass will generate isomeric SMILES without atomic mass. A generator instance is created using one of the static methods, the SMILES are then created by invoking create(IAtomContainer).
 IAtomContainer  ethanol = ...;
 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Generic);
 String          smi     = sg.create(ethanol); // CCO, C(C)O, C(O)C, or OCC

 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Unique);
 String          smi     = sg.create(ethanol); // only CCO
 
The isomeric and absolute generator encode tetrahedral and double bond stereochemistry using IStereoElements provided on the IAtomContainer. If stereochemistry is not being written it may need to be determined from 2D/3D coordinates using StereoElementFactory. By default the generator will not write aromatic SMILES. Kekulé SMILES are generally preferred for compatibility and aromaticity can easily be re-perceived by most tool kits whilst kekulisation may fail. If you really want aromatic SMILES the following code demonstrates
 IAtomContainer  benzene = ...;

 // 'benzene' molecule has no arom flags, we always get Kekulé output
 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Generic);
 String          smi     = sg.create(benzene); // C1=CC=CC=C1

 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Generic |
                                               SmiFlavor.UseAromaticSymbols);
 String          smi     = sg.create(benzene); // C1=CC=CC=C1 flags not set!

 // Note, in practice we'd use an aromaticity algorithm
 for (IAtom a : benzene.atoms())
     a.setIsAromatic(true);
 for (IBond b : benzene.bond())
     a.setIsAromatic(true);

 // 'benzene' molecule now has arom flags, we always get aromatic SMILES if we request it
 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Generic);
 String          smi     = sg.create(benzene); // C1=CC=CC=C1

 SmilesGenerator sg      = new SmilesGenerator(SmiFlavor.Generic |
                                               SmiFlavor.UseAromaticSymbols);
 String          smi     = sg.create(benzene); // c1ccccc1
 
It can be useful to know the output order of SMILES. On input the order of the atoms reflects the atom index. If we know this order we can refer to atoms by index and associate data with the SMILES string. The output order is obtained by parsing in an auxiliary array during creation. The following snippet demonstrates how we can write coordinates in order.

 IAtomContainer  mol = ...;
 SmilesGenerator sg  = new SmilesGenerator(SmiFlavor.Generic);

 int   n     = mol.getAtomCount();
 int[] order = new int[n];

 // the order array is filled up as the SMILES is generated
 String smi = sg.create(mol, order);

 // load the coordinates array such that they are in the order the atoms
 // are read when parsing the SMILES
 Point2d[] coords = new Point2d[mol.getAtomCount()];
 for (int i = 0; i < coords.length; i++)
     coords[order[i]] = container.getAtom(i).getPoint2d();

 // SMILES string suffixed by the coordinates
 String smi2d = smi + " " + Arrays.toString(coords);

 
Using the output order of SMILES forms the basis of ChemAxon Extended SMILES (CXSMILES) which can also be generated. Extended SMILES allows additional structure data to be serialized including, atom labels/values, fragment grouping (for salts in reactions), polymer repeats, multi center bonds, and coordinates. The CXSMILES layer is appended after the SMILES so that parser which don't interpret it can ignore it. The two aggregate flavours are SmiFlavor.CxSmiles and SmiFlavor.CxSmilesWithCoords. As with other flavours, fine grain control is possible SmiFlavor.
* the unique SMILES generation uses a fast equitable labelling procedure and as such there are some structures which may not be unique. The number of such structures is generally minimal.
Author:
Oliver Horlacher, Stefan Kuhn (chiral smiles), John May
See Also:
Source code:
main
Belongs to CDK module:
smiles
Keywords:
SMILES, generator
  • Constructor Details

    • SmilesGenerator

      @Deprecated public SmilesGenerator()
      Deprecated.
      use SmilesGenerator(int) configuring with SmiFlavor.
      Create the SMILES generator, the default output is described by: SmiFlavor.Default but is best to choose/set this flavor.
      See Also:
    • SmilesGenerator

      public SmilesGenerator(int flavour)
      Create a SMILES generator with the specified SmiFlavor.
       SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Stereo |
                                                    SmiFlavor.Canonical);
       
      Parameters:
      flavour - SMILES flavour flags SmiFlavor
  • Method Details

    • aromatic

      public SmilesGenerator aromatic()
      Deprecated.
      configure with SmiFlavor
      Derived a new generator that writes aromatic atoms in lower case. The preferred way of doing this is now to use the SmilesGenerator(int) constructor:
       SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.UseAromaticSymbols);
       
      Returns:
      a generator for aromatic SMILES
    • withAtomClasses

      @Deprecated public SmilesGenerator withAtomClasses()
      Deprecated.
      configure with SmiFlavor
      Specifies that the generator should write atom classes in SMILES. Atom classes are provided by the CDKConstants.ATOM_ATOM_MAPPING property. This method returns a new SmilesGenerator to use.
       IAtomContainer  container = ...;
       SmilesGenerator smilesGen = SmilesGenerator.unique()
                                                  .atomClasses();
       smilesGen.createSMILES(container); // C[CH2:4]O second atom has class = 4
       
      Returns:
      a generator for SMILES with atom classes
    • generic

      public static SmilesGenerator generic()
      Create a generator for generic SMILES. Generic SMILES are non-canonical and useful for storing information when it is not used as an index (i.e. unique keys). The generated SMILES is dependant on the input order of the atoms.
      Returns:
      a new arbitrary SMILES generator
    • isomeric

      public static SmilesGenerator isomeric()
      Convenience method for creating an isomeric generator. Isomeric SMILES are non-unique but contain isotope numbers (e.g. [13C]) and stereo-chemistry.
      Returns:
      a new isomeric SMILES generator
    • unique

      public static SmilesGenerator unique()
      Create a unique SMILES generator. Unique SMILES use a fast canonisation algorithm but does not encode isotope or stereo-chemistry.
      Returns:
      a new unique SMILES generator
    • absolute

      public static SmilesGenerator absolute()
      Create a absolute SMILES generator. Unique SMILES uses the InChI to canonise SMILES and encodes isotope or stereo-chemistry. The InChI module is not a dependency of the SMILES module but should be present on the classpath when generation absolute SMILES.
      Returns:
      a new absolute SMILES generator
    • createSMILES

      @Deprecated public String createSMILES(IAtomContainer molecule)
      Deprecated.
      use #create
      Create a SMILES string for the provided molecule.
      Parameters:
      molecule - the molecule to create the SMILES of
      Returns:
      a SMILES string
    • createSMILES

      @Deprecated public String createSMILES(IReaction reaction)
      Deprecated.
      use #createReactionSMILES
      Create a SMILES string for the provided reaction.
      Parameters:
      reaction - the reaction to create the SMILES of
      Returns:
      a reaction SMILES string
    • create

      public String create(IAtomContainer molecule) throws CDKException
      Generate SMILES for the provided molecule.
      Parameters:
      molecule - The molecule to evaluate
      Returns:
      the SMILES string
      Throws:
      CDKException - SMILES could not be created
    • create

      public String create(IAtomContainer molecule, int[] order) throws CDKException
      Creates a SMILES string of the flavour specified in the constructor and write the output order to the provided array.
      The output order allows one to arrange auxiliary atom data in the order that a SMILES string will be read. A simple example is seen below where 2D coordinates are stored with a SMILES string. This method forms the basis of CXSMILES.
      
       IAtomContainer  mol = ...;
       SmilesGenerator sg  = new SmilesGenerator();
      
       int   n     = mol.getAtomCount();
       int[] order = new int[n];
      
       // the order array is filled up as the SMILES is generated
       String smi = sg.create(mol, order);
      
       // load the coordinates array such that they are in the order the atoms
       // are read when parsing the SMILES
       Point2d[] coords = new Point2d[mol.getAtomCount()];
       for (int i = 0; i < coords.length; i++)
           coords[order[i]] = container.getAtom(i).getPoint2d();
      
       // SMILES string suffixed by the coordinates
       String smi2d = smi + " " + Arrays.toString(coords);
      
       
      Parameters:
      molecule - the molecule to write
      order - array to store the output order of atoms
      Returns:
      the SMILES string
      Throws:
      CDKException - SMILES could not be created
    • create

      public static String create(IAtomContainer molecule, int flavour, int[] order) throws CDKException
      Creates a SMILES string of the flavour specified as a parameter and write the output order to the provided array.
      The output order allows one to arrange auxiliary atom data in the order that a SMILES string will be read. A simple example is seen below where 2D coordinates are stored with a SMILES string. This method forms the basis of CXSMILES.
      
       IAtomContainer  mol = ...;
       SmilesGenerator sg  = new SmilesGenerator();
      
       int   n     = mol.getAtomCount();
       int[] order = new int[n];
      
       // the order array is filled up as the SMILES is generated
       String smi = sg.create(mol, order);
      
       // load the coordinates array such that they are in the order the atoms
       // are read when parsing the SMILES
       Point2d[] coords = new Point2d[mol.getAtomCount()];
       for (int i = 0; i < coords.length; i++)
           coords[order[i]] = container.getAtom(i).getPoint2d();
      
       // SMILES string suffixed by the coordinates
       String smi2d = smi + " " + Arrays.toString(coords);
      
       
      Parameters:
      molecule - the molecule to write
      order - array to store the output order of atoms
      Returns:
      the SMILES string
      Throws:
      CDKException - a valid SMILES could not be created
    • createReactionSMILES

      @Deprecated public String createReactionSMILES(IReaction reaction) throws CDKException
      Deprecated.
      Create a SMILES for a reaction.
      Parameters:
      reaction - CDK reaction instance
      Returns:
      reaction SMILES
      Throws:
      CDKException - a valid SMILES could not be created
    • create

      public String create(IReaction reaction) throws CDKException
      Create a SMILES for a reaction of the flavour specified in the constructor.
      Parameters:
      reaction - CDK reaction instance
      Returns:
      reaction SMILES
      Throws:
      CDKException
    • create

      public String create(IReaction reaction, int[] ordering) throws CDKException
      Create a SMILES for a reaction of the flavour specified in the constructor and write the output order to the provided array.
      Parameters:
      reaction - CDK reaction instance
      Returns:
      reaction SMILES
      Throws:
      CDKException
    • setUseAromaticityFlag

      @Deprecated public void setUseAromaticityFlag(boolean useAromaticityFlag)
      Deprecated.
      since 1.5.6, use aromatic() - invoking this method does nothing
      Indicates whether output should be an aromatic SMILES.
      Parameters:
      useAromaticityFlag - if false only SP2-hybridized atoms will be lower case (default), true=SP2 or aromaticity trigger lower case
    • createComparator

      public static Comparator<IAtom> createComparator(IAtomContainer mol, int flavor)