Package org.openscience.cdk.smiles
Class SmilesGenerator
java.lang.Object
org.openscience.cdk.smiles.SmilesGenerator
SMILES [Weininger, David. Journal of Chemical Information and Computer
Sciences. 1988. 28, Weininger, David et. al.. Journal of Chemical Information and Computer
Sciences. 1989. 29] provides a compact representation of
chemical structures and reactions.
Different flavours of SMILES can be generated and are fully configurable. The standard flavours of SMILES defined by Daylight are:
* the unique SMILES generation uses a fast equitable labelling procedure and as such there are some structures which may not be unique. The number of such structures is generally minimal.
Different flavours of SMILES can be generated and are fully configurable. The standard flavours of SMILES defined by Daylight are:
- Generic - non-canonical SMILES string, different atom ordering produces different SMILES. No isotope or stereochemistry encoded.
- Unique - canonical SMILES string, different atom ordering produces the same* SMILES. No isotope or stereochemistry encoded.
- Isomeric - non-canonical SMILES string, different atom ordering produces different SMILES. Isotope and stereochemistry is encoded.
- Absolute - canonical SMILES string, different atom ordering produces the same SMILES. Isotope and stereochemistry is encoded.
SmiFlavor
are used:
SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Isomeric);
SmiFlavor
provides more fine grained control, for example,
for the following is equivalent to SmiFlavor.Isomeric
:
SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Stereo | SmiFlavor.AtomicMass);Bitwise logic can be used such that we can remove options:
SmiFlavor.Isomeric
^
SmiFlavor.AtomicMass
will generate isomeric SMILES without atomic mass.
A generator instance is created using one of the static methods, the SMILES
are then created by invoking create(IAtomContainer)
.
The isomeric and absolute generator encode tetrahedral and double bond stereochemistry usingIAtomContainer ethanol = ...; SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic); String smi = sg.create(ethanol); // CCO, C(C)O, C(O)C, or OCC SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Unique); String smi = sg.create(ethanol); // only CCO
IStereoElement
s
provided on the IAtomContainer
. If stereochemistry is not being
written it may need to be determined from 2D/3D coordinates using
StereoElementFactory
.
By default the generator will not write aromatic SMILES. Kekulé SMILES are
generally preferred for compatibility and aromaticity can easily be
re-perceived by most tool kits whilst kekulisation may fail. If you
really want aromatic SMILES the following code demonstrates
It can be useful to know the output order of SMILES. On input the order of the atoms reflects the atom index. If we know this order we can refer to atoms by index and associate data with the SMILES string. The output order is obtained by parsing in an auxiliary array during creation. The following snippet demonstrates how we can write coordinates in order.IAtomContainer benzene = ...; // 'benzene' molecule has no arom flags, we always get Kekulé output SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic); String smi = sg.create(benzene); // C1=CC=CC=C1 SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic | SmiFlavor.UseAromaticSymbols); String smi = sg.create(benzene); // C1=CC=CC=C1 flags not set! // Note, in practice we'd use an aromaticity algorithm for (IAtom a : benzene.atoms()) a.setIsAromatic(true); for (IBond b : benzene.bond()) a.setIsAromatic(true); // 'benzene' molecule now has arom flags, we always get aromatic SMILES if we request it SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic); String smi = sg.create(benzene); // C1=CC=CC=C1 SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic | SmiFlavor.UseAromaticSymbols); String smi = sg.create(benzene); // c1ccccc1
Using the output order of SMILES forms the basis of ChemAxon Extended SMILES (CXSMILES) which can also be generated. Extended SMILES allows additional structure data to be serialized including, atom labels/values, fragment grouping (for salts in reactions), polymer repeats, multi center bonds, and coordinates. The CXSMILES layer is appended after the SMILES so that parser which don't interpret it can ignore it. The two aggregate flavours areIAtomContainer mol = ...; SmilesGenerator sg = new SmilesGenerator(SmiFlavor.Generic); int n = mol.getAtomCount(); int[] order = new int[n]; // the order array is filled up as the SMILES is generated String smi = sg.create(mol, order); // load the coordinates array such that they are in the order the atoms // are read when parsing the SMILES Point2d[] coords = new Point2d[mol.getAtomCount()]; for (int i = 0; i < coords.length; i++) coords[order[i]] = container.getAtom(i).getPoint2d(); // SMILES string suffixed by the coordinates String smi2d = smi + " " + Arrays.toString(coords);
SmiFlavor.CxSmiles
and SmiFlavor.CxSmilesWithCoords
.
As with other flavours, fine grain control is possible SmiFlavor
.
* the unique SMILES generation uses a fast equitable labelling procedure and as such there are some structures which may not be unique. The number of such structures is generally minimal.
- Author:
- Oliver Horlacher, Stefan Kuhn (chiral smiles), John May
- See Also:
- Source code:
- main
- Belongs to CDK module:
- smiles
- Keywords:
- SMILES, generator
-
Constructor Summary
ConstructorsConstructorDescriptionDeprecated.SmilesGenerator
(int flavour) Create a SMILES generator with the specifiedSmiFlavor
. -
Method Summary
Modifier and TypeMethodDescriptionstatic SmilesGenerator
absolute()
Create a absolute SMILES generator.aromatic()
Deprecated.configure withSmiFlavor
create
(IAtomContainer molecule) Generate SMILES for the providedmolecule
.create
(IAtomContainer molecule, int[] order) Creates a SMILES string of the flavour specified in the constructor and write the output order to the provided array.static String
create
(IAtomContainer molecule, int flavour, int[] order) Creates a SMILES string of the flavour specified as a parameter and write the output order to the provided array.Create a SMILES for a reaction of the flavour specified in the constructor.Create a SMILES for a reaction of the flavour specified in the constructor and write the output order to the provided array.static Comparator<IAtom>
createComparator
(IAtomContainer mol, int flavor) createReactionSMILES
(IReaction reaction) Deprecated.createSMILES
(IAtomContainer molecule) Deprecated.use #createcreateSMILES
(IReaction reaction) Deprecated.use #createReactionSMILESstatic SmilesGenerator
generic()
Create a generator for generic SMILES.static SmilesGenerator
isomeric()
Convenience method for creating an isomeric generator.void
setUseAromaticityFlag
(boolean useAromaticityFlag) Deprecated.since 1.5.6, usearomatic()
- invoking this method does nothingstatic SmilesGenerator
unique()
Create a unique SMILES generator.Deprecated.configure withSmiFlavor
-
Constructor Details
-
SmilesGenerator
Deprecated.useSmilesGenerator(int)
configuring withSmiFlavor
.Create the SMILES generator, the default output is described by:SmiFlavor.Default
but is best to choose/set this flavor.- See Also:
-
SmilesGenerator
public SmilesGenerator(int flavour) Create a SMILES generator with the specifiedSmiFlavor
.SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Stereo | SmiFlavor.Canonical);
- Parameters:
flavour
- SMILES flavour flagsSmiFlavor
-
-
Method Details
-
aromatic
Deprecated.configure withSmiFlavor
Derived a new generator that writes aromatic atoms in lower case. The preferred way of doing this is now to use theSmilesGenerator(int)
constructor:SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.UseAromaticSymbols);
- Returns:
- a generator for aromatic SMILES
-
withAtomClasses
Deprecated.configure withSmiFlavor
Specifies that the generator should write atom classes in SMILES. Atom classes are provided by theCDKConstants.ATOM_ATOM_MAPPING
property. This method returns a new SmilesGenerator to use.IAtomContainer container = ...; SmilesGenerator smilesGen = SmilesGenerator.unique() .atomClasses(); smilesGen.createSMILES(container); // C[CH2:4]O second atom has class = 4
- Returns:
- a generator for SMILES with atom classes
-
generic
Create a generator for generic SMILES. Generic SMILES are non-canonical and useful for storing information when it is not used as an index (i.e. unique keys). The generated SMILES is dependant on the input order of the atoms.- Returns:
- a new arbitrary SMILES generator
-
isomeric
Convenience method for creating an isomeric generator. Isomeric SMILES are non-unique but contain isotope numbers (e.g.[13C]
) and stereo-chemistry.- Returns:
- a new isomeric SMILES generator
-
unique
Create a unique SMILES generator. Unique SMILES use a fast canonisation algorithm but does not encode isotope or stereo-chemistry.- Returns:
- a new unique SMILES generator
-
absolute
Create a absolute SMILES generator. Unique SMILES uses the InChI to canonise SMILES and encodes isotope or stereo-chemistry. The InChI module is not a dependency of the SMILES module but should be present on the classpath when generation absolute SMILES.- Returns:
- a new absolute SMILES generator
-
createSMILES
Deprecated.use #createCreate a SMILES string for the provided molecule.- Parameters:
molecule
- the molecule to create the SMILES of- Returns:
- a SMILES string
-
createSMILES
Deprecated.use #createReactionSMILESCreate a SMILES string for the provided reaction.- Parameters:
reaction
- the reaction to create the SMILES of- Returns:
- a reaction SMILES string
-
create
Generate SMILES for the providedmolecule
.- Parameters:
molecule
- The molecule to evaluate- Returns:
- the SMILES string
- Throws:
CDKException
- SMILES could not be created
-
create
Creates a SMILES string of the flavour specified in the constructor and write the output order to the provided array.
The output order allows one to arrange auxiliary atom data in the order that a SMILES string will be read. A simple example is seen below where 2D coordinates are stored with a SMILES string. This method forms the basis of CXSMILES.IAtomContainer mol = ...; SmilesGenerator sg = new SmilesGenerator(); int n = mol.getAtomCount(); int[] order = new int[n]; // the order array is filled up as the SMILES is generated String smi = sg.create(mol, order); // load the coordinates array such that they are in the order the atoms // are read when parsing the SMILES Point2d[] coords = new Point2d[mol.getAtomCount()]; for (int i = 0; i < coords.length; i++) coords[order[i]] = container.getAtom(i).getPoint2d(); // SMILES string suffixed by the coordinates String smi2d = smi + " " + Arrays.toString(coords);
- Parameters:
molecule
- the molecule to writeorder
- array to store the output order of atoms- Returns:
- the SMILES string
- Throws:
CDKException
- SMILES could not be created
-
create
Creates a SMILES string of the flavour specified as a parameter and write the output order to the provided array.
The output order allows one to arrange auxiliary atom data in the order that a SMILES string will be read. A simple example is seen below where 2D coordinates are stored with a SMILES string. This method forms the basis of CXSMILES.IAtomContainer mol = ...; SmilesGenerator sg = new SmilesGenerator(); int n = mol.getAtomCount(); int[] order = new int[n]; // the order array is filled up as the SMILES is generated String smi = sg.create(mol, order); // load the coordinates array such that they are in the order the atoms // are read when parsing the SMILES Point2d[] coords = new Point2d[mol.getAtomCount()]; for (int i = 0; i < coords.length; i++) coords[order[i]] = container.getAtom(i).getPoint2d(); // SMILES string suffixed by the coordinates String smi2d = smi + " " + Arrays.toString(coords);
- Parameters:
molecule
- the molecule to writeorder
- array to store the output order of atoms- Returns:
- the SMILES string
- Throws:
CDKException
- a valid SMILES could not be created
-
createReactionSMILES
Deprecated.Create a SMILES for a reaction.- Parameters:
reaction
- CDK reaction instance- Returns:
- reaction SMILES
- Throws:
CDKException
- a valid SMILES could not be created
-
create
Create a SMILES for a reaction of the flavour specified in the constructor.- Parameters:
reaction
- CDK reaction instance- Returns:
- reaction SMILES
- Throws:
CDKException
-
create
Create a SMILES for a reaction of the flavour specified in the constructor and write the output order to the provided array.- Parameters:
reaction
- CDK reaction instance- Returns:
- reaction SMILES
- Throws:
CDKException
-
setUseAromaticityFlag
Deprecated.since 1.5.6, usearomatic()
- invoking this method does nothingIndicates whether output should be an aromatic SMILES.- Parameters:
useAromaticityFlag
- if false only SP2-hybridized atoms will be lower case (default), true=SP2 or aromaticity trigger lower case
-
createComparator
-
SmilesGenerator(int)
configuring withSmiFlavor
.