Class SmilesParser
- java.lang.Object
-
- org.openscience.cdk.smiles.SmilesParser
-
public final class SmilesParser extends Object
Read molecules and reactions from a SMILES [?Authors?, SMILES Tutorial] string. Example usage
Reading Aromatic SMILEStry { SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance()); IAtomContainer m = sp.parseSmiles("c1ccccc1"); } catch (InvalidSmilesException e) { System.err.println(e.getMessage()); }
Aromatic SMILES are automatically kekulised producing a structure with assigned bond orders. The aromatic specification on the atoms is maintained from the SMILES even if the structures are not considered aromatic. For example 'c1ccc1' will correctly have two pi bonds assigned but the atoms/bonds will still be flagged as aromatic. Recomputing or clearing the aromaticty will remove these erroneous flags. If a kekulé structure could not be assigned this is considered an error. The most common example is the omission of hydrogens on aromatic nitrogens (aromatic pyrrole is specified as '[nH]1cccc1' not 'n1cccc1'). These structures can not be corrected without modifying their formula. If there are multiple locations a hydrogen could be placed the returned structure would differ depending on the atom input order. If you wish to skip the kekulistation (not recommended) then it can be disabled with
kekulise
. SMILES can be verified for validity with the DEPICT service. Unsupported FeaturesThe following features are not supported by this parser.
- variable order of bracket atom attributes, '[C-H]', '[CH@]' are considered invalid. The predefined order required by this parser follows the OpenSMILES specification of 'isotope', 'symbol', 'chiral', 'hydrogens', 'charge', 'atom class'
- atom class indication - this information is loaded but not annotated on the structure
- extended tetrahedral stereochemistry (cumulated double bonds)
- trigonal bipyramidal stereochemistry
- octahedral stereochemistry
The atom class is stored as the
CDKConstants.ATOM_ATOM_MAPPING
property.SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance()); IAtomContainer m = sp.parseSmiles("c1[cH:5]cccc1"); Integer c1 = m.getAtom(1) .getProperty(CDKConstants.ATOM_ATOM_MAPPING); // 5 Integer c2 = m.getAtom(2) .getProperty(CDKConstants.ATOM_ATOM_MAPPING); // null
- Author:
- Christoph Steinbeck, Egon Willighagen, John May
- Source code:
- main
- Belongs to CDK module:
- smiles
- Keywords:
- SMILES, parser
- Created on:
- 2002-04-29
-
-
Constructor Summary
Constructors Constructor Description SmilesParser(IChemObjectBuilder builder)
Create a new SMILES parser which will createIAtomContainer
s with the specified builder.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description boolean
isPreservingAromaticity()
Deprecated.void
kekulise(boolean kekulise)
Indicated whether structures should be automatically kekulised if they are provided as aromatic.IReactionSet
parseReactionSetSmiles(String smiles)
Parse a SMILES that describes a set of reactions representing multiple synthesis steps or a metabolic pathway.IReaction
parseReactionSmiles(String smiles)
Parse a reaction SMILES.IAtomContainer
parseSmiles(String smiles)
Parses a SMILES string and returns a structure (IAtomContainer
).void
setPreservingAromaticity(boolean preservingAromaticity)
Deprecated.void
setStrict(boolean strict)
Sets whether the parser is in strict mode.
-
-
-
Constructor Detail
-
SmilesParser
public SmilesParser(IChemObjectBuilder builder)
Create a new SMILES parser which will createIAtomContainer
s with the specified builder.- Parameters:
builder
- used to create the CDK domain objects
-
-
Method Detail
-
setStrict
public void setStrict(boolean strict)
Sets whether the parser is in strict mode. In non-strict mode (default) recoverable issues with SMILES are reported as warnings.- Parameters:
strict
- strict mode true/false.
-
parseReactionSmiles
public IReaction parseReactionSmiles(String smiles) throws InvalidSmilesException
Parse a reaction SMILES.- Parameters:
smiles
- The SMILES string to parse- Returns:
- An instance of
IReaction
- Throws:
InvalidSmilesException
- if the string cannot be parsed- See Also:
parseSmiles(String)
-
parseReactionSetSmiles
public IReactionSet parseReactionSetSmiles(String smiles) throws InvalidSmilesException
Parse a SMILES that describes a set of reactions representing multiple synthesis steps or a metabolic pathway. This is a logical extension to the SMILES reaction syntax. The basic idea is the product(s) of the previous step become the reactants of the next step.
Results in a reaction set with two reactions:{reactant}>{agent_1}>{product_1}>{agent_2}>{product_2}
{reactant}>{agent_1}>{product_1} step 1 {product_1}>{agent_2}>{product_2} step 2
- Parameters:
smiles
- the SMILES input string- Returns:
- the reaction set
- Throws:
InvalidSmilesException
- the input was invalid (with reason)
-
parseSmiles
public IAtomContainer parseSmiles(String smiles) throws InvalidSmilesException
Parses a SMILES string and returns a structure (IAtomContainer
).- Parameters:
smiles
- A SMILES string- Returns:
- A structure representing the provided SMILES
- Throws:
InvalidSmilesException
- thrown when the SMILES string is invalid
-
setPreservingAromaticity
@Deprecated public void setPreservingAromaticity(boolean preservingAromaticity)
Deprecated.Makes the Smiles parser set aromaticity as provided in the Smiles itself, without detecting it. Default false. Atoms will not be typed when set to true.- Parameters:
preservingAromaticity
- boolean to indicate if aromaticity is to be preserved.- See Also:
kekulise
-
isPreservingAromaticity
@Deprecated public boolean isPreservingAromaticity()
Deprecated.Gets the (default false) setting to preserve aromaticity as provided in the Smiles itself.- Returns:
- true or false indicating if aromaticity is preserved.
-
kekulise
public void kekulise(boolean kekulise)
Indicated whether structures should be automatically kekulised if they are provided as aromatic. Kekulisation is on by default but can be turned off if it is believed the structures can be handled without assigned bond orders (not recommended).- Parameters:
kekulise
- should structures be kekulised
-
-