Class SMARTSQueryTool
java.lang.Object
org.openscience.cdk.smiles.smarts.SMARTSQueryTool
Deprecated.
This class provides a easy to use wrapper around SMARTS matching functionality. User code that wants to do
SMARTS matching should use this rather than using SMARTSParser (and UniversalIsomorphismTester) directly. Example
usage would be
Currently the CDK supports the following SMARTS symbols, that are not described in the Daylight specification. However they are supported by other packages and are noted as such.
Notes
SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance());
IAtomContainer atomContainer = sp.parseSmiles("CC(=O)OC(=O)C");
SMARTSQueryTool querytool = new SMARTSQueryTool("O=CO");
boolean status = querytool.matches(atomContainer);
if (status) {
int nmatch = querytool.countMatches();
List mappings = querytool.getMatchingAtoms();
for (int i = 0; i < nmatch; i++) {
List atomIndices = (List) mappings.get(i);
}
}
SMARTS ExtensionsCurrently the CDK supports the following SMARTS symbols, that are not described in the Daylight specification. However they are supported by other packages and are noted as such.
Symbol | Meaning | Default | Notes |
---|---|---|---|
Gx | Periodic group number | None | x must be specified and must be a number between 1 and 18. This symbol is supported by the MOE SMARTS implementation |
#X | Any non-carbon heavy element | None | This symbol is supported by the MOE SMARTS implementation |
^x | Any atom with the a specified hybridization state | None | x must be specified and should be between 1 and 8 (inclusive), corresponding to SP1, SP2, SP3, SP3D1, SP3D2 SP3D3, SP3D4 and SP3D5. Supported by the OpenEye SMARTS implementation |
- As described
by Craig James the
h<n>
SMARTS pattern should not be used. It was included in the Daylight spec for backwards compatibility. To match hydrogens, use theH<n>
pattern. - The wild card
pattern (
*
) will not match hydrogens (explicit or implicit) unless an isotope is specified. In other words,*
gives two hits againstC[2H]
but 1 hit againstC[H]
. This also means that it gives no hits against[H][H]
. This is contrary to what is shown by Daylights depictmatch service, but is based on this discussion. A work around to get*
to match[H][H]
is to write it in the form[1H][1H]
. It's not entirely clear what the behavior of * should be with respect to hydrogens. it is possible that the code will be updated so that*
will not match any hydrogen in the future. - The
org.openscience.cdk.aromaticity.CDKHueckelAromaticityDetector only considers single rings and two fused non-spiro
rings. As a result, it does not properly detect aromaticity in polycyclic systems such as
[O-]C(=O)c1ccccc1c2c3ccc([O-])cc3oc4cc(=O)ccc24
. Thus SMARTS patterns that depend on proper aromaticity detection may not work correctly in such polycyclic systems
- Author:
- Rajarshi Guha
- Source code:
- main
- Belongs to CDK module:
- smarts
- Keywords:
- SMARTS, substructure search
- Created on:
- 2007-04-08
-
Constructor Summary
ConstructorsConstructorDescriptionSMARTSQueryTool
(String smarts, IChemObjectBuilder builder) Deprecated.Create a new SMARTS query tool for the specified SMARTS string. -
Method Summary
Modifier and TypeMethodDescriptionint
Deprecated.Returns the number of times the pattern was found in the target molecule.Deprecated.Get the atoms in the target molecule that match the query pattern.Deprecated.Returns the current SMARTS pattern being used.Deprecated.Get the atoms in the target molecule that match the query pattern.boolean
matches
(IAtomContainer atomContainer) Deprecated.Perform a SMARTS match and check whether the query is present in the target molecule.boolean
matches
(IAtomContainer atomContainer, boolean forceInitialization) Deprecated.Perform a SMARTS match and check whether the query is present in the target molecule.void
setAromaticity
(Aromaticity aromaticity) Deprecated.Set the aromaticity perception to use.void
setQueryCacheSize
(int maxEntries) Deprecated.Set the maximum size of the query cache.void
Deprecated.Set a new SMARTS pattern.void
Deprecated.Indicates that ring properties should use the Essential Rings (default).void
Deprecated.Indicates that ring properties should use the Relevant Rings.void
Deprecated.Indicates that ring properties should use the Smallest Set of Smallest Rings.
-
Constructor Details
-
SMARTSQueryTool
Deprecated.Create a new SMARTS query tool for the specified SMARTS string. Query objects will contain a reference to the specifiedIChemObjectBuilder
.- Parameters:
smarts
- SMARTS query string- Throws:
IllegalArgumentException
- if the SMARTS string can not be handled
-
-
Method Details
-
setQueryCacheSize
public void setQueryCacheSize(int maxEntries) Deprecated.Set the maximum size of the query cache.- Parameters:
maxEntries
- The maximum number of entries
-
useSmallestSetOfSmallestRings
public void useSmallestSetOfSmallestRings()Deprecated.Indicates that ring properties should use the Smallest Set of Smallest Rings. The set is not unique and may lead to ambiguous matches.- See Also:
-
useRelevantRings
public void useRelevantRings()Deprecated.Indicates that ring properties should use the Relevant Rings. The set is unique and includes all of the SSSR but may be exponential in size. -
useEssentialRings
public void useEssentialRings()Deprecated.Indicates that ring properties should use the Essential Rings (default). The set is unique but only includes a subset of the SSSR. -
setAromaticity
Deprecated.Set the aromaticity perception to use. Different aromaticity models may required certain attributes to be set (e.g. atom typing). These will not be automatically configured and should be preset before matching.SMARTSQueryTool sqt = new SMARTSQueryTool(...); sqt.setAromaticity(new Aromaticity(ElectronDonation.cdk(), Cycles.cdkAromaticSet)); for (IAtomContainer molecule : molecules) { // CDK Aromatic model needs atom types AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(molecule); sqt.matches(molecule); }
- Parameters:
aromaticity
- the new aromaticity perception- See Also:
-
getSmarts
Deprecated.Returns the current SMARTS pattern being used.- Returns:
- The SMARTS pattern
-
setSmarts
Deprecated.Set a new SMARTS pattern.- Parameters:
smarts
- The new SMARTS pattern- Throws:
CDKException
- if there is an error in parsing the pattern
-
matches
Deprecated.Perform a SMARTS match and check whether the query is present in the target molecule. This function simply checks whether the query pattern matches the specified molecule. However the function will also, internally, save the mapping of query atoms to the target molecule Note: This method performs a simple caching scheme, by comparing the current molecule to the previous molecule by reference. If you repeatedly match different SMARTS on the same molecule, this method will avoid initializing ( ring perception, aromaticity etc.) the molecule each time. If however, you modify the molecule between such multiple matchings you should use the other form of this method to force initialization.- Parameters:
atomContainer
- The target moleculoe- Returns:
- true if the pattern is found in the target molecule, false otherwise
- Throws:
CDKException
- if there is an error in ring, aromaticity or isomorphism perception- See Also:
-
matches
public boolean matches(IAtomContainer atomContainer, boolean forceInitialization) throws CDKException Deprecated.Perform a SMARTS match and check whether the query is present in the target molecule. This function simply checks whether the query pattern matches the specified molecule. However the function will also, internally, save the mapping of query atoms to the target molecule- Parameters:
atomContainer
- The target moleculoeforceInitialization
- If true, then the molecule is initialized (ring perception, aromaticity etc). If false, the molecule is only initialized if it is different (in terms of object reference) than one supplied in a previous call to this method.- Returns:
- true if the pattern is found in the target molecule, false otherwise
- Throws:
CDKException
- if there is an error in ring, aromaticity or isomorphism perception- See Also:
-
countMatches
public int countMatches()Deprecated.Returns the number of times the pattern was found in the target molecule. This function should be called aftermatches(org.openscience.cdk.interfaces.IAtomContainer)
. If not, the results may be undefined.- Returns:
- The number of times the pattern was found in the target molecule
-
getMatchingAtoms
Deprecated.Get the atoms in the target molecule that match the query pattern. Since there may be multiple matches, the return value is a List of List objects. Each List object contains the indices of the atoms in the target molecule, that match the query pattern- Returns:
- A List of List of atom indices in the target molecule
-
getUniqueMatchingAtoms
Deprecated.Get the atoms in the target molecule that match the query pattern. Since there may be multiple matches, the return value is a List of List objects. Each List object contains the unique set of indices of the atoms in the target molecule, that match the query pattern- Returns:
- A List of List of atom indices in the target molecule
-
SmartsPattern