Class SMARTSQueryTool


  • @Deprecated
    public class SMARTSQueryTool
    extends Object
    Deprecated.
    This class provides a easy to use wrapper around SMARTS matching functionality. User code that wants to do SMARTS matching should use this rather than using SMARTSParser (and UniversalIsomorphismTester) directly. Example usage would be
    
     SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance());
     IAtomContainer atomContainer = sp.parseSmiles("CC(=O)OC(=O)C");
     SMARTSQueryTool querytool = new SMARTSQueryTool("O=CO");
     boolean status = querytool.matches(atomContainer);
     if (status) {
        int nmatch = querytool.countMatches();
        List mappings = querytool.getMatchingAtoms();
        for (int i = 0; i < nmatch; i++) {
           List atomIndices = (List) mappings.get(i);
        }
     }
     
    SMARTS Extensions
    Currently the CDK supports the following SMARTS symbols, that are not described in the Daylight specification. However they are supported by other packages and are noted as such.
    Table 1 - Supported Extensions
    SymbolMeaningDefaultNotes
    GxPeriodic group numberNonex must be specified and must be a number between 1 and 18. This symbol is supported by the MOE SMARTS implementation
    #XAny non-carbon heavy elementNoneThis symbol is supported by the MOE SMARTS implementation
    ^xAny atom with the a specified hybridization stateNonex must be specified and should be between 1 and 8 (inclusive), corresponding to SP1, SP2, SP3, SP3D1, SP3D2 SP3D3, SP3D4 and SP3D5. Supported by the OpenEye SMARTS implementation
    Notes
    • As described by Craig James the h<n> SMARTS pattern should not be used. It was included in the Daylight spec for backwards compatibility. To match hydrogens, use the H<n> pattern.
    • The wild card pattern (*) will not match hydrogens (explicit or implicit) unless an isotope is specified. In other words, * gives two hits against C[2H] but 1 hit against C[H]. This also means that it gives no hits against [H][H]. This is contrary to what is shown by Daylights depictmatch service, but is based on this discussion. A work around to get * to match [H][H] is to write it in the form [1H][1H]. It's not entirely clear what the behavior of * should be with respect to hydrogens. it is possible that the code will be updated so that * will not match any hydrogen in the future.
    • The org.openscience.cdk.aromaticity.CDKHueckelAromaticityDetector only considers single rings and two fused non-spiro rings. As a result, it does not properly detect aromaticity in polycyclic systems such as [O-]C(=O)c1ccccc1c2c3ccc([O-])cc3oc4cc(=O)ccc24. Thus SMARTS patterns that depend on proper aromaticity detection may not work correctly in such polycyclic systems
    Author:
    Rajarshi Guha
    Source code:
    main
    Belongs to CDK module:
    smarts
    Keywords:
    SMARTS, substructure search
    Created on:
    2007-04-08
    • Constructor Detail

      • SMARTSQueryTool

        public SMARTSQueryTool​(String smarts,
                               IChemObjectBuilder builder)
        Deprecated.
        Create a new SMARTS query tool for the specified SMARTS string. Query objects will contain a reference to the specified IChemObjectBuilder.
        Parameters:
        smarts - SMARTS query string
        Throws:
        IllegalArgumentException - if the SMARTS string can not be handled
    • Method Detail

      • setQueryCacheSize

        public void setQueryCacheSize​(int maxEntries)
        Deprecated.
        Set the maximum size of the query cache.
        Parameters:
        maxEntries - The maximum number of entries
      • useSmallestSetOfSmallestRings

        public void useSmallestSetOfSmallestRings()
        Deprecated.
        Indicates that ring properties should use the Smallest Set of Smallest Rings. The set is not unique and may lead to ambiguous matches.
        See Also:
        useEssentialRings(), useRelevantRings()
      • useRelevantRings

        public void useRelevantRings()
        Deprecated.
        Indicates that ring properties should use the Relevant Rings. The set is unique and includes all of the SSSR but may be exponential in size.
        See Also:
        useSmallestSetOfSmallestRings(), useEssentialRings()
      • useEssentialRings

        public void useEssentialRings()
        Deprecated.
        Indicates that ring properties should use the Essential Rings (default). The set is unique but only includes a subset of the SSSR.
        See Also:
        useSmallestSetOfSmallestRings(), useEssentialRings()
      • setAromaticity

        public void setAromaticity​(Aromaticity aromaticity)
        Deprecated.
        Set the aromaticity perception to use. Different aromaticity models may required certain attributes to be set (e.g. atom typing). These will not be automatically configured and should be preset before matching.
         SMARTSQueryTool sqt = new SMARTSQueryTool(...);
         sqt.setAromaticity(new Aromaticity(ElectronDonation.cdk(),
                                            Cycles.cdkAromaticSet));
         for (IAtomContainer molecule : molecules) {
        
             // CDK Aromatic model needs atom types
             AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(molecule);
        
             sqt.matches(molecule);
         }
         
        Parameters:
        aromaticity - the new aromaticity perception
        See Also:
        ElectronDonation, Cycles
      • getSmarts

        public String getSmarts()
        Deprecated.
        Returns the current SMARTS pattern being used.
        Returns:
        The SMARTS pattern
      • setSmarts

        public void setSmarts​(String smarts)
                       throws CDKException
        Deprecated.
        Set a new SMARTS pattern.
        Parameters:
        smarts - The new SMARTS pattern
        Throws:
        CDKException - if there is an error in parsing the pattern
      • matches

        public boolean matches​(IAtomContainer atomContainer)
                        throws CDKException
        Deprecated.
        Perform a SMARTS match and check whether the query is present in the target molecule. This function simply checks whether the query pattern matches the specified molecule. However the function will also, internally, save the mapping of query atoms to the target molecule Note: This method performs a simple caching scheme, by comparing the current molecule to the previous molecule by reference. If you repeatedly match different SMARTS on the same molecule, this method will avoid initializing ( ring perception, aromaticity etc.) the molecule each time. If however, you modify the molecule between such multiple matchings you should use the other form of this method to force initialization.
        Parameters:
        atomContainer - The target moleculoe
        Returns:
        true if the pattern is found in the target molecule, false otherwise
        Throws:
        CDKException - if there is an error in ring, aromaticity or isomorphism perception
        See Also:
        getMatchingAtoms(), countMatches(), matches(org.openscience.cdk.interfaces.IAtomContainer, boolean)
      • matches

        public boolean matches​(IAtomContainer atomContainer,
                               boolean forceInitialization)
                        throws CDKException
        Deprecated.
        Perform a SMARTS match and check whether the query is present in the target molecule. This function simply checks whether the query pattern matches the specified molecule. However the function will also, internally, save the mapping of query atoms to the target molecule
        Parameters:
        atomContainer - The target moleculoe
        forceInitialization - If true, then the molecule is initialized (ring perception, aromaticity etc). If false, the molecule is only initialized if it is different (in terms of object reference) than one supplied in a previous call to this method.
        Returns:
        true if the pattern is found in the target molecule, false otherwise
        Throws:
        CDKException - if there is an error in ring, aromaticity or isomorphism perception
        See Also:
        getMatchingAtoms(), countMatches(), matches(org.openscience.cdk.interfaces.IAtomContainer)
      • countMatches

        public int countMatches()
        Deprecated.
        Returns the number of times the pattern was found in the target molecule. This function should be called after matches(org.openscience.cdk.interfaces.IAtomContainer). If not, the results may be undefined.
        Returns:
        The number of times the pattern was found in the target molecule
      • getMatchingAtoms

        public List<List<Integer>> getMatchingAtoms()
        Deprecated.
        Get the atoms in the target molecule that match the query pattern. Since there may be multiple matches, the return value is a List of List objects. Each List object contains the indices of the atoms in the target molecule, that match the query pattern
        Returns:
        A List of List of atom indices in the target molecule
      • getUniqueMatchingAtoms

        public List<List<Integer>> getUniqueMatchingAtoms()
        Deprecated.
        Get the atoms in the target molecule that match the query pattern. Since there may be multiple matches, the return value is a List of List objects. Each List object contains the unique set of indices of the atoms in the target molecule, that match the query pattern
        Returns:
        A List of List of atom indices in the target molecule