Class ExtendedFingerprinter

  • All Implemented Interfaces:
    IFingerprinter

    public class ExtendedFingerprinter
    extends Object
    implements IFingerprinter
    Generates an extended fingerprint for a given IAtomContainer, that extends the Fingerprinter with additional (25) bits describing ring features and isotopic masses. JWM Comment: It's better to actually just hash the rings over the entire length simply using a different seed. The original version of the class used non-unique SSSR which of course doesn't work for substructure screening so this fingerprint can only be used for similarity.
    Author:
    shk3
    See Also:
    Fingerprinter
    Source code:
    main
    Belongs to CDK module:
    fingerprint
    Keywords:
    fingerprint, similarity
    Created on:
    2006-01-13
    • Constructor Detail

      • ExtendedFingerprinter

        public ExtendedFingerprinter()
        Creates a fingerprint generator of length DEFAULT_SIZE and with a search depth of DEFAULT_SEARCH_DEPTH.
      • ExtendedFingerprinter

        public ExtendedFingerprinter​(int size)
      • ExtendedFingerprinter

        public ExtendedFingerprinter​(int size,
                                     int searchDepth)
        Constructs a fingerprint generator that creates fingerprints of the given size, using a generation algorithm with the given search depth.
        Parameters:
        size - The desired size of the fingerprint
        searchDepth - The desired depth of search
    • Method Detail

      • getBitFingerprint

        public IBitFingerprint getBitFingerprint​(IAtomContainer container)
                                          throws CDKException
        Generates a fingerprint of the default size for the given AtomContainer, using path and ring metrics. It contains the informations from getBitFingerprint() and bits which tell if the structure has 0 rings, 1 or less rings, 2 or less rings ... 10 or less rings (referring to smallest set of smallest rings) and bits which tell if there is a fused ring system with 1,2...8 or more rings in it
        Specified by:
        getBitFingerprint in interface IFingerprinter
        Parameters:
        container - The AtomContainer for which a Fingerprint is generated
        Returns:
        a bit fingerprint for the given IAtomContainer.
        Throws:
        CDKException - may be thrown if there is an error during aromaticity detection or (for key based fingerprints) if there is a SMARTS parsing error
      • getRawFingerprint

        public Map<String,​Integer> getRawFingerprint​(IAtomContainer container)
                                                    throws CDKException
        Returns the raw representation of the fingerprint for the given IAtomContainer. The raw representation contains counts as well as the key strings.
        Specified by:
        getRawFingerprint in interface IFingerprinter
        Parameters:
        container - IAtomContainer for which the fingerprint should be calculated.
        Returns:
        the raw fingerprint
        Throws:
        CDKException
      • getBitFingerprint

        public IBitFingerprint getBitFingerprint​(IAtomContainer atomContainer,
                                                 IRingSet ringSet,
                                                 List<IRingSet> rslist)
                                          throws CDKException
        Generates a fingerprint of the default size for the given AtomContainer, using path and ring metrics. It contains the informations from getBitFingerprint() and bits which tell if the structure has 0 rings, 1 or less rings, 2 or less rings ... 10 or less rings and bits which tell if there is a fused ring system with 1,2...8 or more rings in it. The RingSet used is passed via rs parameter. This must be a smallesSetOfSmallestRings. The List must be a list of all ring systems in the molecule.
        Parameters:
        atomContainer - The AtomContainer for which a Fingerprint is generated
        ringSet - A SSSR RingSet of ac (if not available, use getExtendedFingerprint(AtomContainer ac), which does the calculation)
        rslist - A list of all ring systems in ac
        Returns:
        a BitSet representing the fingerprint
        Throws:
        CDKException - for example if input can not be cloned.
      • getSize

        public int getSize()
        Returns the size (or length) of the fingerprint.
        Specified by:
        getSize in interface IFingerprinter
        Returns:
        the size of the fingerprint
      • getVersionDescription

        public String getVersionDescription()
        Description copied from interface: IFingerprinter
        Generate a fingerprint type version description in chemfp's FPS format. We report the library version rather than an individual version per fingerprint, although this is awkward as many fingerprint's don't/won't change between releases and we can not keep compatibility we guarantee we document how the fingerprint was encoded.
        Examples:
         #type=CDK-Fingerprinter/2.0 searchDepth=7 pathLimit=2000 hashPseudoAtoms=true
         #type=CDK-CircularFingerprint/2.0 classType=ECFP4
         
        Specified by:
        getVersionDescription in interface IFingerprinter
        Returns:
        version description.
      • setPathLimit

        public void setPathLimit​(int pathLimit)
        Set the pathLimit for the base daylight/path fingerprint. If too many paths are generated from a single atom an exception is thrown.
        Parameters:
        pathLimit - the number of paths to generate from a node
        See Also:
        Fingerprinter
      • setHashPseudoAtoms

        public void setHashPseudoAtoms​(boolean hashPseudoAtoms)
        Set the hashPseudoAtoms for the base daylight/path fingerprint. This indicates whether pseudo-atoms should be hashed, for substructure screening this is not desirable - but this fingerprint uses SSSR so can't be used for substructure screening regardless.
        Parameters:
        hashPseudoAtoms - the number of paths to generate from a node
        See Also:
        Fingerprinter