Class ExtendedFingerprinter

java.lang.Object
org.openscience.cdk.fingerprint.ExtendedFingerprinter
All Implemented Interfaces:
IFingerprinter

public class ExtendedFingerprinter extends Object implements IFingerprinter
Generates an extended fingerprint for a given IAtomContainer, that extends the Fingerprinter with additional (25) bits describing ring features and isotopic masses. JWM Comment: It's better to actually just hash the rings over the entire length simply using a different seed. The original version of the class used non-unique SSSR which of course doesn't work for substructure screening so this fingerprint can only be used for similarity.
Author:
shk3
See Also:
Source code:
main
Belongs to CDK module:
fingerprint
Keywords:
fingerprint, similarity
Created on:
2006-01-13
  • Constructor Details

    • ExtendedFingerprinter

      public ExtendedFingerprinter()
      Creates a fingerprint generator of length DEFAULT_SIZE and with a search depth of DEFAULT_SEARCH_DEPTH.
    • ExtendedFingerprinter

      public ExtendedFingerprinter(int size)
    • ExtendedFingerprinter

      public ExtendedFingerprinter(int size, int searchDepth)
      Constructs a fingerprint generator that creates fingerprints of the given size, using a generation algorithm with the given search depth.
      Parameters:
      size - The desired size of the fingerprint
      searchDepth - The desired depth of search
  • Method Details

    • getBitFingerprint

      public IBitFingerprint getBitFingerprint(IAtomContainer container) throws CDKException
      Generates a fingerprint of the default size for the given AtomContainer, using path and ring metrics. It contains the informations from getBitFingerprint() and bits which tell if the structure has 0 rings, 1 or less rings, 2 or less rings ... 10 or less rings (referring to smallest set of smallest rings) and bits which tell if there is a fused ring system with 1,2...8 or more rings in it
      Specified by:
      getBitFingerprint in interface IFingerprinter
      Parameters:
      container - The AtomContainer for which a Fingerprint is generated
      Returns:
      a bit fingerprint for the given IAtomContainer.
      Throws:
      CDKException - may be thrown if there is an error during aromaticity detection or (for key based fingerprints) if there is a SMARTS parsing error
    • getRawFingerprint

      public Map<String,Integer> getRawFingerprint(IAtomContainer container) throws CDKException
      Returns the raw representation of the fingerprint for the given IAtomContainer. The raw representation contains counts as well as the key strings.
      Specified by:
      getRawFingerprint in interface IFingerprinter
      Parameters:
      container - IAtomContainer for which the fingerprint should be calculated.
      Returns:
      the raw fingerprint
      Throws:
      CDKException
    • getBitFingerprint

      public IBitFingerprint getBitFingerprint(IAtomContainer atomContainer, IRingSet ringSet, List<IRingSet> rslist) throws CDKException
      Generates a fingerprint of the default size for the given AtomContainer, using path and ring metrics. It contains the informations from getBitFingerprint() and bits which tell if the structure has 0 rings, 1 or less rings, 2 or less rings ... 10 or less rings and bits which tell if there is a fused ring system with 1,2...8 or more rings in it. The RingSet used is passed via rs parameter. This must be a smallesSetOfSmallestRings. The List must be a list of all ring systems in the molecule.
      Parameters:
      atomContainer - The AtomContainer for which a Fingerprint is generated
      ringSet - A SSSR RingSet of ac (if not available, use getExtendedFingerprint(AtomContainer ac), which does the calculation)
      rslist - A list of all ring systems in ac
      Returns:
      a BitSet representing the fingerprint
      Throws:
      CDKException - for example if input can not be cloned.
    • getSize

      public int getSize()
      Returns the size (or length) of the fingerprint.
      Specified by:
      getSize in interface IFingerprinter
      Returns:
      the size of the fingerprint
    • getCountFingerprint

      public ICountFingerprint getCountFingerprint(IAtomContainer container) throws CDKException
      Returns the count fingerprint for the given IAtomContainer.
      Specified by:
      getCountFingerprint in interface IFingerprinter
      Parameters:
      container - IAtomContainer for which the fingerprint should be calculated.
      Returns:
      the count fingerprint
      Throws:
      CDKException - if there is an error during aromaticity detection or (for key based fingerprints) if there is a SMARTS parsing error.
    • getVersionDescription

      public String getVersionDescription()
      Description copied from interface: IFingerprinter
      Generate a fingerprint type version description in chemfp's FPS format. We report the library version rather than an individual version per fingerprint, although this is awkward as many fingerprint's don't/won't change between releases and we can not keep compatibility we guarantee we document how the fingerprint was encoded.
      Examples:
       #type=CDK-Fingerprinter/2.0 searchDepth=7 pathLimit=2000 hashPseudoAtoms=true
       #type=CDK-CircularFingerprint/2.0 classType=ECFP4
       
      Specified by:
      getVersionDescription in interface IFingerprinter
      Returns:
      version description.
    • getFingerprint

      public BitSet getFingerprint(IAtomContainer mol) throws CDKException
      Description copied from interface: IFingerprinter
      Generate a binary fingerprint as a bit. This method will usually delegate to IFingerprinter.getBitFingerprint(IAtomContainer) and invoke IBitFingerprint.asBitSet(), it is included for backwards compatibility.
      Specified by:
      getFingerprint in interface IFingerprinter
      Parameters:
      mol - molecule
      Returns:
      BitSet
      Throws:
      CDKException - problem generating fingerprint
    • setPathLimit

      public void setPathLimit(int pathLimit)
      Set the pathLimit for the base daylight/path fingerprint. If too many paths are generated from a single atom an exception is thrown.
      Parameters:
      pathLimit - the number of paths to generate from a node
      See Also:
    • setHashPseudoAtoms

      public void setHashPseudoAtoms(boolean hashPseudoAtoms)
      Set the hashPseudoAtoms for the base daylight/path fingerprint. This indicates whether pseudo-atoms should be hashed, for substructure screening this is not desirable - but this fingerprint uses SSSR so can't be used for substructure screening regardless.
      Parameters:
      hashPseudoAtoms - the number of paths to generate from a node
      See Also: