Package org.openscience.cdk.fingerprint
Class Fingerprinter
java.lang.Object
org.openscience.cdk.fingerprint.AbstractFingerprinter
org.openscience.cdk.fingerprint.Fingerprinter
- All Implemented Interfaces:
IFingerprinter
- Direct Known Subclasses:
GraphOnlyFingerprinter,HybridizationFingerprinter
Generates a fingerprint for a given AtomContainer. Fingerprints are
one-dimensional bit arrays, where bits are set according to a the
occurrence of a particular structural feature (See for example the
Daylight inc. theory manual for more information). Fingerprints allow for
a fast screening step to exclude candidates for a substructure search in a
database. They are also a means for determining the similarity of chemical
structures.
A fingerprint is generated for an AtomContainer with this code:
Molecule molecule = new Molecule(); IFingerprinter fingerprinter = new Fingerprinter(); IBitFingerprint fingerprint = fingerprinter.getBitFingerprint(molecule); fingerprint.size(); // returns 1024 by default fingerprint.length(); // returns the highest set bit
The FingerPrinter has the option to ignore explicit hydrogen's
(setHashExplicitHydrogens(boolean)) and pseudo atoms
(setHashPseudoAtoms(boolean)). This ensures the
fingerprint can be used for substructure screening by default.
Another Warning : The daylight manual says: "Fingerprints are not so definite: if a fingerprint indicates a pattern is missing then it certainly is, but it can only indicate a pattern's presence with some probability." In the case of very small molecules, the probability that you get the same fingerprint for different molecules is high.
- Author:
- steinbeck
- Keywords:
- fingerprint, similarity
- Created on:
- 2002-02-24
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intThe default search depth used to create the fingerprints.static final intThe default length of created fingerprints. -
Constructor Summary
ConstructorsConstructorDescriptionCreates a fingerprint generator of lengthDEFAULT_SIZEand with a search depth ofDEFAULT_SEARCH_DEPTH.Fingerprinter(int size) Fingerprinter(int size, int searchDepth) Constructs a fingerprint generator that creates fingerprints of the given size, using a generation algorithm with the given search depth. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidencodePaths(IAtomContainer mol, int depth, BitSet fp, int size) protected int[]findPathes(IAtomContainer container, int searchDepth) Deprecated.getBitFingerprint(IAtomContainer container) Generates a fingerprint of the default size for the given AtomContainer.getBitFingerprint(IAtomContainer container, AllRingsFinder ringFinder) Generates a fingerprint of the default size for the given AtomContainer.protected StringgetBondSymbol(IBond bond) Gets the bondSymbol attribute of the Fingerprinter classgetCountFingerprint(IAtomContainer container) Returns the count fingerprint for the givenIAtomContainer.Base classes should override this method to report the parameters they are configured with.getRawFingerprint(IAtomContainer container) Returns the raw representation of the fingerprint for the given IAtomContainer.intintgetSize()Returns the size (or length) of the fingerprint.voidsetHashExplicitHydrogens(boolean value) Include explicit hydrogen atoms in the fingerprint.voidsetHashPseudoAtoms(boolean value) Include pseudo/query atoms in the fingerprint with atomic number 0.voidsetPathLimit(int limit) Methods inherited from class org.openscience.cdk.fingerprint.AbstractFingerprinter
getFingerprint, getVersionDescriptionMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.openscience.cdk.fingerprint.IFingerprinter
getFingerprint, getVersionDescription
-
Field Details
-
DEFAULT_SIZE
public static final int DEFAULT_SIZEThe default length of created fingerprints.- See Also:
-
DEFAULT_SEARCH_DEPTH
public static final int DEFAULT_SEARCH_DEPTHThe default search depth used to create the fingerprints.- See Also:
-
-
Constructor Details
-
Fingerprinter
public Fingerprinter()Creates a fingerprint generator of lengthDEFAULT_SIZEand with a search depth ofDEFAULT_SEARCH_DEPTH. -
Fingerprinter
public Fingerprinter(int size) -
Fingerprinter
public Fingerprinter(int size, int searchDepth) Constructs a fingerprint generator that creates fingerprints of the given size, using a generation algorithm with the given search depth.- Parameters:
size- The desired size of the fingerprintsearchDepth- The desired depth of search (number of bonds)
-
-
Method Details
-
getParameters
Description copied from class:AbstractFingerprinterBase classes should override this method to report the parameters they are configured with.- Overrides:
getParametersin classAbstractFingerprinter- Returns:
- The key=value pairs of configured parameters
-
getBitFingerprint
public IBitFingerprint getBitFingerprint(IAtomContainer container, AllRingsFinder ringFinder) throws CDKException Generates a fingerprint of the default size for the given AtomContainer.- Parameters:
container- The AtomContainer for which a Fingerprint is generatedringFinder- An instance ofAllRingsFinder- Returns:
- A
BitSetrepresenting the fingerprint - Throws:
CDKException- if there is a timeout in ring or aromaticity perception
-
getBitFingerprint
Generates a fingerprint of the default size for the given AtomContainer.- Specified by:
getBitFingerprintin interfaceIFingerprinter- Parameters:
container- The AtomContainer for which a Fingerprint is generated- Returns:
- the bit fingerprint
- Throws:
CDKException- may be thrown if there is an error during aromaticity detection or (for key based fingerprints) if there is a SMARTS parsing error
-
getRawFingerprint
Returns the raw representation of the fingerprint for the given IAtomContainer. The raw representation contains counts as well as the key strings.- Specified by:
getRawFingerprintin interfaceIFingerprinter- Parameters:
container- IAtomContainer for which the fingerprint should be calculated.- Returns:
- the raw fingerprint
- Throws:
CDKException
-
getCountFingerprint
Description copied from interface:IFingerprinterReturns the count fingerprint for the givenIAtomContainer.- Specified by:
getCountFingerprintin interfaceIFingerprinter- Parameters:
container-IAtomContainerfor which the fingerprint should be calculated.- Returns:
- the count fingerprint
- Throws:
CDKException- if there is an error during aromaticity detection or (for key based fingerprints) if there is a SMARTS parsing error.
-
findPathes
@Deprecated protected int[] findPathes(IAtomContainer container, int searchDepth) throws CDKException Deprecated.Get all paths of lengths 0 to the specified length. This method will find all paths up to length N starting from each atom in the molecule and return the unique set of such paths.- Parameters:
container- The molecule to searchsearchDepth- The maximum path length desired- Returns:
- A Map of path strings, keyed on themselves
- Throws:
CDKException
-
encodePaths
- Throws:
CDKException
-
getBondSymbol
Gets the bondSymbol attribute of the Fingerprinter class- Parameters:
bond- Description of the Parameter- Returns:
- The bondSymbol value
-
setPathLimit
public void setPathLimit(int limit) -
setHashPseudoAtoms
public void setHashPseudoAtoms(boolean value) Include pseudo/query atoms in the fingerprint with atomic number 0. Generally for substructure screening, which path based fingerprints are most useful, this is not wanted.- Parameters:
value- the setting (false by default)
-
setHashExplicitHydrogens
public void setHashExplicitHydrogens(boolean value) Include explicit hydrogen atoms in the fingerprint. This means you get a different fingerprint if hydrogens are implicit/explicit. Generally for substructure screening, which path based fingerprints are most useful, this is not wanted.- Parameters:
value- the setting (false by default)
-
getSearchDepth
public int getSearchDepth() -
getSize
public int getSize()Description copied from interface:IFingerprinterReturns the size (or length) of the fingerprint.- Specified by:
getSizein interfaceIFingerprinter- Returns:
- the size of the fingerprint
-
encodePaths(IAtomContainer, int, BitSet, int)