Class PubchemFingerprinter

All Implemented Interfaces:

public class PubchemFingerprinter extends AbstractFingerprinter implements IFingerprinter
Generates a Pubchem fingerprint for a molecule. These fingerprints are described here and are of the structural key type, of length 881. See Fingerprinter for a more detailed description of fingerprints in general. This implementation is based on the public domain code made available by the NCGC here A fingerprint is generated for an AtomContainer with this code:
   Molecule molecule = new Molecule();
   PubchemFingerprinter fprinter = new PubchemFingerprinter();
   BitSet fingerprint = fprinter.getBitFingerprint(molecule);
   fprinter.getSize(); // returns 881
   fingerprint.length(); // returns the highest set bit
Note that the fingerprinter assumes that you have detected aromaticity and atom types before evaluating the fingerprint. Also the fingerprinter expects that explicit H's are present Note that this fingerprint is not particularly fast, as it will perform ring detection using AllRingsFinder as well as multiple SMARTS queries. Some SMARTS patterns have been modified from the original code, since they were based on explicit H matching. As a result, we replace the explicit H's with a query of the #<N>&!H0 where <N> is the atomic number. Thus bit 344 was originally [#6](~[#6])([H]) but is written here as [#6&!H0]~[#6]. In some cases, where the H count can be reduced to single possibility we directly use that H count. An example is bit 35, which was [#6](~[#6])(~[#6])(~[#6])([H]) and is rewritten as [#6H1](~[#6])(~[#6])(~[#6]).
Warning - this class is not thread-safe and uses stores intermediate steps internally. Please use a separate instance of the class for each thread.
Important! this fingerprint can not be used for substructure screening.
Rajarshi Guha
Source code:
Belongs to CDK module:
Thread Safe: No
fingerprint, similarity