public class PubchemFingerprinter extends AbstractFingerprinter implements IFingerprinter
Fingerprinter
for a
more detailed description of fingerprints in general. This implementation is
based on the public domain code made available by the NCGC
here
A fingerprint is generated for an AtomContainer with this code: Molecule molecule = new Molecule(); PubchemFingerprinter fprinter = new PubchemFingerprinter(); BitSet fingerprint = fprinter.getBitFingerprint(molecule); fprinter.getSize(); // returns 881 fingerprint.length(); // returns the highest set bitNote that the fingerprinter assumes that you have detected aromaticity and atom types before evaluating the fingerprint. Also the fingerprinter expects that explicit H's are present Note that this fingerprint is not particularly fast, as it will perform ring detection using
AllRingsFinder
as well as multiple SMARTS queries.
Some SMARTS patterns have been modified from the original code, since they
were based on explicit H matching. As a result, we replace the explicit H's
with a query of the #<N>&!H0
where <N>
is the atomic number. Thus bit 344 was
originally [#6](~[#6])([H])
but is written here as
[#6&!H0]~[#6]
. In some cases, where the H count can be reduced
to single possibility we directly use that H count. An example is bit 35,
which was [#6](~[#6])(~[#6])(~[#6])([H])
and is rewritten as
[#6H1](~[#6])(~[#6])(~[#6])
.
Modifier and Type | Field and Description |
---|---|
static int |
FP_SIZE
Number of bits in this fingerprint.
|
Constructor and Description |
---|
PubchemFingerprinter(IChemObjectBuilder builder) |
Modifier and Type | Method and Description |
---|---|
static BitSet |
decode(String enc)
Returns a fingerprint from a Base64 encoded Pubchem fingerprint.
|
IBitFingerprint |
getBitFingerprint(IAtomContainer atomContainer)
Calculate 881 bit Pubchem fingerprint for a molecule.
|
ICountFingerprint |
getCountFingerprint(IAtomContainer container)
Returns the count fingerprint for the given
IAtomContainer . |
byte[] |
getFingerprintAsBytes()
Returns the fingerprint generated for a molecule as a byte[].
|
Map<String,Integer> |
getRawFingerprint(IAtomContainer iAtomContainer)
Returns the raw representation of the fingerprint for the given IAtomContainer.
|
int |
getSize()
Get the size of the fingerprint.
|
getFingerprint, getParameters, getVersionDescription
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFingerprint, getVersionDescription
public static final int FP_SIZE
public PubchemFingerprinter(IChemObjectBuilder builder)
public IBitFingerprint getBitFingerprint(IAtomContainer atomContainer) throws CDKException
getBitFingerprint
in interface IFingerprinter
atomContainer
- the molecule to considerCDKException
- if there is an error during substructure
searching or atom typinggetFingerprintAsBytes()
public Map<String,Integer> getRawFingerprint(IAtomContainer iAtomContainer) throws CDKException
getRawFingerprint
in interface IFingerprinter
iAtomContainer
- IAtomContainer for which the fingerprint should be calculated.CDKException
public int getSize()
getSize
in interface IFingerprinter
public byte[] getFingerprintAsBytes()
getBitFingerprint(org.openscience.cdk.interfaces.IAtomContainer)
getBitFingerprint(org.openscience.cdk.interfaces.IAtomContainer)
public static BitSet decode(String enc)
enc
- The Base64 encoded fingerprintpublic ICountFingerprint getCountFingerprint(IAtomContainer container) throws CDKException
IAtomContainer
.getCountFingerprint
in interface IFingerprinter
container
- IAtomContainer
for which the fingerprint should be calculated.CDKException
- if there is an error during aromaticity detection
or (for key based fingerprints) if there is a SMARTS parsing error.Copyright © 2022. All rights reserved.