@TestClass(value="org.openscience.cdk.fingerprint.PubchemFingerprinterTest") public class PubchemFingerprinter extends Object implements IFingerprinter
Fingerprinter
for a
more detailed description of fingerprints in general. This implementation is
based on the public domain code made available by the NCGC
here
A fingerprint is generated for an AtomContainer with this code: Molecule molecule = new Molecule(); PubchemFingerprinter fprinter = new PubchemFingerprinter(); BitSet fingerprint = fprinter.getFingerprint(molecule); fprinter.getSize(); // returns 881 fingerprint.length(); // returns the highest set bitNote that the fingerprinter assumes that you have detected aromaticity and atom types before evaluating the fingerprint. Also the fingerprinter expects that explicit H's are present Note that this fingerprint is not particularly fast, as it will perform ring detection using
AllRingsFinder
as well as multiple SMARTS queries.
Some SMARTS patterns have been modified from the original code, since they
were based on explicit H matching. As a result, we replace the explicit H's
with a query of the #N&!H0 where N is the atomic number. Thus bit 344 was
originally [#6](~[#6])([H])
but is written here as
[#6&!H0]~[#6]
. In some cases, where the H count can be reduced
to single possibility we directly use that H count. An example is bit 35,
which was [#6](~[#6])(~[#6])(~[#6])([H])
and is rewritten as
[#6H1](~[#6])(~[#6])(~[#6]
.
Modifier and Type | Field and Description |
---|---|
static int |
FP_SIZE |
Constructor and Description |
---|
PubchemFingerprinter() |
Modifier and Type | Method and Description |
---|---|
static BitSet |
decode(String enc)
Returns a fingerprint from a Base64 encoded Pubchem fingerprint.
|
BitSet |
getFingerprint(IAtomContainer atomContainer)
Calculate 881 bit Pubchem fingerprint for a molecule.
|
byte[] |
getFingerprintAsBytes()
Returns the fingerprint generated for a molecule as a byte[].
|
int |
getSize()
Get the size of the fingerprint.
|
public static final int FP_SIZE
@TestMethod(value="testFingerprint") public BitSet getFingerprint(IAtomContainer atomContainer) throws CDKException
getFingerprint
in interface IFingerprinter
atomContainer
- the molecule to considerCDKException
- if there is an error during substructure
searching or atom typinggetFingerprintAsBytes()
@TestMethod(value="testGetSize") public int getSize()
getSize
in interface IFingerprinter
@TestMethod(value="testGetFingerprintAsBytes") public byte[] getFingerprintAsBytes()
getFingerprint(org.openscience.cdk.interfaces.IAtomContainer)
getFingerprint(org.openscience.cdk.interfaces.IAtomContainer)
@TestMethod(value="testDecode,testDecode_invalid") public static BitSet decode(String enc)
enc
- The Base64 encoded fingerprint