Package org.openscience.cdk.graph
Class TripletShortCycles
- java.lang.Object
-
- org.openscience.cdk.graph.TripletShortCycles
-
public final class TripletShortCycles extends Object
Compute the shortest cycles through each vertex triple. This allows one to directly obtain the envelope rings of bicyclic fused system. These cycles can be thought of as the 'ESSSR' (extended smallest set of smallest rings) and 'envelope' rings as used by PubChem fingerprints (CACTVS Substructure Keys). The PubChem fingerprint documentation exclusively refers to the ESSSR and envelopes as just the ESSSR and the rest of this documentation does the same. This class provides the cycles (vertex paths) for each ring in the ESSSR. The ESSSR should not be confused with the extended set of smallest rings (ESSR) [Downs, G et. al.. J. Chem. Inf. Comput. Sci. 1989. 29]. Algorithm To our knowledge no algorithm has been published for the ESSSR. The PubChem Specifications states - "An ESSSR ring is any ring which does not share three consecutive atoms with any other ring in the chemical structure. For example, naphthalene has three ESSSR rings (two phenyl fragments and the 10-membered envelope), while biphenyl will yield a count of only two ESSSR rings". The name implies the use of the smallest set of smallest rings (SSSR). Not every graph has an SSSR and so the minimum cycle basis is used instead. With this modification the algorithm is outlined below.- Compute a minimum cycle basis (or SSSR) of the graph (may not be unique)
- For each vertex v and two adjacent vertices (u and w) check if the path -u-v-w- belongs to any cycles already in the basis
- If no such cycle can be found compute the shortest cycle which travels through -u-v-w- and add it to the basis. The shortest cycle is the shortest path from u to w which does not travel through v
RelevantCycles
which is the the smallest set of short cycles which is uniquely defined for a graph. To better explain the issue with the canonical labelling several examples are shown below. The table outlining the size of rings found for each molecule when using canonical and non-canonical generation. Also shown are the sizes of rings stored in the PubChem fingerprint associated with the entry. The fingerprints were obtained directly from PubChem and decoded using the specification. Sizes underlined and coloured red represent rings which may or may not be present depending on the atom ordering. It can be seen from the PubChem fingerprint that even using a consistent canonical labelling rings may be absent which would be present if the subgraph was used.PubChem CID Diagram Size of Rings in ESSSR
(fingerprints only store cycles |C| <= 10)Source CID 135973 {3, 3, 4} {3, 3, 4} {3, 3, 4} Canonical Non-canonical PubChem Fingerprint CID 9249 {3, 3, 4, 6, 6} {3, 3, 4, 6, 6} {3, 3, 6, 6} Canonical - 4 member cycle only added if found before larger 6 member cycles Non-canonical PubChem Fingerprint - 4 member cycle not found CID 931 {6, 6, 10} {6, 6, 10} {6, 6, 10} Canonical Non-canonical PubChem Fingerprint CID 5702 {6, 6, 6, 6, 10, 10, 20, 22, 22, 24, 24} {6, 6, 6, 6, 10, 10, 20, 22, 22, 24, 24} {6, 6, 6, 6} Canonical - 10 member cycles only added if found before larger cycles Non-canonical PubChem Fingerprint - 10 member cycles not found CID 1211 {6, 6, 6, 6, 6, 6, 10, 10, 18, 18, 20, 20, 22, 22, 22} {6, 6, 6, 6, 6, 6, 10, 10, 18, 18, 20, 20, 22, 22, 22} {6, 6, 6, 6, 6, 6, 10, 10} Canonical - 10 member cycles only added if found before larger cycles Non-canonical PubChem Fingerprint - 10 member cycles were found CID 17858819 {5, 6, 9} {5, 6, 9} {5, 6, 9} Canonical Non-canonical PubChem Fingerprint CID 1909 {5, 5, 5, 6, 9, 16, 17, 17, 17, 18} {5, 5, 5, 6, 9, 16, 17, 17, 17, 18} {5, 5, 5, 6} Canonical - 9 member cycle only added if found before larger cycles Non-canonical PubChem Fingerprint - 9 member cycle not found - Author:
- John May
- See Also:
MinimumCycleBasis
,RelevantCycles
- Source code:
- main
- Belongs to CDK module:
- core
- Keywords:
- ESSSR, ring, cycle
-
-
Constructor Summary
Constructors Constructor Description TripletShortCycles(MinimumCycleBasis mcb, boolean canonical)
Compute the cycles of the extended smallest set of smallest rings (ESSSR) for an existing minimum cycle basis.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int[][]
paths()
Access the vertex paths for all cycles of the basis.int
size()
Size of the cycle basis, cardinality of the ESSSR.
-
-
-
Constructor Detail
-
TripletShortCycles
public TripletShortCycles(MinimumCycleBasis mcb, boolean canonical)
Compute the cycles of the extended smallest set of smallest rings (ESSSR) for an existing minimum cycle basis. Choosing the set to be canonical means the set depends on the order of the vertices and may not be consistent in subgraphs. Given a different order of vertices the same cycles may not be found.- Parameters:
mcb
- minimum cycle basiscanonical
- should the set be canonical (non-unique)
-
-