Class Canon
- java.lang.Object
-
- org.openscience.cdk.graph.invariant.Canon
-
public final class Canon extends Object
An implementation based on the canon algorithm [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. The algorithm uses an initial set of of invariants which are assigned a rank. Equivalent ranks are then shattered using an unambiguous function (in this case, the product of primes of adjacent ranks). Once no more equivalent ranks can be shattered ties are artificially broken and rank shattering continues. Unlike the original description rank stability is not maintained reducing the number of values to rank at each stage to only those which are equivalent. The initial set of invariants is basic and are - "sufficient for the purpose of obtaining unique notation for simple SMILES, but it is not necessarily a “complete” set. No “perfect” set of invariants is known that will distinguish all possible graph asymmetries. However, for any given set of structures, a set of invariants can be devised to provide the necessary discrimination" [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. As such this producer should not be considered a complete canonical labelled but in practice performs well. For a more accurate and computationally expensive labelling, please using theInChINumbersTools
.IAtomContainer m = ...; int[][] g = GraphUtil.toAdjList(m); // obtain canon labelling long[] labels = Canon.label(m, g); // obtain symmetry classes long[] labels = Canon.symmetry(m, g);
- Author:
- John May
- Source code:
- main
- Belongs to CDK module:
- standard
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static long[]
basicInvariants(IAtomContainer container, int[][] graph)
static long[]
basicInvariants(IAtomContainer container, int[][] graph, int flav)
Generate the initial invariants for each atom in thecontainer
.static long[]
label(IAtomContainer container, int[][] g)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, int opts)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, long[] initial)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, Comparator<IAtom> cmp)
Compute the canonical labels for the provided structure.static long[]
symmetry(IAtomContainer container, int[][] g)
Compute the symmetry classes for the provided structure.static long[]
symmetry(IAtomContainer container, int[][] g, int opts)
Compute the symmetry classes for the provided structure.
-
-
-
Method Detail
-
label
public static long[] label(IAtomContainer container, int[][] g, int opts)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive.- Parameters:
container
- structureg
- adjacency list graph representationopts
- canonical generation options seeCanonOpts
- Returns:
- the canonical labelling
- See Also:
EquivalentClassPartitioner
,InChINumbersTools
-
label
public static long[] label(IAtomContainer container, int[][] g)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive.- Parameters:
container
- structureg
- adjacency list graph representation- Returns:
- the canonical labelling
- See Also:
EquivalentClassPartitioner
,InChINumbersTools
-
label
public static long[] label(IAtomContainer container, int[][] g, long[] initial)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. This method allows provision of a custom array of initial invariants. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive.- Parameters:
container
- structureg
- adjacency list graph representationinitial
- initial seed invariants- Returns:
- the canonical labelling
- See Also:
EquivalentClassPartitioner
,InChINumbersTools
-
label
public static long[] label(IAtomContainer container, int[][] g, Comparator<IAtom> cmp)
Compute the canonical labels for the provided structure. The initial labelling is seed-ed with the provided atom comparatorcmp
allowing arbitary properties to be distinguished or ignored.- Parameters:
container
- structureg
- adjacency list graph representationcmp
- comparator to compare atoms- Returns:
- the canonical labelling
-
symmetry
public static long[] symmetry(IAtomContainer container, int[][] g, int opts)
Compute the symmetry classes for the provided structure. There are known examples where symmetry is incorrectly found. TheEquivalentClassPartitioner
gives more accurate symmetry perception but this method is very quick and in practise successfully portions the majority of chemical structures.- Parameters:
container
- structureg
- adjacency list graph representationopts
- canonical generation options seeCanonOpts
- Returns:
- symmetry classes
- See Also:
EquivalentClassPartitioner
-
symmetry
public static long[] symmetry(IAtomContainer container, int[][] g)
Compute the symmetry classes for the provided structure. There are known examples where symmetry is incorrectly found. TheEquivalentClassPartitioner
gives more accurate symmetry perception but this method is very quick and in practise successfully portions the majority of chemical structures.- Parameters:
container
- structureg
- adjacency list graph representation- Returns:
- symmetry classes
- See Also:
EquivalentClassPartitioner
,basicInvariants(IAtomContainer, int[][], int)
-
basicInvariants
public static long[] basicInvariants(IAtomContainer container, int[][] graph)
- Parameters:
container
- an atom container to generate labels forgraph
- graph representation (adjacency list)- Returns:
- the initial invariants
- See Also:
basicInvariants(IAtomContainer, int[][], int)
-
basicInvariants
public static long[] basicInvariants(IAtomContainer container, int[][] graph, int flav)
Generate the initial invariants for each atom in thecontainer
. The labels use the invariants described in [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. The bits in the low 32-bits are:0000000000xxxxXXXXeeeeeeescchhhh
where:- 0: padding
- x: number of connections
- X: number of non-hydrogens bonds
- e: atomic number
- s: sign of charge
- c: absolute charge
- h: number of attached hydrogens
[O]C=O
where both oxygens have no hydrogens and a single connection but the atoms are not equivalent. Including a better initial partition is more expensive- Parameters:
container
- an atom container to generate labels forgraph
- graph representation (adjacency list)flav
- bit mask canon flavor (seeCanonOpts
)- Returns:
- initial invariants
- Throws:
NullPointerException
- an atom had unset atomic number, hydrogen count or formal charge
-
-