Class Canon
 java.lang.Object

 org.openscience.cdk.graph.invariant.Canon

public final class Canon extends Object
An implementation based on the canon algorithm [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. The algorithm uses an initial set of of invariants which are assigned a rank. Equivalent ranks are then shattered using an unambiguous function (in this case, the product of primes of adjacent ranks). Once no more equivalent ranks can be shattered ties are artificially broken and rank shattering continues. Unlike the original description rank stability is not maintained reducing the number of values to rank at each stage to only those which are equivalent. The initial set of invariants is basic and are  "sufficient for the purpose of obtaining unique notation for simple SMILES, but it is not necessarily a “complete” set. No “perfect” set of invariants is known that will distinguish all possible graph asymmetries. However, for any given set of structures, a set of invariants can be devised to provide the necessary discrimination" [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. As such this producer should not be considered a complete canonical labelled but in practice performs well. For a more accurate and computationally expensive labelling, please using theInChINumbersTools
.IAtomContainer m = ...; int[][] g = GraphUtil.toAdjList(m); // obtain canon labelling long[] labels = Canon.label(m, g); // obtain symmetry classes long[] labels = Canon.symmetry(m, g);
 Author:
 John May
 Source code:
 main
 Belongs to CDK module:
 standard


Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static long[]
basicInvariants(IAtomContainer container, int[][] graph)
static long[]
basicInvariants(IAtomContainer container, int[][] graph, int flav)
Generate the initial invariants for each atom in thecontainer
.static long[]
label(IAtomContainer container, int[][] g)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, int opts)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, long[] initial)
Compute the canonical labels for the provided structure.static long[]
label(IAtomContainer container, int[][] g, Comparator<IAtom> cmp)
Compute the canonical labels for the provided structure.static long[]
symmetry(IAtomContainer container, int[][] g)
Compute the symmetry classes for the provided structure.static long[]
symmetry(IAtomContainer container, int[][] g, int opts)
Compute the symmetry classes for the provided structure.



Method Detail

label
public static long[] label(IAtomContainer container, int[][] g, int opts)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive. Parameters:
container
 structureg
 adjacency list graph representationopts
 canonical generation options seeCanonOpts
 Returns:
 the canonical labelling
 See Also:
EquivalentClassPartitioner
,InChINumbersTools

label
public static long[] label(IAtomContainer container, int[][] g)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive. Parameters:
container
 structureg
 adjacency list graph representation Returns:
 the canonical labelling
 See Also:
EquivalentClassPartitioner
,InChINumbersTools

label
public static long[] label(IAtomContainer container, int[][] g, long[] initial)
Compute the canonical labels for the provided structure. The labelling does not consider isomer information or stereochemistry. This method allows provision of a custom array of initial invariants. The current implementation does not fully distinguish all structure topologies but in practise performs well in the majority of cases. A complete canonical labelling can be obtained using theInChINumbersTools
but is computationally much more expensive. Parameters:
container
 structureg
 adjacency list graph representationinitial
 initial seed invariants Returns:
 the canonical labelling
 See Also:
EquivalentClassPartitioner
,InChINumbersTools

label
public static long[] label(IAtomContainer container, int[][] g, Comparator<IAtom> cmp)
Compute the canonical labels for the provided structure. The initial labelling is seeded with the provided atom comparatorcmp
allowing arbitary properties to be distinguished or ignored. Parameters:
container
 structureg
 adjacency list graph representationcmp
 comparator to compare atoms Returns:
 the canonical labelling

symmetry
public static long[] symmetry(IAtomContainer container, int[][] g, int opts)
Compute the symmetry classes for the provided structure. There are known examples where symmetry is incorrectly found. TheEquivalentClassPartitioner
gives more accurate symmetry perception but this method is very quick and in practise successfully portions the majority of chemical structures. Parameters:
container
 structureg
 adjacency list graph representationopts
 canonical generation options seeCanonOpts
 Returns:
 symmetry classes
 See Also:
EquivalentClassPartitioner

symmetry
public static long[] symmetry(IAtomContainer container, int[][] g)
Compute the symmetry classes for the provided structure. There are known examples where symmetry is incorrectly found. TheEquivalentClassPartitioner
gives more accurate symmetry perception but this method is very quick and in practise successfully portions the majority of chemical structures. Parameters:
container
 structureg
 adjacency list graph representation Returns:
 symmetry classes
 See Also:
EquivalentClassPartitioner
,basicInvariants(IAtomContainer, int[][], int)

basicInvariants
public static long[] basicInvariants(IAtomContainer container, int[][] graph)
 Parameters:
container
 an atom container to generate labels forgraph
 graph representation (adjacency list) Returns:
 the initial invariants
 See Also:
basicInvariants(IAtomContainer, int[][], int)

basicInvariants
public static long[] basicInvariants(IAtomContainer container, int[][] graph, int flav)
Generate the initial invariants for each atom in thecontainer
. The labels use the invariants described in [Weininger, David et. al.. Journal of Chemical Information and Computer Sciences. 1989. 29]. The bits in the low 32bits are:0000000000xxxxXXXXeeeeeeescchhhh
where: 0: padding
 x: number of connections
 X: number of nonhydrogens bonds
 e: atomic number
 s: sign of charge
 c: absolute charge
 h: number of attached hydrogens
[O]C=O
where both oxygens have no hydrogens and a single connection but the atoms are not equivalent. Including a better initial partition is more expensive Parameters:
container
 an atom container to generate labels forgraph
 graph representation (adjacency list)flav
 bit mask canon flavor (seeCanonOpts
) Returns:
 initial invariants
 Throws:
NullPointerException
 an atom had unset atomic number, hydrogen count or formal charge

