Class HOSECodeGenerator

    • Field Detail

      • sphereNodes

        protected List<org.openscience.cdk.tools.HOSECodeGenerator.TreeNode> sphereNodes
        Container for the nodes in a sphere.
      • sphere

        protected int sphere
        Counter for the sphere in which we currently work.
      • spheres

        protected List<org.openscience.cdk.tools.HOSECodeGenerator.TreeNode>[] spheres
        Here we store the spheres that we assemble, in order to parse them into a code later.
      • HOSECode

        protected StringBuffer HOSECode
        The HOSECode string that we assemble
      • atomContainer

        protected IAtomContainer atomContainer
        The molecular structure on which we work
      • bondSymbols

        protected String[] bondSymbols
        The bond symbols used for bond orders "single", "double", "triple" and "aromatic"
      • centerCode

        protected String centerCode
      • LEGACY_MODE

        public static final int LEGACY_MODE
        Ignored parent ordering be considered when sorting spheres (legacy mode) for compatibility with existing ML/AI models.
        See Also:
        Constant Field Values
    • Constructor Detail

      • HOSECodeGenerator

        public HOSECodeGenerator​(int flags)
        Constructor for the HOSECodeGenerator.

        Important!

        A critical bug was discovered in the implementation (see PR 828) which gave the wrong nesting in "some" cases. Fixing this behaviour invalidates any ML/AI models trained on the incorrect values. If you have a model built with the old algorithm that can not be retrained set legacyMode=true.
        Parameters:
        flags - (default: false)
        See Also:
        PR 828
      • HOSECodeGenerator

        public HOSECodeGenerator()
        Constructor for the HOSECodeGenerator.

        Important!

        A critical bug was discovered in the implementation (see PR 828) which gave the wrong nesting in "some" cases. Fixing this behaviour invalidates any ML/AI models trained on the incorrect values. If you have a model built with the old algorithm that can not be retrained set .
        See Also:
        PR 828
    • Method Detail

      • getSpheres

        public List<IAtom>[] getSpheres​(IAtomContainer ac,
                                        IAtom root,
                                        int noOfSpheres,
                                        boolean ringsize)
                                 throws CDKException
        This method is intended to be used to get the atoms around an atom in spheres. It is not used in this class, but is provided for other classes to use. It also creates the HOSE code in HOSECode as a side-effect.
        Parameters:
        ac - The IAtomContainer with the molecular skeleton in which the root atom resides.
        root - The root atom for which to produce the spheres.
        noOfSpheres - The number of spheres to look at.
        ringsize - Shall the center code have the ring size in it? Only use if you want to have the hose code later, else say false.
        Returns:
        An array of List. The list at i-1 contains the atoms at sphere i as TreeNodes.
        Throws:
        CDKException
      • getHOSECode

        public String getHOSECode​(IAtomContainer ac,
                                  IAtom root,
                                  int noOfSpheres)
                           throws CDKException
        Produces a HOSE code for Atom root in the IAtomContainer ac. The HOSE code is produced for the number of spheres given by noOfSpheres. IMPORTANT: if you want aromaticity to be included in the code, you need to run the IAtomContainer ac to the CDKHueckelAromaticityDetector prior to using getHOSECode(). This method only gives proper results if the molecule is fully saturated (if not, the order of the HOSE code might depend on atoms in higher spheres). This method is known to fail for protons sometimes. IMPORTANT: Your molecule must contain implicit or explicit hydrogens for this method to work properly.
        Parameters:
        ac - The IAtomContainer with the molecular skeleton in which the root atom resides
        root - The root atom for which to produce the HOSE code
        noOfSpheres - The number of spheres to look at
        Returns:
        The HOSECode value
        Throws:
        CDKException - Thrown if something is wrong
      • getHOSECode

        public String getHOSECode​(IAtomContainer ac,
                                  IAtom root,
                                  int noOfSpheres,
                                  boolean ringsize)
                           throws CDKException
        Produces a HOSE code for Atom root in the IAtomContainer ac. The HOSE code is produced for the number of spheres given by noOfSpheres. IMPORTANT: if you want aromaticity to be included in the code, you need to run the IAtomContainer ac to the CDKHueckelAromaticityDetector prior to using getHOSECode(). This method only gives proper results if the molecule is fully saturated (if not, the order of the HOSE code might depend on atoms in higher spheres). This method is known to fail for protons sometimes. IMPORTANT: Your molecule must contain implicit or explicit hydrogens for this method to work properly.
        Parameters:
        ac - The IAtomContainer with the molecular skeleton in which the root atom resides
        root - The root atom for which to produce the HOSE code
        noOfSpheres - The number of spheres to look at
        ringsize - The size of the ring(s) it is in is included in center atom code
        Returns:
        The HOSECode value
        Throws:
        CDKException - Thrown if something is wrong
      • makeBremserCompliant

        public String makeBremserCompliant​(String code)
      • getNodesInSphere

        public List<IAtom> getNodesInSphere​(int sphereNumber)