Class MolecularFormulaManipulator


  • public class MolecularFormulaManipulator
    extends Object
    Class with convenience methods that provide methods to manipulate IMolecularFormula's. For example:
    Author:
    miguelrojasch
    Source code:
    main
    Belongs to CDK module:
    formula
    Created on:
    2007-11-20
    • Constructor Detail

      • MolecularFormulaManipulator

        public MolecularFormulaManipulator()
    • Method Detail

      • getAtomCount

        public static int getAtomCount​(IMolecularFormula formula)
        Checks a set of Nodes for the occurrence of each isotopes instance in the molecular formula. In short number of atoms.
        Parameters:
        formula - The MolecularFormula to check
        Returns:
        The occurrence total
      • getElementCount

        public static int getElementCount​(IMolecularFormula formula,
                                          IElement element)
        Checks a set of Nodes for the occurrence of the isotopes in the molecular formula from a particular IElement. It returns 0 if the element does not exist. The search is based only on the IElement.
        Parameters:
        formula - The MolecularFormula to check
        element - The IElement object
        Returns:
        The occurrence of this element in this molecular formula
      • getElementCount

        public static int getElementCount​(IMolecularFormula formula,
                                          String symbol)
        Occurrences of a given element in a molecular formula.
        Parameters:
        formula - the formula
        symbol - element symbol (e.g. C for carbon)
        Returns:
        number of the times the element occurs
        See Also:
        getElementCount(IMolecularFormula, IElement)
      • getIsotopes

        public static List<IIsotope> getIsotopes​(IMolecularFormula formula,
                                                 IElement element)
        Get a list of IIsotope from a given IElement which is contained molecular. The search is based only on the IElement.
        Parameters:
        formula - The MolecularFormula to check
        element - The IElement object
        Returns:
        The list with the IIsotopes in this molecular formula
      • elements

        public static List<IElement> elements​(IMolecularFormula formula)
        Get a list of all Elements which are contained molecular.
        Parameters:
        formula - The MolecularFormula to check
        Returns:
        The list with the IElements in this molecular formula
      • containsElement

        public static boolean containsElement​(IMolecularFormula formula,
                                              IElement element)
        True, if the MolecularFormula contains the given element as IIsotope object.
        Parameters:
        formula - IMolecularFormula molecularFormula
        element - The element this MolecularFormula is searched for
        Returns:
        True, if the MolecularFormula contains the given element object
      • removeElement

        public static IMolecularFormula removeElement​(IMolecularFormula formula,
                                                      IElement element)
        Removes all isotopes from a given element in the MolecularFormula.
        Parameters:
        formula - IMolecularFormula molecularFormula
        element - The IElement of the IIsotopes to be removed
        Returns:
        The molecularFormula with the isotopes removed
      • getString

        public static String getString​(IMolecularFormula formula,
                                       String[] orderElements,
                                       boolean setOne,
                                       boolean setMassNumber)
        Returns the string representation of the molecular formula.
        Parameters:
        formula - The IMolecularFormula Object
        orderElements - The order of Elements
        setOne - True, when must be set the value 1 for elements with one atom
        setMassNumber - If the formula contains an isotope of an element that is the non-major isotope, the element is represented as [XE] where X is the mass number and E is the element symbol
        Returns:
        A String containing the molecular formula
        See Also:
        getHTML(IMolecularFormula), generateOrderEle(), generateOrderEle_Hill_NoCarbons(), generateOrderEle_Hill_WithCarbons()
      • getString

        public static String getString​(IMolecularFormula formula)
        Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
        Parameters:
        formula - The IMolecularFormula Object
        Returns:
        A String containing the molecular formula
        See Also:
        getHTML(IMolecularFormula)
      • getString

        public static String getString​(IMolecularFormula formula,
                                       boolean setOne)
        Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
        Parameters:
        formula - The IMolecularFormula Object
        setOne - True, when must be set the value 1 for elements with one atom
        Returns:
        A String containing the molecular formula
        See Also:
        getHTML(IMolecularFormula)
      • getString

        public static String getString​(IMolecularFormula formula,
                                       boolean setOne,
                                       boolean setMassNumber)
        Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
        Parameters:
        formula - The IMolecularFormula Object
        setOne - True, when must be set the value 1 for elements with one atom
        setMassNumber - If the formula contains an isotope of an element that is the non-major isotope, the element is represented as [XE] where X is the mass number and E is the element symbol
        Returns:
        A String containing the molecular formula
        See Also:
        getHTML(IMolecularFormula)
      • getHTML

        public static String getHTML​(IMolecularFormula formula)
        Returns the string representation of the molecular formula based on Hill System with numbers wrapped in <sub></sub> tags. Useful for displaying formulae in Swing components or on the web.
        Parameters:
        formula - The IMolecularFormula object
        Returns:
        A HTML representation of the molecular formula
        See Also:
        getHTML(IMolecularFormula, boolean, boolean)
      • getHTML

        public static String getHTML​(IMolecularFormula formula,
                                     boolean chargeB,
                                     boolean isotopeB)
        Returns the string representation of the molecular formula based on Hill System with numbers wrapped in <sub></sub> tags and the isotope of each Element in <sup></sup> tags and the total charge of IMolecularFormula in <sup></sup> tags. Useful for displaying formulae in Swing components or on the web.
        Parameters:
        formula - The IMolecularFormula object
        chargeB - True, If it has to show the charge
        isotopeB - True, If it has to show the Isotope mass
        Returns:
        A HTML representation of the molecular formula
        See Also:
        getHTML(IMolecularFormula)
      • getHTML

        public static String getHTML​(IMolecularFormula formula,
                                     String[] orderElements,
                                     boolean showCharge,
                                     boolean showIsotopes)
        Returns the string representation of the molecular formula with numbers wrapped in <sub></sub> tags and the isotope of each Element in <sup></sup> tags and the total showCharge of IMolecularFormula in <sup></sup> tags. Useful for displaying formulae in Swing components or on the web.
        Parameters:
        formula - The IMolecularFormula object
        orderElements - The order of Elements
        showCharge - True, If it has to show the showCharge
        showIsotopes - True, If it has to show the Isotope mass
        Returns:
        A HTML representation of the molecular formula
        See Also:
        getHTML(IMolecularFormula)
      • getMolecularFormula

        public static IMolecularFormula getMolecularFormula​(String stringMF,
                                                            IChemObjectBuilder builder)
        Construct an instance of IMolecularFormula, initialized with a molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis

        The hydrogens must be implicit.

        Parameters:
        stringMF - The molecularFormula string
        builder - a IChemObjectBuilder which is used to construct atoms
        Returns:
        The filled IMolecularFormula
        See Also:
        getMolecularFormula(String,IMolecularFormula)
      • getMajorIsotopeMolecularFormula

        public static IMolecularFormula getMajorIsotopeMolecularFormula​(String stringMF,
                                                                        IChemObjectBuilder builder)
        Construct an instance of IMolecularFormula, initialized with a molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis. The hydrogens must be implicit. Major isotopes are being used.
        Parameters:
        stringMF - The molecularFormula string
        builder - a IChemObjectBuilder which is used to construct atoms
        Returns:
        The filled IMolecularFormula
        See Also:
        getMolecularFormula(String,IMolecularFormula)
      • getMolecularFormula

        public static IMolecularFormula getMolecularFormula​(String stringMF,
                                                            IMolecularFormula formula)
        add in a instance of IMolecularFormula the elements extracts form molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis

        The hydrogens must be implicit.

        Parameters:
        stringMF - The molecularFormula string
        Returns:
        The filled IMolecularFormula
        See Also:
        getMolecularFormula(String, IChemObjectBuilder)
      • getTotalMassNumber

        public static double getTotalMassNumber​(IMolecularFormula formula)
        Get the summed mass number of all isotopes from an MolecularFormula. It assumes isotope masses to be preset, and returns 0.0 if not.
        Parameters:
        formula - The IMolecularFormula to calculate
        Returns:
        The summed nominal mass of all atoms in this MolecularFormula
      • getMass

        public static double getMass​(IMolecularFormula mf,
                                     int flav)
        Calculate the mass of a formula, this function takes an optional 'mass flavour' that switches the computation type. The key distinction is how specified/unspecified isotopes are handled. A specified isotope is an atom that has either IIsotope.setMassNumber(Integer) or IIsotope.setExactMass(Double) set to non-null and non-zero.
        The flavours are:
        • MolWeight (default) - uses the exact mass of each atom when an isotope is specified, if not specified the average mass of the element is used.
        • MolWeightIgnoreSpecified - uses the average mass of each element, ignoring any isotopic/exact mass specification
        • MonoIsotopic - uses the exact mass of each atom when an isotope is specified, if not specified the major isotope mass for that element is used.
        • MostAbundant - uses the exact mass of each atom when specified, if not specified a distribution is calculated and the most abundant isotope pattern is used.
        Parameters:
        mf - molecular formula
        flav - flavor
        Returns:
        the mass of the molecule
        See Also:
        getMass(IMolecularFormula, int), MolWeight, MolWeightIgnoreSpecified, MonoIsotopic, MostAbundant
      • getMass

        public static double getMass​(IMolecularFormula mf)
        Calculate the mass of a formula, this function takes an optional 'mass flavour' that switches the computation type. The key distinction is how specified/unspecified isotopes are handled. A specified isotope is an atom that has either IIsotope.setMassNumber(Integer) or IIsotope.setExactMass(Double) set to non-null and non-zero.
        The flavours are:
        • MolWeight (default) - uses the exact mass of each atom when an isotope is specified, if not specified the average mass of the element is used.
        • MolWeightIgnoreSpecified - uses the average mass of each element, ignoring any isotopic/exact mass specification
        • MonoIsotopic - uses the exact mass of each atom when an isotope is specified, if not specified the major isotope mass for that element is used.
        • MostAbundant - uses the exact mass of each atom when specified, if not specified a distribution is calculated and the most abundant isotope pattern is used.
        Parameters:
        mf - molecular formula
        Returns:
        the mass of the molecule
        See Also:
        getMass(IMolecularFormula, int), MolWeight, MolWeightIgnoreSpecified, MonoIsotopic, MostAbundant
      • getTotalNaturalAbundance

        public static double getTotalNaturalAbundance​(IMolecularFormula formula)
        Get the summed natural abundance of all isotopes from an MolecularFormula. Assumes abundances to be preset, and will return 0.0 if not.
        Parameters:
        formula - The IMolecularFormula to calculate
        Returns:
        The summed natural abundance of all isotopes in this MolecularFormula
      • getDBE

        public static double getDBE​(IMolecularFormula formula)
                             throws CDKException
        Returns the number of double bond equivalents in this molecule.
        Parameters:
        formula - The IMolecularFormula to calculate
        Returns:
        The number of DBEs
        Throws:
        CDKException - if DBE cannot be be evaluated
        Keywords:
        DBE, double bond equivalent
      • getMolecularFormula

        public static IMolecularFormula getMolecularFormula​(IAtomContainer atomContainer,
                                                            IMolecularFormula formula)
        Method that actually does the work of convert the atomContainer to IMolecularFormula given a IMolecularFormula.

        The hydrogens must be implicit.

        Parameters:
        atomContainer - IAtomContainer object
        formula - IMolecularFormula molecularFormula to put the new Isotopes
        Returns:
        the filled AtomContainer
        See Also:
        getMolecularFormula(IAtomContainer)
      • getAtomContainer

        public static IAtomContainer getAtomContainer​(IMolecularFormula formula,
                                                      IAtomContainer atomContainer)
        Method that actually does the work of convert the IMolecularFormula to IAtomContainer given a IAtomContainer.

        The hydrogens must be implicit.

        Parameters:
        formula - IMolecularFormula object
        atomContainer - IAtomContainer to put the new Elements
        Returns:
        the filled AtomContainer
        See Also:
        getAtomContainer(IMolecularFormula)
      • getAtomContainer

        public static IAtomContainer getAtomContainer​(String formulaString,
                                                      IChemObjectBuilder builder)
        Converts a formula string (like "C2H4") into an atom container with atoms but no bonds.
        Parameters:
        formulaString - the formula to convert
        builder - a chem object builder
        Returns:
        atoms wrapped in an atom container
      • generateOrderEle

        public static String[] generateOrderEle()
        Returns the Elements ordered according to (approximate) probability of occurrence.

        This begins with the "elements of life" C, H, O, N, (Si, P, S, F, Cl), then continues with the "common" chemical synthesis ingredients, closing off with the tail-end of the periodic table in atom-number order and finally the generic R-group.

        Returns:
        fixed-order array
      • compare

        public static boolean compare​(IMolecularFormula formula1,
                                      IMolecularFormula formula2)
        Compare two IMolecularFormula looking at type and number of IIsotope and charge of the formula.
        Parameters:
        formula1 - The first IMolecularFormula
        formula2 - The second IMolecularFormula
        Returns:
        True, if the both IMolecularFormula are the same
      • getHeavyElements

        public static List<IElement> getHeavyElements​(IMolecularFormula formula)
        Returns a set of nodes excluding all the hydrogens.
        Parameters:
        formula - The IMolecularFormula
        Returns:
        The heavyElements value into a List
        Keywords:
        hydrogen, removal
      • simplifyMolecularFormula

        public static String simplifyMolecularFormula​(String formula)
        Simplify the molecular formula. E.g the dot '.' character convention is used when dividing a formula into parts. In this case any numeral following a dot refers to all the elements within that part of the formula that follow it.
        Parameters:
        formula - The molecular formula
        Returns:
        The simplified molecular formula
      • adjustProtonation

        public static boolean adjustProtonation​(IMolecularFormula mf,
                                                int hcnt)
        Adjust the protonation of a molecular formula. This utility method adjusts the hydrogen isotope count and charge at the same time.
         IMolecularFormula mf = MolecularFormulaManipulator.getMolecularFormula("[C6H5O]-", bldr);
         MolecularFormulaManipulator.adjustProtonation(mf, +1); // now "C6H6O"
         MolecularFormulaManipulator.adjustProtonation(mf, -1); // now "C6H5O-"
         
        The return value indicates whether the protonation could be adjusted:
         IMolecularFormula mf = MolecularFormulaManipulator.getMolecularFormula("[Cl]-", bldr);
         MolecularFormulaManipulator.adjustProtonation(mf, +0); // false still "[Cl]-"
         MolecularFormulaManipulator.adjustProtonation(mf, +1); // true now "HCl"
         MolecularFormulaManipulator.adjustProtonation(mf, -1); // true now "[Cl]-" (again)
         MolecularFormulaManipulator.adjustProtonation(mf, -1); // false still "[Cl]-" (no H to remove!)
         
        The method tries to select an existing hydrogen isotope to augment. If no hydrogen isotopes are found a new major isotope (1H) is created.
        Parameters:
        mf - molecular formula
        hcnt - the number of hydrogens to add/remove, (>0 protonate:, <0: deprotonate)
        Returns:
        the protonation was be adjusted
      • getMostAbundant

        public static IMolecularFormula getMostAbundant​(IMolecularFormula mf)
        Compute the most abundant MF. Given the MF C6Br6 this function rapidly computes the most abundant MF as 12C679Br381 Br3.
        Parameters:
        mf - a molecular formula with unspecified isotopes
        Returns:
        the most abundant MF, or null if it could not be computed
      • getMostAbundant

        public static IMolecularFormula getMostAbundant​(IAtomContainer mol)
        Compute the most abundant MF. Given the a molecule C6Br6 this function rapidly computes the most abundant MF as 12C679Br381 Br3.
        Parameters:
        mol - a molecule with unspecified isotopes
        Returns:
        the most abundant MF, or null if it could not be computed