Class MolecularFormulaManipulator

java.lang.Object
org.openscience.cdk.tools.manipulator.MolecularFormulaManipulator

public class MolecularFormulaManipulator extends Object
Class with convenience methods that provide methods to manipulate IMolecularFormula's. For example:
Author:
miguelrojasch
Source code:
main
Belongs to CDK module:
formula
Created on:
2007-11-20
  • Field Details

  • Constructor Details

    • MolecularFormulaManipulator

      public MolecularFormulaManipulator()
  • Method Details

    • getAtomCount

      public static int getAtomCount(IMolecularFormula formula)
      Checks a set of Nodes for the occurrence of each isotopes instance in the molecular formula. In short number of atoms.
      Parameters:
      formula - The MolecularFormula to check
      Returns:
      The occurrence total
    • getElementCount

      public static int getElementCount(IMolecularFormula formula, IElement element)
      Checks a set of Nodes for the occurrence of the isotopes in the molecular formula from a particular IElement. It returns 0 if the element does not exist. The search is based only on the IElement.
      Parameters:
      formula - The MolecularFormula to check
      element - The IElement object
      Returns:
      The occurrence of this element in this molecular formula
    • getElementCount

      public static int getElementCount(IMolecularFormula formula, IIsotope isotope)
      Occurrences of a given element from an isotope in a molecular formula.
      Parameters:
      formula - the formula
      isotope - isotope of an element
      Returns:
      number of the times the element occurs
      See Also:
    • getElementCount

      public static int getElementCount(IMolecularFormula formula, String symbol)
      Occurrences of a given element in a molecular formula.
      Parameters:
      formula - the formula
      symbol - element symbol (e.g. C for carbon)
      Returns:
      number of the times the element occurs
      See Also:
    • getIsotopes

      public static List<IIsotope> getIsotopes(IMolecularFormula formula, IElement element)
      Get a list of IIsotope from a given IElement which is contained molecular. The search is based only on the IElement.
      Parameters:
      formula - The MolecularFormula to check
      element - The IElement object
      Returns:
      The list with the IIsotopes in this molecular formula
    • elements

      public static List<IElement> elements(IMolecularFormula formula)
      Get a list of all Elements which are contained molecular.
      Parameters:
      formula - The MolecularFormula to check
      Returns:
      The list with the IElements in this molecular formula
    • containsElement

      public static boolean containsElement(IMolecularFormula formula, IElement element)
      True, if the MolecularFormula contains the given element as IIsotope object.
      Parameters:
      formula - IMolecularFormula molecularFormula
      element - The element this MolecularFormula is searched for
      Returns:
      True, if the MolecularFormula contains the given element object
    • removeElement

      public static IMolecularFormula removeElement(IMolecularFormula formula, IElement element)
      Removes all isotopes from a given element in the MolecularFormula.
      Parameters:
      formula - IMolecularFormula molecularFormula
      element - The IElement of the IIsotopes to be removed
      Returns:
      The molecularFormula with the isotopes removed
    • getString

      public static String getString(IMolecularFormula formula, String[] orderElements, boolean setOne)
      Returns the string representation of the molecular formula.
      Parameters:
      formula - The IMolecularFormula Object
      orderElements - The order of Elements
      setOne - True, when must be set the value 1 for elements with one atom
      Returns:
      A String containing the molecular formula
      See Also:
    • getString

      public static String getString(IMolecularFormula formula, String[] orderElements, boolean setOne, boolean setMassNumber)
      Returns the string representation of the molecular formula.
      Parameters:
      formula - The IMolecularFormula Object
      orderElements - The order of Elements
      setOne - True, when must be set the value 1 for elements with one atom
      setMassNumber - If the formula contains an isotope of an element that is the non-major isotope, the element is represented as [XE] where X is the mass number and E is the element symbol
      Returns:
      A String containing the molecular formula
      See Also:
    • getString

      public static String getString(IMolecularFormula formula)
      Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
      Parameters:
      formula - The IMolecularFormula Object
      Returns:
      A String containing the molecular formula
      See Also:
    • getString

      public static String getString(IMolecularFormula formula, boolean setOne)
      Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
      Parameters:
      formula - The IMolecularFormula Object
      setOne - True, when must be set the value 1 for elements with one atom
      Returns:
      A String containing the molecular formula
      See Also:
    • getString

      public static String getString(IMolecularFormula formula, boolean setOne, boolean setMassNumber)
      Returns the string representation of the molecular formula. Based on Hill System. The Hill system is a system of writing chemical formulas such that the number of carbon atoms in a molecule is indicated first, the number of hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.
      Parameters:
      formula - The IMolecularFormula Object
      setOne - True, when must be set the value 1 for elements with one atom
      setMassNumber - If the formula contains an isotope of an element that is the non-major isotope, the element is represented as [XE] where X is the mass number and E is the element symbol
      Returns:
      A String containing the molecular formula
      See Also:
    • putInOrder

      public static List<IIsotope> putInOrder(String[] orderElements, IMolecularFormula formula)
    • getHillString

      @Deprecated public static String getHillString(IMolecularFormula formula)
    • getHTML

      public static String getHTML(IMolecularFormula formula)
      Returns the string representation of the molecular formula based on Hill System with numbers wrapped in <sub></sub> tags. Useful for displaying formulae in Swing components or on the web.
      Parameters:
      formula - The IMolecularFormula object
      Returns:
      A HTML representation of the molecular formula
      See Also:
    • getHTML

      public static String getHTML(IMolecularFormula formula, boolean chargeB, boolean isotopeB)
      Returns the string representation of the molecular formula based on Hill System with numbers wrapped in <sub></sub> tags and the isotope of each Element in <sup></sup> tags and the total charge of IMolecularFormula in <sup></sup> tags. Useful for displaying formulae in Swing components or on the web.
      Parameters:
      formula - The IMolecularFormula object
      chargeB - True, If it has to show the charge
      isotopeB - True, If it has to show the Isotope mass
      Returns:
      A HTML representation of the molecular formula
      See Also:
    • getHTML

      public static String getHTML(IMolecularFormula formula, String[] orderElements, boolean showCharge, boolean showIsotopes)
      Returns the string representation of the molecular formula with numbers wrapped in <sub></sub> tags and the isotope of each Element in <sup></sup> tags and the total showCharge of IMolecularFormula in <sup></sup> tags. Useful for displaying formulae in Swing components or on the web.
      Parameters:
      formula - The IMolecularFormula object
      orderElements - The order of Elements
      showCharge - True, If it has to show the showCharge
      showIsotopes - True, If it has to show the Isotope mass
      Returns:
      A HTML representation of the molecular formula
      See Also:
    • getMolecularFormula

      public static IMolecularFormula getMolecularFormula(String stringMF, IChemObjectBuilder builder)
      Construct an instance of IMolecularFormula, initialized with a molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis

      The hydrogens must be implicit.

      Parameters:
      stringMF - The molecularFormula string
      builder - a IChemObjectBuilder which is used to construct atoms
      Returns:
      The filled IMolecularFormula
      See Also:
    • getMajorIsotopeMolecularFormula

      public static IMolecularFormula getMajorIsotopeMolecularFormula(String stringMF, IChemObjectBuilder builder)
      Construct an instance of IMolecularFormula, initialized with a molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis. The hydrogens must be implicit. Major isotopes are being used.
      Parameters:
      stringMF - The molecularFormula string
      builder - a IChemObjectBuilder which is used to construct atoms
      Returns:
      The filled IMolecularFormula
      See Also:
    • getMolecularFormula

      public static IMolecularFormula getMolecularFormula(String stringMF, IMolecularFormula formula)
      add in a instance of IMolecularFormula the elements extracts form molecular formula string. The string is immediately analyzed and a set of Nodes is built based on this analysis

      The hydrogens must be implicit.

      Parameters:
      stringMF - The molecularFormula string
      Returns:
      The filled IMolecularFormula
      See Also:
    • getTotalExactMass

      @Deprecated public static double getTotalExactMass(IMolecularFormula formula)
      Deprecated.
      calls getMass(IMolecularFormula, int) with option MonoIsotopic and adjusts for charge with correctMass(double, Integer). These functions should be used directly.
    • getTotalMassNumber

      public static double getTotalMassNumber(IMolecularFormula formula)
      Get the summed mass number of all isotopes from an MolecularFormula. It assumes isotope masses to be preset, and returns 0.0 if not.
      Parameters:
      formula - The IMolecularFormula to calculate
      Returns:
      The summed nominal mass of all atoms in this MolecularFormula
    • getMass

      public static double getMass(IMolecularFormula mf, int flav)
      Calculate the mass of a formula, this function takes an optional 'mass flavour' that switches the computation type. The key distinction is how specified/unspecified isotopes are handled. A specified isotope is an atom that has either IIsotope.setMassNumber(Integer) or IIsotope.setExactMass(Double) set to non-null and non-zero.
      The flavours are:
      • MolWeight (default) - uses the exact mass of each atom when an isotope is specified, if not specified the average mass of the element is used.
      • MolWeightIgnoreSpecified - uses the average mass of each element, ignoring any isotopic/exact mass specification
      • MonoIsotopic - uses the exact mass of each atom when an isotope is specified, if not specified the major isotope mass for that element is used.
      • MostAbundant - uses the exact mass of each atom when specified, if not specified a distribution is calculated and the most abundant isotope pattern is used.
      Parameters:
      mf - molecular formula
      flav - flavor
      Returns:
      the mass of the molecule
      See Also:
    • getMass

      public static double getMass(IMolecularFormula mf)
      Calculate the mass of a formula, this function takes an optional 'mass flavour' that switches the computation type. The key distinction is how specified/unspecified isotopes are handled. A specified isotope is an atom that has either IIsotope.setMassNumber(Integer) or IIsotope.setExactMass(Double) set to non-null and non-zero.
      The flavours are:
      • MolWeight (default) - uses the exact mass of each atom when an isotope is specified, if not specified the average mass of the element is used.
      • MolWeightIgnoreSpecified - uses the average mass of each element, ignoring any isotopic/exact mass specification
      • MonoIsotopic - uses the exact mass of each atom when an isotope is specified, if not specified the major isotope mass for that element is used.
      • MostAbundant - uses the exact mass of each atom when specified, if not specified a distribution is calculated and the most abundant isotope pattern is used.
      Parameters:
      mf - molecular formula
      Returns:
      the mass of the molecule
      See Also:
    • getNaturalExactMass

      @Deprecated public static double getNaturalExactMass(IMolecularFormula formula)
    • getMajorIsotopeMass

      @Deprecated public static double getMajorIsotopeMass(IMolecularFormula formula)
      Deprecated.
    • getTotalNaturalAbundance

      public static double getTotalNaturalAbundance(IMolecularFormula formula)
      Get the summed natural abundance of all isotopes from an MolecularFormula. Assumes abundances to be preset, and will return 0.0 if not.
      Parameters:
      formula - The IMolecularFormula to calculate
      Returns:
      The summed natural abundance of all isotopes in this MolecularFormula
    • getDBE

      public static double getDBE(IMolecularFormula formula) throws CDKException
      Returns the number of double bond equivalents in this molecule.
      Parameters:
      formula - The IMolecularFormula to calculate
      Returns:
      The number of DBEs
      Throws:
      CDKException - if DBE cannot be be evaluated
      Keywords:
      DBE, double bond equivalent
    • getMolecularFormula

      public static IMolecularFormula getMolecularFormula(IAtomContainer atomContainer)
      Method that actually does the work of convert the atomContainer to IMolecularFormula.

      The hydrogens must be implicit.

      Parameters:
      atomContainer - IAtomContainer object
      Returns:
      a molecular formula object
      See Also:
    • getMolecularFormula

      public static IMolecularFormula getMolecularFormula(IAtomContainer atomContainer, IMolecularFormula formula)
      Method that actually does the work of convert the atomContainer to IMolecularFormula given a IMolecularFormula.

      The hydrogens must be implicit.

      Parameters:
      atomContainer - IAtomContainer object
      formula - IMolecularFormula molecularFormula to put the new Isotopes
      Returns:
      the filled AtomContainer
      See Also:
    • getAtomContainer

      public static IAtomContainer getAtomContainer(IMolecularFormula formula)
      Method that actually does the work of convert the IMolecularFormula to IAtomContainer.

      The hydrogens must be implicit.

      Parameters:
      formula - IMolecularFormula object
      Returns:
      the filled AtomContainer
      See Also:
    • getAtomContainer

      public static IAtomContainer getAtomContainer(IMolecularFormula formula, IAtomContainer atomContainer)
      Method that actually does the work of convert the IMolecularFormula to IAtomContainer given a IAtomContainer.

      The hydrogens must be implicit.

      Parameters:
      formula - IMolecularFormula object
      atomContainer - IAtomContainer to put the new Elements
      Returns:
      the filled AtomContainer
      See Also:
    • getAtomContainer

      public static IAtomContainer getAtomContainer(String formulaString, IChemObjectBuilder builder)
      Converts a formula string (like "C2H4") into an atom container with atoms but no bonds.
      Parameters:
      formulaString - the formula to convert
      builder - a chem object builder
      Returns:
      atoms wrapped in an atom container
    • generateOrderEle

      public static String[] generateOrderEle()
      Returns the Elements ordered according to (approximate) probability of occurrence.

      This begins with the "elements of life" C, H, O, N, (Si, P, S, F, Cl), then continues with the "common" chemical synthesis ingredients, closing off with the tail-end of the periodic table in atom-number order and finally the generic R-group.

      Returns:
      fixed-order array
    • compare

      public static boolean compare(IMolecularFormula formula1, IMolecularFormula formula2)
      Compare two IMolecularFormula looking at type and number of IIsotope and charge of the formula.
      Parameters:
      formula1 - The first IMolecularFormula
      formula2 - The second IMolecularFormula
      Returns:
      True, if the both IMolecularFormula are the same
    • getHeavyElements

      public static List<IElement> getHeavyElements(IMolecularFormula formula)
      Returns a set of nodes excluding all the hydrogens.
      Parameters:
      formula - The IMolecularFormula
      Returns:
      The heavyElements value into a List
      Keywords:
      hydrogen, removal
    • simplifyMolecularFormula

      public static String simplifyMolecularFormula(String formula)
      Simplify the molecular formula. E.g the dot '.' character convention is used when dividing a formula into parts. In this case any numeral following a dot refers to all the elements within that part of the formula that follow it.
      Parameters:
      formula - The molecular formula
      Returns:
      The simplified molecular formula
    • adjustProtonation

      public static boolean adjustProtonation(IMolecularFormula mf, int hcnt)
      Adjust the protonation of a molecular formula. This utility method adjusts the hydrogen isotope count and charge at the same time.
       IMolecularFormula mf = MolecularFormulaManipulator.getMolecularFormula("[C6H5O]-", bldr);
       MolecularFormulaManipulator.adjustProtonation(mf, +1); // now "C6H6O"
       MolecularFormulaManipulator.adjustProtonation(mf, -1); // now "C6H5O-"
       
      The return value indicates whether the protonation could be adjusted:
       IMolecularFormula mf = MolecularFormulaManipulator.getMolecularFormula("[Cl]-", bldr);
       MolecularFormulaManipulator.adjustProtonation(mf, +0); // false still "[Cl]-"
       MolecularFormulaManipulator.adjustProtonation(mf, +1); // true now "HCl"
       MolecularFormulaManipulator.adjustProtonation(mf, -1); // true now "[Cl]-" (again)
       MolecularFormulaManipulator.adjustProtonation(mf, -1); // false still "[Cl]-" (no H to remove!)
       
      The method tries to select an existing hydrogen isotope to augment. If no hydrogen isotopes are found a new major isotope (1H) is created.
      Parameters:
      mf - molecular formula
      hcnt - the number of hydrogens to add/remove, (>0 protonate:, <0: deprotonate)
      Returns:
      the protonation was be adjusted
    • getMostAbundant

      public static IMolecularFormula getMostAbundant(IMolecularFormula mf)
      Compute the most abundant MF. Given the MF C6Br6 this function rapidly computes the most abundant MF as 12C679Br381 Br3.
      Parameters:
      mf - a molecular formula with unspecified isotopes
      Returns:
      the most abundant MF, or null if it could not be computed
    • getMostAbundant

      public static IMolecularFormula getMostAbundant(IAtomContainer mol)
      Compute the most abundant MF. Given the a molecule C6Br6 this function rapidly computes the most abundant MF as 12C679Br381 Br3.
      Parameters:
      mol - a molecule with unspecified isotopes
      Returns:
      the most abundant MF, or null if it could not be computed