Class Abbreviations

java.lang.Object
org.openscience.cdk.depict.Abbreviations
All Implemented Interfaces:
Iterable<String>

public class Abbreviations extends Object implements Iterable<String>
Utility class for abbreviating (sub)structures. Using either self assigned structural motifs or pre-loading a common set a structure depiction can be made more concise with the use of abbreviations (sometimes called superatoms).

Basic usage:


 Abbreviations abrv = new Abbreviations();

 // add some abbreviations, when overlapping (e.g. Me,Et,tBu) first one wins
 abrv.add("[Na+].[H-] NaH");
 abrv.add("*c1ccccc1 Ph");
 abrv.add("*C(C)(C)C tBu");
 abrv.add("*CC Et");
 abrv.add("*C Me");

 // maybe we don't want 'Me' in the depiction
 abrv.setEnabled("Me", false);

 // assign abbreviations with some filters
 int numAdded = abrv.apply(mol);

 // generate all but don't assign, need to be added manually
 // set/update the CDKConstants.CTAB_SGROUPS property of mol
 List<Sgroup> sgroups = abrv.generate(mol);
 

Predefined sets of abbreviations can be loaded, the following are on the classpath.


 // https://www.github.com/openbabel/superatoms
 abrv.loadFromFile("obabel_superatoms.smi");
 
See Also:
Keywords:
abbreviate, depict, superatom
  • Constructor Details

    • Abbreviations

      public Abbreviations()
  • Method Details

    • iterator

      public Iterator<String> iterator()
      Iterate over loaded abbreviations. Both enabled and disabled abbreviations are listed.
      Specified by:
      iterator in interface Iterable<String>
      Returns:
      the abbreviations labels (e.g. Ph, Et, Me, OAc, etc.)
    • isEnabled

      public boolean isEnabled(String label)
      Check whether an abbreviation is enabled.
      Parameters:
      label - is enabled
      Returns:
      the label is enabled
    • setEnabled

      public boolean setEnabled(String label, boolean enabled)
      Set whether an abbreviation is enabled or disabled.
      Parameters:
      label - the label (e.g. Ph, Et, Me, OAc, etc.)
      enabled - flag the label as enabled or disabled
      Returns:
      the label state was modified
    • setContractOnHetero

      public void setContractOnHetero(boolean val)
      Set whether abbreviations should be further contracted when they are connected to a heteroatom, for example -NH-Boc becomes -NHBoc. By default this option is enabled.
      Parameters:
      val - on/off
    • setContractToSingleLabel

      public void setContractToSingleLabel(boolean val)
    • generate

      public List<Sgroup> generate(IAtomContainer mol)
      Find all enabled abbreviations in the provided molecule. They are not added to the existing Sgroups and may need filtering.
      Parameters:
      mol - molecule
      Returns:
      list of new abbreviation Sgroups
    • apply

      public int apply(IAtomContainer mol)
      Generates and assigns abbreviations to a molecule. Abbrevations are first generated with generate(org.openscience.cdk.interfaces.IAtomContainer) and the filtered based on the coverage. Currently only abbreviations that cover 100%, or < 40% of the atoms are assigned.
      Parameters:
      mol - molecule
      Returns:
      number of new abbreviations
      See Also:
    • add

      public boolean add(String line) throws InvalidSmilesException
      Convenience method to add an abbreviation from a SMILES string.
      Parameters:
      line - the smiles to add with a title (the label)
      Returns:
      the abbreviation was added, will be false if no title supplied
      Throws:
      InvalidSmilesException - the SMILES was not valid
    • add

      public boolean add(IAtomContainer mol, String label)
      Add an abbreviation to the factory. Abbreviations can be of various flavour based on the number of attachments:

      Detached - zero attachments, the abbreviation covers the whole structure (e.g. THF) Terminal - one attachment, covers substituents (e.g. Ph for Phenyl) Linker - [NOT SUPPORTED YET] two attachments, covers long repeated chains (e.g. PEG4)

      Attachment points (if present) must be specified with zero element atoms.

       *c1ccccc1 Ph
       *OC(=O)C OAc
       
      Parameters:
      mol - the fragment to abbreviate
      label - the label of the fragment
      Returns:
      the abbreviation was added
    • loadFromFile

      public int loadFromFile(String path) throws IOException
      Load a set of abbreviations from a classpath resource or file in SMILES format. The title is seperated by a space.
       *c1ccccc1 Ph
       *c1ccccc1 OAc
       

      Available:

      obabel_superatoms.smi
      https://www.github.com/openbabel/superatoms
      Parameters:
      path - classpath or filesystem path to a SMILES file
      Returns:
      the number of loaded abbreviation
      Throws:
      IOException