Class DescriptorEngine

java.lang.Object
org.openscience.cdk.qsar.DescriptorEngine

public class DescriptorEngine extends Object
A class that provides access to automatic descriptor calculation and more.

The aim of this class is to provide an easy to use interface to automatically evaluate all the CDK descriptors for a given molecule. Note that at a given time this class will evaluate all atomic or molecular descriptors but not both.

The available descriptors are determined by scanning all the jar files in the users CLASSPATH and selecting classes that belong to the CDK QSAR atomic or molecular descriptors package.

An example of its usage would be

 Molecule someMolecule;
 ...
 DescriptorEngine descriptoEngine = new DescriptorEngine(DescriptorEngine.MOLECULAR, null);
 descriptorEngine.process(someMolecule);
 

The class allows the user to obtain a List of all the available descriptors in terms of their Java class names as well as instances of each descriptor class. For each descriptor, it is possible to obtain its classification as described in the CDK descriptor-algorithms OWL dictionary.

See Also:
Source code:
main
Belongs to CDK module:
qsarmolecular
Created on:
2004-12-02
  • Constructor Details

    • DescriptorEngine

      public DescriptorEngine(List<String> classNames, IChemObjectBuilder builder)
      Instantiates the DescriptorEngine. This constructor instantiates the engine but does not perform any initialization. As a result calling the process() method will fail. To use the engine via this constructor you should use the following code
       List classNames = DescriptorEngine.getDescriptorClassNameByPackage("org.openscience.cdk.qsar.descriptors.molecular",
                                                                null);
       DescriptorEngine engine = DescriptorEngine(classNames);
       
       List instances =  engine.instantiateDescriptors(classNames);
       List specs = engine.initializeSpecifications(instances)
       engine.setDescriptorInstances(instances);
       engine.setDescriptorSpecifications(specs);
       
       engine.process(someAtomContainer);
       
      This approach allows one to use find classes using the interface based approach (getDescriptorClassNameByInterface(String, String[]). If you use this method it is preferable to specify the jar files to examine
    • DescriptorEngine

      public DescriptorEngine(Class<? extends IDescriptor> c, IChemObjectBuilder builder)
      Create a descriptor engine for all descriptor types. Descriptors are loaded using the service provider mechanism. To include custom descriptors one should declare in META-INF/services a file named as the interface you are providing (e.g. org.openscience.cdk.qsar.IMolecularDescriptor). This file declares the implementations provided by the jar as class names.
      Parameters:
      c - class of the descriptor to use (e.g. IMolecularDescriptor.class)
      See Also:
  • Method Details

    • process

      public void process(IAtomContainer molecule) throws CDKException
      Calculates all available (or only those specified) descriptors for a molecule. The results for a given descriptor as well as associated parameters and specifications are used to create a DescriptorValue object which is then added to the molecule as a property keyed on the DescriptorSpecification object for that descriptor
      Parameters:
      molecule - The molecule for which we want to calculate descriptors
      Throws:
      CDKException - if an error occurred during descriptor calculation or the descriptors and/or specifications have not been initialized
    • getDictionaryType

      public String getDictionaryType(String identifier)
      Returns the type of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, first child element that is called "isClassifiedAs" is returned. Note that the OWL descriptor spec allows both the class of descriptor (electronic, topological etc) as well as the type of descriptor (molecular, atomic) to be specified in an "isClassifiedAs" element. Thus we ignore any such element that indicates the descriptors class. The method assumes that any descriptor entry will have only one "isClassifiedAs" entry describing the descriptors type. The descriptor can be identified either by the name of the class implementing the descriptor or else the specification reference value of the descriptor which can be obtained from an instance of the descriptor class.
      Parameters:
      identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
      Returns:
      The type of the descriptor as stored in the dictionary, null if no entry is found matching the supplied identifier
    • getDictionaryType

      public String getDictionaryType(IImplementationSpecification descriptorSpecification)
      Returns the type of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, first child element that is called "isClassifiedAs" is returned. Note that the OWL descriptor spec allows both the class of descriptor (electronic, topological etc) as well as the type of descriptor (molecular, atomic) to be specified in an "isClassifiedAs" element. Thus we ignore any such element that indicates the descriptors class. The method assumes that any descriptor entry will have only one "isClassifiedAs" entry describing the descriptors type. The descriptor can be identified it DescriptorSpecification object
      Parameters:
      descriptorSpecification - A DescriptorSpecification object
      Returns:
      he type of the descriptor as stored in the dictionary, null if no entry is found matching the supplied identifier
    • getDictionaryClass

      public String[] getDictionaryClass(String identifier)
      Returns the class(es) of the decsriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, the meta-data list is examined to look for a dictRef attribute that contains a descriptorClass value. if such an attribute is found, the value of the contents attribute add to a list. Since a descriptor may be classed in multiple ways (geometric and electronic for example), in general, a given descriptor will have multiple classes associated with it. The descriptor can be identified either by the name of the class implementing the descriptor or else the specification reference value of the descriptor which can be obtained from an instance of the descriptor class.
      Parameters:
      identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
      Returns:
      A List containing the names of the QSAR descriptor classes that this descriptor was declared to belong to. If an entry for the specified identifier was not found, null is returned.
    • getDictionaryClass

      public String[] getDictionaryClass(IImplementationSpecification descriptorSpecification)
      Returns the class(es) of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, the meta-data list is examined to look for a dictRef attribute that contains a descriptorClass value. if such an attribute is found, the value of the contents attribute add to a list. Since a descriptor may be classed in multiple ways (geometric and electronic for example), in general, a given descriptor will have multiple classes associated with it. The descriptor can be identified by its DescriptorSpecification object.
      Parameters:
      descriptorSpecification - A DescriptorSpecification object
      Returns:
      A List containing the names of the QSAR descriptor classes that this descriptor was declared to belong to. If an entry for the specified identifier was not found, null is returned.
    • getDictionaryDefinition

      public String getDictionaryDefinition(String identifier)
      Gets the definition of the descriptor. All descriptors in the descriptor dictioanry will have a definition element. This function returns the value of that element. Many descriptors also have a description element which is more detailed. However the value of these elements can contain arbitrary mark up (such as MathML) and I'm not sure what I should return it as
      Parameters:
      identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
      Returns:
      The definition
    • getDictionaryDefinition

      public String getDictionaryDefinition(DescriptorSpecification descriptorSpecification)
      Gets the definition of the descriptor. All descriptors in the descriptor dictioanry will have a definition element. This function returns the value of that element. Many descriptors also have a description element which is more detailed. However the value of these elements can contain arbitrary mark up (such as MathML) and I'm not sure what I should return it as
      Parameters:
      descriptorSpecification - A DescriptorSpecification object
      Returns:
      The definition
    • getDictionaryTitle

      public String getDictionaryTitle(String identifier)
      Gets the label (title) of the descriptor.
      Parameters:
      identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
      Returns:
      The title
    • getDictionaryTitle

      public String getDictionaryTitle(DescriptorSpecification descriptorSpecification)
      Gets the label (title) of the descriptor.
      Parameters:
      descriptorSpecification - The specification object
      Returns:
      The title
    • getDescriptorSpecifications

      public List<IImplementationSpecification> getDescriptorSpecifications()
      Returns the DescriptorSpecification objects for all available descriptors.
      Returns:
      An array of DescriptorSpecification objects. These are the keys with which the DescriptorValue objects can be obtained from a molecules property list
    • setDescriptorSpecifications

      public void setDescriptorSpecifications(List<IImplementationSpecification> specs)
      Set the list of DescriptorSpecification objects.
      Parameters:
      specs - A list of specification objects
      See Also:
    • getDescriptorClassNames

      public List<String> getDescriptorClassNames()
      Returns a list containing the names of the classes implementing the descriptors.
      Returns:
      A list of class names.
    • getDescriptorInstances

      public List<IDescriptor> getDescriptorInstances()
      Returns a List containing the instantiated descriptor classes.
      Returns:
      A List containing descriptor classes
    • setDescriptorInstances

      public void setDescriptorInstances(List<IDescriptor> descriptors)
      Set the list of Descriptor objects.
      Parameters:
      descriptors - A List of descriptor objects
      See Also:
    • getAvailableDictionaryClasses

      public String[] getAvailableDictionaryClasses()
      Get the all the unique dictionary classes that the descriptors belong to.
      Returns:
      An array containing the unique dictionary classes.
    • getDescriptorClassNameByInterface

      public static List<String> getDescriptorClassNameByInterface(String interfaceName, String[] jarFileNames)
      Returns a list containing the classes that implement a specific interface. The interface name specified can be null or an empty string. In this case the interface name is automatcally set to IDescriptor. Specifying IDescriptor will return all available descriptor classes. Valid interface names are
      • IMolecularDescriptor
      • IAtomicDescripto
      • IBondDescriptor
      • IDescriptor
      Parameters:
      interfaceName - The name of the interface that classes should implement
      jarFileNames - A String[] containing the fully qualified names of the jar files to examine for descriptor classes. In general this can be set to NULL, in which case the system classpath is examined for available jar files. This parameter can be set for situations where the system classpath is not available or is modified such as in an application container.
      Returns:
      A list containing the classes implementing the specified interface, null if an invalid interface is specified
    • getDescriptorClassNameByPackage

      public static List<String> getDescriptorClassNameByPackage(String packageName, String[] jarFileNames)
      Returns a list containing the classes found in the specified descriptor package. The package name specified can be null or an empty string. In this case the package name is automatcally set to "org.openscience.cdk.qsar.descriptors" and as a result will return classes corresponding to both atomic and molecular descriptors.
      Parameters:
      packageName - The name of the package containing the required descriptor
      jarFileNames - A String[] containing the fully qualified names of the jar files to examine for descriptor classes. In general this can be set to NULL, in which case the system classpath is examined for available jar files. This parameter can be set for situations where the system classpath is not available or is modified such as in an application container.
      Returns:
      A list containing the classes in the specified package
    • instantiateDescriptors

      public List<IDescriptor> instantiateDescriptors(List<String> descriptorClassNames)
    • initializeSpecifications

      public List<IImplementationSpecification> initializeSpecifications(List<IDescriptor> descriptors)