Class DescriptorEngine


  • public class DescriptorEngine
    extends Object
    A class that provides access to automatic descriptor calculation and more.

    The aim of this class is to provide an easy to use interface to automatically evaluate all the CDK descriptors for a given molecule. Note that at a given time this class will evaluate all atomic or molecular descriptors but not both.

    The available descriptors are determined by scanning all the jar files in the users CLASSPATH and selecting classes that belong to the CDK QSAR atomic or molecular descriptors package.

    An example of its usage would be

     Molecule someMolecule;
     ...
     DescriptorEngine descriptoEngine = new DescriptorEngine(DescriptorEngine.MOLECULAR, null);
     descriptorEngine.process(someMolecule);
     

    The class allows the user to obtain a List of all the available descriptors in terms of their Java class names as well as instances of each descriptor class. For each descriptor, it is possible to obtain its classification as described in the CDK descriptor-algorithms OWL dictionary.

    See Also:
    DescriptorSpecification, Dictionary, OWLFile
    Source code:
    main
    Belongs to CDK module:
    qsarmolecular
    Created on:
    2004-12-02
    • Constructor Detail

      • DescriptorEngine

        public DescriptorEngine​(List<String> classNames,
                                IChemObjectBuilder builder)
        Instantiates the DescriptorEngine. This constructor instantiates the engine but does not perform any initialization. As a result calling the process() method will fail. To use the engine via this constructor you should use the following code
         List classNames = DescriptorEngine.getDescriptorClassNameByPackage("org.openscience.cdk.qsar.descriptors.molecular",
                                                                  null);
         DescriptorEngine engine = DescriptorEngine(classNames);
         
         List instances =  engine.instantiateDescriptors(classNames);
         List specs = engine.initializeSpecifications(instances)
         engine.setDescriptorInstances(instances);
         engine.setDescriptorSpecifications(specs);
         
         engine.process(someAtomContainer);
         
        This approach allows one to use find classes using the interface based approach (getDescriptorClassNameByInterface(String, String[]). If you use this method it is preferable to specify the jar files to examine
      • DescriptorEngine

        public DescriptorEngine​(Class<? extends IDescriptor> c,
                                IChemObjectBuilder builder)
        Create a descriptor engine for all descriptor types. Descriptors are loaded using the service provider mechanism. To include custom descriptors one should declare in META-INF/services a file named as the interface you are providing (e.g. org.openscience.cdk.qsar.IMolecularDescriptor). This file declares the implementations provided by the jar as class names.
        Parameters:
        c - class of the descriptor to use (e.g. IMolecularDescriptor.class)
        See Also:
        Service Provider Interface (SPI) Introduction
    • Method Detail

      • process

        public void process​(IAtomContainer molecule)
                     throws CDKException
        Calculates all available (or only those specified) descriptors for a molecule. The results for a given descriptor as well as associated parameters and specifications are used to create a DescriptorValue object which is then added to the molecule as a property keyed on the DescriptorSpecification object for that descriptor
        Parameters:
        molecule - The molecule for which we want to calculate descriptors
        Throws:
        CDKException - if an error occurred during descriptor calculation or the descriptors and/or specifications have not been initialized
      • getDictionaryType

        public String getDictionaryType​(String identifier)
        Returns the type of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, first child element that is called "isClassifiedAs" is returned. Note that the OWL descriptor spec allows both the class of descriptor (electronic, topological etc) as well as the type of descriptor (molecular, atomic) to be specified in an "isClassifiedAs" element. Thus we ignore any such element that indicates the descriptors class. The method assumes that any descriptor entry will have only one "isClassifiedAs" entry describing the descriptors type. The descriptor can be identified either by the name of the class implementing the descriptor or else the specification reference value of the descriptor which can be obtained from an instance of the descriptor class.
        Parameters:
        identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
        Returns:
        The type of the descriptor as stored in the dictionary, null if no entry is found matching the supplied identifier
      • getDictionaryType

        public String getDictionaryType​(IImplementationSpecification descriptorSpecification)
        Returns the type of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, first child element that is called "isClassifiedAs" is returned. Note that the OWL descriptor spec allows both the class of descriptor (electronic, topological etc) as well as the type of descriptor (molecular, atomic) to be specified in an "isClassifiedAs" element. Thus we ignore any such element that indicates the descriptors class. The method assumes that any descriptor entry will have only one "isClassifiedAs" entry describing the descriptors type. The descriptor can be identified it DescriptorSpecification object
        Parameters:
        descriptorSpecification - A DescriptorSpecification object
        Returns:
        he type of the descriptor as stored in the dictionary, null if no entry is found matching the supplied identifier
      • getDictionaryClass

        public String[] getDictionaryClass​(String identifier)
        Returns the class(es) of the decsriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, the meta-data list is examined to look for a dictRef attribute that contains a descriptorClass value. if such an attribute is found, the value of the contents attribute add to a list. Since a descriptor may be classed in multiple ways (geometric and electronic for example), in general, a given descriptor will have multiple classes associated with it. The descriptor can be identified either by the name of the class implementing the descriptor or else the specification reference value of the descriptor which can be obtained from an instance of the descriptor class.
        Parameters:
        identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
        Returns:
        A List containing the names of the QSAR descriptor classes that this descriptor was declared to belong to. If an entry for the specified identifier was not found, null is returned.
      • getDictionaryClass

        public String[] getDictionaryClass​(IImplementationSpecification descriptorSpecification)
        Returns the class(es) of the descriptor as defined in the descriptor dictionary. The method will look for the identifier specified by the user in the QSAR descriptor dictionary. If a corresponding entry is found, the meta-data list is examined to look for a dictRef attribute that contains a descriptorClass value. if such an attribute is found, the value of the contents attribute add to a list. Since a descriptor may be classed in multiple ways (geometric and electronic for example), in general, a given descriptor will have multiple classes associated with it. The descriptor can be identified by its DescriptorSpecification object.
        Parameters:
        descriptorSpecification - A DescriptorSpecification object
        Returns:
        A List containing the names of the QSAR descriptor classes that this descriptor was declared to belong to. If an entry for the specified identifier was not found, null is returned.
      • getDictionaryDefinition

        public String getDictionaryDefinition​(String identifier)
        Gets the definition of the descriptor. All descriptors in the descriptor dictioanry will have a definition element. This function returns the value of that element. Many descriptors also have a description element which is more detailed. However the value of these elements can contain arbitrary mark up (such as MathML) and I'm not sure what I should return it as
        Parameters:
        identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
        Returns:
        The definition
      • getDictionaryDefinition

        public String getDictionaryDefinition​(DescriptorSpecification descriptorSpecification)
        Gets the definition of the descriptor. All descriptors in the descriptor dictioanry will have a definition element. This function returns the value of that element. Many descriptors also have a description element which is more detailed. However the value of these elements can contain arbitrary mark up (such as MathML) and I'm not sure what I should return it as
        Parameters:
        descriptorSpecification - A DescriptorSpecification object
        Returns:
        The definition
      • getDictionaryTitle

        public String getDictionaryTitle​(String identifier)
        Gets the label (title) of the descriptor.
        Parameters:
        identifier - A String containing either the descriptors fully qualified class name or else the descriptors specification reference
        Returns:
        The title
      • getDictionaryTitle

        public String getDictionaryTitle​(DescriptorSpecification descriptorSpecification)
        Gets the label (title) of the descriptor.
        Parameters:
        descriptorSpecification - The specification object
        Returns:
        The title
      • getDescriptorSpecifications

        public List<IImplementationSpecification> getDescriptorSpecifications()
        Returns the DescriptorSpecification objects for all available descriptors.
        Returns:
        An array of DescriptorSpecification objects. These are the keys with which the DescriptorValue objects can be obtained from a molecules property list
      • getDescriptorClassNames

        public List<String> getDescriptorClassNames()
        Returns a list containing the names of the classes implementing the descriptors.
        Returns:
        A list of class names.
      • getDescriptorInstances

        public List<IDescriptor> getDescriptorInstances()
        Returns a List containing the instantiated descriptor classes.
        Returns:
        A List containing descriptor classes
      • setDescriptorInstances

        public void setDescriptorInstances​(List<IDescriptor> descriptors)
        Set the list of Descriptor objects.
        Parameters:
        descriptors - A List of descriptor objects
        See Also:
        getDescriptorInstances()
      • getAvailableDictionaryClasses

        public String[] getAvailableDictionaryClasses()
        Get the all the unique dictionary classes that the descriptors belong to.
        Returns:
        An array containing the unique dictionary classes.
      • getDescriptorClassNameByInterface

        public static List<String> getDescriptorClassNameByInterface​(String interfaceName,
                                                                     String[] jarFileNames)
        Returns a list containing the classes that implement a specific interface. The interface name specified can be null or an empty string. In this case the interface name is automatcally set to IDescriptor. Specifying IDescriptor will return all available descriptor classes. Valid interface names are
        • IMolecularDescriptor
        • IAtomicDescripto
        • IBondDescriptor
        • IDescriptor
        Parameters:
        interfaceName - The name of the interface that classes should implement
        jarFileNames - A String[] containing the fully qualified names of the jar files to examine for descriptor classes. In general this can be set to NULL, in which case the system classpath is examined for available jar files. This parameter can be set for situations where the system classpath is not available or is modified such as in an application container.
        Returns:
        A list containing the classes implementing the specified interface, null if an invalid interface is specified
      • getDescriptorClassNameByPackage

        public static List<String> getDescriptorClassNameByPackage​(String packageName,
                                                                   String[] jarFileNames)
        Returns a list containing the classes found in the specified descriptor package. The package name specified can be null or an empty string. In this case the package name is automatcally set to "org.openscience.cdk.qsar.descriptors" and as a result will return classes corresponding to both atomic and molecular descriptors.
        Parameters:
        packageName - The name of the package containing the required descriptor
        jarFileNames - A String[] containing the fully qualified names of the jar files to examine for descriptor classes. In general this can be set to NULL, in which case the system classpath is examined for available jar files. This parameter can be set for situations where the system classpath is not available or is modified such as in an application container.
        Returns:
        A list containing the classes in the specified package