Class RdfileReader

java.lang.Object
org.openscience.cdk.io.RdfileReader
All Implemented Interfaces:
Closeable, AutoCloseable, Iterator<RdfileRecord>

public final class RdfileReader extends Object implements Closeable, Iterator<RdfileRecord>
Iterating reader for RDFiles.

This class facilitates reading RDFiles the specification of which was initially published in [Dalby, A. et. al.. Journal of Chemical Information and Computer Sciences. 1992. 32] and is now maintained by Daussault Systems [Dassault Systèmes, CTFile Formats Biovia Databases 2020, 2020, Dassault Systèmes, https://discover.3ds.com/sites/default/files/2020-08/biovia_ctfileformats_2020.pdf].

An RDFile is composed of
  1. an RDFile header
  2. one or more records where each record comprises
    1. an optional internal or external registry number
    2. a molecule represented as a MolFile in V2000 or V3000 format or a reaction represented as an RxnFile in V2000 or V3000 format
    3. an optional data block that consists of one or more (data field identifier, data) pairs
Here is an example of how to read an RDF that is expected to only contain molecules:
 // read an RDF that is expected to only contain molecules
 List<IAtomContainer> molecules = new ArrayList<>();
 try (RdfileReader rdfileReader = new RdfileReader(new FileReader("molecules.rdf"), SilentChemObjectBuilder.getInstance())) {
     while(rdfileReader.hasNext()) {
       final RdfileRecord rdfileRecord = rdfileReader.next();
       if (rdfileRecord.isMolfile()) {
         molecules.add(rdfileRecord.getAtomContainer());
      } else {
       // create log entry or throw exception as only molecules are expected in this RDF
     }
   }
 }
 

By default, any remaining records are skipped if an error is encountered in a record. This can be changed by using one of the constructors that allows to provide a boolean value for the argument continueOnError (one takes an #RdfileReader(InputStream,IChemObject,boolean) InputStream, the other one a #RdfileReader(Reader,IChemObject,boolean) Reader).

Author:
Uli Fechner
See Also:
  • Constructor Details

    • RdfileReader

      public RdfileReader(InputStream in, IChemObjectBuilder chemObjectBuilder)
      Creates a new RdfileReader instance with the given InputStream and IChemObjectBuilder.
      Parameters:
      in - the InputStream serving the RDfile data
      chemObjectBuilder - the IChemObjectBuilder for creating CDK objects
    • RdfileReader

      public RdfileReader(InputStream in, IChemObjectBuilder chemObjectBuilder, boolean continueOnError)
      Creates a new RdfileReader instance with the given InputStream and IChemObjectBuilder.

      If continueOnError is true remaining records are processed when an error is encountered; if false all remaining records in the file are skipped.

      Parameters:
      in - the InputStream serving the RDfile data
      chemObjectBuilder - the IChemObjectBuilder for creating CDK objects
      continueOnError - determines whether to continue processing records in case an error is encountered
    • RdfileReader

      public RdfileReader(Reader reader, IChemObjectBuilder chemObjectBuilder, boolean continueOnError)
      Creates a new RdfileReader instance with the given InputStream and IChemObjectBuilder.

      If continueOnError is true remaining records are processed when an error is encountered; if false all remaining records in the file are skipped.

      Parameters:
      reader - the Reader providing the RDfile data
      chemObjectBuilder - the IChemObjectBuilder for creating CDK objects
      continueOnError - determines whether to continue processing records in case an error is encountered
  • Method Details