java.lang.Object
- org.openscience.cdk.isomorphism.Mappings

All Implemented Interfaces:: Iterable<int[]>

public final class Mappings
extends Object
implements Iterable<int[]>

A fluent interface for handling (sub)-graph mappings from a query to a target structure. The utility allows one to modify the mappings and provides convenience utilities. Mappings are obtained from a (sub)-graph matching using Pattern.

 IAtomContainer query  = ...;
 IAtomContainer target = ...;

 Mappings mappings = Pattern.findSubstructure(query)
                            .matchAll(target);

The primary function is to provide an iterable of matches - each match is a permutation (mapping) of the query graph indices (atom indices).


 for (int[] p : mappings) {
     for (int i = 0; i < p.length; i++)
         // query.getAtom(i) is mapped to target.getAtom(p[i]);
 }

The matches can be filtered to provide only those that have valid stereochemistry.

 for (int[] p : mappings.stereochemistry()) {
     // ...
 }

Unique matches can be obtained for both atoms and bonds.

 for (int[] p : mappings.uniqueAtoms()) {
     // ...
 }

 for (int[] p : mappings.uniqueBonds()) {
     // ...
 }

As matches may be lazily generated - iterating over the match twice (as above) will actually perform two graph matchings. If the mappings are needed for subsequent use the toArray() provides the permutations as a fixed size array.

 int[][] ps = mappings.toArray();
 for (int[] p : ps) {
    // ...
 }

Graphs with a high number of automorphisms can produce many valid matchings. Operations can be combined such as to limit the number of matches we retrieve.

 // first ten matches
 for (int[] p : mappings.limit(10)) {
     // ...
 }

 // first 10 unique matches
 for (int[] p : mappings.uniqueAtoms()
                        .limit(10)) {
     // ...
 }

 // ensure we don't waste memory and only 'fix' up to 100 unique matches
 int[][] ps = mappings.uniqueAtoms()
                      .limit(100)
                      .toArray();

There is no restrictions on which operation can be applied and how many times but the order of operations may change the result.

 // first 100 unique matches
 Mappings m = mappings.uniqueAtoms()
                      .limit(100);

 // unique matches in the first 100 matches
 Mappings m = mappings.limit(100)
                      .uniqueAtoms();

 // first 10 unique matches in the first 100 matches
 Mappings m = mappings.limit(100)
                      .uniqueAtoms()
                      .limit(10);

 // number of unique atom matches
 int n = mappings.countUnique();

 // number of unique atom matches with correct stereochemistry
 int n = mappings.stereochemistry()
                 .countUnique();

Author:: John May
See Also:: Pattern
Source code:: main
Belongs to CDK module:: isomorphism
Keywords:: substructure search, structure search, mappings, matching

Method Summary

All Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method	Description
`boolean`	`atLeast(int n)`	Efficiently determine if there are at least 'n' matches
`int`	`count()`	Convenience method to count the number mappings.
`int`	`countUnique()`	Convenience method to count the number of unique atom mappings.
`Mappings`	`exclusiveAtoms()`	Filter the mappings for those which cover an exclusive set of atoms in the target.
`Mappings`	`filter(Predicate<int[]> predicate)`	Filter the mappings and keep only those which match the provided predicate (Guava).
`int[]`	`first()`	Obtain the first match - if there is no first match an empty array is returned.
`Iterator<int[]>`	`iterator()`
`Mappings`	`limit(int limit)`	Limit the number of mappings - only this number of mappings will be generate.
`<T> Iterable<T>`	`map(Function<int[],T> f)`	Map the mappings to another type.
`Mappings`	`stereochemistry()`	Deprecated. Results now automatically consider stereo if it's present, to match without stereochemistry remove the stereo features.
`Stream<int[]>`	`stream()`	Convert the Mappings to a Java 8 `Stream`.
`int[][]`	`toArray()`	Mappings are lazily generated and best used in a loop.
`Iterable<Map<IChemObject,IChemObject>>`	`toAtomBondMap()`	Convert the permutations to an atom-atom bond-bond map.
`Iterable<Map<IAtom,IAtom>>`	`toAtomMap()`	Convert the permutations to a atom-atom map.
`Iterable<Map<IBond,IBond>>`	`toBondMap()`	Convert the permutations to a bond-bond map.
`Iterable<IChemObject>`	`toChemObjects()`	Obtain the chem objects (atoms and bonds) that have 'hit' in the target molecule.
`Iterable<IAtomContainer>`	`toSubstructures()`	Obtain the mapped substructures (atoms/bonds) of the target compound.
`Stream<IAtomContainer>`	`toSubstructuresStream()`	Obtain the mapped substructures (atoms/bonds) of the target compound.
`Mappings`	`uniqueAtoms()`	Filter the mappings for those which cover a unique atoms in the target.
`Mappings`	`uniqueBonds()`	Filter the mappings for those which cover a unique set of bonds in the target.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.lang.Iterable
forEach, spliterator

Method Detail

filter

public Mappings filter(Predicate<int[]> predicate)

Filter the mappings and keep only those which match the provided predicate (Guava).



     final IAtomContainer query;
     final IAtomContainer target;

     // obtain only the mappings where the first atom in the query is
     // mapped to the first atom in the target
     Mappings mappings = Pattern.findSubstructure(query)
                                .matchAll(target)
                                .filter(new Predicate<int[]>() {
                                    public boolean apply(int[] input) {
                                        return input[0] == 0;
                                    }});

Parameters:: predicate - a predicate
Returns:: fluent-api reference

map

public <T> Iterable<T> map(Function<int[],T> f)

Map the mappings to another type. Each mapping is transformed using the provided function.



     final IAtomContainer query;
     final IAtomContainer target;

     Mappings mappings = Pattern.findSubstructure(query)
                                .matchAll(target);

     // a string that indicates the mapping of atom elements and numbers
     Iterable&lt;String&gt; strs = mappings.map(new Function<int[], String>() {
         public String apply(int[] input) {
             StringBuilder sb = new StringBuilder();
             for (int i = 0; i &lt; input.length; i++) {
                 if (i > 0) sb.append(", ");
                 sb.append(query.getAtom(i))
                   .append(i + 1)
                   .append(" -> ")
                   .append(target.getAtom(input[i]))
                   .append(input[i] + 1);
             }
             return sb.toString();
         }});

Parameters:: f - function to transform a mapping
Returns:: iterable of the transformed type

limit
```
public Mappings limit(int limit)
```
Limit the number of mappings - only this number of mappings will be generate.

Parameters:

limit - the number of mappings

Returns:

fluent-api instance

stereochemistry
```
@Deprecated
public Mappings stereochemistry()
```
Deprecated.
Results now automatically consider stereo if it's present, to match without stereochemistry remove the stereo features.

Filter the mappings for those which preserve stereochemistry specified in the query.

Returns:

fluent-api instance

uniqueAtoms
```
public Mappings uniqueAtoms()
```
Filter the mappings for those which cover a unique atoms in the target. The unique atom mappings are a subset of the unique bond matches.

Returns:

fluent-api instance

See Also:

uniqueBonds(), exclusiveAtoms()

exclusiveAtoms
```
public Mappings exclusiveAtoms()
```
Filter the mappings for those which cover an exclusive set of atoms in the target. If a match overlaps with another one it is not returned. For example suppose we had the query C~O and matched against a carboxylic acid *C(O)=O, there are 2 unique matches but only 1 exclusive match. If we had two -CO2 groups (c1ccc(C(O)=O)cc1C(O)=O there are unique matches and 2 exclusive matches. The exclusive atom mappings are therefore a subset of the unique atom matches.

Returns:

fluent-api instance

See Also:

uniqueAtoms(), ExclusiveAtomMatches

uniqueBonds
```
public Mappings uniqueBonds()
```
Filter the mappings for those which cover a unique set of bonds in the target.

Returns:

fluent-api instance

See Also:

uniqueAtoms()

toArray

public int[][] toArray()

Mappings are lazily generated and best used in a loop. However if all mappings are required this method can provide a fixed size array of mappings.


 IAtomContainer query  = ...;
 IAtomContainer target = ...;

 Pattern pat = Pattern.findSubstructure(query);

 // lazily iterator
 for (int[] mapping : pat.matchAll(target)) {
     // logic...
 }

 int[][] mappings = pat.matchAll(target)
                       .toArray();

 // same as lazy iterator but we now can refer to and parse 'mappings'
 // to other methods without regenerating the graph match
 for (int[] mapping : mappings) {
     // logic...
 }

The method can be used in combination with other modifiers.


 IAtomContainer query  = ...;
 IAtomContainer target = ...;

 Pattern pat = Pattern.findSubstructure(query);

 // array of the first 5 unique atom mappings
 int[][] mappings = pat.matchAll(target)
                       .uniqueAtoms()
                       .limit(5)
                       .toArray();

Returns:: array of mappings

toAtomMap

public Iterable<Map<IAtom,IAtom>> toAtomMap()

Convert the permutations to a atom-atom map.

 for (Map<IAtom,IAtom> map : mappings.toAtomMap()) {
     for (Map.Entry<IAtom,IAtom> e : map.entrySet()) {
         IAtom queryAtom  = e.getKey();
         IAtom targetAtom = e.getValue();
     }
 }

Returns:: iterable of atom-atom mappings

toBondMap

public Iterable<Map<IBond,IBond>> toBondMap()

Convert the permutations to a bond-bond map.

 for (Map<IBond,IBond> map : mappings.toBondMap()) {
     for (Map.Entry<IBond,IBond> e : map.entrySet()) {
         IBond queryBond  = e.getKey();
         IBond targetBond = e.getValue();
     }
 }

Returns:: iterable of bond-bond mappings

toAtomBondMap

public Iterable<Map<IChemObject,IChemObject>> toAtomBondMap()

Convert the permutations to an atom-atom bond-bond map.

 for (Map<IChemObject,IChemObject> map : mappings.toBondMap()) {
     for (Map.Entry<IChemObject,IChemObject> e : map.entrySet()) {
         IChemObject queryObj  = e.getKey();
         IChemObject targetObj = e.getValue();
     }

     IAtom matchedAtom = map.get(query.getAtom(i));
     IBond matchedBond = map.get(query.getBond(i));
 }

Returns:: iterable of atom-atom and bond-bond mappings

stream
```
public Stream<int[]> stream()
```
Convert the Mappings to a Java 8 Stream. The Stream API was written after this class and provides much of the functionality (e.g. map(java.util.function.Function<int[], T>) is Stream.map(java.util.function.Function) etc. Unlike an Iterable, a stream cannot be traversed more than once.

Returns:

the stream

toChemObjects

public Iterable<IChemObject> toChemObjects()

Obtain the chem objects (atoms and bonds) that have 'hit' in the target molecule.

 for (IChemObject obj : mappings.toChemObjects()) {
   if (obj instanceof IAtom) {
      // this atom was 'hit' by the pattern
   }
 }

Returns:: non-lazy iterable of chem objects

toSubstructuresStream

public Stream<IAtomContainer> toSubstructuresStream()

Obtain the mapped substructures (atoms/bonds) of the target compound. The atoms and bonds are the same as in the target molecule but there may be less of them.

 IAtomContainer query, target
 Mappings mappings = ...;
 for (IAtomContainer mol : mol.toSubstructures()) {
    for (IAtom atom : mol.atoms())
      target.contains(atom); // always true
    for (IAtom atom : target.atoms())
      mol.contains(atom): // not always true
 }

Returns:: lazy stream iterable of molecules

toSubstructures

public Iterable<IAtomContainer> toSubstructures()

Obtain the mapped substructures (atoms/bonds) of the target compound. The atoms and bonds are the same as in the target molecule but there may be less of them.

 IAtomContainer query, target
 Mappings mappings = ...;
 for (IAtomContainer mol : mol.toSubstructures()) {
    for (IAtom atom : mol.atoms())
      target.contains(atom); // always true
    for (IAtom atom : target.atoms())
      mol.contains(atom): // not always true
 }

Returns:: non-lazy iterable of molecules

atLeast

public boolean atLeast(int n)

Efficiently determine if there are at least 'n' matches

 Mappings mappings = ...;

 if (mappings.atLeast(5))
    // set bit flag etc.

 // are the at least 5 unique matches?
 if (mappings.uniqueAtoms().atLeast(5))
    // set bit etc.

Parameters:: n - number of matches
Returns:: there are at least 'n' matches

first
```
public int[] first()
```
Obtain the first match - if there is no first match an empty array is returned.

Returns:

first match

count
```
public int count()
```
Convenience method to count the number mappings. Note mappings are lazily generated and checking the count and then iterating over the mappings currently performs two searches. If the mappings are also needed, it is more efficient to check the mappings and count manually.

Returns:

number of matches

countUnique
```
public int countUnique()
```
Convenience method to count the number of unique atom mappings. Note mappings are lazily generated and checking the count and then iterating over the mappings currently performs two searches. If the mappings are also needed, it is more efficient to check the mappings and count manually. The method is simply invokes
```
mappings.uniqueAtoms().count()
```
.
Returns:

number of matches

iterator
```
public Iterator<int[]> iterator()
```
Specified by:

iterator in interface Iterable<int[]>

Class Mappings

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface java.lang.Iterable

Method Detail

filter

map

limit

stereochemistry

uniqueAtoms

exclusiveAtoms

uniqueBonds

toArray

toAtomMap

toBondMap

toAtomBondMap

stream

toChemObjects

toSubstructuresStream

toSubstructures

atLeast

first

count

countUnique

iterator