Package org.openscience.cdk.isomorphism
Class Mappings
- java.lang.Object
-
- org.openscience.cdk.isomorphism.Mappings
-
- All Implemented Interfaces:
Iterable<int[]>
public final class Mappings extends Object implements Iterable<int[]>
A fluent interface for handling (sub)-graph mappings from a query to a target structure. The utility allows one to modify the mappings and provides convenience utilities.Mappings
are obtained from a (sub)-graph matching usingPattern
.
The primary function is to provide an iterable of matches - each match is a permutation (mapping) of the query graph indices (atom indices).IAtomContainer query = ...; IAtomContainer target = ...; Mappings mappings = Pattern.findSubstructure(query) .matchAll(target);
The matches can be filtered to provide only those that have valid stereochemistry.for (int[] p : mappings) { for (int i = 0; i < p.length; i++) // query.getAtom(i) is mapped to target.getAtom(p[i]); }
Unique matches can be obtained for both atoms and bonds.for (int[] p : mappings.stereochemistry()) { // ... }
As matches may be lazily generated - iterating over the match twice (as above) will actually perform two graph matchings. If the mappings are needed for subsequent use thefor (int[] p : mappings.uniqueAtoms()) { // ... } for (int[] p : mappings.uniqueBonds()) { // ... }
toArray()
provides the permutations as a fixed size array.
Graphs with a high number of automorphisms can produce many valid matchings. Operations can be combined such as to limit the number of matches we retrieve.int[][] ps = mappings.toArray(); for (int[] p : ps) { // ... }
There is no restrictions on which operation can be applied and how many times but the order of operations may change the result.// first ten matches for (int[] p : mappings.limit(10)) { // ... } // first 10 unique matches for (int[] p : mappings.uniqueAtoms() .limit(10)) { // ... } // ensure we don't waste memory and only 'fix' up to 100 unique matches int[][] ps = mappings.uniqueAtoms() .limit(100) .toArray();
// first 100 unique matches Mappings m = mappings.uniqueAtoms() .limit(100); // unique matches in the first 100 matches Mappings m = mappings.limit(100) .uniqueAtoms(); // first 10 unique matches in the first 100 matches Mappings m = mappings.limit(100) .uniqueAtoms() .limit(10); // number of unique atom matches int n = mappings.countUnique(); // number of unique atom matches with correct stereochemistry int n = mappings.stereochemistry() .countUnique();
-
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description boolean
atLeast(int n)
Efficiently determine if there are at least 'n' matchesint
count()
Convenience method to count the number mappings.int
countUnique()
Convenience method to count the number of unique atom mappings.Mappings
exclusiveAtoms()
Filter the mappings for those which cover an exclusive set of atoms in the target.Mappings
filter(Predicate<int[]> predicate)
Filter the mappings and keep only those which match the provided predicate (Guava).int[]
first()
Obtain the first match - if there is no first match an empty array is returned.Iterator<int[]>
iterator()
Mappings
limit(int limit)
Limit the number of mappings - only this number of mappings will be generate.<T> Iterable<T>
map(Function<int[],T> f)
Map the mappings to another type.Mappings
stereochemistry()
Deprecated.Results now automatically consider stereo if it's present, to match without stereochemistry remove the stereo features.Stream<int[]>
stream()
Convert the Mappings to a Java 8Stream
.int[][]
toArray()
Mappings are lazily generated and best used in a loop.Iterable<Map<IChemObject,IChemObject>>
toAtomBondMap()
Convert the permutations to an atom-atom bond-bond map.Iterable<Map<IAtom,IAtom>>
toAtomMap()
Convert the permutations to a atom-atom map.Iterable<Map<IBond,IBond>>
toBondMap()
Convert the permutations to a bond-bond map.Iterable<IChemObject>
toChemObjects()
Obtain the chem objects (atoms and bonds) that have 'hit' in the target molecule.Iterable<IAtomContainer>
toSubstructures()
Obtain the mapped substructures (atoms/bonds) of the target compound.Stream<IAtomContainer>
toSubstructuresStream()
Obtain the mapped substructures (atoms/bonds) of the target compound.Mappings
uniqueAtoms()
Filter the mappings for those which cover a unique atoms in the target.Mappings
uniqueBonds()
Filter the mappings for those which cover a unique set of bonds in the target.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Method Detail
-
filter
public Mappings filter(Predicate<int[]> predicate)
Filter the mappings and keep only those which match the provided predicate (Guava).final IAtomContainer query; final IAtomContainer target; // obtain only the mappings where the first atom in the query is // mapped to the first atom in the target Mappings mappings = Pattern.findSubstructure(query) .matchAll(target) .filter(new Predicate<int[]>() { public boolean apply(int[] input) { return input[0] == 0; }});
- Parameters:
predicate
- a predicate- Returns:
- fluent-api reference
-
map
public <T> Iterable<T> map(Function<int[],T> f)
Map the mappings to another type. Each mapping is transformed using the provided function.final IAtomContainer query; final IAtomContainer target; Mappings mappings = Pattern.findSubstructure(query) .matchAll(target); // a string that indicates the mapping of atom elements and numbers Iterable<String> strs = mappings.map(new Function<int[], String>() { public String apply(int[] input) { StringBuilder sb = new StringBuilder(); for (int i = 0; i < input.length; i++) { if (i > 0) sb.append(", "); sb.append(query.getAtom(i)) .append(i + 1) .append(" -> ") .append(target.getAtom(input[i])) .append(input[i] + 1); } return sb.toString(); }});
- Parameters:
f
- function to transform a mapping- Returns:
- iterable of the transformed type
-
limit
public Mappings limit(int limit)
Limit the number of mappings - only this number of mappings will be generate.- Parameters:
limit
- the number of mappings- Returns:
- fluent-api instance
-
stereochemistry
@Deprecated public Mappings stereochemistry()
Deprecated.Results now automatically consider stereo if it's present, to match without stereochemistry remove the stereo features.Filter the mappings for those which preserve stereochemistry specified in the query.- Returns:
- fluent-api instance
-
uniqueAtoms
public Mappings uniqueAtoms()
Filter the mappings for those which cover a unique atoms in the target. The unique atom mappings are a subset of the unique bond matches.- Returns:
- fluent-api instance
- See Also:
uniqueBonds()
,exclusiveAtoms()
-
exclusiveAtoms
public Mappings exclusiveAtoms()
Filter the mappings for those which cover an exclusive set of atoms in the target. If a match overlaps with another one it is not returned. For example suppose we had the queryC~O
and matched against a carboxylic acid*C(O)=O
, there are 2 unique matches but only 1 exclusive match. If we had two -CO2 groups (c1ccc(C(O)=O)cc1C(O)=O
there are unique matches and2
exclusive matches. The exclusive atom mappings are therefore a subset of the unique atom matches.- Returns:
- fluent-api instance
- See Also:
uniqueAtoms()
,ExclusiveAtomMatches
-
uniqueBonds
public Mappings uniqueBonds()
Filter the mappings for those which cover a unique set of bonds in the target.- Returns:
- fluent-api instance
- See Also:
uniqueAtoms()
-
toArray
public int[][] toArray()
Mappings are lazily generated and best used in a loop. However if all mappings are required this method can provide a fixed size array of mappings.
The method can be used in combination with other modifiers.IAtomContainer query = ...; IAtomContainer target = ...; Pattern pat = Pattern.findSubstructure(query); // lazily iterator for (int[] mapping : pat.matchAll(target)) { // logic... } int[][] mappings = pat.matchAll(target) .toArray(); // same as lazy iterator but we now can refer to and parse 'mappings' // to other methods without regenerating the graph match for (int[] mapping : mappings) { // logic... }
IAtomContainer query = ...; IAtomContainer target = ...; Pattern pat = Pattern.findSubstructure(query); // array of the first 5 unique atom mappings int[][] mappings = pat.matchAll(target) .uniqueAtoms() .limit(5) .toArray();
- Returns:
- array of mappings
-
toAtomMap
public Iterable<Map<IAtom,IAtom>> toAtomMap()
Convert the permutations to a atom-atom map.for (Map<IAtom,IAtom> map : mappings.toAtomMap()) { for (Map.Entry<IAtom,IAtom> e : map.entrySet()) { IAtom queryAtom = e.getKey(); IAtom targetAtom = e.getValue(); } }
- Returns:
- iterable of atom-atom mappings
-
toBondMap
public Iterable<Map<IBond,IBond>> toBondMap()
Convert the permutations to a bond-bond map.for (Map<IBond,IBond> map : mappings.toBondMap()) { for (Map.Entry<IBond,IBond> e : map.entrySet()) { IBond queryBond = e.getKey(); IBond targetBond = e.getValue(); } }
- Returns:
- iterable of bond-bond mappings
-
toAtomBondMap
public Iterable<Map<IChemObject,IChemObject>> toAtomBondMap()
Convert the permutations to an atom-atom bond-bond map.for (Map<IChemObject,IChemObject> map : mappings.toBondMap()) { for (Map.Entry<IChemObject,IChemObject> e : map.entrySet()) { IChemObject queryObj = e.getKey(); IChemObject targetObj = e.getValue(); } IAtom matchedAtom = map.get(query.getAtom(i)); IBond matchedBond = map.get(query.getBond(i)); }
- Returns:
- iterable of atom-atom and bond-bond mappings
-
stream
public Stream<int[]> stream()
Convert the Mappings to a Java 8Stream
. The Stream API was written after this class and provides much of the functionality (e.g.map(java.util.function.Function<int[], T>)
isStream.map(java.util.function.Function)
etc. Unlike an Iterable, a stream cannot be traversed more than once.- Returns:
- the stream
-
toChemObjects
public Iterable<IChemObject> toChemObjects()
Obtain the chem objects (atoms and bonds) that have 'hit' in the target molecule.for (IChemObject obj : mappings.toChemObjects()) { if (obj instanceof IAtom) { // this atom was 'hit' by the pattern } }
- Returns:
- non-lazy iterable of chem objects
-
toSubstructuresStream
public Stream<IAtomContainer> toSubstructuresStream()
Obtain the mapped substructures (atoms/bonds) of the target compound. The atoms and bonds are the same as in the target molecule but there may be less of them.IAtomContainer query, target Mappings mappings = ...; for (IAtomContainer mol : mol.toSubstructures()) { for (IAtom atom : mol.atoms()) target.contains(atom); // always true for (IAtom atom : target.atoms()) mol.contains(atom): // not always true }
- Returns:
- lazy stream iterable of molecules
-
toSubstructures
public Iterable<IAtomContainer> toSubstructures()
Obtain the mapped substructures (atoms/bonds) of the target compound. The atoms and bonds are the same as in the target molecule but there may be less of them.IAtomContainer query, target Mappings mappings = ...; for (IAtomContainer mol : mol.toSubstructures()) { for (IAtom atom : mol.atoms()) target.contains(atom); // always true for (IAtom atom : target.atoms()) mol.contains(atom): // not always true }
- Returns:
- non-lazy iterable of molecules
-
atLeast
public boolean atLeast(int n)
Efficiently determine if there are at least 'n' matchesMappings mappings = ...; if (mappings.atLeast(5)) // set bit flag etc. // are the at least 5 unique matches? if (mappings.uniqueAtoms().atLeast(5)) // set bit etc.
- Parameters:
n
- number of matches- Returns:
- there are at least 'n' matches
-
first
public int[] first()
Obtain the first match - if there is no first match an empty array is returned.- Returns:
- first match
-
count
public int count()
Convenience method to count the number mappings. Note mappings are lazily generated and checking the count and then iterating over the mappings currently performs two searches. If the mappings are also needed, it is more efficient to check the mappings and count manually.- Returns:
- number of matches
-
countUnique
public int countUnique()
Convenience method to count the number of unique atom mappings. Note mappings are lazily generated and checking the count and then iterating over the mappings currently performs two searches. If the mappings are also needed, it is more efficient to check the mappings and count manually. The method is simply invokesmappings.uniqueAtoms().count()
.- Returns:
- number of matches
-
-