Class SMARTSParser
java.lang.Object
org.openscience.cdk.smiles.smarts.parser.SMARTSParser
- All Implemented Interfaces:
SMARTSParserConstants,SMARTSParserTreeConstants
public class SMARTSParser
extends Object
implements SMARTSParserTreeConstants, SMARTSParserConstants
This parser implements a nearly complete subset of the SMARTS syntax as defined on
the
Daylight website.
Example code using SMARTS substructure search looks like:
SmilesParser sp = new SmilesParser();
AtomContainer atomContainer = sp.parseSmiles("CC(=O)OC(=O)C");
QueryAtomContainer query = SMARTSParser.parse("C*C");
boolean queryMatch = UniversalIsomorphismTester.isSubgraph(atomContainer, query);
See the cdk.test.smiles.smarts.parser.ParserTest for examples of the implemented subset. This parser is based on JJTree and it generates an AST (Abstract Syntax Tree)
To get the AST, the code looks like:
SMARTSParser parser = new SMARTSParser(new java.io.StringReader("C*C"));
ASTStart = parser.start();
- Author:
- Dazhi Jiao
- See Also:
- Keywords:
- SMARTS, substructure search
- Created on:
- 2007-04-23
- Requires:
- ant1.6
-
Field Summary
FieldsModifier and TypeFieldDescriptionNext token.protected JJTSMARTSParserStateCurrent token.Generated Token Manager.Fields inherited from interface org.openscience.cdk.smiles.smarts.parser.SMARTSParserConstants
a, A, AC, AG, AL, AM, ANY_BOND, AR, AR_BOND, as, AS, AT, ATOM_EXPRESSION, AU, B, BA, BE, BI, BK, BR, c, C, CA, CARET, CD, CE, CF, CL, CM, CO, CR, CS, CU, D, D_BOND, DEFAULT, DIGIT, DN_OR_UNSPECIFIED_S_BOND, DN_S_BOND, DOLLAR, DY, EOF, ER, ES, EU, F, FE, FM, FR, G, GA, GD, GE, h, H, H_AND, HE, HF, HG, HO, HX, I, IN, IR, K, KR, L_AND, L_BRACKET, L_PAREN, LA, LI, LR, LU, MD, MG, MN, MO, n, N, NA, NB, ND, NE, NI, NO, NOT, NP, o, O, OR, OS, p, P, PA, PB, PD, PLUS, PM, PO, PR, PT, PU, Q_MARK, r, R, R_BOND, R_BRACKET, R_PAREN, RA, RB, RE, RH, RN, RU, s, S, S_BOND, SB, SC, se, SE, SI, SM, SN, SR, T_BOND, TA, TB, TC, TE, TH, TI, TL, TM, tokenImage, U, UP_OR_UNSPECIFIED_S_BOND, UP_S_BOND, v, V, W, WILDCARD, WS, x, X, XE, Y, YB, ZN, ZRFields inherited from interface org.openscience.cdk.smiles.smarts.parser.SMARTSParserTreeConstants
JJTALIPHATIC, JJTANYATOM, JJTAROMATIC, JJTATOM, JJTATOMICMASS, JJTATOMICNUMBER, JJTCHARGE, JJTCHIRALITY, JJTELEMENT, JJTEXPLICITATOM, JJTEXPLICITCONNECTIVITY, JJTEXPLICITHIGHANDBOND, JJTEXPLICITHIGHANDEXPRESSION, JJTGROUP, JJTHYBRDIZATIONNUMBER, JJTIMPLICITHCOUNT, JJTIMPLICITHIGHANDBOND, JJTIMPLICITHIGHANDEXPRESSION, JJTLOWANDBOND, JJTLOWANDEXPRESSION, jjtNodeName, JJTNONCHHEAVYATOM, JJTNOTBOND, JJTNOTEXPRESSION, JJTORBOND, JJTOREXPRESSION, JJTPERIODICGROUPNUMBER, JJTREACTION, JJTRECURSIVESMARTSEXPRESSION, JJTRINGCONNECTIVITY, JJTRINGIDENTIFIER, JJTRINGMEMBERSHIP, JJTSIMPLEBOND, JJTSMALLESTRINGSIZE, JJTSMARTS, JJTSTART, JJTTOTALCONNECTIVITY, JJTTOTALHCOUNT, JJTVALENCE, JJTVOID -
Constructor Summary
ConstructorsConstructorDescriptionSMARTSParser(InputStream stream) Constructor with InputStream.SMARTSParser(InputStream stream, String encoding) Constructor with InputStream and supplied encodingSMARTSParser(Reader stream) Constructor.Constructor with generated Token Manager. -
Method Summary
Modifier and TypeMethodDescriptionfinal voidfinal voidAnyAtom()final voidAromatic()final org.openscience.cdk.smiles.smarts.parser.ASTAtomfinal voidfinal voidfinal voidCharge()final voidfinal voidDisable tracing.final voidEnable tracing.final voidfinal voidfinal voidfinal voidGenerate ParseException.final TokenGet the next Token.final TokengetToken(int index) Get the specific Token.final voidfinal voidfinal voidfinal voidfinal voidfinal voidfinal voidfinal voidfinal voidfinal voidNotBond()final voidfinal voidOrBond()final voidstatic QueryAtomContainerparse(String smarts, IChemObjectBuilder builder) This method parses a Smarts String and returns an instance ofQueryAtomContainerfinal voidfinal voidfinal voidfinal voidvoidReInit(InputStream stream) Reinitialise.voidReInit(InputStream stream, String encoding) Reinitialise.voidReinitialise.voidReinitialise.final voidfinal voidfinal voidfinal voidfinal voidfinal voidfinal org.openscience.cdk.smiles.smarts.parser.ASTStartStart()Start ::= <ReactionExpression> <#_WS> ReactionExpression ::= <GroupExpression>? (">" <GroupExpression>? ">" <GroupExpression>?)? GroupExpression ::= ["("] <SmartsExpresion> [")"] ( "." ["("] <SmartsExpression> [")"] )* SmartsExpression ::= <AtomExpression> ( ( [ <LowAndBond> ] ( <Digit> | <AtomExpression> ) ) | ( "(" [ <LowAndBond> ] <SmartsExpression> ")" ) )* AtomExpression ::= ( "[" [ <AtomicMass> ] <LowAndExpression> [:<Digit>+] "]" ) | <ExplicitAtomExpression> LowAndBond ::= <OrBond> [ ";" <AndBond> ] OrBond ::= <ExplicitHighAndBond> [ "," <OrBond> ] ExplicitHighAndBond ::= <ImplicitHighAndBond> [ "&" <ExplicitHighAndBond> ] ImplicitHighAndBond ::= <NotBond> [ <ImplicitHighAndBond> ] NotBond ::= [ "!" ] <SimpleBond> SimpleBond ::= "/" | "\\" | "/?" | "\\?" | "=" | "#" | "~" | "@" ExplicitAtomExpression ::= [ "B" | "C" | "N" | "O" | "P" | "S" | "F" | "CL" | "BR" | "I" | "c" | "o" | "n" | "*" | "A" | "a" | "p" | "as" | "se" ] LowAndExpression ::= <OrExpression> ( ";" <LowAndExpression> )? OrExpression ::= <ExplicitHighAndExpression> ( "," <OrExpression> ) ? ExplicitHighAndExpression ::= <ImplicitHighAndExpression> ( "&" <ExplicitHighAndExpression> )? ImplicitHighAndExpression ::= <NotExpression> ( <ImplicitHighAndExpression> ) ? NotExpression ::= "!" ( <PrimitiveAtomExpression> | <RecursiveSmartsExpression> ) RecursiveSmartsExpression ::= "$" "(" <SmartsExpression> ")" PrimitiveAtomExpression ::= <AtomicMass> | <NonHydrogenElement> | "*" | "A" | "a" | "D" (<Digits>)? | "H" (<Digits>)? | "h" (<Digits>)? | "R" (<Digit>+)? | "r" (<Digit>+)? | "v" (<Digit>+)? | "#X" | "G" (<DIGIT>+) | "X" (<Digit>+)? | "x" (<Digit>+)? | "^" (<DIGIT>) | ("+" | "-") (<Digit>+)? | "#" (<Digit>+) | "@" | "@@" | <Digit>+ Digit ::= ( "0" - "9") NonHydrogenElement ::= [ "HE" | "LI" | "BE" | "NE" | "NA" | "MG" | "AL" | "SI" | "AR" | "CA" | "SC" | "TI" | "CR" | "MN" | "FE" | "CO" | "NI" | "CU" | "ZN" | "GA" | "GE" | "AS" | "SE" | "BR" | "KR" | "RB" | "SR" | "ZR" | "NB" | "MO" | "TC" | "RU" | "RH" | "PD" | "AG" | "CD" | "IN" | "SN" | "SB" | "TE" | "XE" | "CS" | "BA" | "LA" | "HF" | "TA" | "RE" | "OS" | "IR" | "PT" | "AU" | "HG" | "TL" | "PB" | "BI" | "PO" | "AT" | "RN" | "FR" | "RA" | "AC" | "TH" | "PA" | "B" | "C" | "N" | "O" | "F" | "P" | "S" | "K" | "V" | "Y" | "I" | "U" | "c" | "o" | "n" | "p" | "as" | "se" ]final voidfinal voidfinal booleanTrace enabled.final voidValence()
-
Field Details
-
jjtree
-
token_source
Generated Token Manager. -
token
Current token. -
jj_nt
Next token.
-
-
Constructor Details
-
SMARTSParser
Constructor with InputStream. -
SMARTSParser
Constructor with InputStream and supplied encoding -
SMARTSParser
Constructor. -
SMARTSParser
Constructor with generated Token Manager.
-
-
Method Details
-
parse
This method parses a Smarts String and returns an instance ofQueryAtomContainer -
Start
Start ::= <ReactionExpression> <#_WS> ReactionExpression ::= <GroupExpression>? (">" <GroupExpression>? ">" <GroupExpression>?)? GroupExpression ::= ["("] <SmartsExpresion> [")"] ( "." ["("] <SmartsExpression> [")"] )* SmartsExpression ::= <AtomExpression> ( ( [ <LowAndBond> ] ( <Digit> | <AtomExpression> ) ) | ( "(" [ <LowAndBond> ] <SmartsExpression> ")" ) )* AtomExpression ::= ( "[" [ <AtomicMass> ] <LowAndExpression> [:<Digit>+] "]" ) | <ExplicitAtomExpression> LowAndBond ::= <OrBond> [ ";" <AndBond> ] OrBond ::= <ExplicitHighAndBond> [ "," <OrBond> ] ExplicitHighAndBond ::= <ImplicitHighAndBond> [ "&" <ExplicitHighAndBond> ] ImplicitHighAndBond ::= <NotBond> [ <ImplicitHighAndBond> ] NotBond ::= [ "!" ] <SimpleBond> SimpleBond ::= "/" | "\\" | "/?" | "\\?" | "=" | "#" | "~" | "@" ExplicitAtomExpression ::= [ "B" | "C" | "N" | "O" | "P" | "S" | "F" | "CL" | "BR" | "I" | "c" | "o" | "n" | "*" | "A" | "a" | "p" | "as" | "se" ] LowAndExpression ::= <OrExpression> ( ";" <LowAndExpression> )? OrExpression ::= <ExplicitHighAndExpression> ( "," <OrExpression> ) ? ExplicitHighAndExpression ::= <ImplicitHighAndExpression> ( "&" <ExplicitHighAndExpression> )? ImplicitHighAndExpression ::= <NotExpression> ( <ImplicitHighAndExpression> ) ? NotExpression ::= "!" ( <PrimitiveAtomExpression> | <RecursiveSmartsExpression> ) RecursiveSmartsExpression ::= "$" "(" <SmartsExpression> ")" PrimitiveAtomExpression ::= <AtomicMass> | <NonHydrogenElement> | "*" | "A" | "a" | "D" (<Digits>)? | "H" (<Digits>)? | "h" (<Digits>)? | "R" (<Digit>+)? | "r" (<Digit>+)? | "v" (<Digit>+)? | "#X" | "G" (<DIGIT>+) | "X" (<Digit>+)? | "x" (<Digit>+)? | "^" (<DIGIT>) | ("+" | "-") (<Digit>+)? | "#" (<Digit>+) | "@" | "@@" | <Digit>+ Digit ::= ( "0" - "9") NonHydrogenElement ::= [ "HE" | "LI" | "BE" | "NE" | "NA" | "MG" | "AL" | "SI" | "AR" | "CA" | "SC" | "TI" | "CR" | "MN" | "FE" | "CO" | "NI" | "CU" | "ZN" | "GA" | "GE" | "AS" | "SE" | "BR" | "KR" | "RB" | "SR" | "ZR" | "NB" | "MO" | "TC" | "RU" | "RH" | "PD" | "AG" | "CD" | "IN" | "SN" | "SB" | "TE" | "XE" | "CS" | "BA" | "LA" | "HF" | "TA" | "RE" | "OS" | "IR" | "PT" | "AU" | "HG" | "TL" | "PB" | "BI" | "PO" | "AT" | "RN" | "FR" | "RA" | "AC" | "TH" | "PA" | "B" | "C" | "N" | "O" | "F" | "P" | "S" | "K" | "V" | "Y" | "I" | "U" | "c" | "o" | "n" | "p" | "as" | "se" ]- Throws:
ParseException
-
ReactionExpression
- Throws:
ParseException
-
GroupExpression
- Throws:
ParseException
-
SmartsExpression
- Throws:
ParseException
-
AtomExpression
public final org.openscience.cdk.smiles.smarts.parser.ASTAtom AtomExpression() throws ParseException- Throws:
ParseException
-
LowAndBond
- Throws:
ParseException
-
OrBond
- Throws:
ParseException
-
ExplicitHighAndBond
- Throws:
ParseException
-
ImplicitHighAndBond
- Throws:
ParseException
-
NotBond
- Throws:
ParseException
-
SimpleBond
- Throws:
ParseException
-
ExplicitAtomExpression
- Throws:
ParseException
-
LowAndExpression
- Throws:
ParseException
-
OrExpression
- Throws:
ParseException
-
ExplicitHighAndExpression
- Throws:
ParseException
-
ImplicitHighAndExpression
- Throws:
ParseException
-
NotExpression
- Throws:
ParseException
-
RecursiveSmartsExpression
- Throws:
ParseException
-
PrimitiveAtomExpression
- Throws:
ParseException
-
TotalHCount
- Throws:
ParseException
-
ImplicitHCount
- Throws:
ParseException
-
ExplicitConnectivity
- Throws:
ParseException
-
AtomicNumber
- Throws:
ParseException
-
HybridizationNumber
- Throws:
ParseException
-
Charge
- Throws:
ParseException
-
RingConnectivity
- Throws:
ParseException
-
PeriodicGroupNumber
- Throws:
ParseExceptionParseException
-
TotalConnectivity
- Throws:
ParseException
-
Valence
- Throws:
ParseException
-
RingMembership
- Throws:
ParseException
-
SmallestRingSize
- Throws:
ParseException
-
Aliphatic
- Throws:
ParseException
-
NonCHHeavyAtom
- Throws:
ParseException
-
Aromatic
- Throws:
ParseException
-
AnyAtom
- Throws:
ParseException
-
AtomicMass
- Throws:
ParseException
-
RingIdentifier
- Throws:
ParseException
-
Chirality
- Throws:
ParseException
-
NoHydrogenElement
- Throws:
ParseException
-
ReInit
Reinitialise. -
ReInit
Reinitialise. -
ReInit
Reinitialise. -
ReInit
Reinitialise. -
getNextToken
Get the next Token. -
getToken
Get the specific Token. -
generateParseException
Generate ParseException. -
trace_enabled
public final boolean trace_enabled()Trace enabled. -
enable_tracing
public final void enable_tracing()Enable tracing. -
disable_tracing
public final void disable_tracing()Disable tracing.
-