For such a task, a simple Example: the following sample shows how to extract French vehicle registration plates. <TRules xmlns="exa:com.exalead.mot.components.transducer">
<!-- SIV (e.g. AA-229-AA) -->
<TRule priority="0">
<MatchAnnotation kind="NE.plates.SIV"/>
<Seq>
<Or>
<TokenRegexp value="[A-Za-z]{2}"/>
<Word value="W" level="exact"/>
<!-- to match temporary plates starting with only one W (e.g. W-001-AA) -->
</Or>
<Opt> <Word value="-" level="exact"/> </Opt>
<TokenRegexp value="[1-9]{3}"/>
<Opt> <Word value="-" level="exact"/> </Opt>
<TokenRegexp value="[A-Za-z]{2}"/>
</Seq>
</TRule>
<!-- FNI (e.g. 1233 CD 33) -->
<TRule priority="1">
<MatchAnnotation kind="NE.plates.FNI"/>
<Or>
<Seq>
<TokenRegexp value="[1-9]{1,4}"/>
<TokenRegexp value="[A-Za-z]{1,3}"/>
<Or>
<TokenRegexp value="[1-9]{2}"/>
<TokenRegexp value="2[AB]"/> <!-- Corse -->
<TokenRegexp value="97[1-8]"/> <!-- DOM-TOM -->
</Or>
</Seq>
<Seq>
<TokenRegexp value="[1-9]{1,6}"/>
<Or>
<Word value="NC" level="exact"/> <!-- New Caledonia -->
<Word value="P" level="exact"/> <!-- French Polynesia -->
</Or>
</Seq>
<Seq> <!-- TAAF - Kerguelen islands -->
<TokenRegexp value="[05-9][1-9]"/>
<TokenRegexp value="[1-9]{4}"/>
</Seq>
<Seq> <!-- Wallis-and-Futuna -->
<TokenRegexp value="[1-9]{1,4}"/>
<Word value="WF" level="exact"/>
</Seq>
</Or>
</TRule>
</TRules>
To use it, add a RulesMatcher in the configuration file: <RulesMatcher name="platesMatcher" resourceFile="resource:///tutorial-mot/plates.xml" /> In this example, we used the NETVIBES resource protocol. It implies that the resource is relative to the
Token[AA] kind[ALPHA] lng[fr] offset[0]
Annotation[aa] tag[LOWERCASE] nbTokens[1]
Annotation[aa] tag[NORMALIZE] nbTokens[1]
Annotation[AA-123-BB] tag[NE.plates.SIV] nbTokens[5]
Token[-] kind[DASH] lng[fr] offset[2]
Token[123] kind[NUMBER] lng[fr] offset[3][...]
| ||||||||||||