edu.udo.cs.wvtool.generic.tokenizer
Interface WVTTokenizer

All Known Implementing Classes:
NGramTokenizer, SimpleTokenizer

public interface WVTTokenizer

Interface, which represents a mechanism to convert a stream of characters to a stream of tokens, deleting all seperators.

Version:
$Id: WVTTokenizer.java,v 1.2 2006/06/06 11:45:24 mjwurst Exp $
Author:
Michael Wurst

Method Summary
 TokenEnumeration tokenize(java.io.Reader source, WVTDocumentInfo d)
          Tokenize a character stream.
 

Method Detail

tokenize

TokenEnumeration tokenize(java.io.Reader source,
                          WVTDocumentInfo d)
                          throws WVToolException
Tokenize a character stream.

Parameters:
source - the Reader from which to get the character stream
d - the WVTDocumentInfo value, describing the document being processed
Returns:
a TokenEnumeration
Throws:
java.lang.Exception - if an error occurs
WVToolException