edu.udo.cs.wvtool.generic.tokenizer
Interface WVTTokenizer
- All Known Implementing Classes:
- NGramTokenizer, SimpleTokenizer
public interface WVTTokenizer
Interface, which represents a mechanism to convert a stream of characters to
a stream of tokens, deleting all seperators.
- Version:
- $Id: WVTTokenizer.java,v 1.2 2006/06/06 11:45:24 mjwurst Exp $
- Author:
- Michael Wurst
tokenize
TokenEnumeration tokenize(java.io.Reader source,
WVTDocumentInfo d)
throws WVToolException
- Tokenize a character stream.
- Parameters:
source
- the Reader
from which to get the character
streamd
- the WVTDocumentInfo
value, describing the
document being processed
- Returns:
- a
TokenEnumeration
- Throws:
java.lang.Exception
- if an error occurs
WVToolException