A B C D E F G H I J K L M N O P R S T U V W X

A

AbstractStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
An abstract stemmer.
AbstractStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
AbstractStopWordFilter - Class in edu.udo.cs.wvtool.generic.wordfilter
Abstract call implementing basic functionality for several word filters.
AbstractStopWordFilter(int) - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Constructor for StopWordsWrapper.
AbstractStopWordFilter() - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Constructor for StopWordsWrapper.
AbstractWordNetStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
An abstract stemming class using the wordnet dicitionary.
AbstractWordNetStemmer(SimpleStemmer, int) - Constructor for class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
AbstractWordNetStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
add(char) - Method in class edu.udo.cs.wvtool.external.Stemmer
Add a character to the word being stemmed.
add(char[], int) - Method in class edu.udo.cs.wvtool.external.Stemmer
Adds wLen characters to the word being stemmed contained in a portion of a char[] array.
addEntry(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.main.WVTFileInputList
Add an entry to the list.
additionalMap - Variable in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
addMappings - Variable in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
addOccurance() - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Add an occurance of the word for the document currently processed.
addRegularExpression(String, String) - Method in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
addTermMapping(String, String) - Method in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
addWordOccurance(String) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Count the occurance of the given word.
appendWords - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
indicates, whether missing words should be added to the list
attributeCount - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
attributes - Variable in class edu.udo.cs.wvtool.external.XmlReader
 

B

b - Variable in class edu.udo.cs.wvtool.external.Stemmer
 
BinaryOccurrences - Class in edu.udo.cs.wvtool.generic.vectorcreation
Create the vector by taking the number of occurences.
BinaryOccurrences() - Constructor for class edu.udo.cs.wvtool.generic.vectorcreation.BinaryOccurrences
 

C

CDSECT - Static variable in class edu.udo.cs.wvtool.external.XmlReader
 
classCount - Variable in class edu.udo.cs.wvtool.wordlist.WVTWord
counter for the class frequencies
className - Variable in class edu.udo.cs.wvtool.config.WVTConfigurationFact
the name of the class from which to create an object as component
classValue - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
the class value, which is assigned to the document.
close() - Method in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
close(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.loader.SourceAsTextLoader
Close the resource from which the given document has been read.
close(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.loader.UniversalLoader
 
close(WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.loader.WVTDocumentLoader
Close the resource from which the given document has been read.
close() - Method in class edu.udo.cs.wvtool.generic.output.WordVectorWriter
 
closeDocument(boolean) - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Close the processing of the current document, no class value is provided for this document.
closeDocument(int, boolean) - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Close the processing of the current document, if a class value is provided.
closeDocument(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Used to reset the calculation for individual documents after the given document has been processed.
column - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
component - Variable in class edu.udo.cs.wvtool.config.WVTConfigurationFact
the object that represents the component
configure(Crawler) - Method in class edu.udo.cs.wvtool.crawler.CrawledInputList
 
cons(int) - Method in class edu.udo.cs.wvtool.external.Stemmer
 
containsLettersOnly(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
contentEncoding - Variable in class edu.udo.cs.wvtool.crawler.WVToolCrawler
the encoding of the crawled document
contentEncoding - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
the encoding of the document
contentLanguage - Variable in class edu.udo.cs.wvtool.crawler.WVToolCrawler
the language the documents are written in (english, german, ...)
contentLanguage - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
the language the document is written in (english, german, ...)
contentType - Variable in class edu.udo.cs.wvtool.crawler.WVToolCrawler
the MIME content type of the crawled documents
contentType - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
the MIME content type of the document
convertChars(Reader, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.charmapper.DummyCharConverter
 
convertChars(Reader, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.charmapper.WVTCharConverter
Convert characters.
convertToPlainText(InputStream, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.inputfilter.TextInputFilter
 
convertToPlainText(InputStream, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.inputfilter.WVTInputFilter
Convert the input stream to plain natural text.
convertToPlainText(InputStream, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.inputfilter.XMLInputFilter
 
CrawledInputList - Class in edu.udo.cs.wvtool.crawler
Input list obtained by crawling from a set of initial URLs.
CrawledInputList(WVToolCrawler[]) - Constructor for class edu.udo.cs.wvtool.crawler.CrawledInputList
 
CrawledInputList(WVToolCrawler) - Constructor for class edu.udo.cs.wvtool.crawler.CrawledInputList
 
crawlers - Variable in class edu.udo.cs.wvtool.crawler.CrawledInputList
 
createVector(int[], int, WVTWordList, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.vectorcreation.BinaryOccurrences
 
createVector(int[], int, WVTWordList, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.vectorcreation.TermFrequency
 
createVector(int[], int, WVTWordList, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.vectorcreation.TermOccurrences
 
createVector(int[], int, WVTWordList, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.vectorcreation.TFIDF
 
createVector(int[], int, WVTWordList, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.vectorcreation.WVTVectorCreator
Create a word vector from term frequencies and a word list.
createVector(String, WVTDocumentInfo, WVTConfiguration, WVTWordList) - Method in class edu.udo.cs.wvtool.main.WVTool
Create a single word vector.
createVector(String, WVTWordList) - Method in class edu.udo.cs.wvtool.main.WVTool
Create an individual word vector from a String using TF/IDF weights and stadard configuration.
createVectors(WVTInputList, WVTConfiguration, int, int) - Method in class edu.udo.cs.wvtool.main.WVTool
Create a word list and after this word vectors, both from the same input list.
createVectors(WVTInputList, WVTConfiguration) - Method in class edu.udo.cs.wvtool.main.WVTool
Deprecated. Please use the method createVectors(WVTInputList input, WVTConfiguration config, int pruneMin, int pruneMax)
createVectors(WVTInputList, WVTConfiguration, WVTWordList) - Method in class edu.udo.cs.wvtool.main.WVTool
Create word vectors from an input list.
createWordList(WVTInputList, WVTConfiguration) - Method in class edu.udo.cs.wvtool.main.WVTool
Create a word list from scrat based on the given texts.
createWordList(WVTInputList, WVTConfiguration, List, boolean) - Method in class edu.udo.cs.wvtool.main.WVTool
Create a word list based on an existing word list.
current - Variable in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
currentInputStream - Variable in class edu.udo.cs.wvtool.generic.loader.UniversalLoader
 
currentToken - Variable in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
The token, which is currently provided.
currentToken - Variable in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
The token, which is currently provided.
currentTokens - Variable in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
The token, which is currently provided.
cvc(int) - Method in class edu.udo.cs.wvtool.external.Stemmer
 

D

DEFAULT_LANGUAGE - Static variable in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
 
DEFAULT_MIN_CHARS - Static variable in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
 
DEFAULT_MIN_NUM_CHARS - Static variable in class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapper
 
DEFAULT_MIN_NUM_CHARS - Static variable in class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapperGerman
 
DEFAULT_PRUNE_MAX - Static variable in class edu.udo.cs.wvtool.main.WVTool
upper boundary for automatic pruning
DEFAULT_PRUNE_MIN - Static variable in class edu.udo.cs.wvtool.main.WVTool
lower boundary for automatic pruning
defineCharacterEntity(String, String) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
degenerated - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
depth - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
dictionary - Variable in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
DictionaryStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
A stemmer that is based on an explicit dictionary containing pairs of terms and base forms.
DictionaryStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
DictionaryStemmer(Reader) - Constructor for class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
DictionaryStemmer(Reader, SimpleStemmer, boolean) - Constructor for class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
docInfos - Variable in class edu.udo.cs.wvtool.crawler.CrawledInputList
 
documentCount - Variable in class edu.udo.cs.wvtool.wordlist.WVTWord
counter for the document frequencies
documentInfo - Variable in class edu.udo.cs.wvtool.main.WVTWordVector
reference to the document information
doublec(int) - Method in class edu.udo.cs.wvtool.external.Stemmer
 
DummyCharConverter - Class in edu.udo.cs.wvtool.generic.charmapper
A simple Dummy, which does nothing at all (returns exactly what it gets).
DummyCharConverter() - Constructor for class edu.udo.cs.wvtool.generic.charmapper.DummyCharConverter
 
DummyStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
Does not do anything with the tokens.
DummyStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.DummyStemmer
 
DummyWordFilter - Class in edu.udo.cs.wvtool.generic.wordfilter
Dummy Wrapper for the stop word class (without performing any filtering).
DummyWordFilter() - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.DummyWordFilter
 

E

edu.udo.cs.wvtool.config - package edu.udo.cs.wvtool.config
Provides classes connected with the dynamic configuration of the tool.
edu.udo.cs.wvtool.crawler - package edu.udo.cs.wvtool.crawler
Provides classes and interfaces for using the WebSphinx crawler with the wvtool.
edu.udo.cs.wvtool.external - package edu.udo.cs.wvtool.external
Contains individual classes by other people.
edu.udo.cs.wvtool.generic.charmapper - package edu.udo.cs.wvtool.generic.charmapper
Provides classes and interfaces related to char conversion.
edu.udo.cs.wvtool.generic.inputfilter - package edu.udo.cs.wvtool.generic.inputfilter
Provides classes and interfaces for document input filters.
edu.udo.cs.wvtool.generic.loader - package edu.udo.cs.wvtool.generic.loader
Provides classes and interfaces for document loading.
edu.udo.cs.wvtool.generic.output - package edu.udo.cs.wvtool.generic.output
Provides classes and interfaces for output filters.
edu.udo.cs.wvtool.generic.stemmer - package edu.udo.cs.wvtool.generic.stemmer
Provides classes and interfaces for stemmers.
edu.udo.cs.wvtool.generic.tokenizer - package edu.udo.cs.wvtool.generic.tokenizer
Provides classes and interfaces for tokenization.
edu.udo.cs.wvtool.generic.vectorcreation - package edu.udo.cs.wvtool.generic.vectorcreation
Provides classes and interfaces for vector creation.
edu.udo.cs.wvtool.generic.wordfilter - package edu.udo.cs.wvtool.generic.wordfilter
Provides classes and interfaces for the filtering of words.
edu.udo.cs.wvtool.main - package edu.udo.cs.wvtool.main
The main classes, which contain most of the relevant interfaces.
edu.udo.cs.wvtool.util - package edu.udo.cs.wvtool.util
Provides some utility classes.
edu.udo.cs.wvtool.wordlist - package edu.udo.cs.wvtool.wordlist
Provides classes connected with the creation/modification of the word list.
elementStack - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
END_DOCUMENT - Static variable in class edu.udo.cs.wvtool.external.XmlReader
Signal logical end of xml document
END_TAG - Static variable in class edu.udo.cs.wvtool.external.XmlReader
End tag was just read
ends(String) - Method in class edu.udo.cs.wvtool.external.Stemmer
 
ensureCapacity(String[], int) - Static method in class edu.udo.cs.wvtool.external.XmlReader
 
ENTITY_REF - Static variable in class edu.udo.cs.wvtool.external.XmlReader
 
entityMap - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
eof - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
equals(Object) - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
 
exception(String) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
EXCEPTION - Static variable in class edu.udo.cs.wvtool.util.WVToolLogger
 

F

fallBackStemmer - Variable in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
filter(TokenEnumeration, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
 
filter(TokenEnumeration, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
filter - Variable in class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapperGerman
 
filter(TokenEnumeration, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.wordfilter.WVTWordFilter
Filter tokens from a token stream.

G

getAttributeCount() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getAttributeName(int) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getAttributeValue(int) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getAttributeValue(String) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.DummyStemmer
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.LovinsStemmerWrapper
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.PorterStemmerWrapper
 
getBase(String) - Method in interface edu.udo.cs.wvtool.generic.stemmer.SimpleStemmer
Produce the base form of a given term.
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
 
getBase(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.ToLowerCaseConverter
 
getClassFrequencies(int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Get the document frequencies of documents having a given class value.
getClassFrequency(int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Return the class frequency of this word for a given class.
getClassValue() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
Getter for property classValue.
getColumnNumber() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getComponentForStep(String, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.config.WVTConfiguration
Get the object to use in a given step according to given document informations.
getContentEncoding() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
Getter for property contentEncoding.
getContentLanguage() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
Getter for property contentLanguage.
getContentType() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
Getter for property contentType.
getDepth() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getDocumentFrequencies() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Get the document frequencies.
getDocumentFrequency() - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Return the document frequency for this word.
getDocumentInfo() - Method in class edu.udo.cs.wvtool.main.WVTWordVector
Returns the documentInfo.
getEntries() - Method in class edu.udo.cs.wvtool.crawler.CrawledInputList
 
getEntries(boolean) - Method in class edu.udo.cs.wvtool.main.WVTFileInputList
Return the list of entries
getEntries() - Method in class edu.udo.cs.wvtool.main.WVTFileInputList
 
getEntries() - Method in interface edu.udo.cs.wvtool.main.WVTInputList
Return the list of entries
getFrequenciesForCurrentDocument() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Get the word frequencies for the document that is currently processed.
getFrequencyByRank(int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Returns the document frequency of the word that is on the p-th rank, assuming that each word occupies exactly one rank.
getGlobalLogger() - Static method in class edu.udo.cs.wvtool.util.WVToolLogger
Get the global logging instance.
getIndexWord(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
getLineNumber() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getLocalFrequency() - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Return the frequency for the current document.
getLogLevel() - Method in class edu.udo.cs.wvtool.util.WVToolLogger
Get the current log level.
getMatchingComponent(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.config.WVTConfigurationFact
 
getMatchingComponent(WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.config.WVTConfigurationRule
Get a component object for a given document info
getMinNumChars() - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Get the minimal number of characters a word must contain to be processed.
getName() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getNumClasses() - Method in class edu.udo.cs.wvtool.crawler.CrawledInputList
 
getNumClasses() - Method in class edu.udo.cs.wvtool.main.WVTFileInputList
Returns the numClasses.
getNumClasses() - Method in interface edu.udo.cs.wvtool.main.WVTInputList
Returns the number of classes
getNumDocuments() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Returns the numDocuments.
getNumWords() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Return the number of words in the list.
getPositionDescription() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getResultBuffer() - Method in class edu.udo.cs.wvtool.external.Stemmer
Returns a reference to a character buffer containing the results of the stemming process.
getResultLength() - Method in class edu.udo.cs.wvtool.external.Stemmer
Returns the length of the word resulting from the stemming process.
getSourceName() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
Getter for property sourceName.
getTermCountForCurrentDocument() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
 
getText() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getType() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
getURLS() - Method in class edu.udo.cs.wvtool.crawler.WVToolCrawler
 
getValues() - Method in class edu.udo.cs.wvtool.main.WVTWordVector
Returns the values.
getWord() - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Returns the word.
getWord(int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Returns the WVTWord with the given index.
getWordForm(IndexWord) - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
Obtain a derived form of the specified word.
getWordForm(IndexWord) - Method in class edu.udo.cs.wvtool.generic.stemmer.WordNetHypernymStemmer
 
getWordForm(IndexWord) - Method in class edu.udo.cs.wvtool.generic.stemmer.WordNetSynonymStemmer
 

H

hasClassValue - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
has the document a class value assigened to it?
hasClassValue() - Method in class edu.udo.cs.wvtool.main.WVTDocumentInfo
has the document a class value assigned to it. a boolean
hasMoreTokens() - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
hasMoreTokens() - Method in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
hasMoreTokens() - Method in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
 
hasMoreTokens() - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
 
hasMoreTokens() - Method in interface edu.udo.cs.wvtool.util.TokenEnumeration
Determine whether there are tokens left in the Enumeration.

I

i - Variable in class edu.udo.cs.wvtool.external.Stemmer
 
i_end - Variable in class edu.udo.cs.wvtool.external.Stemmer
 
INC - Static variable in class edu.udo.cs.wvtool.external.Stemmer
 
input - Variable in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
input - Variable in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
The underlying character stream of the currently tokenized document
input - Variable in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
the token stream from which the characters are read
inputList - Variable in class edu.udo.cs.wvtool.main.WVTFileInputList
the list of input files
isAppendWords() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Returns the appendWords.
isEmptyElementTag() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
isStopword(String) - Static method in class edu.udo.cs.wvtool.external.Stopwords
Returns true if the given string is a stop word.
isStopword(String) - Method in class edu.udo.cs.wvtool.external.StopWordsGerman
 
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Determines whether the specified word is stopword.
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.DummyWordFilter
 
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.StopWordFilterFile
 
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapper
 
isStopword(String) - Method in class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapperGerman
 
isUpdateOnlyCurrent() - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Returns the updateOnlyCurrent.
isWhitespace - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
isWhitespace() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
iterateWords(WVTInputList, WVTConfiguration, WVToolWordListener) - Method in class edu.udo.cs.wvtool.main.WVTool
Process the specified documents using the configured steps and send all encountered words to a listener class.

J

j - Variable in class edu.udo.cs.wvtool.external.Stemmer
 

K

k - Variable in class edu.udo.cs.wvtool.external.Stemmer
 

L

LEGACY - Static variable in class edu.udo.cs.wvtool.external.XmlReader
 
line - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
loadDocument(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.loader.SourceAsTextLoader
Open the document and return an input stream on it.
loadDocument(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.loader.UniversalLoader
 
loadDocument(WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.loader.WVTDocumentLoader
Open the document and return an input stream on it.
localCount - Variable in class edu.udo.cs.wvtool.wordlist.WVTWord
counter for the term frequencies in the currently processed document
logException(String, Exception) - Method in class edu.udo.cs.wvtool.util.WVToolLogger
Log an exception.
logger - Static variable in class edu.udo.cs.wvtool.util.WVToolLogger
 
logLevel - Static variable in class edu.udo.cs.wvtool.util.WVToolLogger
 
logMessage(String, int) - Method in class edu.udo.cs.wvtool.util.StdOutLogger
 
logMessage(String, int) - Method in class edu.udo.cs.wvtool.util.WVToolLogger
Log a message, if the current log level is equal or higher than the one of the message.
LovinsStemmer - Class in edu.udo.cs.wvtool.external
Implements the Lovins stemmer.
LovinsStemmer() - Constructor for class edu.udo.cs.wvtool.external.LovinsStemmer
 
LovinsStemmerWrapper - Class in edu.udo.cs.wvtool.generic.stemmer
Wrapper for the lovins stemmer.
LovinsStemmerWrapper() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.LovinsStemmerWrapper
Constructor for LovinsStemmerWrapper.

M

m() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
m_CompMode - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
C version compatibility mode (emulates bugs in original C implementation)
m_l1 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l10 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l11 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
The hash tables containing the list of endings.
m_l2 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l3 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l4 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l5 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l6 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l7 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l8 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_l9 - Static variable in class edu.udo.cs.wvtool.external.LovinsStemmer
 
m_Stopwords - Static variable in class edu.udo.cs.wvtool.external.Stopwords
The hashtable containing the list of stopwords
m_Stopwords - Static variable in class edu.udo.cs.wvtool.external.StopWordsGerman
The hashtable containing the list of stopwords
main(String[]) - Static method in class edu.udo.cs.wvtool.external.LovinsStemmer
Stems text coming into stdin and writes it to stdout.
main(String[]) - Static method in class edu.udo.cs.wvtool.external.Stemmer
Test program for demonstrating the Stemmer.
main(String[]) - Static method in class edu.udo.cs.wvtool.util.TextGenerator
 
maxSenses - Variable in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
minNumChars - Variable in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
 

N

n - Variable in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
name - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
next() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
nextToken() - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
nextToken() - Method in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
nextToken() - Method in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
 
nextToken() - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
 
nextToken() - Method in interface edu.udo.cs.wvtool.util.TokenEnumeration
Return the next token from the stream.
NGramTokenizer - Class in edu.udo.cs.wvtool.generic.tokenizer
Creates tokens by creating ngrams of the tokens received from an inner tokenizer.
NGramTokenizer(int, WVTTokenizer) - Constructor for class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
numClasses - Variable in class edu.udo.cs.wvtool.main.WVTFileInputList
the number of classes
numClasses - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
the number of possible class values
numDocuments - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
the number of documents processed so far
numLocalTerms - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
the number of terms processed in the current document so far

O

openNewDocument(WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.main.WVToolWordListener
Invoked as a new document is opened for processing.
out - Variable in class edu.udo.cs.wvtool.generic.output.WordVectorWriter
 

P

parseDoctype() - Method in class edu.udo.cs.wvtool.external.XmlReader
precondition:
parseEndTag() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
parseLegacy(boolean) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
parseStartTag() - Method in class edu.udo.cs.wvtool.external.XmlReader
Sets name and attributes
peek0 - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
peek1 - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
peekType() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
pop(int) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
PorterStemmerWrapper - Class in edu.udo.cs.wvtool.generic.stemmer
Wrapper for the Porter Stemmer provided with the system.
PorterStemmerWrapper() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.PorterStemmerWrapper
 
pos - Variable in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
processWord(String) - Method in interface edu.udo.cs.wvtool.main.WVToolWordListener
Invoked as a word is processed in the current document.
pruneByFrequency(int, int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Prune the word list by document frequencies.
push(int) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
pushEntity() - Method in class edu.udo.cs.wvtool.external.XmlReader
result: isWhitespace; if the setName parameter is set, the name of the entity is stored in "name"
pushText(int) - Method in class edu.udo.cs.wvtool.external.XmlReader
types: '< parse="parse" to="to" any="any" token="token" (for="(for" nextToken="nextToken" (="(" )=")" '="'" "='"' ="':" quote="quote" whitespace="whitespace" or="or">'

R

r(String) - Method in class edu.udo.cs.wvtool.external.Stemmer
 
read() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
read(char) - Method in class edu.udo.cs.wvtool.external.XmlReader
 
read() - Method in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
read(char[], int, int) - Method in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
reader - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
reader - Variable in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
readName() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
readNextToken() - Method in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
Read a token from the character stream and store it into currentToken.
readNextToken() - Method in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
Read a token from the character stream and store it into currentToken.
readNextValidToken() - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Read tokens from the stream until a token is found that is not filtered or the end of the stream is reached.
readText() - Method in class edu.udo.cs.wvtool.external.XmlReader
If the current event is text, the value of getText is returned and next() is called.
readText() - Method in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
recodeEnding(String) - Method in class edu.udo.cs.wvtool.external.LovinsStemmer
Recodes ending of given word.
regExprList - Variable in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
relaxed - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
removeEnding(String) - Method in class edu.udo.cs.wvtool.external.LovinsStemmer
Finds and removes ending from given word.
removeEntry(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.main.WVTFileInputList
Remove an entry from the list.
require(int, String) - Method in class edu.udo.cs.wvtool.external.XmlReader
test if the current event is of the given type and if the name do match.
ruleSet - Variable in class edu.udo.cs.wvtool.config.WVTConfiguration
data structure to store the rules for the individual steps

S

SelectingWordFilter - Class in edu.udo.cs.wvtool.generic.wordfilter
Class that dynamically selects the word filter according to the selected language.
SelectingWordFilter() - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
SelectingWordFilter(Map) - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
setAppendWords(boolean) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Sets the appendWords.
setClassFrequency(int, int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Set the class frequency
setConfigurationRule(String, WVTConfigurationRule) - Method in class edu.udo.cs.wvtool.config.WVTConfiguration
Set a rule for a given step.
setDocumentFrequency(int) - Method in class edu.udo.cs.wvtool.wordlist.WVTWord
Set the document frequency
setDocumentInfo(WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.main.WVTWordVector
Sets the documentInfo.
setGlobalLogger(WVToolLogger) - Static method in class edu.udo.cs.wvtool.util.WVToolLogger
Set the global logging instance.
setLanguage(String) - Method in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
 
setLogLevel(int) - Method in class edu.udo.cs.wvtool.util.WVToolLogger
Set the log level.
setMinNumChars(int) - Method in class edu.udo.cs.wvtool.generic.wordfilter.AbstractStopWordFilter
Set the minimal number of characters a word must contain to be processed.
setMinNumChars(int) - Method in class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
setto(String) - Method in class edu.udo.cs.wvtool.external.Stemmer
 
setUpdateOnlyCurrent(boolean) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Sets the updateOnlyCurrent.
setValues(double[]) - Method in class edu.udo.cs.wvtool.main.WVTWordVector
Sets the values.
SimpleStemmer - Interface in edu.udo.cs.wvtool.generic.stemmer
Interface to a stemming algorithm.
SimpleTokenizer - Class in edu.udo.cs.wvtool.generic.tokenizer
This class implements a simple tokenizer.
SimpleTokenizer() - Constructor for class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
 
skip() - Method in class edu.udo.cs.wvtool.external.XmlReader
 
skipErrors - Variable in class edu.udo.cs.wvtool.main.WVTool
should errors be skiped
SnowballStemmerWrapper - Class in edu.udo.cs.wvtool.generic.stemmer
Wrapper for the snowball stemmer package.
SnowballStemmerWrapper() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
Constructor for SnowballStemmerWrapper.
source - Variable in class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
SourceAsTextLoader - Class in edu.udo.cs.wvtool.generic.loader
This loader simply uses the defined source as text.
SourceAsTextLoader() - Constructor for class edu.udo.cs.wvtool.generic.loader.SourceAsTextLoader
 
sourceName - Variable in class edu.udo.cs.wvtool.main.WVTDocumentInfo
the source of the document as String.
sparse - Variable in class edu.udo.cs.wvtool.generic.output.WordVectorWriter
 
srcBuf - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
srcCount - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
srcPos - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
START_DOCUMENT - Static variable in class edu.udo.cs.wvtool.external.XmlReader
Return value of getType before first call to next()
START_TAG - Static variable in class edu.udo.cs.wvtool.external.XmlReader
Start tag was just read
STATUS - Static variable in class edu.udo.cs.wvtool.util.WVToolLogger
 
StdOutLogger - Class in edu.udo.cs.wvtool.util
Logging class that writes all messages to stdout.
StdOutLogger() - Constructor for class edu.udo.cs.wvtool.util.StdOutLogger
 
stem(String) - Method in class edu.udo.cs.wvtool.external.LovinsStemmer
Returns the stemmed version of the given word.
stem() - Method in class edu.udo.cs.wvtool.external.Stemmer
Stem the word placed into the Stemmer buffer through calls to add().
stem(TokenEnumeration, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.stemmer.AbstractStemmer
 
stem(TokenEnumeration, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
 
stem(TokenEnumeration, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.stemmer.WVTStemmer
Convert a list of tokens to a list of stems.
Stemmer - Class in edu.udo.cs.wvtool.external
Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form.
Stemmer() - Constructor for class edu.udo.cs.wvtool.external.Stemmer
 
stemmer - Variable in class edu.udo.cs.wvtool.generic.stemmer.AbstractWordNetStemmer
 
stemmer - Variable in class edu.udo.cs.wvtool.generic.stemmer.LovinsStemmerWrapper
the stemmer itself
stemmer - Variable in class edu.udo.cs.wvtool.generic.stemmer.PorterStemmerWrapper
the stemmer itself
stemmer - Variable in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
the stemmer itself
stemMethod - Variable in class edu.udo.cs.wvtool.generic.stemmer.SnowballStemmerWrapper
 
stemString(String) - Method in class edu.udo.cs.wvtool.external.LovinsStemmer
Stems everything in the given string.
step1() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
step2() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
step3() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
step4() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
step5() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
step6() - Method in class edu.udo.cs.wvtool.external.Stemmer
 
STEP_CHAR_MAPPER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_INPUT_FILTER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_LOADER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_OUTPUT - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_STEMMER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_TOKENIZER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_VECTOR_CREATION - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
STEP_WORDFILTER - Static variable in class edu.udo.cs.wvtool.config.WVTConfiguration
 
StopWordFilterFile - Class in edu.udo.cs.wvtool.generic.wordfilter
Filters all words specified in a file.
StopWordFilterFile(int, Reader) - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.StopWordFilterFile
 
Stopwords - Class in edu.udo.cs.wvtool.external
Class that can test whether a given string is a stop word.
Stopwords() - Constructor for class edu.udo.cs.wvtool.external.Stopwords
 
stopWords - Static variable in class edu.udo.cs.wvtool.external.StopWordsGerman
 
stopWords - Variable in class edu.udo.cs.wvtool.generic.wordfilter.StopWordFilterFile
 
StopWordsGerman - Class in edu.udo.cs.wvtool.external
 
StopWordsGerman() - Constructor for class edu.udo.cs.wvtool.external.StopWordsGerman
 
StopWordsWrapper - Class in edu.udo.cs.wvtool.generic.wordfilter
Wrapper for the stop word class.
StopWordsWrapper() - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapper
Constructor for StopWordsWrapper.
StopWordsWrapper(int) - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapper
Constructor for StopWordsWrapper.
StopWordsWrapperGerman - Class in edu.udo.cs.wvtool.generic.wordfilter
A wrapper for the german stopwordlist
StopWordsWrapperGerman(int) - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapperGerman
 
StopWordsWrapperGerman() - Constructor for class edu.udo.cs.wvtool.generic.wordfilter.StopWordsWrapperGerman
 
stopWordWrapperMap - Variable in class edu.udo.cs.wvtool.generic.wordfilter.SelectingWordFilter
 
store(Writer) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Write the wordlist to a stream.
storePlain(Writer) - Method in class edu.udo.cs.wvtool.wordlist.WVTWordList
Write the wordlist to a stream without any additional info.
stream - Variable in class edu.udo.cs.wvtool.generic.loader.SourceAsTextLoader
 

T

TagIgnoringReader - Class in edu.udo.cs.wvtool.generic.inputfilter
A reader that ignores alll tags.
TagIgnoringReader(Reader) - Constructor for class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 
TermFrequency - Class in edu.udo.cs.wvtool.generic.vectorcreation
Generate word vector by simply using term frequencies.
TermFrequency() - Constructor for class edu.udo.cs.wvtool.generic.vectorcreation.TermFrequency
 
termMap - Variable in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
TermOccurrences - Class in edu.udo.cs.wvtool.generic.vectorcreation
Create the vector by taking the number of occurences.
TermOccurrences() - Constructor for class edu.udo.cs.wvtool.generic.vectorcreation.TermOccurrences
 
TEXT - Static variable in class edu.udo.cs.wvtool.external.XmlReader
Text was just read
text - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
TextGenerator - Class in edu.udo.cs.wvtool.util
Generate many random text documents.
TextGenerator() - Constructor for class edu.udo.cs.wvtool.util.TextGenerator
 
TextInputFilter - Class in edu.udo.cs.wvtool.generic.inputfilter
A simple input filter for documents already in plain text format.
TextInputFilter() - Constructor for class edu.udo.cs.wvtool.generic.inputfilter.TextInputFilter
 
TFIDF - Class in edu.udo.cs.wvtool.generic.vectorcreation
This class represents a mechanism to create TFIDF word vectors.
TFIDF() - Constructor for class edu.udo.cs.wvtool.generic.vectorcreation.TFIDF
 
TokenEnumeration - Interface in edu.udo.cs.wvtool.util
Interface for an enumeration of tokens.
tokenize(Reader, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
tokenize(Reader, WVTDocumentInfo) - Method in class edu.udo.cs.wvtool.generic.tokenizer.SimpleTokenizer
 
tokenize(Reader, WVTDocumentInfo) - Method in interface edu.udo.cs.wvtool.generic.tokenizer.WVTTokenizer
Tokenize a character stream.
tokenizer - Variable in class edu.udo.cs.wvtool.generic.tokenizer.NGramTokenizer
 
ToLowerCaseConverter - Class in edu.udo.cs.wvtool.generic.stemmer
Converts the tokens to lower case.
ToLowerCaseConverter() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.ToLowerCaseConverter
 
toString() - Method in class edu.udo.cs.wvtool.external.Stemmer
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
txtBuf - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
txtPos - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
type - Variable in class edu.udo.cs.wvtool.external.XmlReader
 
TYPES - Variable in class edu.udo.cs.wvtool.external.XmlReader
 

U

UNEXPECTED_EOF - Static variable in class edu.udo.cs.wvtool.external.XmlReader
 
UniversalLoader - Class in edu.udo.cs.wvtool.generic.loader
Loader, which is able to load individual documents from an URL or simply from a local file given by the source name in document information.
UniversalLoader() - Constructor for class edu.udo.cs.wvtool.generic.loader.UniversalLoader
Constructor for UniversalLoader.
updateOnlyCurrent - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
indicates, whether the document and class frequencies should be updated as well, or only the frequencies for the current document
urlsToVectorize - Variable in class edu.udo.cs.wvtool.crawler.WVToolCrawler
 

V

values - Variable in class edu.udo.cs.wvtool.main.WVTWordVector
the values
vectorizePage(Page) - Method in class edu.udo.cs.wvtool.crawler.WVToolCrawler
 
visit(Page) - Method in class edu.udo.cs.wvtool.crawler.WVToolCrawler
 
vowelinstem() - Method in class edu.udo.cs.wvtool.external.Stemmer
 

W

WARNING - Static variable in class edu.udo.cs.wvtool.util.WVToolLogger
 
word - Variable in class edu.udo.cs.wvtool.wordlist.WVTWord
the word
wordList - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
A sequential indexing structure, to ensure a fixed order of all words in the list
wordMap - Variable in class edu.udo.cs.wvtool.wordlist.WVTWordList
A Hash used to find words efficiently
WordNetHypernymStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
Replaces a word by a hypernym.
WordNetHypernymStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.WordNetHypernymStemmer
 
WordNetHypernymStemmer(SimpleStemmer, int) - Constructor for class edu.udo.cs.wvtool.generic.stemmer.WordNetHypernymStemmer
 
WordNetSynonymStemmer - Class in edu.udo.cs.wvtool.generic.stemmer
Replaces a word by the first representant of its synset.
WordNetSynonymStemmer() - Constructor for class edu.udo.cs.wvtool.generic.stemmer.WordNetSynonymStemmer
 
WordNetSynonymStemmer(SimpleStemmer, int) - Constructor for class edu.udo.cs.wvtool.generic.stemmer.WordNetSynonymStemmer
 
WordVectorWriter - Class in edu.udo.cs.wvtool.generic.output
This class represents a mechanism, by which word vectors are stored to a character stream.
WordVectorWriter(Writer, boolean) - Constructor for class edu.udo.cs.wvtool.generic.output.WordVectorWriter
Create a new instance of WordVectorFile
write(WVTWordVector) - Method in class edu.udo.cs.wvtool.generic.output.WordVectorWriter
 
write(WVTWordVector) - Method in interface edu.udo.cs.wvtool.generic.output.WVTOutputFilter
Store a word vector.
writeAddedMappings(Writer) - Method in class edu.udo.cs.wvtool.generic.stemmer.DictionaryStemmer
 
WVTCharConverter - Interface in edu.udo.cs.wvtool.generic.charmapper
This interface represents a mechanism to convert char encondings.
WVTConfigException - Exception in edu.udo.cs.wvtool.config
Exception that is thrown if a configuration problem is encountered.
WVTConfigException() - Constructor for exception edu.udo.cs.wvtool.config.WVTConfigException
 
WVTConfiguration - Class in edu.udo.cs.wvtool.config
WVTool Configuration.
WVTConfiguration(Reader) - Constructor for class edu.udo.cs.wvtool.config.WVTConfiguration
Creates a new instance of WVTConfiguration by reading a configuration from a stream.
WVTConfiguration() - Constructor for class edu.udo.cs.wvtool.config.WVTConfiguration
Creates a new instance of WVTConfiguration, setting up a standard configuration
WVTConfigurationFact - Class in edu.udo.cs.wvtool.config
Class used to simplify the process of creating rules, in cases, in which simply a constant value is returned.
WVTConfigurationFact(String) - Constructor for class edu.udo.cs.wvtool.config.WVTConfigurationFact
Constructor for a configuration fact.
WVTConfigurationFact(Object) - Constructor for class edu.udo.cs.wvtool.config.WVTConfigurationFact
Constructor for a configuration fact.
WVTConfigurationRule - Interface in edu.udo.cs.wvtool.config
This interface abstracts from rules, used to select an appropriate component for a given document.
WVTDocumentInfo - Class in edu.udo.cs.wvtool.main
Represents relevant information about a document.
WVTDocumentInfo(String, String, String, String, int) - Constructor for class edu.udo.cs.wvtool.main.WVTDocumentInfo
Creates a new instance of WVTDocumentInfo
WVTDocumentInfo(String, String, String, String) - Constructor for class edu.udo.cs.wvtool.main.WVTDocumentInfo
Creates a new instance of WVTDocumentInfo
WVTDocumentLoader - Interface in edu.udo.cs.wvtool.generic.loader
This interface represents a mechanism by which a document is loaded.
WVTFileInputList - Class in edu.udo.cs.wvtool.main
This class represents a list of information items describing the documents, from which a word list or word vectors should be created.
WVTFileInputList(int) - Constructor for class edu.udo.cs.wvtool.main.WVTFileInputList
Creates a new empty instance of WVTInputList
WVTFileInputList(int, Reader) - Constructor for class edu.udo.cs.wvtool.main.WVTFileInputList
Creates a new instance of WVTInputList by reading a XML file
WVTInputFilter - Interface in edu.udo.cs.wvtool.generic.inputfilter
This interface represents a mechanism, which converts content given by an input stream to a stream of characters.
WVTInputList - Interface in edu.udo.cs.wvtool.main
Represents a set of resources that should be processed by the word vector tool.
WVTool - Class in edu.udo.cs.wvtool.main
Main class of the word vector tool.
WVTool(boolean) - Constructor for class edu.udo.cs.wvtool.main.WVTool
Create a new WVTool instance.
WVToolCrawler - Class in edu.udo.cs.wvtool.crawler
An abstract class that must be overridden by all specialized crawlers that are used to construct a crawled input list.
WVToolCrawler(String, String, String) - Constructor for class edu.udo.cs.wvtool.crawler.WVToolCrawler
 
WVToolException - Exception in edu.udo.cs.wvtool.util
General exception thrown if a problem occurs during processing.
WVToolException() - Constructor for exception edu.udo.cs.wvtool.util.WVToolException
 
WVToolException(String, Throwable) - Constructor for exception edu.udo.cs.wvtool.util.WVToolException
 
WVToolException(String) - Constructor for exception edu.udo.cs.wvtool.util.WVToolException
 
WVToolIOException - Exception in edu.udo.cs.wvtool.util
Exception thrown if a problem occurs during i/o processing.
WVToolIOException() - Constructor for exception edu.udo.cs.wvtool.util.WVToolIOException
 
WVToolIOException(String, Throwable) - Constructor for exception edu.udo.cs.wvtool.util.WVToolIOException
 
WVToolLogger - Class in edu.udo.cs.wvtool.util
Singelton logging class.
WVToolLogger() - Constructor for class edu.udo.cs.wvtool.util.WVToolLogger
 
WVToolWordListener - Interface in edu.udo.cs.wvtool.main
An interface for an algorithm that listens to the documents and words that are processed by the word vector tool.
WVTOutputFilter - Interface in edu.udo.cs.wvtool.generic.output
This class represents a mechanism by which word vectors a stored.
WVTStemmer - Interface in edu.udo.cs.wvtool.generic.stemmer
This interface represents a mechanism to convert a stream of tokens to a stream of word stems.
WVTTokenizer - Interface in edu.udo.cs.wvtool.generic.tokenizer
Interface, which represents a mechanism to convert a stream of characters to a stream of tokens, deleting all seperators.
WVTVectorCreator - Interface in edu.udo.cs.wvtool.generic.vectorcreation
This interface represents a mechanism by which an individual word vector is created.
WVTWord - Class in edu.udo.cs.wvtool.wordlist
Class, which represents an individual word and its occurances (class and document frequencies).
WVTWord(String, int) - Constructor for class edu.udo.cs.wvtool.wordlist.WVTWord
Create a new instance of Word.
WVTWord(String) - Constructor for class edu.udo.cs.wvtool.wordlist.WVTWord
Create a new instance of Word, not supporting class values.
WVTWordFilter - Interface in edu.udo.cs.wvtool.generic.wordfilter
This interface represents a mechanism by which tokens (words) are filtered from a stream of tokens.
WVTWordList - Class in edu.udo.cs.wvtool.wordlist
This class represents a word list.
WVTWordList(int) - Constructor for class edu.udo.cs.wvtool.wordlist.WVTWordList
Create a new instance of WVTWordList.
WVTWordList(List, int) - Constructor for class edu.udo.cs.wvtool.wordlist.WVTWordList
 
WVTWordList(Reader) - Constructor for class edu.udo.cs.wvtool.wordlist.WVTWordList
Create a new instance of WVTWordList by reading it from a stream.
WVTWordVector - Class in edu.udo.cs.wvtool.main
Represents an individual word vector in non-sparse form.
WVTWordVector() - Constructor for class edu.udo.cs.wvtool.main.WVTWordVector
Creates a new instance of WVTWordVector

X

XMLInputFilter - Class in edu.udo.cs.wvtool.generic.inputfilter
Read XML Input ignoring the tags.
XMLInputFilter() - Constructor for class edu.udo.cs.wvtool.generic.inputfilter.XMLInputFilter
Constructor for XMLInputFilter.
XmlReader - Class in edu.udo.cs.wvtool.external
A minimalistic XML pull parser, similar to kXML, but not supporting namespaces or legacy events.
XmlReader(Reader) - Constructor for class edu.udo.cs.wvtool.external.XmlReader
 
xr - Variable in class edu.udo.cs.wvtool.generic.inputfilter.TagIgnoringReader
 

A B C D E F G H I J K L M N O P R S T U V W X