|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.udo.cs.wvtool.main.WVTool
public class WVTool
Main class of the word vector tool. It provides all the functionality and can be called directly via java.
Field Summary | |
---|---|
private static int |
DEFAULT_PRUNE_MAX
upper boundary for automatic pruning |
private static int |
DEFAULT_PRUNE_MIN
lower boundary for automatic pruning |
private boolean |
skipErrors
should errors be skiped |
Constructor Summary | |
---|---|
WVTool(boolean skipErrors)
Create a new WVTool instance. |
Method Summary | |
---|---|
WVTWordVector |
createVector(java.lang.String text,
WVTDocumentInfo d,
WVTConfiguration config,
WVTWordList wordList)
Create a single word vector. |
WVTWordVector |
createVector(java.lang.String text,
WVTWordList wordList)
Create an individual word vector from a String using TF/IDF weights and stadard configuration. |
void |
createVectors(WVTInputList input,
WVTConfiguration config)
Deprecated. Please use the method createVectors(WVTInputList input, WVTConfiguration config, int pruneMin, int pruneMax) |
void |
createVectors(WVTInputList input,
WVTConfiguration config,
int pruneMin,
int pruneMax)
Create a word list and after this word vectors, both from the same input list. |
void |
createVectors(WVTInputList input,
WVTConfiguration config,
WVTWordList wordList)
Create word vectors from an input list. |
WVTWordList |
createWordList(WVTInputList input,
WVTConfiguration config)
Create a word list from scrat based on the given texts. |
WVTWordList |
createWordList(WVTInputList input,
WVTConfiguration config,
java.util.List initialWords,
boolean addWords)
Create a word list based on an existing word list. |
void |
iterateWords(WVTInputList input,
WVTConfiguration config,
WVToolWordListener listener)
Process the specified documents using the configured steps and send all encountered words to a listener class. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private static final int DEFAULT_PRUNE_MIN
private static final int DEFAULT_PRUNE_MAX
private boolean skipErrors
Constructor Detail |
---|
public WVTool(boolean skipErrors)
skipErrors
- should errors be skip (and only be written to the error
log) or should an Exception be thrownMethod Detail |
---|
public WVTWordList createWordList(WVTInputList input, WVTConfiguration config) throws WVToolException
input
- the input list from which word list is createdconfig
- the underlying configuration
java.lang.Exception
WVToolException
public WVTWordList createWordList(WVTInputList input, WVTConfiguration config, java.util.List initialWords, boolean addWords) throws WVToolException
input
- the input list from which word list is createdconfig
- the underlying configurationinitialWords
- initial list of words to useaddWords
- should words, appearing in texts but not in the initial
list be added to the list
java.lang.Exception
WVToolException
public void createVectors(WVTInputList input, WVTConfiguration config, int pruneMin, int pruneMax) throws WVToolException
input
- the input listconfig
- the configurationpruneMin
- the minimal number of occurences of a word to be
consideredpruneMax
- the maximum number of occurences of a word to be
considered
WVToolException
public void createVectors(WVTInputList input, WVTConfiguration config) throws WVToolException
input
- the input listconfig
- the configuration
WVToolException
public void createVectors(WVTInputList input, WVTConfiguration config, WVTWordList wordList) throws WVToolException
input
- the input listconfig
- the configurationwordList
- a word list (possibly containing document and class
frequencies).
java.lang.Exception
WVToolException
public WVTWordVector createVector(java.lang.String text, WVTDocumentInfo d, WVTConfiguration config, WVTWordList wordList) throws WVToolException
text
- the underlying textd
- information about the textconfig
- the configuration to use (though it will be only partly
used)wordList
- the word list to use
WVToolException
public WVTWordVector createVector(java.lang.String text, WVTWordList wordList) throws WVToolException
text
- the underlying textwordList
- a wordlist (for IDF)
java.lang.Exception
WVToolException
public void iterateWords(WVTInputList input, WVTConfiguration config, WVToolWordListener listener) throws WVToolException
input
- the input listconfig
- the configurationlistener
- a call back class that is invoked on every processed
document and word
WVToolException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |