edu.udo.cs.wvtool.external
Class Stemmer

java.lang.Object
  extended by edu.udo.cs.wvtool.external.Stemmer

public class Stemmer
extends java.lang.Object

Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form. The input word can be provided a character at time (by calling add()), or at once by calling one of the various stem(something) methods.


Field Summary
private  char[] b
           
private  int i
           
private  int i_end
           
private static int INC
           
private  int j
           
private  int k
           
 
Constructor Summary
Stemmer()
           
 
Method Summary
 void add(char ch)
          Add a character to the word being stemmed.
 void add(char[] w, int wLen)
          Adds wLen characters to the word being stemmed contained in a portion of a char[] array.
private  boolean cons(int i)
           
private  boolean cvc(int i)
           
private  boolean doublec(int j)
           
private  boolean ends(java.lang.String s)
           
 char[] getResultBuffer()
          Returns a reference to a character buffer containing the results of the stemming process.
 int getResultLength()
          Returns the length of the word resulting from the stemming process.
private  int m()
           
static void main(java.lang.String[] args)
          Test program for demonstrating the Stemmer.
private  void r(java.lang.String s)
           
private  void setto(java.lang.String s)
           
 void stem()
          Stem the word placed into the Stemmer buffer through calls to add().
private  void step1()
           
private  void step2()
           
private  void step3()
           
private  void step4()
           
private  void step5()
           
private  void step6()
           
 java.lang.String toString()
          After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
private  boolean vowelinstem()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

b

private char[] b

i

private int i

i_end

private int i_end

j

private int j

k

private int k

INC

private static final int INC
See Also:
Constant Field Values
Constructor Detail

Stemmer

public Stemmer()
Method Detail

add

public void add(char ch)
Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to stem the word.


add

public void add(char[] w,
                int wLen)
Adds wLen characters to the word being stemmed contained in a portion of a char[] array. This is like repeated calls of add(char ch), but faster.


toString

public java.lang.String toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)

Overrides:
toString in class java.lang.Object


getResultLength

public int getResultLength()
Returns the length of the word resulting from the stemming process.


getResultBuffer

public char[] getResultBuffer()
Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result.


cons

private final boolean cons(int i)

m

private final int m()

vowelinstem

private final boolean vowelinstem()

doublec

private final boolean doublec(int j)

cvc

private final boolean cvc(int i)

ends

private final boolean ends(java.lang.String s)

setto

private final void setto(java.lang.String s)

r

private final void r(java.lang.String s)

step1

private final void step1()

step2

private final void step2()

step3

private final void step3()

step4

private final void step4()

step5

private final void step5()

step6

private final void step6()

stem

public void stem()
Stem the word placed into the Stemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().


main

public static void main(java.lang.String[] args)
Test program for demonstrating the Stemmer. It reads text from a a list of files, stems each word, and writes the result to standard output. Note that the word stemmed is expected to be in lower case: forcing lower case must be done outside the Stemmer class. Usage: Stemmer file-name file-name ...