|
Train and Test
Description: |
Keeping a test set additionally to the train set is a general method
to estimate the accuracy of a concept learning algorithm (classifier).
Given a sample of classified instances and a concept learning
algorithm, this is how this method works:
- Split the data in two parts, a training set and a test set.
-
Run the learner on the training set, but do not show the test set.
Let h denote the hypothesis output by the learner.
-
Use h to classify all instances of the test set. The fraction
of correct classified instances is the estimated accuracy.
Usually a fraction of 20-30% of the available data is chosen as the
test set. This is a good option, if the size of the test set is
larger than 1000.
|
|
|