View on GitHub

record-classification

This project provides an automatic record classification tool.

train Command

The train command trains the classifier with the loaded gold standard records. In order to use this command, firstly, the classifier must be set via the set command, and secondly, at least one training record must be loaded, i.e. at least one gold standard record collection must be loaded with training ratio of more that 0.0 (see load command). The train command offers the following option:

For example, the following command:

train -it 0.9

trains the classifier with 90% of the loaded training records, where the remaining 10% of the training records are used by the classifier to evaluate its own performance.

Note: internal evaluation is necessary in order to calculate the confidence measure for a classification; if the internal training ratio is set to 1.0 (i.e. 100%), all of the loaded training records will be used for training the classifier. The classifier will not perform internal evaluation, and therefore, is unable to calculate confidence for any classified records.

Home | CLI