You can download the evaluation component from our Downloads page. To use it for DT evaluation, you need pre-computed similarity scores.
If you run the .jar file, you will see a list of possible tasks (the bold tasks are only available in the GPL component):
$ java -jar org.jobimtext.evaluation-gpl.jar java -jar org.jobimtext.evaluation-gpl.jar task [options] Please select a task: ws/WordSampling: Perform Word Sampling . wsv/WordSamplingValidation: Perform Word Sampling Validation . wsim/WordSimilarity: Perform Word Similarity (Distributional Theasurus Evaluation task) . smg/SenseMerger: Perform Sense Merger . wnsim/WordNetSimilarityCalculation: Perform WordNet Similarity Score Extraction . gnsim/GermaNetSimilarityCalculation: Perform GermaNet Similarity Score Extraction .
Contents
Pre-computation of Similarity Scores / Score Extraction
Run the wsim or gsim taks of the GPL-licensed component.
WordNet Score Extraction
$ java -jar org.jobimtext.evaluation-gpl.jar wnsim Usage: java -jar org.jobimtext.evaluation-gpl.jar wnsim [options] <path to wordnet> <desired POS> <desired measure> <desired POS> can be 'verb' or 'noun' <desired measure> can be 'hso' or 'path' usage: options [-c <Candidate Wordlist>] [-o <prefix>] -c <Candidate Wordlist> Candidate Wordlist [Default: none] -o <prefix> Prefix for the output directory where all similarity score files will be written to [Default: WordNet-pos-pos-measureName-pairs- candidate_wordlist_file_name/]
GermaNet Score Extraction
$ java -jar org.jobimtext.evaluation-gpl.jar gnsim Usage: java -jar org.jobimtext.evaluation-gpl.jar gnsim [options] <path to germanet xmlfiles folder> <desired POS> <desired measure> <desired POS> can be 'verb' or 'noun' <desired measure> can be 'hso' or 'path' usage: options [-c <Candidate Wordlist>] [-o <prefix>] -c <Candidate Wordlist> Candidate Wordlist [Default: none] -o <prefix> Prefix for the output directory where all similarity score files will be written to [Default:GermaNet-pos-pos-measureName-pairs/]
Evaluation of DTs with Similarity Scores from WordNet/GermaNet
To evaluate a Distributional Thesaurus, you need to run the wsim task.
$ java -jar org.jobimtext.evaluation-gpl.jar wsim Usage: java -jar org.jobimtext.evaluation-gpl.jar wsim [options] <path to sim score files folder> <candidate words> <Distributional Theasurus file 1> <Distributional Theasurus file 2>... < Distributional Theasurus file n> usage: options [-no_gz] [-o <prefix>] [-pos <tag>] [-ps <separator>] [-t <NUMBER,..>] -no_gz Expect the DT not to be compressed . -o <prefix> Prefix for the output file (name of the output file) [Default: dtEvaloutput] -pos <tag> Only the entries with specified POS tag will be checked. If none is specified no POS tags will be checked. [Default: none] . -ps <separator> Separation String for POS Tags (the last occurrence for that separation string will be used for splitting. To specify a hash sign as separator, escape it: '\#' If none is specified no truncation of POS tags is performed. [Default: none] . -t <NUMBER,..> Evaluate the top N entries of the DT entries. It is also possible to specify more then one value. [Default: 10]