We have released an English trigram model, which consists of a Distributional Thesaurus (Simsort.gz), word counts (wordcount.gz), significance scores between terms and features (LMI.gz) and sense clusters with IS-A labels (cluster_isa.gz).
The holing operation was performed with a TrigramHolingAnnotator, where the target Jo is in the middle of the trigram. The features look like the following:
Sentence: Mary likes candy
Features:
Jo | Bim |
Mary | 3-gram2(_@_likes) |
likes | 3-gram2(Mary_@_candy) |
candy | 3-gram2(likes_@_) |
This dataset is included in the JoBimText web demo, where it can be tried out.