Today, we have released the Wikipedia Trigram model for English in the web demo. It was computed from an English Wikipedia dump from 2013 and is one of the first models that contains a Bim DT (Feature DT).
The Bim DT is a “reversed Distributional Thesaurus”; users can find similar context features when they need more contexts to reduce sparsity issues. The Bim DT was computed using words as “Bims” and features as “Jos”. This demonstrates the general JoBimText approach, where users can define any type of Jos and Bims for their tasks.
Currently, we are using the Wikipedia trigram model to develop in-text contextualization, that is able to assign the induced word senses to words in text. Due to sparsity of contexts, the Bim DT helps in this application.