Using the API for DT Access

Connecting to the database

To instantiate the API, the path of the MySQL configuration is needed and given to the constructor of the class DatabaseThesaurusDatastructure.
This class is an implementation of the IThesaurus class which contains all methods that can be available for a full model. The DatabaseThesaurusDatastructure class returns all results wrapped in a datatype of the framework. If you prefer working with HashMaps, you might consider using the implementation DatabaseThesaurusMap which wraps the results into Strings within a HashMap.

String config = "conf_mysql_de_70M_trigram.xml"
DatabaseThesaurusDatastructure dt ;
dt= new DatabaseThesaurusDatastructure(config);
dt.connect();

At the end of using the API the connection should be closed again using:

dt.destroy()

Functions of the API

The API has several methods to access the data. Here we will shortly describe the most important ones. A more detailed documentation of all classes can be found in the JavaDoc. Additionally, the class MySqlExamples illustrates the usage of the API and prints
results using both models.

Here is an overview of the API methods:

  • getSimilarTerms(term): List<Order2>
    this function returns all similar terms to a given term. As a result a list of Order2 instances is returned. Each Order2 object contains the similar word and its similarity score. [The contextScore is not yet used]
  • getSimilarTerms(term,N:Integer): List<Order2>
    this function returns the N top most similar terms to a given token.
  • getSimilarTerms(term,D:Double): List<Order2>
    this function returns all similar tokens with a similarity score above D.
  • getTermCount(term): long
    This function returns the frequency of the term within the processed corpus
  • getContextsCount(context): long
    This function returns the frequency of the context within the processed corpus
  • getTermContextsCount(term, context): long
    return the number of co-occurrences of term and context
  • getTermContextsScore(term, context): long
    returns the significance score of the term and context
  • getTermContextsScores(term): List<Order1>
    This function returns the significant contexts (Order1 objects) for a given term.
    An Order1 object contains the context feature as well as the significant score and the frequency of the term and the context.
  • getTermContextsScores(term,N:Integer): List<Order1>
    This function returns the top N contexts for a given term.
  • getTermContextsScores(term,D:double): List<Order1>
    This function returns contexts with a significance score greater then D.
  • getSensesTypes(): String[]
    This function gets the names of the available sense computations
  • getSenses(term): List<Sense>
    This function gets a list of Sense objects of the standard sense type. Each sense object contains a list of words belonging to the sense and a list of IS-A’s which label the sense
  • getSenses(term,sense_type): List<Sense>
    This function returns a list of Senses for a specified sense type.

 

Leave a Reply

Your email address will not be published. Required fields are marked *