Connecting to the database
To instantiate the API, the path of the MySQL configuration is needed and given to the constructor of the class DatabaseThesaurusDatastructure.
This class is an implementation of the IThesaurus class which contains all methods that can be available for a full model. The DatabaseThesaurusDatastructure class returns all results wrapped in a datatype of the framework. If you prefer working with HashMaps, you might consider using the implementation DatabaseThesaurusMap which wraps the results into Strings within a HashMap.
String config = "conf_mysql_de_70M_trigram.xml" DatabaseThesaurusDatastructure dt ; dt= new DatabaseThesaurusDatastructure(config); dt.connect();
At the end of using the API the connection should be closed again using:
dt.destroy()
Functions of the API
The API has several methods to access the data. Here we will shortly describe the most important ones. A more detailed documentation of all classes can be found in the JavaDoc. Additionally, the class MySqlExamples illustrates the usage of the API and prints
results using both models.
Here is an overview of the API methods:
getSimilarTerms(term): List<Order2>
this function returns all similar terms to a given term. As a result a list of Order2 instances is returned. Each Order2 object contains the similar word and its similarity score. [The contextScore is not yet used]getSimilarTerms(term,N:Integer): List<Order2>
this function returns the N top most similar terms to a given token.getSimilarTerms(term,D:Double): List<Order2>
this function returns all similar tokens with a similarity score above D.getTermCount(term): long
This function returns the frequency of the term within the processed corpusgetContextsCount(context): long
This function returns the frequency of the context within the processed corpusgetTermContextsCount(term, context): long
return the number of co-occurrences of term and contextgetTermContextsScore(term, context): long
returns the significance score of the term and contextgetTermContextsScores(term): List<Order1>
This function returns the significant contexts (Order1 objects) for a given term.
An Order1 object contains the context feature as well as the significant score and the frequency of the term and the context.getTermContextsScores(term,N:Integer): List<Order1>
This function returns the top N contexts for a given term.getTermContextsScores(term,D:double): List<Order1>
This function returns contexts with a significance score greater then D.getSensesTypes(): String[]
This function gets the names of the available sense computationsgetSenses(term): List<Sense>
This function gets a list of Sense objects of the standard sense type. Each sense object contains a list of words belonging to the sense and a list of IS-A’s which label the sensegetSenses(term,sense_type): List<Sense>
This function returns a list of Senses for a specified sense type.