This release includes optimizations to the new PattaMaika component:
This release brings some minor improvements and new Holing Operations:
- MateTools Parser (GPL) for English and German
- MaltParser (ASL) for English and German
- N-gram Holing operation
- Updated build process reduces project size by about 30%.
- Build options available for the GPL components (documentation)
- updates to the Hadoop pipeline generators to accommodate for new parsers and their memory/time demands
You can download the new release from Sourceforge:
Today, have released new JoBimText models for German news. They are the first released models based on the new JoBimText 0.1.0 pipeline. The provided models feature sense clusterings in different granularities:
The models are free for any use. We also provide them in the JoBimText web demo. The demo is now capable to parse German sentences.
We are proud to announce the next release of the JoBimText pipeline. The main addition of version 0.1.0 is the pattern matching engine PattaMaika, that can run locally and on Hadoop. The pattern matching engine is able to extract hierarchical relations between terms and is very flexible. It utilizes the Apache UIMA Ruta annotation engine to tag patterns. For more information on the pattern engine, consult the PattaMaika project page.
Other improvements include the re-organization of thirdparty models. Since their number grows with the increasing number of components (segmenters, taggers, parsers) they are now structure. Additionally, the build scripts have been updated.
You can download the new release from Sourceforge: JoBimText pipeline 0.1.0.