This release includes optimizations to the new PattaMaika component:
- Lemmatization: patterns like “cats ISA animals” get lemmatized into “cat ISA animal”. This is important to accurately label singular nouns.
- Extended lexico-syntactic patterns: patterns from Klaussner & Zhekova (2011) now implemented
This release brings some minor improvements and new Holing Operations:
- MateTools Parser (GPL) for English and German
- MaltParser (ASL) for English and German
- N-gram Holing operation
- Updated build process reduces project size by about 30%.
- Build options available for the GPL components (documentation)
- updates to the Hadoop pipeline generators to accommodate for new parsers and their memory/time demands
You can download the new release from Sourceforge: