A Part-of-speech Tagging Model Employing Word Clustering and Syntactic Parsing
-
Graphical Abstract
-
Abstract
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-;Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.
-
-