Fusion of Word Clustering Features for Tibetan Part of Speech Tagging Based on Maximum Entropy Model

Special Issues Editor (Nottingham Trent University, United Kingdom (Great Britain))

Tibetan Part of Speech (POS) tagging, the foundation of Tibetan natural language processing, judges word classification according to contextual information of words. Based on the framework of the maximum entropy model, the paper studied the fusion of morphological features for Tibetan part of speech with maximum entropy model with the integration of word clustering features. Experimental results show that Tibetan POS based on maximum entropy achieves much better results and word cluster features can increase the performance of Tibetan POS significantly. Additionally, the accuracy rate of Tibetan POS based on maximum entropy is 0.81% higher than that of baseline system.

Journal: International Journal of Simulation: Systems, Science & Technology, IJSSST V17

Published: Feb 28, 2016

DOI: 10.5013/IJSSST.a.17.08.19