A Feedforward Neural Model for Morphological Stemming of Uzbek Words
Аннотация
Morphological analysis plays a vital role in natural language processing (NLP), especially for agglutinative languages such as Uzbek, where words are constructed by attaching multiple affixes to a root stem. This paper presents a character-level feedforward neural network model for morphological stemming in Uzbek, aimed at automatically identifying the stem within inflected or derived word forms. The model represents each character as a binary vector and is trained using backpropagation with a sigmoid activation function and mean squared error loss. Unlike traditional rule-based or dictionary-driven approaches, the proposed method learns morphological patterns directly from annotated data without requiring manually defined linguistic rules. Experimental results show that the model achieves an average stemming accuracy of 88% over 10 training epochs, with consistent convergence. These findings focus on the effectiveness of simple neural architectures in capturing complex morphological structures in morphologically rich languages. The study offers a scalable, data-driven solution for Uzbek stemming and lays a foundation for further advances in Uzbek NLP applications.