Development of a Legal Document Recognition Algorithm for the Karakalpak Language
Аннотация
In this study, the authors propose an algorithm for recognizing legal documents in Karakalpak texts. To develop such an algorithm, similar scientific works were studied, the relevance of the current work and the problems that need to be solved were identified. The proposed algorithm is developed based on traditional rules, which include the rules of morphology of the Karakalpak language, as well as a dictionary of tagged words used to identify legal words and phrases from texts. The dictionary used contains more than 12,000 tagged words, which include both word roots and other forms concatenated to grammatical affixes. The authors also tested the algorithm, where high accuracy in identifying the necessary words was achieved. In particular, three samples were formed, each of which contained words and phrases of legal terminology in a certain amount. In conclusion, the authors added information regarding the proposed improvements and further prospects for the development of the algorithm.
Ҳали таржима қилинмаган