Article

Historical Text Analysis Using Transformer Models for Language Decoding and Translation

Saodat ToshtemirovaSvetlana UmirovaSamarkand State University Named After Sharof Rashidov,Samarkand,UzbekistanShokhrukhbek IkromovAndijan State Institute of Foreign Languages,UzbekistanFayzullo BegovNational Research University,Tashkent Institute of Irrigation and Agricultural Mechanization EngineersDilshodbek AllayarovUgiljon QushnazarovaUrgench State University,Department of Pedagogy and Psychology,Urgench City,Uzbekistan

2025en

ABI

Abstract

Ancient text deciphering and translation are essential to know about past civilizations, but standard linguistic approaches do not work because contextual details are missing and language patterns change over time. Breakthroughs in transformer models present potential solutions for language decoding and translation. Current approaches are based primarily on rule-based or statistical methods, which perform poorly with incomplete texts, low-frequency linguistic patterns, and insufficient annotated datasets, resulting in poor-quality translations. Further, most available deep learning models are non-transferable to less familiar ancient scripts. To offset these constraints, we propose the Decode and Translate Ancient Languages using BERT (DTAL-BERT) approach, which leverages BERT-based transformers in contextual language modeling. DTAL-BERT fuses self-attention mechanisms and masked language modeling to reconstruct missing segments, improving translation precision and contextual perception.Moreover, it applies transfer learning from recent language corpora to enhance recognition of ancient text. The DTALBERT model enables the decoding and semantic analysis of old languages more specifically and efficiently, aiding linguists and historians in better interpreting text. Experimental validations confirm that DTAL-BERT translates more efficiently and accurately and outperforms the contextualization and robustness of conventional models in reconstructing fragmented texts. The framework contributes significantly to preserving and comprehending ancient languages through AI-based methods.

Topics

Natural Language Processing Techniques

Identifiers

DOI: 10.1109/iccies63851.2025.11032731

Citations and references

Cited by 014 references

Metrics — AkademScholar