Skip to main content
AkademIndex

Products

For developers

AkademBasesoonOpen API for the ecosystem
Latin
English
Article

Corpus-based Uncertainty Analysis of Multilingual Media under Language Policy

Sulieman Ibraheem Shelash Al-Hawary<p>Electronic Marketing and Social Media, Economic and Administrative Sciences, Zarqa University, Zarqa 13110, Jordan</p> <p>Faculty of Business and Communications, INTI International University, Nilai 71800, Malaysia</p>Yogeesh Nijalingappa<p>Department of Mathematics, Government First Grade College, Tumkur 572101, India</p>Hanan Jadallah<p>Electronic Marketing and Social Media, Economic and Administrative Sciences, Zarqa University, Zarqa 13110, Jordan</p>Naveed Iqbal Raja<p>Department of Visual Communication, Sathyabama Institute of Science and Technology, Chennai 600119, India</p>Azizbek Qaraqulov<p>Department of Uzbek Language and Literature, Termez University of Economics and Service, Termez 190111, Uzbekistan</p>Asokan Vasudevan<p>Faculty of Business and Communications, INTI International University, Nilai 71800, Malaysia</p> <p>Faculty of Management, Shinawatra University, Sam Khok 12160,Thailand</p> <p>Department of Business Stusies, Wekerle Business School, 1083 Budapest, Hungary</p>Sadoqat Masharipova<p>Department of Roman-Germanic Philology, Mamun University, Khiva 220900, Uzbekistan</p>
ABI

Abstract

This paper presents a mathematical framework for quantifying graded language mixing in media texts surrounding a policy reform. We model each document as generated by probabilistic n-gram models for two languages, interpret the resulting posterior probabilities as soft-membership degrees, and apply Shannon entropy to measure per-document mixing. A fuzzification exponent controls assignment sharpness, and aggregate entropy across documents yields a corpus-level metric tracked over pre- and post-reform intervals. In a case study of 20 headlines, mean entropy rose from 0.52 to 0.68 nats (∆ = 0.16), indicating increased code-mixing after the policy change. Statistical validation via a paired t-test (t = 3.27, p &lt; 0.01) and a permutation test (p = 0.005) confirms the significance of this shift. Analysis of soft-membership distributions reveals a drop in average English membership from 0.77 to 0.52, further illustrating editorial adaptation. The modular implementation enables scalable analysis of large corpora, and an open-source toolkit is provided to promote reproducibility and extension to other bilingual or multilingual settings. We discuss limitations related to parameter sensitivity, model assumptions, and sample size, and outline future extensions involving imprecise-probability bounds, contextual embeddings, dynamic time-series modeling, and topic-augmented uncertainty. Our results demonstrate the power of information-theoretic tools for detecting subtle shifts in media discourse in response to regulatory changes.

Topics

Identifiers

Citations and references

Cited by 00 references
Metrics — AkademScholar · Coming soon