Weighted lexicon and hybrid neural models for fake news detection in low-resource media
Аннотация
This study addresses the challenge of identifying fake news in languages with limited computational resources. The proposed approach builds upon the combination of weighted lexical indicators and neural representations. A mixed framework is introduced, uniting the transparency of lexicon-based scoring methods with the adaptability of transformer-driven neural systems. During implementation, bigram-based lexicons are used as the initial layer, where their categorical distinctions are transformed into statistically determined weights that express each term’s discriminative significance. To improve reliability, lexical signals are merged not only through the conventional maximum rule but also by means of more generalized aggregation procedures capable of capturing dispersed linguistic evidence. The lexicon is further expanded automatically using contextual embedding models, allowing it to reach beyond the boundaries of the manually constructed seed vocabulary. The dual role of weighted lexical features manifests itself in both interpretable rule-based scoring and neural-level probabilistic fusion when integrated into transformer classifiers. In hybrid recognition, ensemble integration is realized through stacking technology.