Information-Measuring Approach to Multimodal Educational Content Quality via Neuro-Symbolic AI
Аннотация
The digital transformation of education has led to a proliferation of multimodal content, creating a critical need for scalable and pedagogically grounded methods for quality assessment. This paper introduces and empirically evaluates a novel neuro-symbolic framework designed for this task. At its core, our architecture utilizes a unified multimodal Transformer encoder to holistically process video, audio, and text streams, learning rich cross-modal representations. These dense neural representations are then interpreted by a symbolic reasoning engine that encodes established pedagogical principles (e.g., Cognitive Load Theory) using a fuzzy logic system. We validate this framework against strong baselines, including a powerful CNN+BERT architecture, on a large-scale educational dataset. Our empirical results yield a crucial insight: while a conventional late-fusion baseline achieves a state-of-the-art correlation (r=0.976) on engagement-based proxy labels, our proposed neuro-symbolic model also demonstrates strong performance (r=0.903). More importantly, its interpretability reveals a critical misalignment between established pedagogical theories and the engagement metrics used as ground truth in real-world datasets. This analysis highlights the challenges of grounding symbolic reasoning in practice and provides a clear direction for future research in building more robust and trustworthy hybrid AI systems for education.
Перевод пока недоступен