Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

Hybrid Big Data Analytics: Integrating Structured and Unstructured Data for Predictive Intelligence

Renas Rajab AsaadDepartment of Computer Science, Khazar University, Baku, AzerbaijanRahman AliCollege of Science, Computer Science Department, Nawroz University, Duhok, IraqSaman M. Almufti‎Computer and Software Engineering, L.N.Gumilyov Eurasian national university, Astana, Kazakhstan
2022
ABI

Аннотация

Hybrid big data analytics has emerged as a compelling paradigm for predictive intelligence, yet most operational pipelines still privilege a single modality—either structured relational data or unstructured text—thereby under-exploiting complementary signals. This paper proposes a unified framework that integrates structured records (e.g., time-series sensors, tabular attributes) with unstructured corpora (e.g., clinical narratives, web-scale text) through a multi-modal deep learning architecture coupled with scalable clustering and query optimization. The method fuses static encoders, temporal CNN/LSTM modules, and text representations (e.g., document embeddings with BiLSTM/CNN) in a learned fusion layer, and augments inference with a Gaussian Mixture Model optimized by a bio-inspired Salp Swarm Algorithm for low-latency, distributed querying. Experiments across two representative domains—infectious-disease forecasting and Industry 4.0 cycle-time projection—demonstrate consistent gains over single-modality baselines in AUROC, F1, MAE, and AUPRC, while preserving near real-time responsiveness on commodity GPU/CPU clusters. We discuss integration complexity, interpretability challenges, and deployment constraints, and delineate practical pathways for edge-side execution, transfer learning across domains, and explainability overlays. By systematically bridging structured and unstructured modalities, the study evidences material performance improvements and offers a robust template for multimodal analytics in high-stakes environments.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 2Использованных источников: 0