Article

A Unified Big Data Analytics Framework using AutoML and Deep Learning for Real-time Business Intelligence

Varsha MittalGraphic Era Deemed to be University,Department of Computer Science & Engineering,Dehradun,IndiaSаmаriddin MаkhmudovTermez University of Economics and Service,Department of Finance and Tourism,TermezKunduz BakdurdiyevaUrgench State University,Department of History,Urgench,UzbekistanShakhista AllamovaUrgench Innovation University,Department of Pedagogy and Primary Education Methodology,Urgench,UzbekistanMomogul IsmailovaUrgench State Pedagogical Institute,Deparment of Technological Education,Urgench,UzbekistanSindor SapaevUrgench State University,Department of Economy,Urgench,Uzbekistan

2025

ABI

Abstract

This paper presents an artefact-centric analytics framework that reconciles predictive utility, low-latency inference, and auditable traceability for real-time business intelligence (BI). Modern BI systems increasingly require low-latency, auditable predictive analytics but suffer from gaps between offline model development and production serving—caused by feature-parity breaks, schema drift, tail latency, and weak KPI → exemplar traceability. Our design couples a parity-preserving feature-fabric (materialised Delta views) with a constrained AutoML with multi-fidelity search and a compilation/distillation pipeline producing registry-tracked ONNX/TVM artefacts. A gated serving policy and a distilled fast-path reconcile ensemble-quality decisions with median/tail-latency budgets, while provenance capture and schema/version governance enable KPI → exemplar traceability and auditable rollbacks. Evaluation on a production-like BI workload demonstrates that the AutoML-selected ensemble achieves F1 = 0.768, while the distilled fast-path recovers F1=0.754 and meets median latency targets by limiting ensemble invocations to ⩽10%. Ablation studies show multi-fidelity evaluation reduces search cost with modest utility loss, and drift-injection experiments show automated warm-start retraining restores KPIs within a single retraining cycle (∼3–4 hours) when thresholds are exceeded. Contributions include an artifact-centric pipeline enforcing offline/online parity and traceability; a constrained AutoML plus compilation workflow that meets deployment budgets; and an operational governance stack validating automated recovery across diverse BI domains.

Topics

Machine Learning and Data Classification Big Data and Business Intelligence Data Stream Mining Techniques

Identifiers

DOI: 10.1109/ic-eeta66496.2025.11548328

Citations and references

Cited by 017 references

Metrics — AkademScholar · Coming soon