An RL-Enhanced Multi-Agent Framework for Scalable and Intelligent Business Intelligence Systems
Annotatsiya
In many organizations, business intelligence systems support analytical reporting and operational decision making. As data volumes grow and analytical tasks become more complex, architectures based on centralized processing pipelines increasingly face limitations related to scalability and timely response. These challenges motivate the development of alternative architectural approaches capable of operating efficiently in data-intensive environments. This study presents a modular multi-agent business intelligence framework that distributes analytical tasks across autonomous agents and applies lightweight reinforcement learning at the decision-making stage. The analytical workflow is decomposed into agents responsible for data collection, preprocessing, analytical modeling, and decision execution. Decision adaptation relies on localized policy updates driven by operational feedback, which avoids complex learning coordination and helps preserve system stability and interpretability. The proposed framework is evaluated using real-world transactional data from an electronic commerce setting. Experimental results show that the approach consistently outperforms centralized analytical pipelines and non-agent machine learning baselines in terms of processing efficiency, classification accuracy, and balanced classification performance. Threshold-independent evaluation further confirms stronger discriminative behavior across varying decision thresholds. In addition, stability analysis across repeated experimental runs indicates reduced performance variance and more predictable system behavior. These findings suggest that the proposed multi-agent business intelligence framework provides a practical and scalable alternative to centralized analytical architectures for data-intensive decision-support environments, while maintaining the robustness and transparency required in enterprise systems. The evaluation is limited to a single dataset and a classification task, and results should be interpreted within this scope. Experiments on the Online Retail dataset (UCI Machine Learning Repository) show an average accuracy of 0.89 ± 0.012 (baseline: 0.74 ± 0.029) and decision latency of 94 ± 9 ms (baseline: 137 ± 16 ms) across 10 independent runs, indicating stable behavior under repeated execution.