Skip to main content
Article

Data Science Optimization Frameworks for Computationally Efficient Scientific Simulation: A Cross-Domain Narrative Review

Sabir Z. SharipovIndependent Researcher, Bukhara, Uzbekistan
ABI

Abstract

This is a preprint of a narrative review with original experimental validation. Artificial intelligence and scientific simulation workloads impose substantial computational and energy costs, limiting access for researchers and institutions without large budgets. This paper proposes a four-tier data science optimization framework for computationally efficient scientific simulation across five domains: healthcare digital twins, computational chemistry, computational neuroscience, manufacturing optimization, and digital olfaction. The review synthesizes evidence for four tiers: (1) domain-informed feature engineering and dimensionality reduction, (2) surrogate modeling and physics-informed learning, (3) transfer learning and data-efficient adaptation, and (4) model compression and inference acceleration. Original experiments across six publicly available proxy datasets and a controlled ANI-2x molecular benchmark identify empirical boundary conditions for framework applicability. In particular, transfer learning can produce negative results when source and target domains have different generative processes; the framework fails when baseline predictability is too low (R² < 0.3); and compression effectiveness depends on upstream model quality. A controlled wall-clock benchmark against the ANI-2x physics potential demonstrates 10²–10³× surrogate speedup with R² = 0.984 on identical molecular conformations. All experiments were conducted using freely available computing resources. Working Paper / Preprint — Not yet peer-reviewed.

Topics

Identifiers

Citations and references

Cited by 00 references