Статья

Learning Adaptive Routing Policies Under Stochastic Service Times

Yelbek AbdumajitovDept. of Computer Science New Uzbekistan University,Tashkent,UzbekistanOybek AbdukhalimovDept. of Computer Science New Uzbekistan University,Tashkent,UzbekistanAzamkhon AzamovKhalifa University,Dept. of Mathematics,Abu Dhabi,UAEShirali KadyrovNew Uzbekistan University,Dept. of General Education,Tashkent,Uzbekistan

2026

ABI

Аннотация

Field service routing is strongly affected by uncertainty in on-site service durations, where deviations from nominal times propagate through the schedule and induce tardiness. We study the Multi-Vehicle Routing Problem with Stochastic Service Times (MVRP-SST), in which customer service times are revealed only upon arrival and follow customer-specific Gamma distributions. We formulate the problem as a cooperative Dec-POMDP and propose a Multi-Agent Proximal Policy Optimization (MAPPO) approach with centralized training and decentralized execution to learn adaptive routing policies that react to realized service times. The objective minimizes a weighted cost comprising travel distance and tardiness (with an additional penalty for unserved customers). Experiments on synthetic instances with <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K=3$</tex> vehicles and problem sizes <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$N \in\{30,50,100\}$</tex> show that MAPPO consistently improves the overall objective compared with classical constructive heuristics and simulated annealing, with particularly strong gains at larger scales. Statistical testing confirms significant differences across methods via the Friedman test, and post-hoc Wilcoxon signedrank tests with Holm correction indicate that MAPPO significantly outperforms all baselines for <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$N=50$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$N=100$</tex>, and all except simulated annealing for <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$N=30$</tex>.

Перевод пока недоступен

Темы

Network Traffic and Congestion Control Advanced Optical Network Technologies Age of Information Optimization

Идентификаторы

DOI: 10.1109/icecco67619.2026.11488784

Цитирования и источники

Цитирований: 0Использованных источников: 10

Показатели — AkademScholar