Q-Learning-Based Feedback Optimization for Operator Training in Industrial Robotics
Abstract
In today's manufacturing environments, increasing efficiency, safety, and accuracy depends on adequate operator training while using industrial robotics. Although training operators in their skills might have efficiency potential, traditional training approaches inhibit the efficiency of skill learning since those skills do not give immediate adaptive feedback. Therefore, here propose the Q-Learning Based Feedback Optimization (QLFO) approach to optimize feedback in training sessions. This method uses reinforcement learning. The Q-learning algorithm observes the operator's actions and the system's states, through a Markov Decision Process (MDP) model of the feedback process, builds the best feedback policies. It will adjust its policies based on the operators' actions in realtime, to improve outcomes. The system was applied in a simulated environment for robotic training and evaluated against static methods of feedback, in comparison with each other. As experimental results suggest, the QLFO method shortened the training time and more accurately improved the skills of the operator, and the task completion. The results demonstrate the potential benefits of using reinforcement learning methods to enhance human-robot interaction through provided implemented adaptive training feedback. The proposed QLFO method reduces training time to 5 minutes and achieves 95% task accuracy, outperforming DQN-RAR, RLJSS, and Q-FTTC. It accelerates skill improvement rate to 0.47 and enhances feedback efficiency, reducing error rate to 1.8 %, demonstrating superior adaptive training performance in industrial robotics operator skill development.