Research on Comprehensive Unit Price Estimation for Temporary Repair of Ship Equipment Based on the PPO Algorithm
Zhiyin Wang, Li XieAfter the completion of temporary repair of naval ship equipment, cost settlement has long relied on an ex post auditing model, which results in long cycles and a lack of immediate pricing references for the military. To address this issue, a comprehensive unit price estimation method based on Proximal Policy Optimization (PPO) is proposed, which rapidly generates reasonable unit prices for each process after the repair is completed, thereby providing a quantitative benchmark for negotiation. The unit price estimation problem is formulated as a Markov decision process, and a multi-objective reward function combining range reward, compliance penalty, and final accuracy reward is designed. To alleviate the sparse reward problem, potential-based reward shaping using the Critic network is introduced, which decomposes the final accuracy signal into each pricing step. The clipping mechanism of PPO is adopted to limit the policy update amplitude, thereby improving training stability. Experimental results on 12,000 desensitized real repair records show that the proposed method achieves a mean absolute percentage error (MAPE) of 11.3%, a coefficient of determination (R2) of 0.913, and an abnormal estimation rate (AER) of 3.5%. Compared with standard PPO, the AER is reduced by 59%. The proposed method can sequentially output reasonable unit prices after repair completion, exploring a technical pathway for transforming temporary repair funding from ex post auditing to immediate verification.