Performance Optimization of Distributed Data Processing in Centralized Control System Based on Spark and GPU Collaboration
Xunting Wang, Cheng Xie, Jinjin Ding, Bin Xu, Jianlin Li, Weimin HuangLimited by the computational performance limits of the CPU(Central Processing Unit), the traditional Spark architecture struggles to achieve high throughput and low latency under the dual pressure of a large data scale and real-time requirements in centralized control systems. This work uses a publicly available CNC(Computer Numerical Control) milling dataset as a functional validation proxy for time-series data processing, then extends validation to a large-scale synthetic power transmission grid dataset. Furthermore, Spark-GPU(Graphics Processing Unit) collaboration suffers from load balancing failure due to heterogeneous resource scheduling and communication overhead, thus failing to unleash its performance potential. This paper proposes a Spark-GPU fusion acceleration technology path. The path consists of three key components: first, it integrates the RAPIDS accelerator; second, it designs a GPU-aware partitioning and task co-scheduling strategy; and third, it optimizes the zero-copy data path. Together, these components realize an integrated collaboration of heterogeneous resources. Validation on real-world datasets yields the following results. In real-time aggregation scenarios, the proposed solution improves throughput by a factor of 3.7 over the pure CPU baseline and reduces end-to-end latency by 62%. Compared with the basic GPU solution, GPU utilization rises from 51.7% to 72.3%, representing a relative improvement of 39.8%. Furthermore, the solution meets industrial-grade high availability requirements. This research significantly improves the processing throughput and reduces end-to-end latency in typical centralized control scenarios, thus providing a feasible technical route for demanding concurrent centralized control scenarios such as electric power industry manufacturing with high real-time demands.