Dynamic Pipeline Scheduling for Hybrid Edge Collaboration in Multi‐Modal LLM Inference

doi:10.1049/aie2.70020

DOI: 10.1049/aie2.70020 ISSN: 3067-249X

Dynamic Pipeline Scheduling for Hybrid Edge Collaboration in Multi‐Modal LLM Inference

He Li

ABSTRACT

Multi‐modal large language models (MLLMs) are revolutionising AI‐enabled technologies, enabling more intelligent and human‐like multimedia data analysis. The application of MLLMs has the potential to accelerate processes while reducing human resource costs. However, the substantial computational overhead of supporting MLLMs poses challenges even during inference. This paper introduces a paradigm leveraging high‐performance low‐cost edge devices to enhance MLLM processing. The proposed approach employs in‐device and across‐device collaboration to improve overall system utilization and reduce MLLM inference latency. Additionally, a sophisticated multilevel pipeline scheduling method addresses bottlenecks in the edge system. Comprehensive experimental results demonstrate that the proposed system significantly accelerates MLLM inference for multi‐modal data processing.

Outline

Dynamic Pipeline Scheduling for Hybrid Edge Collaboration in Multi‐Modal LLM Inference

ABSTRACT

More from our Archive