中国教育图书进出口有限公司

[期刊论文][Full-length article]

PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud–edge collaborative environments

作者：	Zhuofan Liao;Xiangyu Zhang;Shiming He;Qiang Tang;

出版年：2023

页码：103720 - 103720

出版社：Elsevier BV

摘要：

To address the challenges of delay-sensitive deep learning tasks, Deep Neural Network (DNN) models are often partitioned and deployed to the cloud–edge environment for parallel and collaborative inference. However, existing parallel coordination mechanisms are not suitable for the cloud–edge environment, as the high inter-layer dependence of DNNs can increase transmission latency and wait times for inference, which contradicts the advantage of low latency in edge computing . To resolve this contradiction, the PMP mechanism takes into account the inter-layer transfer dependence of partitioning solutions and employs a multi-objective equalization algorithm to derive DNN model partitioning strategies suitable for multi-way parallel computing . Moreover, the mechanism establishes a DNN inference time prediction model based on these partitions and utilizes an iterative matching algorithm to approximate an optimal DNN inference workflow. Extensive evaluations of the proposed mechanism are conducted using various DNN models, and the results demonstrate its superiority over existing schemes, including local, CoEdge, and EdgeFlow. Notably, PMP achieves significant reductions in total inference latency compared to these schemes, with reductions of 80.9%, 37.9%, and 9.1%, respectively.

关键字：

暂无

去购买

原文链接

所属期刊

Journal of Network and Computer Applications

ISSN: 1084-8045

来自：Elsevier BV