Sample-aware early-exit prediction for efficient device-edge collaborative inference

Dong, Rongkang

Full metadata record

DC Field	Value	Language
dc.contributor	Department of Electronic and Information Engineering	en_US
dc.contributor.advisor	Mao, Yuyi (EIE)	en_US
dc.creator	Dong, Rongkang	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/12062	-
dc.language	English	en_US
dc.publisher	Hong Kong Polytechnic University	en_US
dc.rights	All rights reserved	en_US
dc.title	Sample-aware early-exit prediction for efficient device-edge collaborative inference	en_US
dcterms.abstract	Recent emergent studies on artificial intelligence for mobile applications have been widely concerned by researchers. The deployment of the cumbersome deep neural networks (DNNs) on resource-constrained mobile devices introduces significant latency. The device-edge co-inference mode is a flexible solution that a DNN is partitioned into two parts: A head network on the device and a tail network on the edge server. Early-exit neural networks provide a dynamic inference method by terminating the inference process for some samples in the early layers. But inserting the early exits generates additional computational overhead, which burdens the resource-limited devices when the early exits are inserted into the on-device network. In this dissertation, a new methodology, called early-exit prediction, is proposed to alleviate the computational overhead brought by the early exits. The system consists of an early-exit network and a low-cost Exit Predictor. The Exit Predictor guides the inference to skip the computation of the early exits so that some "hard" samples can be directly inferred by the backbone network without being processed by any early exit. To verify the effectiveness of the early-exit prediction, extensive experiments are conducted using three DNN models (i.e., AlexNet, VGG16-BN, ResNet44) on the CIFAR10 and CIFAR100 datasets. The experimental results show that the early exit prediction method is able to reduce over 20% of the on-device computation without degrading too much the classification accuracy.	en_US
dcterms.abstract	In practical applications, some of the sensed data contain no information of interest (called "task-irrelevant data"), which do not need to be fed to the early-exit neural network and can be filtered out at the early stage. Moreover, the data size of the intermediate features computed by the early layers in commonly used backbone DNNs is larger than a raw image, which burdens the communication network in the device-edge co-inference. Therefore, we further adopt irrelevant-data filtering and feature compression to improve the overall performance of the system. Specifically, irrelevant-data filtering that discards some distinctly task-irrelevant data is achieved by a low-cost model. On the other hand, an autoencoder that consists of a convolutional encoder and decoder is applied to reduce the transmitted data size by compressing the features and recover them before being processed by the edge server. Under the dual effects of early-exit prediction and irrelevant-data filtering methods, the end-to-end inference latency can be reduced significantly compared to the original early-exit neural network.	en_US
dcterms.extent	viii, 51 pages : color illustrations	en_US
dcterms.isPartOf	PolyU Electronic Theses	en_US
dcterms.issued	2022	en_US
dcterms.educationalLevel	M.Sc.	en_US
dcterms.educationalLevel	All Master	en_US
dcterms.LCSH	Neural networks (Computer science)	en_US
dcterms.LCSH	Artificial intelligence	en_US
dcterms.LCSH	Edge computing	en_US
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	en_US
dcterms.accessRights	restricted access	en_US

Files in This Item:

File	Description	Size	Format
6519.pdf	For All Users (off-campus access for PolyU Staff & Students only)	5.31 MB	Adobe PDF	View/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12062