Full metadata record
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.contributor.advisorMao, Yuyi (EIE)en_US
dc.creatorDong, Rongkang-
dc.identifier.urihttps://theses.lib.polyu.edu.hk/handle/200/12062-
dc.languageEnglishen_US
dc.publisherHong Kong Polytechnic Universityen_US
dc.rightsAll rights reserveden_US
dc.titleSample-aware early-exit prediction for efficient device-edge collaborative inferenceen_US
dcterms.abstractRecent emergent studies on artificial intelligence for mobile applications have been widely concerned by researchers. The deployment of the cumbersome deep neural networks (DNNs) on resource-constrained mobile devices introduces significant latency. The device-edge co-inference mode is a flexible solution that a DNN is partitioned into two parts: A head network on the device and a tail network on the edge server. Early-exit neural networks provide a dynamic inference method by terminating the inference process for some samples in the early layers. But inserting the early exits generates additional computational overhead, which burdens the resource-limited devices when the early exits are inserted into the on-device network. In this dissertation, a new methodology, called early-exit prediction, is proposed to alleviate the computational overhead brought by the early exits. The system consists of an early-exit network and a low-cost Exit Predictor. The Exit Predictor guides the inference to skip the computation of the early exits so that some "hard" samples can be directly inferred by the backbone network without being processed by any early exit. To verify the effectiveness of the early-exit prediction, extensive experiments are conducted using three DNN models (i.e., AlexNet, VGG16-BN, ResNet44) on the CIFAR10 and CIFAR100 datasets. The experimental results show that the early exit prediction method is able to reduce over 20% of the on-device computation without degrading too much the classification accuracy.en_US
dcterms.abstractIn practical applications, some of the sensed data contain no information of interest (called "task-irrelevant data"), which do not need to be fed to the early-exit neural network and can be filtered out at the early stage. Moreover, the data size of the intermediate features computed by the early layers in commonly used backbone DNNs is larger than a raw image, which burdens the communication network in the device-edge co-inference. Therefore, we further adopt irrelevant-data filtering and feature compression to improve the overall performance of the system. Specifically, irrelevant-data filtering that discards some distinctly task-irrelevant data is achieved by a low-cost model. On the other hand, an autoencoder that consists of a convolutional encoder and decoder is applied to reduce the transmitted data size by compressing the features and recover them before being processed by the edge server. Under the dual effects of early-exit prediction and irrelevant-data filtering methods, the end-to-end inference latency can be reduced significantly compared to the original early-exit neural network.en_US
dcterms.extentviii, 51 pages : color illustrationsen_US
dcterms.isPartOfPolyU Electronic Thesesen_US
dcterms.issued2022en_US
dcterms.educationalLevelM.Sc.en_US
dcterms.educationalLevelAll Masteren_US
dcterms.LCSHNeural networks (Computer science)en_US
dcterms.LCSHArtificial intelligenceen_US
dcterms.LCSHEdge computingen_US
dcterms.LCSHHong Kong Polytechnic University -- Dissertationsen_US
dcterms.accessRightsrestricted accessen_US

Files in This Item:
File Description SizeFormat 
6519.pdfFor All Users (off-campus access for PolyU Staff & Students only)5.31 MBAdobe PDFView/Open


Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show simple item record

Please use this identifier to cite or link to this item: https://theses.lib.polyu.edu.hk/handle/200/12062