Author: Yu, Le
Title: Identifying privacy issues in mobile apps via synthesizing static analysis and NLP
Advisors: Luo, Xiapu (COMP)
Degree: Ph.D.
Year: 2021
Subject: Privacy, Right of
Application software
Mobile computing -- Security measures
Hong Kong Polytechnic University -- Dissertations
Department: Department of Computing
Pages: xxvi, 251 pages : color illustrations
Language: English
Abstract: Recent years have witnessed a sharp increase of malicious apps that access or steal users' personal information. To address users' concerns about privacy risks, researchers proposed a promising detection approach that looks for the inconsistency between an app's permissions and its description. Unfortunately, using description and permission will lead to many false positives because descriptions often fail to declare all sensitive operations. Moreover, the permission is coarse-grained, which cannot describe which type of personal information is accessed by the app itself (or third party library). In this thesis, we focus on combining static analysis technique to discover the behaviors contained in bytecode and using natural language processing (NLP) technique to processing software artifacts (i.e., description, privacy policy, and user reviews) so that we can discover the privacy issues precisely. We propose to detect the privacy issues of mobile apps with the following steps: (1) We propose exploiting the app's privacy policy and its bytecode to remove the false alerts of identifying the inconsistency between app's permissions and its description. If users report bugs in user reviews, to help developers fix them, we locate the bugs in app bytecode. (2) If the app developers provide a privacy policy to notify users which types of personal information are accessed by the app, to determine whether these privacy policies are trustworthy or not, we propose a novel approach to automatically identify five kinds of problems in privacy policy. (3) For those apps that do not provide privacy policies, we develop a novel system to automatically construct correct and readable descriptions to facilitate the generation of privacy policy for Android apps. (4) We propose a system to determine if the app complies with privacy requirements or not.
For (1), to remove the false alerts of state-of-the-art systems identifying privacy issues of apps (e.g., AutoCog, CHABADA), we develop a system TAPVerifier, which automatically analyzing the bytecode to discover the over-claimed permissions and extracting the behaviors of accessing personal information from privacy policy. The result shows that our system can remove up to 59.4 percent false alerts of the state-of-the-art systems. If the users describing the function errors in user reviews, we propose a system ReviewSolver to locate these errors in app bytecode by exploiting the context information in user reviews and then correlating the reviews and bytecode through their semantic meanings. The results show that ReviewSolver outperforms ChangeAdvisor in terms of correctly mapping more reviews to code. For (2), we develop a system PPChecker that employs NLP techniques to analyze privacy policies, and adopts program analysis approaches to inspect apps to identify five kinds of problems in privacy policy. Applying PPChecker to 2,500 popular apps, we find that 1,850 apps (i.e., 74.0%) have at least one kind of problems. For (3), to generate privacy policy for apps, we further propose a system AutoPPG that conducts static code analysis to characterize its behaviors related to users' personal information, and then applies NLP techniques to generating correct and accessible sentences for describing these behaviors. Experimental results indicate that: AutoPPG creates correct and easy-to-understand descriptions for privacy policies; the privacy policies constructed by AutoPPG usually reveal more operations related to users' personal information than existing privacy policies. For (4), we first summarize existing privacy requirements from both governments and app markets. Then, we propose a system PrivacyPromoter, to check whether the app bytecode and privacy policy are compliant with these privacy requirements or not. The experimental results show that our system can detect violations with high precision and recall rate.
Rights: All rights reserved
Access: open access

Files in This Item:
File Description SizeFormat 
5846.pdfFor All Users5.59 MBAdobe PDFView/Open

Copyright Undertaking

As a bona fide Library user, I declare that:

  1. I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
  2. I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
  3. I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.

By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.

Show full item record

Please use this identifier to cite or link to this item: