|Title:||Towards laying the foundation of firmware analysis|
|Advisors:||Luo, Xiapu (COMP)|
|Department:||Department of Computing|
|Pages:||xxvi, 149 pages : color illustrations|
|Abstract:||Embedded devices are becoming ubiquitous. Meanwhile, there is a pressing need to perform security assessments for the software (i.e., firmware) of these devices. Static analysis and dynamic analysis are widely used to conduct firmware analysis. This thesis aims to lay the foundation of firmware analysis. Specifically, this thesis explores the limitations and implementation errors of the both static and dynamic firmware analysis tools, makes enhancements to the stability and reliability of these tools, and proposes the new technique and analysis framework to increase the capability and scalability of firmware analysis tools.|
Due to different types of peripherals, emulating the firmware of the embedded devices in scale, which supports the dynamic analysis, is challenging. Therefore, static analysis is still widely used. To conduct static analysis, existing works usually leverage off-the-shelf tools to disassemble stripped binaries and (implicitly) assume that reliably disassembling binaries is a solved problem. However, whether this assumption really holds is unknown. We conduct the first comprehensive study on ARM disassembly tools as ARM is becoming the dominant architecture among the embedded devices. Specifically, we build 1,896 ARM binaries (including 248 obfuscated ones) with different compilers, compiling options, and obfuscation methods. We then evaluate them using eight state-of-the-art ARM disassembly tools (including both commercial and noncommercial ones) in different versions on their capabilities to locate instruction boundary, function boundary, and function signature. Instruction and function boundary are two fundamental primitives upon which the other primitives build while function signature is significant for control flow integrity (CFI) techniques. Our work reveals some observations that have not been systematically summarized and/or confirmed. For instance, we find that the existence of both ARM and Thumb instruction sets, and the reuse of the BL instruction for both function calls and branches bring serious challenges to disassembly tools. Our evaluation sheds light on the limitations of state-of-the-art disassembly tools and points out potential directions for improvement.
Apart from the widely used static analysis, different dynamic analysis frameworks, which are based on the full-system emulator (i.e., QEMU) are proposed for firmware analysis. Emulator is widely used to build dynamic analysis frameworks due to its fine-grained tracing capability, full system monitoring functionality, and scalability of running on different operating systems and architectures. However, whether the emulator is consistent with real devices is unknown. To understand this problem, we aim to automatically locate inconsistent instructions, which behave differently between emulators and real devices. We target ARM architecture, which provides machine readable specification. Based on the specification, we propose a test case generator by designing and implementing the first symbolic execution engine for ARM architecture specification language (ASL). We generate 2,774,649 representative instruction streams and conduct differential testing with these instruction streams between four ARM real devices in different architecture versions (i.e., ARMv5, ARMv6, ARMv7, and ARMv8) and three state-of-the-art emulators (i.e., QEMU, Unicorn, and Angr). We locate a huge number of inconsistent instruction streams (171,857 for QEMU, 223,264 for Unicorn, and 120,169 for Angr). We find undefined implementation in ARM manual and implementation bugs of QEMU are the major causes of inconsistencies. Furthermore, we discover 12 bugs, which influence commonly used instructions (e.g., BLX). With the inconsistent instructions, we build three security applications and demonstrate the capability of these instructions on detecting emulators, anti-emulation, and anti-fuzzing.
Though many dynamic firmware analysis frameworks are proposed, booting the Linux kernel (we call this process rehosting the Linux kernel in this thesis.) of embedded device in QEMU is still an unsolved problem. That's because embedded devices usually use different system-on-chips (SoCs) from multiple vendors and only a limited number of SoCs are currently supported in QEMU. To increase the scalability of the dynamic firmware analysis frameworks, we propose a technique called peripheral transplantation. The main idea is to transplant the device drivers of designated peripherals into the Linux kernel. By doing so, it can replace the peripherals in the kernel that are currently unsupported in QEMU with supported ones, thus making the Linux kernel rehostable. After that, various applications can be built upon. We implemented this technique inside a prototype system called ECMO and applied it to 815 firmware images, which consist of 20 kernel versions, 37 device models, and 24 vendors. The result shows that ECMO can successfully transplant peripherals for all the 815 Linux kernels. Among them, 710 kernels can be successfully rehosted, i.e., launching a user-space shell (87.1% success rate). The failed cases are mainly because the root file system format (ramfs) is not supported by the kernel. We further build three applications, i.e., kernel crash analysis, rootkit forensic analysis, and kernel fuzzing, based on the rehosted kernels to demonstrate the usage scenarios of ECMO.
|Rights:||All rights reserved|
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: