Despite over a decade of research, it is still challenging for mobile UI testing tools to achieve satisfactory effectiveness, especially on industrial apps with rich features and large code bases. Our experiences suggest that existing mobile UI testing tools are prone to exploration tarpits, where the tools get stuck with a small fraction of app functionalities for an extensive amount of time. For example, a tool logs out an app at early stages without being able to log back in, and since then the tool gets stuck with exploring the app's pre-login functionalities (i.e., exploration tarpits) instead of its main functionalities. While tool vendors/users can manually hardcode rules for the tools to avoid specific exploration tarpits, these rules can hardly generalize, being fragile in face of diverted testing environments and fast app iterations. To identify and resolve exploration tarpits, we propose VET, a general approach including a supporting system for the given specific Android UI testing tool on the given specific app under test (AUT). VET runs the tool on the AUT for some time and records UI traces, based on which VET identifies exploration tarpits by recognizing their patterns in the UI traces. VET then pinpoints the actions (e.g.,more »
An infrastructure approach to improving effectiveness of Android UI testing tools
Due to the importance of Android app quality assurance, many Android UI testing tools have been developed by researchers over the years. However, recent studies show that these tools typically achieve low code coverage on popular industrial apps. In fact, given a reasonable amount of run time, most state-of-the-art tools cannot even outperform a simple tool, Monkey, on popular industrial apps with large codebases and sophisticated functionalities. Our motivating study finds that these tools perform two types of operations, UI Hierarchy Capturing (capturing information about the contents on the screen) and UI Event Execution (executing UI events, such as clicks), often inefficiently using UIAutomator, a component of the Android framework. In total, these two types of operations use on average 70% of the given test time. Based on this finding, to improve the effectiveness of Android testing tools, we propose TOLLER, a tool consisting of infrastructure enhancements to the Android operating system. TOLLER injects itself into the same virtual machine as the app under test, giving TOLLER direct access to the app’s runtime memory. TOLLER is thus able to directly (1) access UI data structures, and thus capture contents on the screen without the overhead of invoking the Android framework services more »
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- 30th ACM SIGSOFT International Symposium on Software Testing and Analysis
- Page Range or eLocation-ID:
- 165 to 176
- Sponsoring Org:
- National Science Foundation
More Like this
Writing and maintaining UI tests for mobile apps is a time-consuming and tedious task. While decades of research have produced auto- mated approaches for UI test generation, these approaches typically focus on testing for crashes or maximizing code coverage. By contrast, recent research has shown that developers prefer usage-based tests, which center around specific uses of app features, to help support activities such as regression testing. Very few existing techniques support the generation of such tests, as doing so requires automating the difficult task of understanding the semantics of UI screens and user inputs. In this paper, we introduce Avgust, which automates key steps of generating usage-based tests. Avgust uses neural models for image understanding to process video recordings of app uses to synthesize an app-agnostic state-machine encoding of those uses. Then, Avgust uses this encoding to synthesize test cases for a new target app. We evaluate Avgust on 374 videos of common uses of 18 popular apps and show that 69% of the tests Avgust generates successfully execute the desired usage, and that Avgust’s classifiers outperform the state of the art.
Why Johnny Can't Make Money With His Contents: Pitfalls of Designing and Implementing Content Delivery AppsMobile devices are becoming the default platform for multimedia content consumption. Such a thriving business ecosystem has drawn interests from content distributors to develop apps that can reach a large number of audience. The business-edge of content delivery apps crucially relies on being able to effectively arbitrate the purchase and delivery of contents, and govern the access of contents with respect to usage control policies, on a plethora of consumer devices. Content protection on mobile platforms, especially in the absence of Trusted Execution Environment (TEE), is a challenging endeavor where developers often have to resort to ad-hoc deterrence-based defenses. This work evaluates the effectiveness of content protection mechanisms embraced by vendors of content delivery apps, with respect to a hierarchy of adversaries with varying real-world capabilities. Our analysis of 141 vulnerable apps uncovered that, in many cases, due to developers’ unjustified trust assumptions about the underlying technologies, adversaries can obtain unauthorized and unrestricted access to contents of apps, sometimes without even needing to reverse engineer the deterrence-based defenses. Some weaknesses in the apps can also severely impact app users’ security and privacy. All our "findings have been responsibly disclosed to the corresponding app vendors.
Mobile devices are becoming a more common part of the education experience. Students can access their devices at any time to perform assignments or review material. Mobile apps can have the added advantage of being able to automatically grade student work and provide instantaneous feedback. However, numerous challenges remain in implementing effective mobile educational apps. One challenge is the small screen size of smartphones, which was a concern for a spatial visualization training app where students sketch isometric and orthographic drawings. This app was originally developed for iPads, but the wide prevalence of smartphones led to porting the software to iPhone and Android phones. The sketching assignments on a smartphone screen required more frequent zooming and panning, and one of the hypotheses of this study was that the educational effectiveness on smartphones was the same as on the larger screen sizes using iPad tablets. The spatial visualization mobile sketching app was implemented in a college freshman engineering graphics course to teach students how to sketch orthographic and isometric assignments. The app provides automatic grading and hint feedback to help students when they are stuck. Students in this pilot were assigned sketching problems as homework using their personal devices. Students weremore »
Cryptographic (crypto) algorithms are the essential ingredients of all secure systems: crypto hash functions and encryption algorithms, for example, can guarantee properties such as integrity and confidentiality. Developers, however, can misuse the application programming interfaces (API) of such algorithms by using constant keys and weak passwords. This paper presents CRYLOGGER, the first open-source tool to detect crypto misuses dynamically. CRYLOGGER logs the parameters that are passed to the crypto APIs during the execution and checks their legitimacy offline by using a list of crypto rules. We compare CRYLOGGER with CryptoGuard, one of the most effective static tools to detect crypto misuses. We show that our tool complements the results of CryptoGuard, making the case for combining static and dynamic approaches. We analyze 1780 popular Android apps downloaded from the Google Play Store to show that CRYLOGGER can detect crypto misuses on thousands of apps dynamically and automatically. We reverse-engineer 28 Android apps and confirm the issues flagged by CRYLOGGER. We also disclose the most critical vulnerabilities to app developers and collect their feedback.