skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Not Your Father's Big Data (Lighting Talk)
Embedded database libraries provide developers with a com- mon and convenient data persistence layer. They have spread to many systems, including interactive devices like smart- phones, appearing in all major mobile systems. Their perfor- mance affects the response times and resource consumption of millions of phone apps and billions of phone users. It is thus critical that we better understand how they work, so they can be used more efficiently, and so developers can make faster libraries. Mobile databases differ significantly from server-class storage in terms of platform, usage, and measurement. Phones are multi-tenant, end-user devices that the database must share with other apps. Contrary to traditional database design goals, workloads on phones are single-app, bursty, and rarely saturate the CPU. We argue that mobile storage design should refocus on what matters on the mobile platform: latency and energy. As accurate per- formance measurement tools are necessary to evaluation of good database design, this uncovers another issue: Tradi- tional database benchmarking methods produce misleading results when applied to mobile devices, due to evaluating performance at saturation. Development of databases and measurements specifically designed for the mobile platform is necessary to optimize user experience of the most common database usage in the world.  more » « less
Award ID(s):
1617586
PAR ID:
10175110
Author(s) / Creator(s):
Date Published:
Journal Name:
Conference on Innovative Data Systems Research (CIDR)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This work presents the first-ever detailed and large-scale measurement analysis of storage consumption behavior of applications (apps) on smart mobile devices. We start by carrying out a five-year longitudinal static analysis of millions of Android apps to study the increase in their sizes over time and identify various sources of app storage consumption. Our study reveals that mobile apps have evolved as large monolithic packages that are packed with features to monetize/engage users and optimized for performance at the cost of redundant storage consumption. We also carry out a mobile storage usage study with 140 Android participants. We built and deployed a lightweight context-aware storage tracing tool, called cosmos, on each participant's device. Leveraging the traces from our user study, we show that only a small fraction of apps/features are actively used and usage is correlated to user context. Our findings suggest a high degree of app feature bloat and unused functionality, which leads to inefficient use of storage. Furthermore, we found that apps are not constrained by storage quota limits, and developers freely abuse persistent storage by frequently caching data, creating debug logs, user analytics, and downloading advertisements as needed. Finally, drawing upon our findings, we discuss the need for efficient mobile storage management, and propose an elastic storage design to reclaim storage space when unused. We further identify research challenges and quantify expected storage savings from such a design. We believe our findings will be valuable to the storage research community as well as mobile app developers. 
    more » « less
  2. Use of mobile phones today has become pervasive throughout society. A common use of a phone involves calling another person using VoIP apps. However the OSes on mobile devices are prone to compromise creating a risk for users who want to have private conversations when calling someone. Mobile devices today provide a hardware-protected mode called trusted execution environment (TEE) to protect users from a compromised OS. In this paper we propose a design to allow a user to make a secure end-to-end protected VoIP call from a compromised mobile phone. We implemented our design, TruzCall using Android OS and TrustZone TEE running OP-TEE OS. We built a prototype using the TrustZone-enabled Hikey development board and tested our design using the open source VoIP app Linphone. Our testing utilizes a simulation based environment that allows a Hikey board to use a real phone for audio hardware. 
    more » « less
  3. Consumer mobile spyware apps covertly monitor a user's activities (i.e., text messages, phone calls, e-mail, location, etc.) and transmit that information over the Internet to support remote surveillance. Unlike conceptually similar apps used for state espionage, so-called stalkerware apps are mass-marketed to consumers on a retail basis and expose a far broader range of victims to invasive monitoring. Today the market for such apps is large enough to support dozens of competitors, with individual vendors reportedly monitoring hundreds of thousands of phones. However, while the research community is well aware of the existence of such apps, our understanding of the mechanisms they use to operate remains ad hoc. In this work, we perform an in-depth technical analysis of 14 distinct leading mobile spyware apps targeting Android phones. We document the range of mechanisms used to monitor user activity of various kinds (e.g., photos, text messages, live microphone access) — primarily through the creative abuse of Android APIs. We also discover previously undocumented methods these apps use to hide from detection and to achieve persistence. Additionally, we document the measures taken by each app to protect the privacy of the sensitive data they collect, identifying a range of failings on the part of spyware vendors (including privacy-sensitive data sent in the clear or stored in the cloud with little or no protection). 
    more » « less
  4. There has been a proliferation of mobile apps in the Medical, as well as Health&Fitness categories. These apps have a wide audience, from medical providers, to patients, to end users who want to track their fitness goals. The low barrier to entry on mobile app stores raises questions about the diligence and competence of the developers who publish these apps, especially regarding the practices they use for user data collection, processing, and storage. To help understand the nature of data that is collected, and how it is processed, as well as where it is sent, we developed a tool named PIT (Personal Information Tracker) and made it available as open source. We used PIT to perform a multi-faceted study on 2832 Android apps: 2211 Medical apps and 621 Health&Fitness apps. We first define Personal Information (PI) as 17 different groups of sensitive information, e.g., user’s identity, address and financial information, medical history or anthropometric data. PIT first extracts the elements in the app’s User Interface (UI) where this information is collected. The collected information could be processed by the app’s own code or third-party code; our approach disambiguates between the two. Next, PIT tracks, via static analysis, where the information is “leaked”, i.e., it escapes the scope of the app, either locally on the phone or remotely via the network. Then, we conduct a link analysis that examines the URLs an app connects with, to understand the origin and destination of data that apps collect and process. We found that most apps leak 1–5 PI items (email, credit card, phone number, address, name, being the most frequent). Leak destinations include the network (25%), local databases (37%), logs (23%), and files or I/O (15%). While Medical apps have more leaks overall, as they collect data on medical history, surprisingly, Health&Fitness apps also collect, and leak, medical data. We also found that leaks that are due to third-party code (e.g., code for ads, analytics, or user engagement) are much more numerous (2x–12x) than leaks due to app’s own code. Finally, our link analysis shows that most apps access 20–80 URLs (typically third-party URLs and Cloud APIs) though some apps could access more than 1,000 URLs. 
    more » « less
  5. Integration of third-party SDKs are essential in the development of mobile apps. However, the rise of in-app privacy threat against mobile SDKs — called cross-library data harvesting (XLDH), targets social media/platform SDKs (called social SDKs) that handles rich user data. Given the widespread integration of social SDKs in mobile apps, XLDH presents a significant privacy risk, as well as raising pressing concerns regarding legal compliance for app developers, social media/platform stakeholders, and policymakers. The emerging XLDH threat, coupled with the increasing demand for privacy and compliance in line with societal expectations, introduces unique challenges that cannot be addressed by existing protection methods against privacy threats or malicious code on mobile platforms. In response to the XLDH threats, in our study, we generalize and define the concept of privacypreserving social SDKs and their in-app usage, characterize fundamental challenges for combating the XLDH threat and ensuring privacy in design and utilization of social SDKs. We introduce a practical, clean-slate design and end-to-end systems, called PESP, to facilitate privacy-preserving social SDKs. Our thorough evaluation demonstrates its satisfactory effectiveness, performance overhead and practicability for widespread adoption. 
    more » « less