Smartphones enjoy high adoption rates around the globe. Rarely more than an arm’s length away, these sensor-rich devices can easily be repurposed to collect rich and extensive records of their users’ behaviors (e.g., location, communication, media consumption), posing serious threats to individual privacy. Here we examine the extent to which individuals’ Big Five personality dimensions can be predicted on the basis of six different classes of behavioral information collected via sensor and log data harvested from smartphones. Taking a machine-learning approach, we predict personality at broad domain ( r median = 0.37) and narrow facet levels ( r median = 0.40) based on behavioral data collected from 624 volunteers over 30 consecutive days (25,347,089 logging events). Our cross-validated results reveal that specific patterns in behaviors in the domains of 1) communication and social behavior, 2) music consumption, 3) app usage, 4) mobility, 5) overall phone activity, and 6) day- and night-time activity are distinctively predictive of the Big Five personality traits. The accuracy of these predictions is similar to that found for predictions based on digital footprints from social media platforms and demonstrates the possibility of obtaining information about individuals’ private traits from behavioral patterns passively collected from their smartphones. Overall, our results point to both the benefits (e.g., in research settings) and dangers (e.g., privacy implications, psychological targeting) presented by the widespread collection and modeling of behavioral data obtained from smartphones. 
                        more » 
                        « less   
                    
                            
                            A Framework for Autonomic Computing for In Situ Imageomics
                        
                    
    
            In situ imageomics is a new approach to study ecological, biological and evolutionary systems wherein large image and video data sets are captured in the wild and machine learning methods are used to infer biological traits of individual organisms, animal social groups, species, and even whole ecosystems. Monitoring biological traits over large spaces and long periods of time could enable new, data-driven approaches to wildlife conservation, biodiversity, and sustainable ecosystem management. However, to accurately infer biological traits, machine learning methods for images require voluminous and high quality data. Adaptive, data-driven approaches are hamstrung by the speed at which data can be captured and processed. Camera traps and unmanned aerial vehicles (UAVs) produce voluminous data, but lose track of individuals over large areas, fail to capture social dynamics, and waste time and storage on images with poor lighting and view angles. In this vision paper, we make the case for a research agenda for in situ imageomics that depends on significant advances in autonomic and self-aware computing. Precisely, we seek autonomous data collection that manages camera angles, aircraft positioning, conflicting actions for multiple traits of interest, energy availability, and cost factors. Given the tools to detect object and identify individuals, we propose a research challenge: Which optimization model should the data collection system employ to accurately identify, characterize, and draw inferences from biological traits while respecting a budget? Using zebra and giraffe behavioral data collected over three weeks at the Mpala Research Centre in Laikipia County, Kenya, we quantify the volume and quality of data collected using existing approaches. Our proposed autonomic navigation policy for in situ imageomics collection has an F1 score of 82% compared to an expert pilot, and provides greater safety and consistency, suggesting great potential for state-of-the-art autonomic approaches if they can be scaled up to fully address the problem. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2118240
- PAR ID:
- 10530238
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3503-3744-0
- Page Range / eLocation ID:
- 11 to 16
- Subject(s) / Keyword(s):
- Social groups Ecosystems Wildlife Machine learning Data collection Cameras Object recognition
- Format(s):
- Medium: X
- Location:
- Toronto, ON, Canada
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The availability of large datasets of organism images combined with advances in artificial intelligence (AI) has significantly enhanced the study of organisms through images, unveiling biodiversity patterns and macro-evolutionary trends. However, existing machine learning (ML)-ready organism datasets have several limitations. First, these datasets often focus on species classification only, overlooking tasks involving visual traits of organisms. Second, they lack detailed visual trait annotations, like pixel-level segmentation, that are crucial for in-depth biological studies. Third, these datasets predominantly feature organisms in their natural habitats, posing challenges for aquatic species like fish, where underwater images often suffer from poor visual clarity, obscuring critical biological traits. This gap hampers the study of aquatic biodiversity patterns which is necessary for the assessment of climate change impacts, and evolutionary research on aquatic species morphology. To address this, we introduce the Fish-Visual Trait Analysis (Fish-Vista) dataset—a large, annotated collection of about 80K fish images spanning 3000 different species, supporting several challenging and biologically relevant tasks including species classification, trait identification, and trait segmentation. These images have been curated through a sophisticated data processing pipeline applied to a cumulative set of images obtained from various museum collections. Fish-Vista ensures that visual traits of images are clearly visible, and provides fine-grained labels of various visual traits present in each image. It also offers pixel-level annotations of 9 different traits for about 7000 fish images, facilitating additional trait segmentation and localization tasks. The ultimate goal of Fish-Vista is to provide a clean, carefully curated, high-resolution dataset that can serve as a foundation for accelerating biological discoveries using advances in AI. Finally, we provide a comprehensive analysis of state-of-the-art deep learning techniques on Fish-Vista.more » « less
- 
            Challenge : Most plant imaging systems focus predominantly on monitoring morphological traits. The challenge is to relate color information to measurements of physiological processes. Question: Can the color of individual leaves be measured and quantified over time to infer physiological information about the plant? Solution: We developed the open source and affordable plant phenotyping software pipeline for Arabidopsis thaliana. SMART (Speedy Measurement of Arabidopsis Rosette Traits) that integrates a new color analysis algorithm to measure leaf surface temperature, leaf wilting and zinc toxicity over time. Data Collection: We used public datasets to develop the algorithm [1] and validate morphological measurements. We also collected top-view images of the Arabidopsis rosette with the Open-Leafmore » « less
- 
            null (Ed.)We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep network-based regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target--making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.more » « less
- 
            We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep network-based regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target--making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    