The rate at which humanity is producing data has increased sig- nificantly over the last decade. As organizations generate unprece- dented amounts of data, storing, cleaning, integrating, and ana- lyzing this data consumes significant (human and computational) resources. At the same time organizations extract significant value from their data. In this work, we present our vision for develop- ing an objective metric for the value of data based on the recently introduced concept of data relevance, outline proposals for how to efficiently compute and maintain such metrics, and how to utilize data value to improve data management including storage organi- zation, query performance, intelligent allocation of data collection and curation efforts, improving data catalogs, and for making pric- ing decisions in data markets. While we mostly focus on tabular data, the concepts we introduce can also be applied to other data models such as semi-structure data (e.g., JSON) or property graphs. Furthermore, we discuss strategies for dealing with data and work- loads that evolve and discuss how to deal with data that is currently not relevant, but has potential value (we refer to this as dark data). Furthermore, we sketch ideas for measuring the value that a query / workload has for an organization and reason about the interaction between query and data value. 
                        more » 
                        « less   
                    
                            
                            A Balanced Scorecard for Maximizing Data Performance
                        
                    
    
            A good performance monitoring system is crucial to knowing whether an organization's efforts are making their data capabilities better, the same, or worse. However, comprehensive performance measurements are costly. Organizations need to expend time, resources, and personnel to design the metrics, to gather evidence for the metrics, to assess the metrics' value, and to determine if any actions should be taken as a result of those metrics. Consequently organizations need to be strategic in selecting their portfolio of performance indicators for evaluating how well their data initiatives are producing value to the organization. This paper proposes a balanced scorecard approach to aid organizations in designing a set of meaningful and coordinated metrics for maximizing the potential of their data assets. This paper also discusses implementation challenges and the need for further research in this area. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1946391
- PAR ID:
- 10496979
- Publisher / Repository:
- Frontiers
- Date Published:
- Journal Name:
- Frontiers in Big Data
- Volume:
- 5
- ISSN:
- 2624-909X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            There are two strategic and longstanding questions about cyber risk that organizations largely have been unable to answer: What is an organization's estimated risk exposure and how does its security compare with peers? Answering both requires industry-wide data on security posture, incidents, and losses that, until recently, have been too sensitive for organizations to share. Now, privacy enhancing technologies (PETs) such as cryptographic computing can enable the secure computation of aggregate cyber risk metrics from a peer group of organizations while leaving sensitive input data undisclosed. As these new aggregate data become available, analysts need ways to integrate them into cyber risk models that can produce more reliable risk assessments and allow comparison to a peer group. This paper proposes a new framework for benchmarking cyber posture against peers and estimating cyber risk within specific economic sectors using the new variables emerging from secure computations. We introduce a new top-line variable called the Defense Gap Index representing the weighted security gap between an organization and its peers that can be used to forecast an organization's own security risk based on historical industry data. We apply this approach in a specific sector using data collected from 25 large firms, in partnership with an industry ISAO, to build an industry risk model and provide tools back to participants to estimate their own risk exposure and privately compare their security posture with their peers.more » « less
- 
            Abstract The future of work will be measured. The increasing and widespread adoption of analytics, the use of digital inputs and outputs to inform organizational decision making, makes the communication of data central to organizing. This article applies and extends signaling theory to provide a framework for the study of analytics as communication. We report three cases that offer examples of dubious, selective, and ambiguous signaling in the activities of workers seeking to shape the meaning of data within the practice of analytics. The analysis casts the future of work as a game of strategic moves between organizations, seeking to measure behaviors and quantify the performance of work, and workers, altering their behavioral signaling to meet situated goals. The framework developed offers a guide for future examinations of the asymmetric relationship between management and workers as organizations adopt metrics to monitor and evaluate work.more » « less
- 
            Fair and Efficient Allocation Algorithms Do Not Require Knowing Exact Item Values Food rescue organizations are tasked with allocating often-unpredictable donations to recipients who need it. For a large class of recipient valuation functions, this can be done in a fair and efficient manner as long as each recipient reports their value for each arriving donation. In practice, however, such valuations are rarely elicited. In “Dynamic Fair Division with Partial Information,” Benadè, Halpern, and Psomas ask whether simultaneous fairness and efficiency remain possible when the allocator receives limited information about recipient valuations, even as little as a single binary signal. For recipients with i.i.d. or correlated values, the paper provides an algorithm which is envy-free and 1-epsilon welfare-maximizing with high probability. Asymptotically tight results are also established for independent, nonidentical agents. This shows that fair and efficient online allocation algorithms do not critically rely on recipients being able to precisely report their utility functions.more » « less
- 
            In this paper, the impact of various data integrity attacks on electric drive systems of electric vehicles is analyzed. The cyber-physical models of power electronics and electric drives are firstly proposed to investigate the interaction between physical systems and cyber systems. Then, a few predefined performance metrics are introduced, which are needed to evaluate the impact of data integrity attacks on power electronics and electric drives. The simulation is conducted to quantitatively analyze the impact under different attack scenarios. Simulation results show that the metrics are greatly impacted by data integrity attacks and have obvious features different from the ones under healthy conditions. For example, the current distortion could be increased by over 70% by maliciously reducing the current feedback signal to 10% of the original value and the torque ripple could be increased up to 300% of the healthy value by similar attacks.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    