skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Delegating Data Collection in Decentralized Machine Learning
Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve $$1-1/\epsilon$$ fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also analyze the optimal utility and linear contracts for the more complex setting of multiple interactions.  more » « less
Award ID(s):
2145898
PAR ID:
10573556
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve $$1-1/\epsilon$$ fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also analyze the optimal utility and linear contracts for the more complex setting of multiple interactions. 
    more » « less
  2. Feldt, Robert; Zimmermann, Thomas; Basili, Victor R; Briand, Lionel C (Ed.)
    Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow, Scikit-learn, Keras, and PyTorch. For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages. 
    more » « less
  3. Public blockchains have spurred the growing popularity of decentralized transactions and smart contracts, especially on the financial market. However, public blockchains exhibit their limitations on the transaction throughput, storage availability, and compute capacity. To avoid transaction gridlock, public blockchains impose large fees and per-block resource limits, making it difficult to accommodate the ever-growing high transaction demand. Previous research endeavors to improve the scalability and performance of blockchain through various technologies, such as side-chaining, sharding, secured off-chain computation, communication network optimizations, and efficient consensus protocols. However, these approaches have not attained a widespread adoption due to their inability in delivering a cloud-like performance, in terms of the scalability in transaction throughput, storage, and compute capacity. In this work, we determine that the major obstacle to public blockchain scalability is their underlying unstructured P2P networks. We further show that a centralized network can support the deployment of decentralized smart contracts. We propose a novel approach for achieving scalable decentralization: instead of trying to make blockchain scalable, we deliver decentralization to already scalable cloud by using an Ethereum smart contract. We introduce Blockumulus, a framework that can deploy decentralized cloud smart contract environments using a novel technique called overlay consensus. Through experiments, we demonstrate that Blockumulus is scalable in all three dimensions: computation, data storage, and transaction throughput. Besides eliminating the current code execution and storage restrictions, Blockumulus delivers a transaction latency between 2 and 5 seconds under normal load. Moreover, the stress test of our prototype reveals the ability to execute 20,000 simultaneous transactions under 26 seconds, which is on par with the average throughput of worldwide credit card transactions. 
    more » « less
  4. null (Ed.)
    We consider the decentralized control of a discretetime, linear system subject to exogenous disturbances and polyhedral constraints on the state and input trajectories. The underlying system is composed of a finite collection of dynamically coupled subsystems, where each subsystem is assumed to have a dedicated local controller. The decentralization of information is expressed according to sparsity constraints on the state measurements that each local controller has access to. In this context, we investigate the design of decentralized controllers that are affinely parameterized in their measurement history. For problems with partially nested information structures, the optimization over such policy spaces is known to be convex. Convexity is not, however, guaranteed under more general (nonclassical) information structures in which the information available to one local controller can be affected by control actions that it cannot access or reconstruct. With the aim of alleviating the nonconvexity that arises in such problems, we propose an approach to decentralized control design where the information-coupling states are effectively treated as disturbances whose trajectories are constrained to take values in ellipsoidal contract sets whose location, scale, and orientation are jointly optimized with the underlying affine decentralized control policy. We establish a natural structural condition on the space of allowable contracts that facilitates the joint optimization over the control policy and the contract set via semidefinite programming. 
    more » « less
  5. Modern smart grid systems exploit a two-way interaction paradigm between the utility and the electricity user and promote the role of prosumer, as a new user type, able to generate and sell energy, or consume energy. Within such a setting, the prosumers and their interactions with the microgrid system become of high significance for its efficient operation. In this article, to model the corresponding interactions, we introduce a labor economics-based framework by exploiting the principles of contract theory, that jointly achieves the satisfaction of the various interacting system entities, i.e., the microgrid operator (MGO) and the prosumers. The MGO offers personalized rewards to the sellers and buyers, to incentivize them to sell and purchase energy, respectively. To provide a stable and efficient operation point, while aiming at jointly satisfying the profit and requirements of the involved competing parties, optimal personalized contracts, i.e., rewards and amount of sold/purchased energy, are determined, by formulating and solving contract-theoretic optimization problems between the MGO and the sellers or buyers. The analysis is provided for both cases of complete and incomplete information availability regarding the prosumers’ types. Detailed numerical results are presented to demonstrate the operation characteristics of the proposed framework under diverse scenarios. 
    more » « less