skip to main content


Title: Learning Implicitly with Noisy Data in Linear Arithmetic

Robust learning in expressive languages with real-world data continues to be a challenging task. Numerous conventional methods appeal to heuristics without any assurances of robustness. While probably approximately correct (PAC) Semantics offers strong guarantees, learning explicit representations is not tractable, even in propositional logic. However, recent work on so-called “implicit learning has shown tremendous promise in terms of obtaining polynomial-time results for fragments of first-order logic. In this work, we extend implicit learning in PAC-Semantics to handle noisy data in the form of intervals and threshold uncertainty in the language of linear arithmetic. We prove that our extended framework keeps the existing polynomial-time complexity guarantees. Furthermore, we provide the first empirical investigation of this hitherto purely theoretical framework. Using benchmark problems, we show that our implicit approach to learning optimal linear programming objective constraints significantly outperforms an explicit approach in practice.

 
more » « less
Award ID(s):
1939677
NSF-PAR ID:
10297331
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Page Range / eLocation ID:
1410 to 1417
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Robustly learning in expressive languages with real-world data continues to be a challenging task. Numerous conventional methods appeal to heuristics without any assurances of robustness. While PAC-Semantics offers strong guarantees, learning explicit representations is not tractable even in a propositional setting. However, recent work on so-called "implicit" learning has shown tremendous promise in terms of obtaining polynomial-time results for fragments of first-order logic. In this work, we extend implicit learning in PAC-Semantics to handle noisy data in the form of intervals and threshold uncertainty in the language of linear arithmetic. We prove that our extended framework keeps the existing polynomial-time complexity guarantees. Furthermore, we provide the first empirical investigation of this hitherto purely theoretical framework. Using benchmark problems, we show that our implicit approach to learning optimal linear programming objective constraints significantly outperforms an explicit approach in practice. 
    more » « less
  2. We consider the problem of answering queries about formulas of first-order logic based on background knowledge partially represented explicitly as other formulas, and partially represented as examples independently drawn from a fixed probability distribution. PAC semantics, introduced by Valiant, is one rigorous, general proposal for learning to reason in formal languages: although weaker than classical entailment, it allows for a powerful model theoretic framework for answering queries while requiring minimal assumptions about the form of the distribution in question. To date, however, the most significant limitation of that approach, and more generally most machine learning approaches with robustness guarantees, is that the logical language is ultimately essentially propositional, with finitely many atoms. Indeed, the theoretical findings on the learning of relational theories in such generality have been resoundingly negative. This is despite the fact that first-order logic is widely argued to be most appropriate for representing human knowledge. In this work, we present a new theoretical approach to robustly learning to reason in first-order logic, and consider universally quantified clauses over a countably infinite domain. Our results exploit symmetries exhibited by constants in the language, and generalize the notion of implicit learnability to show how queries can be computed against (implicitly) learned first-order background knowledge. 
    more » « less
  3. The tension between deduction and induction is perhaps the most fundamental issue in areas such as philosophy, cognition, and artificial intelligence. In an influential paper,Valiantrecognized that the challenge of learning should be integrated with deduction. In particular, he proposed a semantics to capture the quality possessed by the output ofprobably approximately correct(PAC) learning algorithms when formulated in a logic. Although weaker than classical entailment, it allows for a powerful model-theoretic framework for answering queries. In this paper, we provide a new technical foundation to demonstrate PAC learning with multi-agent epistemic logics. To circumvent the negative results in the literature on the difficulty of robust learning with the PAC semantics, we consider so-called implicit learning where we are able to incorporate observations to the background theory in service of deciding the entailment of an epistemic query. We prove correctness of the learning procedure and discuss results on the sample complexity, that is how many observations we will need to provably assert that the query is entailed given a user-specified error bound. Finally, we investigate under what circumstances this algorithm can be made efficient. On the last point, given that reasoning in epistemic logics especially in multi-agent epistemic logics is PSPACE-complete, it might seem like there is no hope for this problem. We leverage some recent results on the so-calledRepresentation Theoremexplored for single-agent and multi-agent epistemic logics with theonly knowingoperator to reduce modal reasoning to propositional reasoning. 
    more » « less
  4. In this paper, we consider the linear convection-diffusion equation in one dimension with periodic boundary conditions, and analyze the stability of fully discrete methods that are defined with local discontinuous Galerkin (LDG) methods in space and several implicit-explicit (IMEX) Runge-Kutta methods in time. By using the forward temporal differences and backward temporal differences, respectively, we establish two general frameworks of the energy-method based stability analysis. From here, the fully discrete schemes being considered are shown to have monotonicity stability, i.e. theL2L^2norm of the numerical solution does not increase in time, under the time step conditionτ<#comment/>≤<#comment/>F(h/c,d/c2)\tau \le \mathcal {F}(h/c, d/c^2), with the convection coefficientcc, the diffusion coefficientdd, and the mesh sizehh. The functionF\mathcal {F}depends on the specific IMEX temporal method, the polynomial degreekkof the discrete space, and the mesh regularity parameter. Moreover, the time step condition becomesτ<#comment/>≲<#comment/>h/c\tau \lesssim h/cin the convection-dominated regime and it becomesτ<#comment/>≲<#comment/>d/c2\tau \lesssim d/c^2in the diffusion-dominated regime. The result is improved for a first order IMEX-LDG method. To complement the theoretical analysis, numerical experiments are further carried out, leading to slightly stricter time step conditions that can be used by practitioners. Uniform stability with respect to the strength of the convection and diffusion effects can especially be relevant to guide the choice of time step sizes in practice, e.g. when the convection-diffusion equations are convection-dominated in some sub-regions.

     
    more » « less
  5. In this paper we present a method based on linear programming that facilitates reliable safety verification of hybrid dynamical systems subject to perturbation inputs over the infinite time horizon. The verification algorithm applies the probably approximately correct (PAC) learning framework and consequently can be regarded as statistically formal verification in the sense that it provides formal safety guarantees expressed using error probabilities and confidences. The safety of hybrid systems in this framework is verified via the computation of so-called PAC barrier certificates, which can be computed by solving a linear programming problem. Based on scenario approaches, the linear program is constructed by a family of independent and identically distributed state samples. In this way we can conduct verification of hybrid dynamical systems that existing methods are not capable of dealing with. Some preliminary experiments demonstrate the performance of our approach. 
    more » « less