skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: SafeNet: A Neural-Symbolic Network for Safe Planning in Robotic Systems using Formal Method-Guided LLM Fine-Tuning
Robotic systems present unique safety challenges due to their complex integration of computational and physical processes and direct interaction with humans and environments. Traditional approaches to robot safety planning either rely on conventional methods, which struggle with the complexity of modern robotic systems, or on pure machine learning techniques, which lack formal safety guarantees. While recent advances in Large Language Models (LLMs) offer promising capabilities, pre-trained LLMs alone lack the specific domain expertise required for effective robotic safety planning. This paper introduces SafeNet, a novel neural-symbolic network architecture that enhances LLMs' safety planning capabilities through formal method-guided fine-tuning for robotic applications. Our approach integrates formal logical knowledge and reward machines into pre-trained LLMs by carefully designed fine-tuning, creating a neural-symbolic approach that combines the flexibility of neural networks with the precision of formal methods for robot trajectory generation and task planning. Experimental results demonstrate significant improvements in safe trajectory generation for robotic systems, with planning success rates increasing from 1.17% to 91.60% for the block manipulation task and from 7.23% to 90.63% for the robotic path planning task.  more » « less
Award ID(s):
2442914 2333980
PAR ID:
10670645
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper addresses temporal logic task planning problems for mobile robots. We consider missions that require accomplishing multiple high-level sub-tasks, expressed in natural language (NL), in a temporal and logical order. To formally define the mission, we treat these sub-tasks as atomic predicates in a Linear Temporal Logic (LTL) formula. We refer to this task specification framework as LTL-NL. Our goal is to design plans, defined as sequences of robot actions, accomplishing LTL-NL tasks. This action planning problem cannot be solved directly by existing LTL planners due to the NL nature of atomic predicates. Therefore, we propose HERACLEs, a hierarchical neuro-symbolic planner that relies on a novel integration of (i) existing symbolic planners generating high-level task plans determining the order at which the NL sub-tasks should be accomplished; (ii) pre-trained Large Language Models (LLMs) to design sequences of robot actions for each sub-task in these task plans; and (iii) conformal prediction acting as a formal interface between (i) and (ii) and managing uncertainties due to LLM imperfections. We show, both theoretically and empirically, that HERACLEs can achieve user-defined mission success rates. We demonstrate the efficiency of HERACLEs through comparative numerical experiments against recent LLM-based planners as well as hardware experiments on mobile manipulation tasks. Finally, we present examples demonstrating that our approach enhances user-friendliness compared to conventional symbolic approaches. 
    more » « less
  2. Early forecasting of student performance in a course is a critical component of building effective intervention systems. However, when the available student data is limited, accurate early forecasting is challenging. We present a language generation transfer learning approach that leverages the general knowledge of pre-trained language models to address this challenge. We hypothesize that early forecasting can be significantly improved by fine-tuning large language models (LLMs) via personalization and contextualization using data on students' distal factors (academic and socioeconomic) and proximal non-cognitive factors (e.g., motivation and engagement), respectively. Results obtained from extensive experimentation validate this hypothesis and thereby demonstrate the prowess of personalization and contextualization for tapping into the general knowledge of pre-trained LLMs for solving the downstream task of early forecasting. 
    more » « less
  3. Fine-tuning a pre-trained model on a downstream task often degrades its original capabilities, a phenomenon known as "catastrophic forgetting". This is especially an issue when one does not have access to the data and recipe used to develop the pre-trained model. Under this constraint, most existing methods for mitigating forgetting are inapplicable. To address this challenge, we propose a sample weighting scheme for the fine-tuning data solely based on the pre-trained model's losses. Specifically, we upweight the easy samples on which the pre-trained model's loss is low and vice versa to limit the drift from the pre-trained model. Our approach is orthogonal and yet complementary to existing methods; while such methods mostly operate on parameter or gradient space, we concentrate on the sample space. We theoretically analyze the impact of fine-tuning with our method in a linear setting, showing that it stalls learning in a certain subspace which inhibits overfitting to the target task. We empirically demonstrate the efficacy of our method on both language and vision tasks. As an example, when fine-tuning Gemma 2 2B on MetaMathQA, our method results in only a 0.8% drop in accuracy on GSM8K (another math dataset) compared to standard fine-tuning, while preserving 5.4% more accuracy on the pre-training datasets. 
    more » « less
  4. Automated Planning and Scheduling is among the growing areas in Artificial Intelligence (AI) where mention of LLMs has gained popularity. Based on a comprehensive review of 126 papers, this paper investigates eight categories based on the unique applications of LLMs in addressing various aspects of planning problems: language translation, plan generation, model construction, multi-agent planning, interactive planning, heuristics optimization, tool integration, and brain-inspired planning. For each category, we articulate the issues considered and existing gaps. A critical insight resulting from our review is that the true potential of LLMs unfolds when they are integrated with traditional symbolic planners, pointing towards a promising neuro-symbolic approach. This approach effectively combines the generative aspects of LLMs with the precision of classical planning methods. By synthesizing insights from existing literature, we underline the potential of this integration to address complex planning challenges. Our goal is to encourage the ICAPS community to recognize the complementary strengths of LLMs and symbolic planners, advocating for a direction in automated planning that leverages these synergistic capabilities to develop more advanced and intelligent planning systems. We aim to keep the categorization of papers updated on https://ai4society.github.io/LLM-Planning-Viz/, a collaborative resource that allows researchers to contribute and add new literature to the categorization. 
    more » « less
  5. Transfer learning using ImageNet pre-trained models has been the de facto approach in a wide range of computer vision tasks. However, fine-tuning still requires task-specific training data. In this paper, we propose N3 (Neural Networks from Natural Language) - a new paradigm of synthesizing task-specific neural networks from language descriptions and a generic pre-trained model. N3 leverages language descriptions to generate parameter adaptations as well as a new task-specific classification layer for a pre-trained neural network, effectively “fine-tuning” the network for a new task using only language descriptions as input. To the best of our knowledge, N3 is the first method to synthesize entire neural networks from natural language. Experimental results show that N3 can out-perform previous natural-language based zero-shot learning methods across 4 different zero-shot image classification benchmarks. We also demonstrate a simple method to help identify keywords in language descriptions leveraged by N3 when synthesizing model parameters. 
    more » « less