skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Be sure to use the same writing style: Applying Authorship Verification on LLM-Generated Texts
The repository link contains a README which gives an overview of the files along with the structure of the data.  Additionally, for LLAMA and GPT2, the files are in human_{llm_name}{i}.jsonl format where {llm} is the name of the LLM and {i} is the partition of the file and which can be concatenated to form the full dataset for that llm.  more » « less
Award ID(s):
2341206
PAR ID:
10591687
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Zenodo
Date Published:
Format(s):
Medium: X
Location:
Zenodo
Right(s):
Creative Commons Attribution 4.0 International
Sponsoring Org:
National Science Foundation
More Like this
  1. Codes and data for "Large language models design sequence-defined macromolecules via evolutionary optimization" Note this repository contains codes and data files for the manuscript. This is a snapshot of the repository, frozen at the time of submission. Codes: LLM codes, other algorithms, postprocessing, visualization Data files: prompts, models, embeddings, LLM responses 
    more » « less
  2. Large Language Models (LLMs) have demonstrated significant potential across various applications, but their use as AI copilots in complex and specialized tasks is often hindered by AI hallucinations, where models generate outputs that seem plausible but are incorrect. To address this challenge, we develop AutoFEA, an intelligent system that integrates LLMs with Finite Element Analysis (FEA) to automate the generation of FEA input files. Our approach features a novel planning method and a graph convolutional network (GCN)-Transformer Link Prediction retrieval model, which enhances the accuracy and reliability of the generated simulations. The AutoFEA system proceeds with key steps: dataset preparation, step-by-step planning, GCN-Transformer Link Prediction retrieval, LLM-driven code generation, and simulation using CalculiX. In this workflow, the GCN-Transformer model predicts and retrieves relevant example codes based on relationships between different steps in the FEA process, guiding the LLM in generating accurate simulation codes. We validate AutoFEA using a specialized dataset of 512 meticulously prepared FEA projects, which provides a robust foundation for training and evaluation. Our results demonstrate that AutoFEA significantly reduces AI hallucinations by grounding LLM outputs in physically accurate simulation data, thereby improving the success rate and accuracy of FEA simulations and paving the way for future advancements in AI-assisted engineering tasks. 
    more » « less
  3. Topologically Interlocked Material systems are a class of architectured materials. TIM systems are assembled from individual building blocks and are confined by an external frame. In particular, 2D, plate-type assemblies are considered. This publication contains files for the numerical analysis of the mechanical behavior of TIM systems through the use of finite element analysis. ABAQUS model files (inp format) for the study of the chiral/achiral response are provided. Files chirality_s1_in.inp are for type I square assemblies. n=3,5,7,9 Files chirality_s2_in.inp are for type II square assemblies. n=4,6,8,10 Files chirality_h1_in.inp are for type I hexagon assemblies. n=2,3,4,5 Files chirality_h2_in.inp are for type II hexagon assemblies. n=2,3,4,5 File chirality_s1i5_center_dissection.inp is for an assembly with a dissection of the central tile of type I square assembly with n=5. File chirality_s2i6_center_dissection.inp is for an assembly with a dissection of the central tile of type II square assembly with n=6. File chirality_s1i5_center_surrounding_dissection.inp is for an assembly with dissections of the tiles surrounding the center tile of type I square assembly with n=5. File chirality_h1i3_center_dissection.inp is for an assembly with a dissection of the central tile of type I hexagon assembly with n=3. File chirality_h2i3_center_dissection.inp is for an assembly with a dissection of the central tile of type II hexagon assembly with n=3. File chirality_h1i3_center_surrounding_dissection.inp is for an assembly with dissections of the tiles surrounding the center tile of type I hexagon assembly with n=3. 
    more » « less
  4. Many scientific applications operate on data sets that span hundreds of Gigabytes or even Terabytes in size. Large data sets often use compression to reduce the size of the files. Yet as of today, parallel I/O libraries do not support reading and writing compressed files, necessitating either expensive sequential compression/decompression operations before/after the simulation, or omitting advanced features of parallel I/O libraries, such as collective I/O operations. This paper introduces parallel I/O on compressed data files, discusses the key challenges, requirements, and solutions for supporting compressed data files in MPI I/O, as well as limitations on some MPI I/O operations when using compressed data files. The paper details handling of individual read and write operations of compressed data files, and presents an extension to the two-phase collective I/O algorithm to support data compression. The paper further presents and evaluates an implementation based on the Snappy compression library and the OMPIO parallel I/O framework. The performance evaluation using multiple data sets demonstrate significant performance benefits when using data compression on a parallel BeeGFS file system. 
    more » « less
  5. Although Large Language Models (LLMs) succeed in human-guided conversations such as instruction following and question answering, the potential of LLM-guided conversations—where LLMs direct the discourse and steer the conversation’s objectives—remains largely untapped. In this study, we provide an exploration of the LLM-guided conversation paradigm. Specifically, we first characterize LLM-guided conversation into three fundamental properties: (i) Goal Navigation; (ii) Context Management; (iii) Empathetic Engagement, and propose GUIDELLM as a general framework for LLM-guided conversation. We then implement an autobiography interviewing environment as one of the demonstrations of GuideLLM, which is a common practice in Reminiscence Therapy. In this environment, various techniques are integrated with GUIDELLM to enhance the autonomy of LLMs, such as Verbalized Interview Protocol (VIP) and Memory Graph Extrapolation (MGE) for goal navigation, and therapy strategies for empathetic engagement. We compare GUIDELLM with baseline LLMs, such as GPT-4-turbo and GPT-4o, from the perspective of interviewing quality, conversation quality, and autobiography generation quality. Experimental results encompassing both LLM-as-a-judge evaluations and human subject experiments involving 45 participants indicate that GUIDELLM significantly outperforms baseline LLMs in the autobiography interviewing task. 
    more » « less