skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining
A challenging task when generating summaries of legal documents is the ability to address their argumentative nature. We introduce a simple technique to capture the argumentative structure of legal documents by integrating argument role labeling into the summarization process. Experiments with pretrained language models show that our proposed approach improves performance over strong baselines.  more » « less
Award ID(s):
2040490
PAR ID:
10383455
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 29th International Conference on Computational Linguistics
Page Range / eLocation ID:
6187-6194
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Cloud Legal documents, like Privacy Policies and Terms of Services (ToS), include key terms and rules that enable consumers to continuously monitor the performance of the cloud services used in their organization. To ensure high consumer confidence in the cloud service, it is necessary that these documents are clear and comprehensible to the average consumer. However, in practice, service providers often use legalese and ambiguous language in cloud legal documents resulting in consumers consenting or rejecting the terms without understanding the details. A measure capturing ambiguity in the texts of cloud service documents will enable consumers to decide if they understand what they are agreeing to, and deciding whether that service will meet their organizational requirements. It will also allow them to compare the service policies across various vendors. We have developed a novel model, ViCLOUD, that defines a scoring method based on linguistic cues to measure ambiguity in cloud legal documents and compare them to other peer websites. In this paper, we describe the ViCLOUD model in detail along with the validation results when applying it to 112 privacy policies and 108 Terms of Service documents of 115 cloud service vendors. The score distribution gives us a landscape of current trends in cloud services and a scale of comparison for new documentation. Our model will be very useful to organizations in making judicious decisions when selecting their cloud service. 
    more » « less
  2. Understanding an online argumentative discussion is essential for understanding users' opinions on a topic and their underlying reasoning. A key challenge in determining completeness and persuasiveness of argumentative discussions is to assess how arguments under a topic are connected in a logical and coherent manner. Online argumentative discussions, in contrast to essays or face-to-face communication, challenge techniques for judging argument relevance because online discussions involve multiple participants and often exhibit incoherence in reasoning and inconsistencies in writing style. We define relevance as the logical and topical connections between small texts representing argument fragments in online discussions. We provide a corpus comprising pairs of sentences, labeled with argumentative relevance between the sentences in each pair. We propose a computational approach relying on content reduction and a Siamese neural network architecture for modeling argumentative connections and determining argumentative relevance between texts. Experimental results indicate that our approach is effective in measuring relevance between arguments, and outperforms strong and well-adopted baselines.Further analysis demonstrates the benefit of using our argumentative relevance encoding on a downstream task, predicting how impactful an online comment is to certain topic, comparing to encoding that does not consider logical connection. 
    more » « less
  3. How and when do opportunities for political participation through courts change under authoritarianism? Although China is better known for tight political control than for political expression, the 2008 Open Government Information (OGI) regulation ushered in a surge of political-legal activism. We draw on an original dataset of 57,095 OGI lawsuits, supplemented by interview data and government documents, to show how a feedback loop between judges and court users shaped possibilities for political activism and complaint between 2008 and 2019. Existing work suggests that authoritarian leaders crack down on legal action when they feel politically threatened. In contrast, we find that courts minted, defined, and popularized new legal labels to cut off access to justice for the super-active litigants whose lawsuits had come to dominate the OGI docket. This study underscores the power of procedural rules and frontline judges in shaping possibilities for political participation under authoritarianism. 
    more » « less
  4. Terms of service documents are a common feature of organizations' websites. Although there is no blanket requirement for organizations to provide these documents, their provision often serves essential legal purposes. Users of a website are expected to agree with the contents of a terms of service document, but users tend to ignore these documents as they are often lengthy and difficult to comprehend. As a step towards understanding the landscape of these documents at a large scale, we present a first-of-its-kind terms of service corpus containing 247,212 English language terms of service documents obtained from company websites sampled from Free Company Dataset. We examine the URLs and contents of the documents and find that some websites that purport to post terms of service actually do not provide them. We analyze reasons for unavailability and determine the overall availability of terms of service in a given set of website domains. We also identify that some websites provide an agreement that combines terms of service with a privacy policy, which is often an obligatory separate document. Using topic modeling, we analyze the themes in these combined documents by comparing them with themes found in separate terms of service and privacy policies. Results suggest that such single-page agreements miss some of the most prevalent topics available in typical privacy policies and terms of service documents and that many disproportionately cover privacy policy topics as compared to terms of service topics. 
    more » « less
  5. null (Ed.)
    A quiet revolution is afoot in the field of law. Technical systems employing algorithms are shaping and displacing professional decision making, and they are disrupting and restructuring relationships between law firms, lawyers, and clients. Decision-support systems marketed to legal professionals to support e-discovery—generally referred to as “technology assisted review” (TAR)—increasingly rely on “predictive coding”: machine-learning techniques to classify and predict which of the voluminous electronic documents subject to litigation should be withheld or produced to the opposing side. These systems and the companies offering them are reshaping relationships between lawyers and clients, introducing new kinds of professionals into legal practice, altering the discovery process, and shaping how lawyers construct knowledge about their cases and professional obligations. In the midst of these shifting relationships—and the ways in which these systems are shaping the construction and presentation of knowledge—lawyers are grappling with their professional obligations, ethical duties, and what it means for the future of legal practice. Through in-depth, semi-structured interviews of experts in the e-discovery technology space—the technology company representatives who develop and sell such systems to law firms and the legal professionals who decide whether and how to use them in practice—we shed light on the organizational structures, professional rules and norms, and technical system properties that are shaping and being reshaped by predictive coding systems. Our findings show that AI-supported decision systems such as these are reconfiguring professional work practices. In particular, they highlight concerns about potential loss of professional agency and skill, limited understanding and thereby both over- and under reliance on decision-support systems, and confusion about responsibility and accountability as new kinds of technical professionals and technologies are brought into legal practice. The introduction of predictive coding systems and the new professional and organizational arrangements they are ushering into legal practice compound general concerns over the opacity of technical systems with specific concerns about encroachments on the construction of expert knowledge, liability frameworks, and the potential (mis)alignment of machine reasoning with professional logic and ethics. Based on our findings, we conclude that predictive coding tools—and likely other algorithmic systems lawyers use to construct knowledge and reason about legal practice— challenge the current model for evaluating whether and how tools are appropriate for legal practice. As tools become both more complex and more consequential, it is unreasonable to rely solely on legal professionals—judges, law firms, and lawyers—to determine which technologies are appropriate for use. The legal professionals we interviewed report relying on the evaluation and judgment of a range of new technical experts within law firms and, increasingly, third-party vendors and their technical experts. This system for choosing technical systems upon which lawyers rely to make professional decisions—e.g., whether documents are responsive, or whether the standard of proportionality has been met—is no longer sufficient. As the tools of medicine are reviewed by appropriate experts before they are put out for consideration and adoption by medical professionals, we argue that the legal profession must develop new processes for determining which algorithmic tools are fit to support lawyers’ decision making. Relatedly, because predictive coding systems are used to produce lawyers’ professional judgment, we argue they must be designed for contestability— providing greater transparency, interaction, and configurability around embedded choices to ensure decisions about how to embed core professional judgments, such as relevance and proportionality, remain salient and demand engagement from lawyers, not just their technical experts. 
    more » « less