NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Building a Broad Infrastructure for Uniform Meaning Representations

Bonn, Juli; Buchholz, Matthew J; Chun, Jayeol; Cowell, Andrew; Croft, William; Denk, Lukas; Ge, Sijia; Hajič, Jan; Lai, Kenneth; Martin, James H; et al (May 2024, ELRA and ICCL)
Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)
This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.
more » « less
Full Text Available
Building a Broad Infrastructure for Uniform Meaning Representations

Bonn, Julia; Buchholz, Matthew J; Chun, Jayeol; Cowell, Andrew; Croft, William; Denk, Lukas; Ge, Sijia; Hajič, Jan; Lai, Kenneth; Martin, James H; et al (May 2024, ELRA and ICCL)
Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)
This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.
more » « less
Full Text Available
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility

Bonn, Julia; Myers Skatje; Van Gysel, Jens E.; Denk, Lukas; Vigus, Meagan; Zhao, Jin; Cowell, Andrew; Croft, William; Hajic, Jan; Martin, James H; et al (March 2023, The 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023))

Full Text Available
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility

Bonn, Julia; Myers, Skatje; Van Gysel, Jens E.; Denk, Lukas; Vigus, Meagan; Zhao, Jin; Cowell, Andrew; Croft, William; Hajic, Jan; Martin, James H.; et al (March 2023, Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023))

This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categories present in both AMR and UMR.
more » « less
Full Text Available
AutoAspect: Automatic Annotation of Tense and Aspect for Uniform Meaning Representations

https://doi.org/10.18653/v1/2021.law-1.4

Chen, Daniel; Palmer, Martha; Vigus, Meagan (January 2021, Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop)
Claire Bonial, Nianwen Xue (Ed.)
We present AutoAspect, a novel, rule-based annotation tool for labeling tense and aspect. The pilot version annotates English data. The aspect labels are designed specifically for Uniform Meaning Representations (UMR), an annotation schema that aims to encode crosslingual semantic information. The annotation tool combines syntactic and semantic cues to assign aspects on a sentence-by-sentence basis, following a sequence of rules that each output a UMR aspect. Identified events proceed through the sequence until they are assigned an aspect. We achieve a recall of 76.17% for identifying UMR events and an accuracy of 62.57% on all identified events, with high precision values for 2 of the aspect labels.
more » « less
Full Text Available
Theoretical and Practical Issues in the Semantic Annotation of Four Indigenous Languages

https://doi.org/10.18653/v1/2021.law-1.2

Van Gysel, Jens E.; Vigus, Meagan; Denk, Lukas; Cowell, Andrew; Vallejos, Rosa; O’Gorman, Tim; Croft, William (January 2021, Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop)

Full Text Available
Cross-lingual annotation: a road map for low- and no-resource languages

Vigus, Meagan; Van Gysel, Jens E.; O'Gorman, Tim; Cowell, Andres; Vallejos, Rosa; Croft, William (January 2020, Proceedings of the Second International Workshop on Designing Meaning Representations (DMR 2020))

This paper presents a “road map” for the annotation of semantic categories in typologically diverse languages, with potentially few linguistic resources, and often no existing computational resources. Past semantic annotation efforts have focused largely on high-resource languages, or relatively low-resource languages with a large number of native speakers. However, there are certain typological traits, namely the synthesis of multiple concepts into a single word, that are more common in languages with a smaller speech community. For example, what is expressed as a sentence in a more analytic language like English, may be expressed as a single word in a more synthetic language like Arapaho. This paper proposes solutions for annotating analytic and synthetic languages in a comparable way based on existing typological research, and introduces a road map for the annotation of languages with a dearth of resources.
more » « less
Full Text Available
A Dependency Structure Annotation for Modality

https://doi.org/10.18653/v1/W19-3321

Vigus, Meagan; Van Gysel, Jens E.; Croft, William (January 2019, Proceedings of the First International Workshop on Designing Meaning Representations (DMR 2019))
Xue, Nianwen; Croft, William; Hajic, Jan; Huang, Chu-Ren; Oepen, Stephan; Palmer, Martha; Pustejovsky, James (Ed.)
This paper presents an annotation scheme for modality that employs a dependency structure. Events and sources (here, conceivers) are represented as nodes and epistemic strength relations characterize the edges. The epistemic strength values are largely based on Saurí and Pustejovsky’s (2009) FactBank, while the dependency structure mirrors Zhang and Xue’s (2018b) approach to temporal relations. Six documents containing 377 events have been annotated by two expert annotators with high levels of agreement.
more » « less
Full Text Available
Designing a Uniform Meaning Representation for Natural Language Processing

https://doi.org/10.1007/s13218-021-00722-w

Van Gysel, Jens E.; Vigus, Meagan; Chun, Jayeol; Lai, Kenneth; Moeller, Sarah; Yao, Jiarui; O’Gorman, Tim; Cowell, Andrew; Croft, William; Huang, Chu-Ren; et al (April 2021, KI - Künstliche Intelligenz)
null (Ed.)
In this paper we present Uniform Meaning Representation (UMR), a meaning representation designed to annotate the semantic content of a text. UMR is primarily based on Abstract Meaning Representation (AMR), an annotation framework initially designed for English, but also draws from other meaning representations. UMR extends AMR to other languages, particularly morphologically complex, low-resource languages. UMR also adds features to AMR that are critical to semantic interpretation and enhances AMR by proposing a companion document-level representation that captures linguistic phenomena such as coreference as well as temporal and modal dependencies that potentially go beyond sentence boundaries.
more » « less
Full Text Available
Cross-linguistic semantic annotation: reconciling the language-specific and the universal

https://doi.org/10.18653/v1/W19-3301

Van Gysel, Jens E.; Vigus, Meagan; Kalm, Pavlína; Lee, Sook-kyung; Regan, Michael; Croft, William (January 2019, Proceedings of the First International Workshop on Designing Meaning Representations (DMR 2019))
Xue, Nianwen; Croft, William; Hajic, Jan; Huang, Chu-Ren; Oepen, Stephan; Palmer, Martha; Pustejovsky, James (Ed.)
Developers of cross-lingual semantic annotation schemes face a number of issues not encountered in monolingual annotation. This paper discusses four such issues, related to the establishment of annotation labels, and the treatment of languages with more fine-grained, more coarse-grained, and cross-cutting categories. We propose that a lattice-like architecture of the annotation categories can adequately handle all four issues, and at the same time remain both intuitive for annotators and faithful to typological insights. This position is supported by a brief annotation experiment.
more » « less
Full Text Available

Search for: All records