skip to main content


Title: Improving Data Quality in Clinical Research Informatics Tools
Maintaining data quality is a fundamental requirement for any successful and long-term data management. Providing high-quality, reliable, and statistically sound data is a primary goal for clinical research informatics. In addition, effective data governance and management are essential to ensuring accurate data counts, reports, and validation. As a crucial step of the clinical research process, it is important to establish and maintain organization-wide standards for data quality management to ensure consistency across all systems designed primarily for cohort identification, allowing users to perform an enterprise-wide search on a clinical research data repository to determine the existence of a set of patients meeting certain inclusion or exclusion criteria. Some of the clinical research tools are referred to as de-identified data tools. Assessing and improving the quality of data used by clinical research informatics tools are both important and difficult tasks. For an increasing number of users who rely on information as one of their most important assets, enforcing high data quality levels represents a strategic investment to preserve the value of the data. In clinical research informatics, better data quality translates into better research results and better patient care. However, achieving high-quality data standards is a major task because of the variety of ways that errors might be introduced in a system and the difficulty of correcting them systematically. Problems with data quality tend to fall into two categories. The first category is related to inconsistency among data resources such as format, syntax, and semantic inconsistencies. The second category is related to poor ETL and data mapping processes. In this paper, we describe a real-life case study on assessing and improving the data quality at one of healthcare organizations. This paper compares between the results obtained from two de-identified data systems i2b2, and Epic Slicedicer, and discuss the data quality dimensions' specific to the clinical research informatics context, and the possible data quality issues between the de-identified systems. This work in paper aims to propose steps/rules for maintaining the data quality among different systems to help data managers, information systems teams, and informaticists at any health care organization to monitor and sustain data quality as part of their business intelligence, data governance, and data democratization processes.  more » « less
Award ID(s):
1946391
NSF-PAR ID:
10421457
Author(s) / Creator(s):
Date Published:
Journal Name:
Frontiers in Big Data
Volume:
5
ISSN:
2624-909X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper reflects on the significance of ABET’s “maverick evaluators” and what it says about the limits of accreditation as a mode of governance in US engineering education. The US system of engineering education operates as a highly complex system, where the diversity of the system is an asset to robust knowledge production and the production of a varied workforce. ABET Inc., the principal accreditation agency for engineering degree programs in the US, attempts to uphold a set of professional standards for engineering education using a voluntary, peer-based system of evaluation. Key to their approach is a volunteer army of trained program evaluators (PEVs) assigned by the engineering professional societies, who serve as the frontline workers responsible for auditing the content, learning outcomes, and continuous improvement processes utilized by every engineering degree program accredited by ABET. We take a look specifically at those who become labeled “maverick evaluators” in order to better understand how this system functions, and to understand its limitations as a form of governance in maintaining educational quality and appropriate professional standards within engineering education. ABET was established in 1932 as the Engineers’ Council for Professional Development (ECPD). The Cold War consensus around the engineering sciences led to a more quantitative system of accreditation first implemented in 1956. However, the decline of the Cold War and rising concerns about national competitiveness prompted ABET to shift to a more neoliberal model of accountability built around outcomes assessment and modeled after total quality management / continuous process improvement (TQM/CPI) processes that nominally gave PEVs greater discretion in evaluating engineering degree programs. However, conflicts over how the PEVs exercised judgment points to conservative aspects in the structure of the ABET organization, and within the engineering profession at large. This paper and the phenomena we describe here is one part of a broader, interview-based study of higher education governance and engineering educational reform within the United States. We have conducted over 300 interviews at more than 40 different academic institutions and professional organizations, where ABET and institutional responses to the reforms associated with “EC 2000,” which brought outcomes assessment to engineering education, are extensively discussed. The phenomenon of so-called “maverick evaluators” reveal the divergent professional interests that remain embedded within ABET and the engineering profession at large. Those associated with Civil and Environmental Engineering, and to a lesser extent Mechanical Engineering continue to push for higher standards of accreditation grounded in a stronger vision for their professions. While the phenomenon is complex and more subtle than we can summarize in an abstract, “maverick evaluators” emerged as a label for PEVs who interpreted their role, including determinations about whether certain content “appropriate to the field of study,” utilizing professional standards that lay outside of the consensus position held by the majority of the member of the Engineering Accreditation Commission. This, conjoined with the engineers’ epistemic aversion to uncertainty and concerns about the legal liability of their decisions, resulted in a more narrow interpretation of key accreditation criteria. The organization then designed and used a “due-process” reviews process to discipline identified shortcomings in order to limit divergent interpretations. The net result is that the bureaucratic process ABET built to obtain uniformity in accreditation outcomes, simultaneously blunts the organization’s capacity to support varied interpretations of professional standards at the program level. The apparatus has also contributed to ABET’s reputation as an organization focused on minimum standards, as opposed to one that functions as an effective driver for further change in engineering education. 
    more » « less
  2. Rather than treating symptoms of a destructive agri-food system, agricultural policy, research, and advocacy need both to address the root causes of dysfunction and to learn from longstanding interventions to counter it. Specifically, this paper focuses on agricultural parity policies – farmer-led, government-enacted programs to secure a price floor and manage supply to prevent the economic and ecological devastation of unfettered corporate agro-capitalism. Though these programs remain off the radar in dominant policy, scholarship, and civil society activism, but in the past few years, vast swaths of humanity have mobilized in India to call for agri-food systems transformation through farmgate pricing and market protections. This paper asks what constitutes true farm justice and how it could be updated and expanded as an avenue for radically reimagining agriculture and thus food systems at large. Parity refers to both a pricing ratio to ensure livelihood, but also a broader farm justice movement built on principles of fair farmgate prices and cooperatively coordinated supply management. The programs and principles are now mostly considered “radical,” deemed inefficient, irrelevant, obsolete, and grievous government overeach—but from the vantage, we argue, of a system that profits from commodity crop overproduction and agroindustry consolidation. However, by examining parity through a producer-centric lens cognizant of farmers‘ ability, desire, and need to care for the land, ideas of price protection and supply coordination become foundational, so that farmers can make a dignified livelihood stewarding land and water while producing nourishing food. This paradox—that an agricultural governance principle can seem both radical and common sense, far-fetched and pragmatic—deserves attention and analysis. As overall numbers of farmers decline in Global North contexts, their voices dwindle from these conversations, leaving space for worldviews favoring de-agrarianization altogether. In Global South contexts maintaining robust farming populations, such policies for deliberate de-agrarianization bely an aggression toward rural and peasant ways of life and land tenure. Alongside the history of parity programs, principles, and movements in U.S., the paper will examine a vast version of a parity program in India – the Minimum Support Price (MSP) system, which Indian farmers defended and now struggle to expand into a legal right. From East India to the plains of the United States and beyond, parity principles and programs have the potential to offer a pragmatic direction for countering global agro-industrial corporate capture, along with its de-agrarianization, and environmental destruction. The paper explores what and why of parity programs and movements, even as it addresses the complexity of how international parity agreements would unfold. It ends with the need for global supply coordination grounded in food sovereignty and solidarity, and thus the methodological urgency of centering farm justice and agrarian expertise. 
    more » « less
  3. There is growing recognition that unambiguous citation and tracking of physical samples allows previously impossible linking of samples to data and publications, linking and integration of sample-based observations across data systems, and paves the road towards advanced data mining of sample-based data. And in recent years, there has been an uptake in the use of Persistent Identifiers (PIDs) for physical samples to support such citation and tracking. The IGSN (International Geo Sample Number) is a PID for physical samples. It was originally developed for the solid earth sciences, and has evolved into an international PID system with members in five continents and a network of active allocating agents. It has been adopted by a growing number and range of stakeholders worldwide, including national geological surveys, research infrastructure providers, collection curators, researchers, and data managers, and by other disciplines that need to refer to physical samples. Nearly 6.9 million samples have been registered with IGSNs so far. The IGSN system uses the Handle System (Kahn and Wilensky 1995; see also Handle.Net ® ) and has an international organization, IGSN e.V., to manage its governance structure and the technical architecture. The recent expansion of the IGSN beyond the geosciences into other domains such as biodiversity, archeology, and material sciences confirms the power of its concept and implementation, but imposes substantial pressures on the existing capacity and capabilities of the IGSN architecture and its governing organization. Modifications to the IGSN organizational and technical architecture are necessary at this point to keep pace with the growing demand and expectations. These changes are also necessary to ensure trustworthy and sustainable services for PID registration and resolution in a maturing research data ecosystem. The essential criteria for a trustworthy system include an organizational foundation that ensures longevity, sustainability, proper governance, and regular quality assessment of registration services. It also includes a reliable and secure technical platform, based on open standards, which is sufficiently scalable and flexible to accommodate the growing diversity of specimen types, use cases, and stakeholder requirements. In 2018, a major planning project for the IGSN was funded by the Alfred P. Sloan Foundation. An international group of experts participates in re-designing and improving the existing organization and technical architecture of the IGSN system, revising the current business model of the IGSN e.V. and professionalizing its operations. The goal is for the IGSN system to be able to respond to, and support in a sustainable manner, the rapidly growing demands of a global and increasingly multi-disciplinary user community, and to ensure that the IGSN will be a trustworthy, stable, and adaptable persistent identifier system for material samples, both technically and organizationally. The end result should also satisfy and facilitate participation across research domains, and will be a reliable component of the evolving research data ecosystem. Finally, it will ensure that the IGSN is recognized as a trusted partner by data infrastructure providers and the science community alike. 
    more » « less
  4. Addressing the challenges of sustainable and equitable city management in the 21st century requires innovative solutions and integration from a range of dedicated actors. In order to form and fortify partnerships of multi-sectoral collaboration, expand effective governance, and build collective resiliency it is important to understand the network of existing stewardship organizations. The term ‘stewardship’ encompasses a spectrum of local agents dedicated to the evolving process of community care and restoration. Groups involved in stewardship across Baltimore are catalysts of change through a variety of conservation, management, monitoring, transformation, education, and advocacy activities for the local environment – many with common goals of joint resource management, distributive justice, and community power sharing. The “environment” here is intentionally broadly defined as land, air, water, energy and more. The Stewardship Mapping and Assessment Project (STEW-MAP) is a method of data collection and visualization that tracks the characteristics of organizations and their financial and informational flows across sectors and geographic boundaries. The survey includes questions about three facets of environmental stewardship groups: 1) organizational characteristics, 2) collaboration networks, and 3) stewardship “turfs” where each organization works. The data have been analyzed alongside landcover and demographic data and used in multi-city studies incorporating similar datasets across major urban areas of the U.S. Additional information about the growing network of cities conducting stewmap can be found here: https://www.nrs.fs.usda.gov/STEW-MAP/ Romolini, Michele; Grove, J. Morgan; Locke, Dexter H. 2013. Assessing and comparing relationships between urban environmental stewardship networks and land cover in Baltimore and Seattle. Landscape and Urban Planning. 120: 190-207. https://www.fs.usda.gov/research/treesearch/44985 Johnson, M., D. H. Locke, E. Svendsen, L. Campbell, L. M. Westphal, M. Romolini, and J. Grove. 2019. Context matters: influence of organizational, environmental, and social factors on civic environmental stewardship group intensity. Ecology and Society 24(4): 1. https://doi.org/10.5751/ES-10924-240401 Ponte, S. 2023. Social-ecological processes and dynamics of urban forests as green stormwater infrastructure in Maryland, USA. Doctoral dissertation, University of Maryland, College Park, MD. 
    more » « less
  5. CitSci.org is a global citizen science software platform and support organization housed at Colorado State University. The mission of CitSci is to help people do high quality citizen science by amplifying impacts and outcomes. This platform hosts over one thousand projects and a diverse volunteer base that has amassed over one million observations of the natural world, focused on biodiversity and ecosystem sustainability. It is a custom platform built using open source components including: PostgreSQL, Symfony, Vue.js, with React Native for the mobile apps. CitSci sets itself apart from other Citizen Science platforms through the flexibility in the types of projects it supports rather than having a singular focus. This flexibility allows projects to define their own datasheets and methodologies. The diversity of programs we host motivated us to take a founding role in the design of the PPSR Core, a set of global, transdisciplinary data and metadata standards for use in Public Participation in Scientific Research (Citizen Science) projects. Through an international partnership between the Citizen Science Association, European Citizen Science Association, and Australian Citizen Science Association, the PPSR team and associated standards enable interoperability of citizen science projects, datasets, and observations. Here we share our experience over the past 10+ years of supporting biodiversity research both as developers of the CitSci.org platform and as stewards of, and contributors to, the PPSR Core standard. Specifically, we share details about: the origin, development, and informatics infrastructure for CitSci our support for biodiversity projects such as population and community surveys our experiences in platform interoperability through PPSR Core working with the Zooniverse, SciStarter, and CyberTracker data quality data sharing goals and use cases. the origin, development, and informatics infrastructure for CitSci our support for biodiversity projects such as population and community surveys our experiences in platform interoperability through PPSR Core working with the Zooniverse, SciStarter, and CyberTracker data quality data sharing goals and use cases. We conclude by sharing overall successes, limitations, and recommendations as they pertain to trust and rigor in citizen science data sharing and interoperability. As the scientific community moves forward, we show that Citizen Science is a key tool to enabling a systems-based approach to ecosystem problems. 
    more » « less