This article expands upon my presentation to the panel on “The Radical Prescription for Change” at the 2017 ASA (American Statistical Association) symposium on A World Beyond $p<0.05$. It emphasizes that, to greatly enhance the reliability of—and hence public trust in—statistical and data scientific findings, we need to take a holistic approach. We need to lead by example, incentivize study quality, and inoculate future generations with profound appreciations for the world of uncertainty and the uncertainty world. The four “radical” proposals in the title—with all their inherent defects and trade-offs—are designed to provoke reactions and actions. First, research methodologies are trustworthy only if they deliver what they promise, even if this means that they have to be overly protective, a necessary trade-off for practicing quality-guaranteed statistics. This guiding principle may compel us to doubling variance in some situations, a strategy that also coincides with the call to raise the bar from $p<0.05$ to $p<0.005$ [3]. Second, teaching principled practicality or corner-cutting is a promising strategy to enhance the scientific community’s as well as the general public’s ability to spot—and hence to deter—flawed arguments or findings. A remarkable quick-and-dirty Bayes formula for rare events, which simply divides the prevalence by the sum of the prevalence and the false positive rate (or the total error rate), as featured by the popular radio show Car Talk, illustrates the effectiveness of this strategy. Third, it should be a routine mental exercise to put ourselves in the shoes of those who would be affected by our research finding, in order to combat the tendency of rushing to conclusions or overstating confidence in our findings. A pufferfish/selfish test can serve as an effective reminder, and can help to institute the mantra “Thou shalt not sell what thou refuseth to buy” as the most basic professional decency. Considering personal stakes in our statistical endeavors also points to the concept of behavioral statistics, in the spirit of behavioral economics. Fourth, the current mathematical education paradigm that puts “deterministic first, stochastic second” is likely responsible for the general difficulties with reasoning under uncertainty, a situation that can be improved by introducing the concept of histogram, or rather kidstogram, as early as the concept of counting.
more »
« less
In the AI science boom, beware: your results are only as good as your data
Machine-learning systems are voracious data consumers — but trustworthy results require more vetting both before and after publication.
more »
« less
- Award ID(s):
- 2020026
- PAR ID:
- 10508131
- Publisher / Repository:
- SpringerNature
- Date Published:
- Journal Name:
- Nature
- ISSN:
- 0028-0836
- Subject(s) / Keyword(s):
- artificial intelligence machine learning data leakage scientific reproducibility scientific rigor
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Available benchmark suites are used to provide realistic workloads and to understand their run-time characteristics. However, they do not necessarily target the same platforms and often offer a diverse set of metrics, leading to the lack of a knowledge base that could be used for both systems and theoretical research. RT-Bench, a new benchmark framework environment, tries to address these issues by providing a uniform interface and metrics while maintaining portability. This demo illustrates how to leverage this framework and its recently added features to improve the understanding of the benchmarks’ interaction with its system.more » « less
-
null (Ed.)Individual Development Plans (IDPs) have been used to support the career and professional development of graduate students across disciplines. This interactive session focuses on how IDPs can help you build your skills, form your professional identity, and take control of your career through an iterative process of self‐assessment, career exploration, decision making and goal setting. To provide an introduction to the IDP process attendees will be able to take a self‐assessment to learn about their particular strengths and begin to target their strengths toward their professional goals. Additionally a co‐developer and researcher of IDP platforms will highlight the aspects that can help your IDP be more effective than others.more » « less
An official website of the United States government

