skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 2 until 12:00 AM ET on Saturday, May 3 due to maintenance. We apologize for the inconvenience.


Title: A Panel Data Set of Cryptocurrency Development Activity on GitHub
Cryptocurrencies are a significant development in recent years, featuring in global news, the financial sector, and academic research. They also hold a significant presence in open source development, comprising some of the most popular repositories on GitHub. Their openly developed software artifacts thus present a unique and exclusive avenue to quantitatively observe human activity, effort, and software growth for cryptocurrencies. Our data set marks the first concentrated effort toward high-fidelity panel data of cryptocurrency development for a wide range of metrics. The data set is foremost a quantitative measure of developer activity for budding open source cryptocurrency development. We collect metrics like daily commits, contributors, lines of code changes, stars, forks, and subscribers. We also include financial data for each cryptocurrency: the daily price and market capitalization. The data set includes data for 236 cryptocurrencies for 380 days (roughly January 2018 to January 2019). We discuss particularly interesting research opportunities for this combination of data, and release new tooling to enable continuing data collection for future research opportunities as development and application of cryptocurrencies mature.  more » « less
Award ID(s):
1750116
PAR ID:
10135838
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
Page Range / eLocation ID:
186 to 190
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Interest in cryptocurrencies has surged in recent years. Today thousands of currencies are in circulation, collectively worth hundreds of billions of dollars. Software vulnerabilities have also proliferated, which poses new and unique challenges to the ecosystem as it has developed. This review article explains what is different about vulnerabilities and responsible disclosure in cryptocurrencies, identifying key problems and opportunities for research and development. Selected case studies of vulnerability disclosures are presented. We draw lessons and pose open questions that can inform the responsible disclosure debate in cryptocurrencies and beyond. 
    more » « less
  2. <italic>Abstract</italic> Cryptocurrencies and the underpinning blockchain technology have gained unprecedented public attention recently. In contrast to fiat currencies, transactions of cryptocurrencies, such as Bitcoin and Litecoin, are permanently recorded on distributed ledgers to be seen by the public. As a result, public availability of all cryptocurrency transactions allows us to create a complex network of financial interactions that can be used to study not only the blockchain graph, but also the relationship between various blockchain network features and cryptocurrency risk investment. We introduce a novel concept of chainlets, or blockchain motifs, to utilize this information. Chainlets allow us to evaluate the role of local topological structure of the blockchain on the joint Bitcoin and Litecoin price formation and dynamics. We investigate the predictive Granger causality of chainlets and identify certain types of chainlets that exhibit the highest predictive influence on cryptocurrency price and investment risk. More generally, while statistical aspects of blockchain data analytics remain virtually unexplored, the paper aims to highlight various emerging theoretical, methodological and applied research challenges of blockchain data analysis that will be of interest to the broad statistical community.The Canadian Journal of Statistics48: 561–581; 2020 © 2020 Statistical Society of Canada 
    more » « less
  3. Many previous studies have shown that open-source technologies help democratize information and foster collaborations to enable addressing global physical and societal challenges. The outbreak of the novel coronavirus has imposed unprecedented challenges to human society. It affects every aspect of livelihood, including health, environment, transportation, and economy. Open-source technologies provide a new ray of hope to collaboratively tackle the pandemic. The role of open source is not limited to sharing a source code. Rather open-source projects can be adopted as a software development approach to encourage collaboration among researchers. Open collaboration creates a positive impact in society and helps combat the pandemic effectively. Open-source technology integrated with geospatial information allows decision-makers to make strategic and informed decisions. It also assists them in determining the type of intervention needed based on geospatial information. The novelty of this paper is to standardize the open-source workflow for spatiotemporal research. The highlights of the open-source workflow include sharing data, analytical tools, spatiotemporal applications, and results and formalizing open-source software development. The workflow includes (i) developing open-source spatiotemporal applications, (ii) opening and sharing the spatiotemporal resources, and (iii) replicating the research in a plug and play fashion. Open data, open analytical tools and source code, and publicly accessible results form the foundation for this workflow. This paper also presents a case study with the open-source spatiotemporal application development for air quality analysis in California, USA. In addition to the application development, we shared the spatiotemporal data, source code, and research findings through the GitHub repository. 
    more » « less
  4. Over the past 20 years, the explosion of genomic data collection and the cloud computing revolution have made computational and data science research accessible to anyone with a web browser and an internet connection. However, students at institutions with limited resources have received relatively little exposure to curricula or professional development opportunities that lead to careers in genomic data science. To broaden participation in genomics research, the scientific community needs to support these programs in local education and research at underserved institutions (UIs). These include community colleges, historically Black colleges and universities, Hispanic-serving institutions, and tribal colleges and universities that support ethnically, racially, and socioeconomically underrepresented students in the United States. We have formed the Genomic Data Science Community Network to support students, faculty, and their networks to identify opportunities and broaden access to genomic data science. These opportunities include expanding access to infrastructure and data, providing UI faculty development opportunities, strengthening collaborations among faculty, recognizing UI teaching and research excellence, fostering student awareness, developing modular and open-source resources, expanding course-based undergraduate research experiences (CUREs), building curriculum, supporting student professional development and research, and removing financial barriers through funding programs and collaborator support. 
    more » « less
  5. null (Ed.)
    A significant challenge in blockchain and cryptocurrencies is protecting private keys from potential hackers because nobody can rollback a transaction made with a stolen key once the blockchain network confirms the transaction. The technical solution to protect private keys is cryptocurrency wallets, a piece of software, hardware, or a combination of them to manage the keys. In this paper, we propose a multilayered architecture for cryptocurrency wallets based on a Defense-in-Depth strategy to protect private keys with a balance between convenience and security. The user protects the private keys in three restricted layers with different protection mechanisms. So, a single breach cannot threaten the entire fund, and it saves time for the user to respond. We implement a proof-of-concept of our proposed architecture on both a smart card hardware wallet and an Android smartphone wallet with no performance penalty. Furthermore, we analyze the security of our proposed architecture with two adversary models. 
    more » « less