skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A recommendation system for scientific water data
Abstract We describe a recommendation system for HydroShare, a platform for scientific water data sharing. We discuss similarities, differences and challenges for implementing recommendation systems for scientific water data sharing. We discuss and analyze the behaviors that scientists exhibit in using HydroShare as documented by users’ activity logs. Unlike entertainment system users, users on HydroShare tend to be task-oriented, where the set of tasks of interest can change over time, and older interests are sometimes no longer relevant. By validating recommendation approaches against user behavior as expressed in activity logs, we conclude that a combination of content-based filtering and a latent Dirichlet allocation (LDA) topic modeling of user behavior—rather than and instead of LDA classification of dataset topics—provides a workable solution for HydroShare and compares this approach to existing recommendation methods.  more » « less
Award ID(s):
1664061 1664018
PAR ID:
10387619
Author(s) / Creator(s):
;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
International Journal of Data Science and Analytics
Volume:
12
Issue:
1
ISSN:
2364-415X
Page Range / eLocation ID:
p. 61-75
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In recent years, data and file sharing have advanced significantly, opening doors for engineers from all over the world to stay connected with each other and share data, models, scripts and other information required for scientific and engineering purposes. HydroShare (www.hydroshare.org) was developed by a consortium of universities sponsored by the National Science Foundation (NSF) as a means for improving data and model sharing. Originally released in 2014, and continually updated since that time, HydroShare has proven to be a valuable resource for a growing number of active users in the field of water resources and environmental research. The graphical user interface is relatively simple and easy to understand and the system provides users with a large amount of free data storage, which makes it particularly useful for academics, researchers, and scientists as well as practicing engineers. This project report presents the design and development of a web-based application (web app) that demonstrates all core functions of HydroShare via a published application programmer interface (API). The resulting web app was developed using the Tethys Platform which is intended for creating web-based applications with database and mapping capabilities. This app demonstrates the use of all of the core functions of the HydroShare Python REST client and includes sample code and instructions for using these functions. The overarching goal of this work is to increase the use and usability of HydroShare via its API and to simplify using the API for student and other programmers developing their own web applications. 
    more » « less
  2. Abstract Many have argued that datasets resulting from scientific research should be part of the scholarly record as first class research products. Data sharing mandates from funding agencies and scientific journal publishers along with calls from the scientific community to better support transparency and reproducibility of scientific research have increased demand for tools and support for publishing datasets. Hydrology domain‐specific data publication services have been developed alongside more general purpose and even commercial data repositories. Prominent among these are the Hydrologic Information System (HIS) and HydroShare repositories developed by the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI). More broadly, however, multiple organizations have been involved in the practice of data publication in the hydrology domain, each having different roles that have shaped data publication and reuse. Bibliographic and archival approaches to data publication have been advanced, but both have limitations with respect to hydrologic data. Specific recommendations for improving data publication infrastructure, support, and practices to move beyond existing limitations and enable more effective data publication in support of scientific research in the hydrology domain include: improving support for journal article‐based data access and data citation, considering the workflow for data publication, enhancing support for reproducible science, encouraging publication of curated reference data collections, advancing interoperability standards for sharing data and metadata among repositories, developing partnerships with university libraries offering data services, and developing more specific data management plans. While presented in the context of CUAHSI's data repositories and experience, these recommendations are broadly applicable to other domains. This article is categorized under:Science of Water > Methods 
    more » « less
  3. With the increase in volume of daily online news items, it is more and more difficult for readers to identify news articles relevant to their interests. Thus, effective recommendation systems are critical for an effective user news consumption experience. Existing news recommendation methods usually rely on the news click history to model user interest. However, there are other signals about user behaviors, such as user commenting activity, which have not been used before. We propose a recommendation algorithm that predicts articles a user may be interested in, given her historical sequential commenting behavior on news articles. We show that following this sequential user behavior the news recommendation problem falls into in the class of session-based recommendation. The techniques in this class seek to model users' sequential and temporal behaviors. While we seek to follow the general directions in this space, we face unique challenges specific to news in modeling temporal dynamics, e.g., users' interests shift over time, users comment irregularly on articles, and articles are perishable items with limited lifespans. We propose a recency-regularized neural attentive framework for session-based news recommendation. The proposed method is able to capture the temporal dynamics of both users and news articles, while maintaining interpretability. We design a lag-aware attention and a recency regularization to model the time effect of news articles and comments. We conduct extensive empirical studies on 3 real-world news datasets to demonstrate the effectiveness of our method. 
    more » « less
  4. Fitness trackers are undoubtedly gaining in popularity. As fitness-related data are persistently captured, stored, and processed by these devices, the need to ensure users’ privacy is becoming increasingly urgent. In this paper, we apply a data-driven approach to the development of privacy-setting recommendations for fitness devices. We first present a fitness data privacy model that we defined to represent users’ privacy preferences in a way that is unambiguous, compliant with the European Union’s General Data Protection Regulation (GDPR), and able to represent both the user and the third party preferences. Our crowdsourced dataset is collected using current scenarios in the fitness domain and used to identify privacy profiles by applying machine learning techniques. We then examine different personal tracking data and user traits which can potentially drive the recommendation of privacy profiles to the users. Finally, a set of privacy-setting recommendation strategies with different guidance styles are designed based on the resulting profiles. Interestingly, our results show several semantic relationships among users’ traits, characteristics, and attitudes that are useful in providing privacy recommendations. Even though several works exist on privacy preference modeling, this paper makes a contribution in modeling privacy preferences for data sharing and processing in the IoT and fitness domain, with specific attention to GDPR compliance. Moreover, the identification of well-identified clusters of preferences and predictors of such clusters is a relevant contribution for user profiling and for the design of interactive recommendation strategies that aim to balance users’ control over their privacy permissions and the simplicity of setting these permissions. 
    more » « less
  5. Hashtags can greatly facilitate content navigation and improve user engagement in social media. Meaningful as it might be, recommending hashtags for photo sharing services such as Instagram and Pinterest remains a daunting task due to the following two reasons. On the endogenous side, posts in photo sharing services often contain both images and text, which are likely to be correlated with each other. Therefore, it is crucial to coherently model both image and text as well as the interaction between them. On the exogenous side, hashtags are generated by users and different users might come up with different tags for similar posts, due to their different preference and/or community effect. Therefore, it is highly desirable to characterize the users’ tagging habits. In this paper, we propose an integral and effective hashtag recommendation approach for photo sharing services. In particular, the proposed approach considers both the endogenous and exogenous effects by a content modeling module and a habit modeling module, respectively. For the content modeling module, we adopt the parallel co-attention mechanism to coherently model both image and text as well as the interaction between them; for the habit modeling module, we introduce an external memory unit to characterize the historical tagging habit of each user. The overall hashtag recommendations are generated on the basis of both the post features from the content modeling module and the habit influences from the habit modeling module. We evaluate the proposed approach on real Instagram data. The experimental results demonstrate that the proposed approach significantly outperforms the state-of-theart methods in terms of recommendation accuracy, and that both content modeling and habit modeling contribute significantly to the overall recommendation accuracy. 
    more » « less