Loki: Streamlining Integration and Enrichment

Spoth, William; Kumari, Poonam; Kennedy, Oliver; Nargesian, Fatemeh

Citation Details

Data scientists frequently transform data from one form to another while cleaning, integrating, and enriching datasets.Writing such transformations, or “mapping functions" is time-consuming and often involves significant code re-use. Unfortunately, when every dataset is slightly different from the last, finding the right mapping functions to re-use can be equally difficult. In this paper, we propose “Link Once and Keep It" (Loki), a system which consists of a repository of datasets and mapping functions and relates new datasets to datasets it already knows about, helping a data scientist to quickly locate and re-use mapping functions she developed for other datasets in the past. Loki represents a first step towards building and re-using repositories of domain-specific data integration pipelines. more »

Award ID(s):: 1640864 1750460

PAR ID:: 10208623

Author(s) / Creator(s):: Spoth, William; Kumari, Poonam; Kennedy, Oliver; Nargesian, Fatemeh

Editor(s):: Abouzied, Azza; Amer-Yahia, Sihem; Ives, Zachary

Date Published:: 2020-06-19

Journal Name:: Human in the Loop Data Analytics

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this