Synthetic data for learning-based knowledge discovery

Shiao, William; Papalexakis, Evangelos E

doi:10.1145/3682112.3682115

Citation Details

Synthetic data for learning-based knowledge discovery

Recent advances in deep learning have demonstrated the ability of learning-based methods to tackle very hard downstream tasks. Historically, this has been demonstrated in predictive tasks, while tasks more akin to the traditional KDD (Knowledge Discovery in Databases) pipeline have enjoyed proportionally fewer advances. Can learning-based approaches help with inherently hard problems within the KDD pipeline, such as how many patterns are in the data, what are different structures in the data, and how can we robustly extract those structures? In this vision paper, we argue for the need for synthetic data generators to empower cheaply-supervised learning-based solutions for knowledge discovery. We describe the general idea, early proof-of-concept results which speak to the viability of the paradigm, and we outline a number of exciting challenges that await, and a set of milestones for measuring success. more »

Award ID(s):: 2112650

PAR ID:: 10591466

Author(s) / Creator(s):: Shiao, William; Papalexakis, Evangelos E

Publisher / Repository:: ACM Digital Library

Date Published:: 2024-07-24

Journal Name:: ACM SIGKDD Explorations Newsletter

Volume:: 26

Issue:: 1

ISSN:: 1931-0145

Page Range / eLocation ID:: 19 to 23

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3682112.3682115

More Like this