NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules

Ren, Weijieying; Li, Xiaoting; Chen, Huiyuan; Rakesh, Vineeth; Wang, Zhuoyi; Das, Mahashweta; Honavar, Vasant (July 2024, Proceedings of Machine Learning Research: International Conference on Machine Learning)

We consider the problem of test-time adaptation of predictive models trained on tabular data. Effective solution of this problem requires adaptation of predictive models trained on the source domain to a target domain, using only unlabeled target domain data, without access to source domain data. Existing test-time adaptation methods for tabular data have difficulty coping with the heterogeneous features and their complex dependencies inherent in tabular data. To overcome these limitations, we consider test-time adaptation in the setting wherein the logical structure of the rules is assumed to remain invariant despite distribution shift between source and target domains whereas the numerical parameters associated with the rules and the weights assigned to them can vary to accommodate distribution shift. TabLog discretizes numerical features, models dependencies between heterogeneous features, introduces a novel contrastive loss for coping with distribution shift, and presents an end-to-end framework for efficient training and test-time adaptation by taking advantage of a logical neural network representation of a rule ensemble. We present results of experiments using several benchmark data sets that demonstrate TabLog is competitive with or improves upon the state-of-the-art methods for testtime adaptation of predictive models trained on tabular data. Our code is available at https:// github.com/WeijieyingRen/TabLog.
more » « less
Full Text Available
A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews

https://doi.org/10.1145/3178876.3186069

Rakesh, Vineeth; Ding, Weicong; Ahuja, Aman; Rao, Nikhil; Sun, Yifan; Reddy, Chandan K. (April 2018, Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18)

Full Text Available
Probabilistic Social Sequential Model for Tour Recommendation

https://doi.org/10.1145/3018661.3018711

Rakesh, Vineeth; Jadhav, Niranjan; Kotov, Alexander; Reddy, Chandan K. (January 2017, Proceedings of the Tenth ACM International Conference on Web Search and Data Mining)

Full Text Available
Project Success Prediction in Crowdfunding Environments

https://doi.org/10.1145/2835776.2835791

Li, Yan; Rakesh, Vineeth; Reddy, Chandan K. (January 2016, Proceedings of the Ninth ACM International Conference on Web Search and Data Mining)

Crowdfunding has gained widespread attention in recent years. Despite the huge success of crowdfunding platforms, the percentage of projects that succeed in achieving their desired goal amount is only around 40%. Moreover, many of these crowdfunding platforms follow "all-or-nothing" policy which means the pledged amount is collected only if the goal is reached within a certain predefined time duration. Hence, estimating the probability of success for a project is one of the most important research challenges in the crowdfunding domain. To predict the project success, there is a need for new prediction models that can potentially combine the power of both classification (which incorporate both successful and failed projects) and regression (for estimating the time for success). In this paper, we formulate the project success prediction as a survival analysis problem and apply the censored regression approach where one can perform regression in the presence of partial information. We rigorously study the project success time distribution of crowdfunding data and show that the logistic and log-logistic distributions are a natural choice for learning from such data. We investigate various censored regression models using comprehensive data of 18K Kickstarter (a popular crowdfunding platform) projects and 116K corresponding tweets collected from Twitter. We show that the models that take complete advantage of both the successful and failed projects during the training phase will perform significantly better at predicting the success of future projects compared to the ones that only use the successful projects. We provide a rigorous evaluation on many sets of relevant features and show that adding few temporal features that are obtained at the project's early stages can dramatically improve the performance.
more » « less
Full Text Available

Search for: All records