NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Covering a Graph with Dense Subgraph Families, via Triangle-Rich Sets

https://doi.org/10.1145/3627673.3679578

Basu, Sabyasachi; Paul-Pena, Daniel; Qian, Kun; Seshadhri, C; Huang, Edward W; Subbian, Karthik (October 2024, ACM)

Full Text Available
STARK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

Wu, Shirley; Zhao, Shiyu; Yasunaga, Michihiro; Huang, Kexin; Cao, Kaidi; Huang, Qian; Ioannidis, Vassilis N; Subbian, Karthik; Zou, James; Leskovec, Jure (December 2024, Advances in neural information processing systems)

Full Text Available
Avatar: Optimizing llm agents for tool usage via contrastive reasoning

Wu, Shirley; Zhao, Shiyu; Huang, Qian; Huang, Kexin; Yasunaga, Michihiro; Cao, Kaidi; Ioannidis, Vassilis N; Subbian, Karthik; Leskovec, Jure; Zou, James (December 2024, Advances in neural information processing systems)

Full Text Available
Learning Backward Compatible Embeddings

https://doi.org/10.1145/3534678.3539194

Hu, Weihua; Bansal, Rajas; Cao, Kaidi; Rao, Nikhil; Subbian, Karthik; Leskovec, Jure (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Embeddings, low-dimensional vector representation of objects, are fundamental in building modern machine learning systems. In industrial settings, there is usually an embedding team that trains an embedding model to solve intended tasks (e.g., product recommendation). The produced embeddings are then widely consumed by consumer teams to solve their unintended tasks (e.g., fraud detection). However, as the embedding model gets updated and retrained to improve performance on the intended task, the newly-generated embeddings are no longer compatible with the existing consumer models. This means that historical versions of the embeddings can never be retired or all consumer teams have to retrain their models to make them compatible with the latest version of the embeddings, both of which are extremely costly in practice. Here we study the problem of embedding version updates and their backward compatibility. We formalize the problem where the goal is for the embedding team to keep updating the embedding version, while the consumer teams do not have to retrain their models. We develop a solution based on learning backward compatible embeddings, which allows the embedding model version to be updated frequently, while also allowing the latest version of the embedding to be quickly transformed into any backward compatible historical version of it, so that consumer teams do not have to retrain their models. Our key idea is that whenever a new embedding model is trained, we learn it together with a light-weight backward compatibility transformation that aligns the new embedding to the previous version of it. Our learned backward transformations can then be composed to produce any historical version of embedding. Under our framework, we explore six methods and systematically evaluate them on a real-world recommender system application. We show that the best method, which we call BC-Aligner, maintains backward compatibility with existing unintended tasks even after multiple model version updates. Simultaneously, BC-Aligner achieves the intended task performance similar to the embedding model that is solely optimized for the intended task.
more » « less
Full Text Available
AutoGDA: Automated Graph Data Augmentation for Node Classification

Zhao, Tong; Tang, Xianfeng; Zhang, Danqing; Jiang, Haoming; Rao, Nikhil; Song, Yiwei; Agrawal, Pallav; Subbian, Karthik; Yin, Bing; Jiang, Meng (December 2022, Proceedings of the First Learning on Graphs Conference)

Full Text Available
ALLIE: Active Learning on Large-scale Imbalanced Graphs

https://doi.org/10.1145/3485447.3512229

Cui, Limeng; Tang, Xianfeng; Katariya, Sumeet; Rao, Nikhil; Agrawal, Pallav; Subbian, Karthik; Lee, Dongwon (April 2022, In Proceedings of the ACM Web Conference 2022)

Full Text Available
Do Diffusion Protocols Govern Cascade Growth?

Cheng, Justin; Kleinberg, Jon; Leskovec, Jure; Liben-Nowell, David; State, Bogdan; Subbian, Karthik; Adamic, Lada (June 2018, Proceedings of the ... International AAAI Conference on Weblogs and Social Media)

Large cascades can develop in online social networks as people share information with one another. Though simple reshare cascades have been studied extensively, the full range of cascading behaviors on social media is much more diverse. Here we study how diffusion protocols, or the social exchanges that enable information transmission, affect cascade growth, analogous to the way communication protocols define how information is transmitted from one point to another. Studying 98 of the largest information cascades on Facebook, we find a wide range of diffusion protocols - from cascading reshares of images, which use a simple protocol of tapping a single button for propagation, to the ALS Ice Bucket Challenge, whose diffusion protocol involved individuals creating and posting a video, and then nominating specific others to do the same. We find recurring classes of diffusion protocols, and identify two key counterbalancing factors in the construction of these protocols, with implications for a cascade's growth: the effort required to participate in the cascade, and the social cost of staying on the sidelines. Protocols requiring greater individual effort slow down a cascade's propagation, while those imposing a greater social cost of not participating increase the cascade's adoption likelihood. The predictability of transmission also varies with protocol. But regardless of mechanism, the cascades in our analysis all have a similar reproduction number (≈1.8), meaning that lower rates of exposure can be offset with higher per-exposure rates of adoption. Last, we show how a cascade's structure can not only differentiate these protocols, but also be modeled through branching processes. Together, these findings provide a framework for understanding how a wide variety of information cascades can achieve substantial adoption across a network.
more » « less
Full Text Available
Accepted Tutorials at The Web Conference 2022

https://doi.org/10.1145/3487553.3547182

Tommasini, Riccardo; Basu Roy, Senjuti; Wang, Xuan; Wang, Hongwei; Ji, Heng; Han, Jiawei; Nakov, Preslav; Da San Martino, Giovanni; Alam, Firoj; Schedl, Markus; et al (April 2022, TWC 2022)

Full Text Available

Search for: All records