Sequences of Sets

Benson, Austin R; Kumar, Ravi; Tomkins, Andrew.

Sequential behavior such as sending emails, gathering in groups, tagging posts, or authoring academic papers may be characterized by a set of recipients, attendees, tags, or coauthors respectively. Such łsequences of sets" show complex repetition behavior, sometimes repeating prior sets wholesale, and sometimes creating new sets from partial copies or partial merges of earlier sets. In this paper, we provide a stochastic model to capture these pat- terns. The model has two classes of parameters. First, a correlation parameter determines how much of an earlier set will contribute to a future set. Second, a vector of recency parameters captures the fact that a set in a sequence is more similar to recent sets than more distant ones. Comparing against a strong baseline, we ind that modeling both correlation and recency structures are required for high accuracy. We also ind that both parameter classes vary widely across domains, so must be optimized on a per-dataset basis. We present the model in detail, provide a theoretical examination of its asymptotic behavior, and perform a set of detailed experiments on its predictive performance.

More Like this