title: Discovering and exploiting riffled independence relations in ranked data
speaker: Jonathan Huang (joint work with Carlos Guestrin)
abstract:
Representing distributions over permutations can be a daunting task due to the
fact that the number of permutations of n objects scales factorially in n. One
recent way that has been used to reduce storage complexity has been to exploit
probabilistic independence, but full independence assumptions impose
strong sparsity constraints on distributions and are unsuitable for modeling
rankings. I will discuss a novel class of independence structures, called riffled
independence, encompassing a more expressive family of distributions while
retaining many of the properties necessary for performing efficient inference and
reducing sample complexity. In riffled independence, one draws two permutations
independently, then performs the riffle shuffle, common in card games, to
combine the two permutations to form a single permutation. In ranking, riffled
independence corresponds to ranking disjoint sets of objects independently,
then interleaving those rankings.
In addition to pointing out ways in which riffled independence assumptions
can be leveraged during learning and inference, I will discuss strategies
for efficiently finding riffle independent subsets of objects from ranked
data and show some examples in real ranked datasets.