Obtaining accurate estimates of wireless link quality is important to
applications ranging from WiFi deployment, to multihop routing, to
mobile robotics. To address this need we have developed a novel
nonparametric probabilistic model for predicting link qualities in a
complex environment, which can flexibly represent the variability we
observe in the real world, incorporate multiple sources of
information, and exploit spatial structure to minimize training
requirements. We implemented our model and tested it on a deployment
of 31 wirelss sensors (Motes) and found that it outperforms other
common approaches on several different criterion both when data is
sparse and when data is abundant.

We study two seemingly very
different problems: placing sensors in a municipal water distribution
network in order to detect possible contaminations, and selecting
informative weblogs to read on the Internet. We show that both problems
share essential structure: Information cascades spreading over the
blogosphere can be modeled similarly as contaminations spreading
through water networks. We develop an approximation algorithm for
detecting such spreading cascades, which performs well on real blog
data, as well as on the Battle of Water Sensor Networks competition.

Stability is a desirable characteristic for linear dynamical systems, but is
often ignored by algorithms that learn these systems from data. We propose a
novel method for learning stable linear dynamical systems: we formulate an
approximation of the problem as a convex program, start with a solution to a
relaxed version of the program, and incrementally add constraints to improve
stability. Rather than continuing to generate constraints until we reach a
feasible solution, we test stability at each step; because the convex program
is only an approximation of the desired problem, this early stopping rule can
yield a higher-quality solution. We apply our algorithm to the task of
learning dynamic textures from image sequences as well as to modeling
biosurveillance drug sales data. The constraint generation approach leads to
noticeable improvement in the quality of simulated sequences. We compare our
method to those of Lacy and Bernstein, with positive results in terms of
accuracy, quality of simulated sequences, and efficiency.

We consider the problem of
cost-effectively instrumenting a chair with pressure sensors, in order
to enable
seating posture recognition. Our approach reduces the sensor hardware
cost by a factor of 30 compared to previous solutions, while achieving
comparable classification accuracy. We test our approach in a user
study on a real deployment.

When placing wireless sensor
networks, the placed
sensors both have to be informative and communicate well. We develop
pSPIEL, an efficient algorithm for padded Sensor Placements at
Informative and cost-Effective Locations, which near-optimally trades
off informativeness and communication cost of a wireless sensor network
deployment.

Often, nodes in wireless networks need to perform probabilistic inference to combine local, noisy observations into a global, joint estimate of the system state. Suppose that each node carries a fragment of the probabilistic model, and the nodes wish to collaborate, in order to solve the inference task in a distributed manner. In this project, we examine the data structures and algorithms that lead to accurate solutions in presence of node and link failures.

A key problem in deploying a network of cameras is calibration -
determining the location and orientation of each sensor so that
observations in an image can be mapped to locations in the real world.
In this project, we study distributed approaches to camera calibration.
We propose an effective parameterization of the camera state that lets
us approximate the posterior distribution with a simple Gaussian.

Many online services rely on recommender systems that provide users with personalized suggestions for products. A popular approach is collaborative filtering, where product ratings from many users are combined to learn a relation between users and products. Most of the current collaborative filtering algorithms are centralized, requiring a central server that stores the ratings of all the users. In this project, we propose algorithms that work with data that is distributed across nodes in a computer cluster or a peer-to-peer network.

Permutations are ubiquitous in real
world problems
such as voting, rankings, and data association. Representing
uncertainty
over permutations is challenging, however, since there are n!
possibilities
and typical approaches such as graphical models do not capture
the inherent mutual exclusivity constraints. Instead,we use the "low
frequency"
terms of a Fourier decomposition to represent distributions compactly.
We develop intuitive and efficient approximate inference algorithms
defined directly in the Fourier domain and show that they are effective
in a real camera-based multi-person tracking setting.

How can we coordinate multiple
robots in order to effectively monitor environmental phenomena, such as
algal growth in lakes?
We develop an efficient approximation algorithm for this problem, and
demonstrate its effectiveness on several real-world monitoring
problems.

We propose Shotgun, a parallel coordinate descent algorithm for minimizing L1-regularized losses.
Though coordinate descent seems inherently sequential, we prove convergence bounds for Shotgun which predict linear speedups,
up to a problem-dependent limit.
We present a comprehensive empirical study of Shotgun for Lasso and sparse logistic regression.
Our theoretical predictions on the potential for parallelism closely match behavior on real data.
Shotgun outperforms other published solvers on a range of large problems, proving to be one of the most
scalable algorithms for L1.

**Project page, Supplementary Material and Code.**

How can we devise a building
management strategy which saves energy and satisfy personal
preferences?
We develop a utility theoretic approach towards optimal light control
using data from sensor networks.

Learning the structure of probabilistic graphical models that
permit efficient exact inference is a crucial requirement of many
applications. We show that such models,

*thin junction trees*, can be learned efficiently and with global quality guarantees.
Relational data addresses the situation where the data are not just a
set of attributes over independent entities; but also include
relations that induce dependencies between entities -- e.g.,
relational databases. We propose a model for relational data, based on
collective matrix factorization, that yields a joint model of
correlated data types while maintaining well-understood linear models
on individual relations. We show that incorporating additional
relations yields better classification results, and that our model
allows one to embed different types of entities in a shared
low-dimensional space.