Obtaining accurate estimates of wireless link quality is important to applications ranging from WiFi deployment, to multihop routing, to mobile robotics. To address this need we have developed a novel nonparametric probabilistic model for predicting link qualities in a complex environment, which can flexibly represent the variability we observe in the real world, incorporate multiple sources of information, and exploit spatial structure to minimize training requirements. We implemented our model and tested it on a deployment of 31 wirelss sensors (Motes) and found that it outperforms other common approaches on several different criterion both when data is sparse and when data is abundant.
We study two seemingly very different problems: placing sensors in a municipal water distribution network in order to detect possible contaminations, and selecting informative weblogs to read on the Internet. We show that both problems share essential structure: Information cascades spreading over the blogosphere can be modeled similarly as contaminations spreading through water networks. We develop an approximation algorithm for detecting such spreading cascades, which performs well on real blog data, as well as on the Battle of Water Sensor Networks competition.
Stability is a desirable characteristic for linear dynamical systems, but is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein, with positive results in terms of accuracy, quality of simulated sequences, and efficiency.
We consider the problem of cost-effectively instrumenting a chair with pressure sensors, in order to enable seating posture recognition. Our approach reduces the sensor hardware cost by a factor of 30 compared to previous solutions, while achieving comparable classification accuracy. We test our approach in a user study on a real deployment.
When placing wireless sensor networks, the placed sensors both have to be informative and communicate well. We develop pSPIEL, an efficient algorithm for padded Sensor Placements at Informative and cost-Effective Locations, which near-optimally trades off informativeness and communication cost of a wireless sensor network deployment.

Often, nodes in wireless networks need to perform probabilistic inference to combine local, noisy observations into a global, joint estimate of the system state. Suppose that each node carries a fragment of the probabilistic model, and the nodes wish to collaborate, in order to solve the inference task in a distributed manner. In this project, we examine the data structures and algorithms that lead to accurate solutions in presence of node and link failures.

A key problem in deploying a network of cameras is calibration - determining the location and orientation of each sensor so that observations in an image can be mapped to locations in the real world. In this project, we study distributed approaches to camera calibration. We propose an effective parameterization of the camera state that lets us approximate the posterior distribution with a simple Gaussian.

Many online services rely on recommender systems that provide users with personalized suggestions for products. A popular approach is collaborative filtering, where product ratings from many users are combined to learn a relation between users and products. Most of the current collaborative filtering algorithms are centralized, requiring a central server that stores the ratings of all the users. In this project, we propose algorithms that work with data that is distributed across nodes in a computer cluster or a peer-to-peer network.

Permutations are ubiquitous in real world problems such as voting, rankings, and data association. Representing uncertainty over permutations is challenging, however, since there are n! possibilities and typical approaches such as graphical models do not capture the inherent mutual exclusivity constraints. Instead,we use the "low frequency" terms of a Fourier decomposition to represent distributions compactly. We develop intuitive and efficient approximate inference algorithms defined directly in the Fourier domain and show that they are effective in a real camera-based multi-person tracking setting.
How can we coordinate multiple robots in order to effectively monitor environmental phenomena, such as algal growth in lakes? We develop an efficient approximation algorithm for this problem, and demonstrate its effectiveness on several real-world monitoring problems.
We propose Shotgun, a parallel coordinate descent algorithm for minimizing L1-regularized losses. Though coordinate descent seems inherently sequential, we prove convergence bounds for Shotgun which predict linear speedups, up to a problem-dependent limit. We present a comprehensive empirical study of Shotgun for Lasso and sparse logistic regression. Our theoretical predictions on the potential for parallelism closely match behavior on real data. Shotgun outperforms other published solvers on a range of large problems, proving to be one of the most scalable algorithms for L1.
Project page, Supplementary Material and Code.
Light control
How can we devise a building management strategy which saves energy and satisfy personal preferences? We develop a utility theoretic approach towards optimal light control using data from sensor networks.
Junction tree
Learning the structure of probabilistic graphical models that permit efficient exact inference is a crucial requirement of many applications. We show that such models, thin junction trees, can be learned efficiently and with global quality guarantees.
ER Diagram
Relational data addresses the situation where the data are not just a set of attributes over independent entities; but also include relations that induce dependencies between entities -- e.g., relational databases. We propose a model for relational data, based on collective matrix factorization, that yields a joint model of correlated data types while maintaining well-understood linear models on individual relations. We show that incorporating additional relations yields better classification results, and that our model allows one to embed different types of entities in a shared low-dimensional space.