Deep learning on graphs
This talk is about deep learning applied to graphs, for predicting, inferring, and controlling complex systems. I’ll introduce my team’s “graph network” formalism and open-source library (github.com/deepmind/graph_nets), and describe how we’ve used these tools to learn physical reasoning, visual scene understanding, robotic control, multi-agent behavior, and physical construction.
Fully decentralized joint learning of personalized models and collaboration graphs
We consider the fully decentralized machine learning scenario where many users with personal datasets collaborate to learn models through local peer-to-peer exchanges, without a central coordinator. We propose to train personalized models that leverage a collaboration graph describing the relationships between the users’ personal tasks, which we learn jointly with the models. Our fully decentralized optimization procedure alternates between training nonlinear models given the graph in a greedy boosting manner, and updating the collaboration graph (with controlled sparsity) given the models. Throughout the process, users exchange messages only with a small number of peers (their direct neighbors in the graph and a few random users), ensuring that the procedure naturally scales to large numbers of users. We analyze the convergence rate, memory and communication complexity of our approach, and demonstrate its benefits compared to competing techniques on synthetic and real datasets.
Exact recovery in the Ising block model
We consider the problem associated to recovering the block structure of an Ising model given independent observations on the binary hypercube. This new model, called the Ising block model, is a perturbation of the mean eld approximation of the Ising model known as the Curie–Weiss model: the sites are partitioned into two blocks of equal size and the interaction between those of the same block is stronger than across blocks, to account for more order within each block. We study probabilistic, statistical and computational aspects of this model in the high-dimensional case when the number of sites may be much larger than the sample size.
[Joint work with P. Srivastava and P. Rigollet]
Functional protein design using geometric deep learning
Protein-based drugs are becoming some of the most important drugs of the XXI century. The typical mechanism of action of these drugs is a strong protein-protein interaction (PPI) between surfaces with complementary geometry and chemistry. Over the past three decades, large amounts of structural data on PPIs has been collected, creating opportunities for differentiable learning on the surface geometry and chemical properties of natural PPIs. Since the surface of these proteins has a non-Euclidean structure, it is a natural fit for geometric deep learning. In the talk, I will show how geometric deep learning methods can be used to address various problems in functional protein design such as interface site prediction, pocket classification, and search for surface motifs. I will present results on an ongoing work with collaborators Bruno Correia, Pablo Gainza-Cirauqui, and others from the EPFL Lab of Protein Design and Immunoengineering.
Computationally efficient estimation of the spectral gap of a Markov chain
We consider the problem of estimating from sample paths the absolute spectral gap of a reversible, irreducible and aperiodic Markov chain over a finite state space. We propose the UCPI (Upper Confidence Power Iteration) algorithm for this problem, a low-complexity algorithm which estimates the spectral gap in time O(n) and memory space O(㏒(n)2) given n samples. This is in stark contrast with most known methods which require at least memory space scaling linearly with the state space size so that they cannot be applied to large state spaces. Furthermore, UCPI is amenable to parallel implementation. More details at https://arxiv.org/abs/1806.060
Learning Quadratic Games on Networks
Individuals, or organisations, cooperate with or compete against one another in a wide range of practical situations. In the economics literature, such strategic interactions are often modelled as games played on networks, where an individual’s payoff depends not only on her action but also that of her neighbours. The current literature has largely focused on analysing the characteristics of network games in the scenario where the structure of the network, which is represented by a graph, is known beforehand. It is often the case, however, that the actions of the players are readily observable while the underlying interaction network remains hidden. In this talk, I will introduce two novel frameworks for learning, from the observations on individual actions, network games with linear-quadratic payoffs, and in particular the structure of the interaction network. Both frameworks are based on the Nash equilibrium of such games and involve solving a joint optimisation problem for the graph structure and the individual marginal benefits. We test the proposed frameworks on both synthetic and real world examples, and show that they can effectively and more accurately learn the games than the baselines. The proposed approach has both theoretical and practical implications for understanding strategic interactions in a network environment.
A general framework for structured prediction
Statistical learning theory offers powerful approaches to deal with supervised problems with linear output spaces (e.g. scalar values or vectors). However, problems with non-linear (and often non-convex/discrete) output spaces are becoming increasingly common. Examples include image segmentation or captioning, speech recognition, manifold regression, trajectory planning, protein folding, prediction of probability distributions or ranking to name a few. These settings are often referred to as “structured prediction” problems, since they require dealing with output spaces that have a specific structure, such as: strings, images, sequences, manifolds or graphs. In this talk we consider a novel structured prediction framework combining ideas from the literature on surrogate approaches and “likelihood estimation” methods. Within this setting we systematically design learning algorithms for a wide range of learning problems for which it is possible to prove strong theoretical guarantees, namely universal consistency and learning rates. The proposed analysis leverages techniques related to the theory of reproducing kernel Hilbert spaces and vector-valued regression in infinite dimensional settings.
Sparse network estimation
Embeddings for graph classification
New insights to partial monitoring
In a partial monitoring game, the learner repeatedly chooses an action, the environment responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his regret, which is the difference between his total cumulative loss and the total loss of the best fixed action in hindsight. In this paper, we characterize the minimax regret of any partial monitoring game with finitely many actions and outcomes. It turns out that the minimax regret of any such game is either zero, ?(T1/2), ?(T2/3), or ?(T). We provide computationally efficient learning algorithms that achieve the minimax regret within logarithmic factor for any game. In addition to the bounds on the minimax regret, if we assume that the outcomes are generated in an i.i.d. fashion, we prove individual upper bounds on the expected regret.
Predicting switching graph labelings with cluster specialists
We address the problem of predicting the labeling of a graph in an online setting when the labeling is changing over time. Our primary algorithm is based on a specialist approach; we develop the machinery of cluster specialists which probabilistically exploits the cluster structure in the graph. We show that one variant of this algorithm surprisingly only requires ?(㏒(n)) time on any trial t on an n-vertex graph. Our secondary algorithm is a quasi-Bayesian classifier which requires ?(t㏒(n)) time to predict at trial t. We prove switching mistake-bound guarantees for both algorithms. For our primary algorithm, the switching guarantee smoothly varies with the magnitude of the change between successive labelings. In preliminary experiments, we compare the performance of these algorithms against an existing algorithm (a kernelized Perceptron) and show that our algorithms perform better on synthetic data.
Towards decentralized reinforcement learning architectures for modelling agent societies
The evolution of cooperation in competitive environments has been relevant to studies in economics, game theory, psychology, social science and computing. Mathematical and computational models have been developed in order to provide insights about underlying mechanisms. More recently, we have witnessed the emergence of studies of societies based on agents that can learn their strategies as they interact. The applications of this work are many: from the design of self-organising agent systems to the understanding of the emergence of cooperation in human and animal societies (and, possibly, in the future, in mixed environments composed of humans and artificial agents).
In this talk I will give an overview of our ongoing work in modelling societies using multi-agent systems based on Reinforcement Learning. In particular, I will present our work in formalizing decentralized Reinforcement Learning architectures for modelling and studying social dilemmas. I will discuss the design of architectures composed of autonomous agents that do not rely on centralised coordination. Finally, I will discuss our ongoing work in this area and the open questions in this fascinating emerging field.
Flexible probabilistic models of networks in sequence data and in semi-supervised classification
Decentralized multi-armed bandits
We consider a stochastic bandit problem in a decentralized setting, in which all the nodes of the network play the same bandit problem and communicate with their neighbors to minimize future regret, measured as the sum of the regrets of each node. Using an approximate and delayed upper confidence bound algorithm and the averaging problem in decentralized networks we are able to propose an algorithm that improves the state of the art both theoretically and empirically. The algorithm assumes very little global information about the network and the regret scales naturally when the spectral gap decreases. At the same time, the asymptotic dependence of the regret on time does not worsen with respect to an optimal centralized algorithm. We provide lower bounds that show that the gap between the upper bound defined by our algorithm and the lower bound, although not optimal, is narrow.
Based on joint work with David Martínez-Rubio and Patrick Rebeschini.