Abstract:
We consider a complete dictionary recovery problem, in which we are given a data matrix Y, and the goal is to factor it into a product Y ~ A_0 X_0, with A_0 a square and invertible matrix and X_0 is a sparse matrix of coefficients. This is an abstraction of the dictionary learning problem, in which we try to learn a concise approximation to a given dataset. While dictionary learning is widely (and effectively!) used in signal processing and machine learning, relatively little is known about it in theory. Much of the difficulty owes to the fact that standard learning algorithms solve nonconvex problems, and are difficult to analyze globally.
We describe the first efficient algorithm which provably learns representations in which the matrix X_0 has as many as O(n) nonzeros per column, under a suitable probability model for X_0. Previous efficient algorithms either only worked for very sparse instances (O(n^{1/2}) nonzero per column) or required multiple rounds of SDP relaxation. Our results follow from a reformulation of the dictionary recovery problem as a nonconvex optimization over a high dimensional sphere. This particular nonconvex problem has a surprising property: once about n^3 data samples have been observed, with high probability the objective function has no spurious local minima. We prove this under probabilistic data models, and give numerical evidence that the same property holds for real imagery datasets. We use the geometric characterization of the objective landscape to develop a Riemannian trust region method which efficiently and provably obtains the target solution.
This geometric phenomenon, in which seemingly challenging nonconvex problems can be solved efficiently, also arises in problems such as tensor decomposition and phase recovery from magnitude measurements. We sketch these connections, and illustrate our results with applications to microscopy and computer vision. *Joint work with Ju Sun and Qing Qu (Columbia)
Bio:
John Wright is an Assistant Professor in the Electrical Engineering Department at Columbia University. He received his PhD in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2009, and was with Microsoft Research from 2009-2011. His research is in the area of high-dimensional data analysis. In particular, his recent research has focused on developing algorithms for robustly recovering structured signal representations from incomplete and corrupted observations, and applying them to practical problems in imaging and vision. His work has received a number of awards and honors, including the 2009 Lemelson-Illinois Prize for Innovation for his work on face recognition, the 2009 UIUC Martin Award for Excellence in Graduate Research, and a 2008-2010 Microsoft Research Fellowship, and the Best Paper Award from the Conference on Learning Theory (COLT) in 2012, the 2015 PAMI TC Young Researcher Award, and the 2015 SPARS Best Student Paper Award (for PhD Advisees Ju Sun and Qing Qu).
Ещё видео!