Abstract: Spectral methods are a simple yet effective approach to extract information from high-dimensional data. In the context of inference from a generalized linear model (GLM), they are often used to obtain an initial estimate that can also be employed as a ‘warm start’ for other algorithms. Specifically, in a GLM, the goal is to estimate a d-dimensional signal x from an n-dimensional observation of the form f(Ax, w), where A is a design matrix and w a noise vector. Here, the spectral estimator is the principal eigenvector of a data-dependent matrix, whose spectrum exhibits a phase transition.
In the talk, I will start by (i) discussing the emergence of this phase transition for an i.i.d. Gaussian design A, and by (ii) combining spectral methods with Approximate Message Passing (AMP) algorithms, thus solving a problem related to their initialization. I will then focus on GLMs with a correlated Gaussian design, which are widely adopted in high-dimensional regression. To characterize spectral estimators in this challenging setup, I will propose a novel approach based on AMP: this allows to systematically characterize key spectral properties, such as the location of outlier eigenvalues and the overlap between top eigenvectors and unknown informative components. I will conclude by showing the generality of this technique via an application to matrix denoising with doubly heteroscedastic noise.