### Uncertainty Propagation in Deep Neural Network Using Active Subspace

Posted on 21/10/2019, in Paper.
• Overview: Monte Carlo estimates of the uncertainty of a prediction need to efficiently sample the inputs and evaluate the model to get the first and second moment of the output. This paper adopt the active subspace method proposed by Constantine, Dow and Wang 2014, and build the entire workflow of propagating the input uncertainty in DNN.
• Activate Subspace: Denote the DNN as $f(x)$, and we try to find a orthogonal matrix $S_{d\times r}$, such that: \begin{equation} f(x) \approx g(S^T x) \end{equation} Then the active space is defined as $span(S)$. Here we can propose the active subspace by an eigenvalue decomposition of the outer product of the gradient: \begin{equation} W\Lambda W^T = C = \int \nabla f(x) \nabla f(x)^T \pi_x(x) dx \end{equation} , here $\pi_x(x)$ is the empirical distribution of x. Then we can work on the space spanned by the first $d$ eigenvectors of $C$ if they captured the most variance.
• Response Surface: Substituting the projection we define the response surface as: \begin{equation} RS(x_r) = g(x_r) \approx f(x) \end{equation}
• Estimate Output Distribution: Now we can still sample $x$ from $\pi_x$ but project it onto subspace $S$ to effectively reduce the computation cost.
• Result: Their experiment on MNIST shows the first 1-2 eigenvectors captured the pixels inputs space really well, and the output uncertainty tends to have a liner relation with the input uncertainty for various noising level. They also showed the MC sampling first two momentum represent the true ones well.

Review Questions

• Q1. What is matrix $C$ capturing and how is that being able to capture the active subspace?