Overview: This paper is hard to classify and I don’t think I understand section 4. Nevertheless, it provides a completely new family of neural network by interpreting residual like neural networks as an ODE integrater.

ResNet as ODE solver: ResNet can be viewed as an Euler integrator for an ODE that has initials and terminating conditions, at time t=0, 1, while the hidden layer in the middle are the fix step evaluations.

ODE Network: We can push this approach forward by model the evaluation continuously and adaptively, i.e. evaluating with an ODE solver. The hardest problem is to calculate the backward gradients and with some mathematical proofs, this is doable by calling another ODE solver.

Advantage: Some obvious advantage of this formulation are: a) Memory (constant) and computation efficiency; b) Scalable and invertible normalizing flows; c) Continuous-time evaluation. I am not sure I understand b) but for c) , under this abstraction, ResNet is a special case of RNN and both are a special case of continuous-time evaluation model where the sample can come at any time t.