On the Power of Shallow Learning
We prove that any NNGP or NTK kernel of any wide, deep fully-connected network architecture can be achieved in just a single hidden layer with a specially-designed activation function. We validate this surprising result with several experiments.
J. B. Simon
M. R. DeWeese