Coronavirus Information for the UC San Diego Community

Our leaders are working closely with federal and state officials to ensure your ongoing safety at the university. Stay up to date with the latest developments. Learn more.

Machine learning: Deepest learning as statistical data assimilation problems

TitleMachine learning: Deepest learning as statistical data assimilation problems
Publication TypeJournal Article
Year of Publication2018
AuthorsAbarbanel H.DI, Rozdeba P.J, Shirman S.
JournalNeural Computation
Volume30
Pagination2025-2055
Date Published2018/08
Type of ArticleArticle
ISBN Number0899-7667
Accession NumberWOS:000445128000001
Keywordsalgorithm; bound-constrained optimization; Computer Science; integrators; Neurosciences & Neurology
Abstract

We formulate an equivalence between machine learning and the formulation of statistical data assimilation as used widely in physical and biological sciences. The correspondence is that layer number in a feedforward artificial network setting is the analog of time in the data assimilation setting. This connection has been noted in the machine learning literature. We add a perspective that expands on how methods from statistical physics and aspects of Lagrangian and Hamiltonian dynamics play a role in how networks can be trained and designed. Within the discussion of this equivalence, we show that adding more layers (making the network deeper) is analogous to adding temporal resolution in a data assimilation framework. Extending this equivalence to recurrent networks is also discussed. We explore how one can find a candidate for the global minimum of the cost functions in the machine learning context using a method from data assimilation. Calculations on simple models from both sides of the equivalence are reported. Also discussed is a framework in which the time or layer label is taken to be continuous, providing a differential equation, the Euler-Lagrange equation and its boundary conditions, as a necessary condition for a minimum of the cost function. This shows that the problem being solved is a two-point boundary value problem familiar in the discussion of variational methods. The use of continuous layers is denoted "deepest learning." These problems respect a symplectic symmetry in continuous layer phase space. Both Lagrangian versions and Hamiltonian versions of these problems are presented. Their well-studied implementation in a discrete time/layer, while respecting the symplectic structure, is addressed. The Hamiltonian version provides a direct rationale for backpropagation as a solution method for a certain two-point boundary value problem.

DOI10.1162/neco_a_01094
Short TitleNeural Comput.
Student Publication: 
No