Abstract
The aim of this paper is to carry out convergence analysis and algorithm implementation of a novel sample-wise backpropagation method for training a class of stochastic neural networks (SNNs). The preliminary discussion on such an SNN framework was first introduced in [Archibald et al., Discrete Contin. Dyn. Syst. Ser. S, 15 (2022), pp. 2807-2835]. The structure of the SNN is formulated as a discretization of a stochastic differential equation (SDE). A stochastic optimal control framework is introduced to model the training procedure, and a sample-wise approximation scheme for the adjoint backward SDE is applied to improve the efficiency of the stochastic optimal control solver, which is equivalent to the backpropagation for training the SNN. The convergence analysis is derived by introducing a novel joint conditional expectation for the gradient process. Under the convexity assumption, our result indicates that the number of SNN training steps should be proportional to the square of the number of layers in the convex optimization case. In the implementation of the sample-based SNN algorithm with the benchmark MNIST dataset, we adopt the convolution neural network (CNN) architecture and demonstrate that our sample-based SNN algorithm is more robust than the conventional CNN.
| Original language | English |
|---|---|
| Pages (from-to) | 593-621 |
| Number of pages | 29 |
| Journal | SIAM Journal on Numerical Analysis |
| Volume | 62 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 2024 |
| Externally published | Yes |
Keywords
- backward stochastic differential equations
- convergence analysis
- probabilistic learning
- stochastic gradient descent
- stochastic neural networks
Fingerprint
Dive into the research topics of 'NUMERICAL ANALYSIS FOR CONVERGENCE OF A SAMPLE-WISE BACKPROPAGATION METHOD FOR TRAINING STOCHASTIC NEURAL NETWORKS'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver