reason why to make many hidden layers

5 visualizzazioni (ultimi 30 giorni)
Igor Goldberg
Igor Goldberg il 3 Feb 2022
Risposto: Jayanti il 3 Lug 2025
Hi there. As far as I understood, from the "universal approximation theorem", even if the model is nonlinear, only one hidden layer can perfectly fit a nonlinear model, as shown here.
1) If so, what is the added value in using extra layers? The only guess I could think about is that intransfer learning we want to freeze some of the layers while leaving others unfriezed. And if i Use one hidden layer it seems impossible. Am I right?
2) here mr Greg Heath says:
"The multilayer perceptron with one hidden layer is a universal approximator. The only reason to use more than one hidden layer is to reduce the total number of unknown weights by reducing the total number of hidden nodes (i.e., H1+H2 < H)."
Could you please explain me the sentence a little more?
3) maybe there are other reasons to use many layers? I will be glad to hear your explanation.

Risposte (1)

Jayanti
Jayanti il 3 Lug 2025
Hi Igor,
According to the "Universal Approximation Theorem", a neural network with just one hidden layer can, approximate any nonlinear function provided it has enough neurons and the appropriate activation function. However, in practical scenarios, there are several important reasons why models with multiple hidden layers are commonly used:
  1. It often requires a very large number of neurons, which can lead to inefficient training and overfitting. Using multiple layers allows the network to learn hierarchical features. Where earlier layers learn simple patterns, and deeper layers build on these.
  2. Deeper networks often need fewer parameters to represent the same function, making them faster and easier to train.
  3. Deep architectures generally generalize better to unseen data when trained correctly.
  4. Also the technique of transfer learning relies on freezing earlier layers while fine-tuning the later layers. This flexibility is not possible with a single hidden layer architecture.
Regarding Greg Heath’s statement he is emphasizing that deep networks can be more parameter-efficient for complex tasks. They can achieve the same or better representational power with fewer overall weights, making them easier to train and less prone to overfitting due to fewer parameters.
Hope this helps!

Categorie

Scopri di più su Deep Learning Toolbox in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by