reason why to make many hidden layers

Question

Igor Goldberg il 3 Feb 2022

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/1642410-reason-why-to-make-many-hidden-layers

Risposto: Jayanti il 3 Lug 2025

Hi there. As far as I understood, from the "universal approximation theorem", even if the model is nonlinear, only one hidden layer can perfectly fit a nonlinear model, as shown here.

1) If so, what is the added value in using extra layers? The only guess I could think about is that intransfer learning we want to freeze some of the layers while leaving others unfriezed. And if i Use one hidden layer it seems impossible. Am I right?

2) here mr Greg Heath says:

"The multilayer perceptron with one hidden layer is a universal approximator. The only reason to use more than one hidden layer is to reduce the total number of unknown weights by reducing the total number of hidden nodes (i.e., H1+H2 < H)."

Could you please explain me the sentence a little more?

3) maybe there are other reasons to use many layers? I will be glad to hear your explanation.

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Jayanti il 3 Lug 2025

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/1642410-reason-why-to-make-many-hidden-layers#answer_1567491

Hi Igor,

According to the "Universal Approximation Theorem", a neural network with just one hidden layer can, approximate any nonlinear function provided it has enough neurons and the appropriate activation function. However, in practical scenarios, there are several important reasons why models with multiple hidden layers are commonly used:

It often requires a very large number of neurons, which can lead to inefficient training and overfitting. Using multiple layers allows the network to learn hierarchical features. Where earlier layers learn simple patterns, and deeper layers build on these.
Deeper networks often need fewer parameters to represent the same function, making them faster and easier to train.
Deep architectures generally generalize better to unseen data when trained correctly.
Also the technique of transfer learning relies on freezing earlier layers while fine-tuning the later layers. This flexibility is not possible with a single hidden layer architecture.

Regarding Greg Heath’s statement he is emphasizing that deep networks can be more parameter-efficient for complex tasks. They can achieve the same or better representational power with fewer overall weights, making them easier to train and less prone to overfitting due to fewer parameters.

Hope this helps!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

reason why to make many hidden layers

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

reason why to make many hidden layers

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposte (1)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti