Can a neural network be created without a training ?

Question

Stefano il 10 Apr 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/455505-can-a-neural-network-be-created-without-a-training

Commentato: Abhishek Singh il 9 Giu 2019

Is it possible to transfer weights and parameters of a pre-trained network into another with a slightly different architecture WITHOUT having to do a finetuning?

All tutorials shown:

1 how to extract layers from a pre-trained network,

2 build a new architecture using them

3 and ... tmake a fine tuning.

Without the fine-tuning, all the layers alone cannot be used as if they were a complete SeriesNetwork (even if you do not need to modify the wiegits or the bias) and thus lackes all those functions such as "activations" that can be used on a SeriesNetwork, but not on pure "layers".

2 Commenti
Mostra NessunoNascondi Nessuno

Stefano il 18 Apr 2019

Thank for thew answer.

However I've found the solution to the main problem:

just use the command: "AssembleNetwork"

Abhishek Singh il 9 Giu 2019

Maybe accept the answer or write your own answer and accept it so this could be helpful for others

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Abhishek Singh il 15 Apr 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/455505-can-a-neural-network-be-created-without-a-training#answer_370729

The question is too broad to have a single generic answer.

However, we can discuss the essence of your question and try to get to the answer is specific situations.

If you take the pre-trained parameters (weights+bias) and simply copy them over to an identical architecture, then we will still need to add the identical activation functions for it to work perfectly.

If you take the pre-trained parameters (weights+bias) and copy them over to a new architecture which has fewer number of layers then the original, then it will also work provided identical activation functions are added.

If you take the pre-trained parameters and copy them over to a new architecture which has new layers that must be randomly initialized, then there is very low probability that it will work in the first attempt. This depends on the initialization of the parameters in the new layers whose pre-trained parameters don't exist in the old network. To be pedantic, if the task is very similar and the number of new layers added at the end of the network are very small (approaching towards 1) it might still work better than a network whose parameters were initialized from scratch.

However, there is a lot of literature on the techniques to smartly initialize a network. Apparently, initialization of parameters gets more and more important as the depth of the network grows. Good parameter initializations (like Kaiming He initialization) can help the training process tremendously. There is a paper on "Fixup initialization" which takes this idea to the extreme. Another sub-domain of deep learning research which might be of interest is "One-shot learning". I also remember a blog post where transfer learning could be used by adding new layers in the middle of the network instead of at the end. The trick was to add these layers as "residual" layers with the parameters set to zero.Thus in the beginning of the training these layers served as identity function for the previous layers but soon they began learning new weights where it helped optimize the loss (Searched everywhere but I can't seem to find the blog now, sorry)

Reference:

https://www.ri.cmu.edu/wp-content/uploads/2017/06/yuxiongw_cvpr17_growingcnn.pdf