How to use trainlm with L2 regulation
18 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
As a simple test problem, I am training a neural network with 1 hidden layer for function fitting. The training method "trainlm" works well when I set
net.performParam.regularization=0.
To prevent over-fitting (and other purposes), I like to introduce L2 regulation. However, when I set
net.performParam.regularization=1e-6 (or any other positive number),
the training stopped at iteration 3 with "Maximum Mu reached".
Can we use trainlm with L2 regulation at all?
3 Commenti
Shivansh
il 11 Set 2023
Hi Hongyun,
The above code can be executed with regularization but the parameters should be in sync with each other. Some possible workarounds can be:
- Decreasing the complexity of the model. (It works when hidden size = 32).
- Decreasing the strength of regularization. (Working for net.performParam.regularization = 1e-7;)
The above actions will execute the code but may not lead to optimal results.
You are able to get the results when training first without regularization and then with regularization because the first training sets the initial weights closer to optimal setting and the second training makes the solution better. Another way to resolve the above problem with similar parameters can be to execute the regularization training first. When you train the network with regularization first, the algorithm reaches the maximum mu value and terminates prematurely, as you mentioned. However, when you subsequently train the network without regularization, it starts from the weights obtained from the previous training and continues the optimization process. Since the network is already initialized with weights that are close to the optimal solution, the training without regularization is able to further improve the performance and achieve the desired goal.
Risposte (1)
Ashu
il 6 Set 2023
Hey Wang,
I understand that you training a network with "trainlm" and the training stops with "Maximum Mu reached". The problem with regularization is that it is difficult to determine the optimum value for the performance ratio parameter. If you make this parameter too large, you might get overfitting. If the ratio is too small, the network does not adequately fit the training data.
The following suggestions might help you in training the network better.
1. To resolve this issue you can experiment with the training parameters of "trainlm", like increasing the value of "net.trainParam.mu_max", "net.trainParam.mu_dec".
For the list of parameters, please refer the following documentation of "trainlm"
2. Use Automated Regularisation (trainbr) : The weights and biases of the network are assumed to be random variables with specified distributions. The regularization parameters are related to the unknown variances associated with these distributions. You can then estimate these parameters using statistical techniques.
Please refer to the following page to learn more about this.
I hope this was helpful.
Vedere anche
Categorie
Scopri di più su Sequence and Numeric Feature Data Workflows in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!