|
7.5.1 Regularization
You can apply regularization to the training by setting the option Regularization to a positive number. Then the criterion function minimized in the training becomes
 instead of ( ), given in Section 2.5.3, Training Feedforward and Radial Basis Function Networks, where is the number you specify with the option. The second term in Eq. (7.1) is called the regularization term, which acts like a spring pulling the parameters toward the origin. The spring only marginally influences those parameters that are important for the first term ( ), while parameters that do not have any large impact on ( ) will be pulled to the origin by the regularization. This second class of parameters are effectively excluded from the fit, thus reducing the network's flexibility or, equivalently, reducing the number of efficient parameters. You use to control the importance of a parameter to the training process.
The critical issue in using regularization is to choose a good value of . This may be done by trial and error using validation data. Typically, you try several different values and compare the results.
|