|
7.6.1 Small Example
Read in the Neural Networks package.
In[1]:=
The following standard package is needed for the surface plot of the criterion.
In[2]:=
First, the "true" function has to be defined, which is then used to generate the data. To make the example small, with one linear and one nonlinear parameter, a small FF network is chosen. It consists of one input and one output, without any hidden layers and it has no bias parameter. This small network has only two parameters and these are chosen to be (2,1).
Define the "true" function.
In[3]:=
Generate data with the true function.
In[5]:=
To illustrate the result with plots you need the following function, which computes the criterion of fit, described in Section 2.5.3, Training Feedforward and Radial Basis Function Networks.
In[8]:=
Look at the criterion as a function of the two parameters.
In[9]:=

The parameter that has the largest influence on the criterion is the linear parameter. The separable algorithm minimizes the criterion in the direction of the linear parameter in each iteration of the algorithm so that the iterative training follows the valley. This will be obvious from the following computations.
The network is now initialized at the point (1.1,2) in the parameter space, and separable training is compared with the nonseparable training. This is done using the default Levenberg-Marquardt training algorithm. You can repeat the example using the Gauss-Newton or the steepest descent training by changing the option Method.
Initialize the network and insert the true parameter values.
In[10]:=
Train with the separable algorithm.
In[12]:=

Form a list of the trajectory in the parameter space.
In[13]:=
Form plots of the trajectory and show it together with the criterion surface.
In[14]:=

As you can see from the plot the parameter estimate is at the bottom of the valley already at the first iteration. The minimization problem has been reduced to a search in one dimension, along the valley, instead of the original two-dimensional space. The training converged after approximately five iterations.
The calculations can now be repeated but without using the separable algorithm.
Train without the separable algorithm.
In[17]:=

Form a list of the trajectory in the parameter space.
In[18]:=
Form plots of the trajectory and show it together with the criterion surface.
In[19]:=

Without the separable algorithm the training is slowed down a little. Several iterations are necessary before the bottom of the valley is reached. Also the convergence along the valley is somewhat slower. The algorithm needs about eight iterations to converge.
|