|
5.3 Classification with Feedforward Networks
In this section a small example is given, showing how FF networks can be used for classification.
Read the Neural Networks package.
In[1]:=
Load the data consisting of three classes divided into two clusters each. The data distribution is contained in x (input data), and the class indication is in y (output data). The data format is described in Section 3.2 Package Conventions.
Load the data vectors and output indicating class.
In[2]:=
Look at the data.
In[3]:=

In classification problems it is important to have a differentiable nonlinearity at the output of the FF network model. The purpose of the nonlinearity is to ensure that the output values stay within the range of the different classes. That is done using the option OutputNonlinearity. Its default is None. Set it to Sigmoid so that its saturating values are 0 and 1, exactly as the output data of the classes. Note that the sigmoid never reaches exactly 0 and 1; this is in most problems of no practical importance.
Initialize an FF network.
In[4]:=
Out[4]=
Train the initialized FF network.
In[5]:=

The trained classifier can now be used on the input data vectors.
Classify two data vectors.
In[6]:=
Out[6]=
The data vectors are assigned to the class with the largest output value. If several outputs give large values, or if none of them do, then the classifier is considered to be highly unreliable for the data used.
The performance of the derived classifier can be illustrated in different ways using NetPlot. By the choice of the option DataFormat you can indicate the type of plot you want. If the data vectors are of dimension two, as in this example, nice plots of the classification boundaries can be obtained.
Plot classification borders together with the data.
In[7]:=

The previous plot showed the classification boundary for each class. It is also possible to look at the classification function as a function plot. Since there are three outputs of the network, you obtain three plots. The boundaries indicated in the previous plot are the level curves where the function output equals 0.5 in the function plot shown here.
Look at the function.
In[8]:=

This option can be used for problems with one or two input signals.
By giving the option BarChart, you obtain bar charts showing the classification performance. Correctly classified data is found on the diagonal and the misclassified data corresponds to the off-diagonal bars. Notice that, since the outputs of the FF network take values in the range {0,1}, you do not obtain precise classifications but, rather, a "degree" of membership. This situation may be corrected by using a UnitStep output neuron with the option setting OutputNonlinearity UnitStep. Then the outputs will be either 0 or 1, as desired.
In[9]:=

On the x and y axes you have the class of the samples according to the output data and according to the network classifier. On the z axis is the number of samples. For example, in the bin (2,3) is the number of data samples from the second class, according to the supplied output data, but classified into the third class by the network. Therefore, the diagonal bins correspond to correctly classified samples, that is, the network assigns these samples to the same class as indicated in the output data.
In contrast to FunctionPlot and Classifier, the BarChart option can be used to visualize the performance of classifiers of any input dimensions.
So far you have evaluated the end result of the training—the derived FF network. It is also possible to display the same plots but as a function of the training iterations. Consider the training record.
In[10]:=
Out[10]=
The first component is just a copy of the FF network model. The second component contains several information items about the training. Section 7.8, The Training Record, shows you how to extract the information from the training record. Here, you will see how this information can be illustrated in different ways using NetPlot and depending on which DataFormat option is chosen.
Look at the classification performance for each class during training. Correctly classified samples are marked with diamonds and a solid line, incorrectly assigned samples are indicated with stars and a dashed line.
In[11]:=

The training progress of the classifier may be viewed as a function of iteration using the option setting DataFormat Classifier. By default, the display shows the evolving boundaries at every (5 × report frequency) iterations, where the report frequency is determined by the option ReportFrequency of NeuralFit. The display frequency may be changed from 5 to any other positive integer by explicitly setting Intervals to a desired value, such as 4 in the present example.
Plot the classifier at every four training iterations.
In[12]:=

If you prefer, the progress can be animated as described in Section 5.2.1, Function Approximation in One Dimension, instead of being given in a graphics array.
Also the BarChart option can be used to evaluate the progress of the training. Changing the nonlinearity at the output from the smooth sigmoid to a discrete step makes the output take only the values 0 and 1.
Illustrate the classification result every four iterations of the training.
In[13]:=

As seen in the plots, in the beginning of the training the network model classifies a lot of samples incorrectly. These incorrectly classified samples are illustrated with the non-diagonal bins. As the training proceeds, more of the samples are classified correctly and at the end all of the samples are correctly classified since all samples are at the diagonal.
This result can easily be animated as described in Section 5.2.1, Function Approximation in One Dimension.
|