Problem Set 4b

From 6.034 Wiki

(Difference between revisions)

Revision as of 07:57, 7 November 2006

__NUMBERS__

Neural Nets

In this problem set, you will be experimenting with neural nets. Although you'll only have to do the most superficial of coding, take your time and make sure you really understand what is happening when you run the nets.

Learning XOR

The first function the neural net will be learning is XOR. XOR is a binary function that returns a 1 when exactly one of its two inputs is 1. The XOR table looks like this:

A B XOR(A, B)
1 1    0
1 0    1
0 1    1
0 0    0

To do this, we are using a net with 2 input neurons, 2 hidden (internal) neurons, and 1 output neuron. Take a look near the bottom of nnet-train.scm to find the procedures you will use in training the XOR function. In order to set up the neural net for XOR, call (initialize-xor learning-rate) , which readies the neural net to learn XOR, at the specified learning rate learning-rate.

Now that the neural net is set up, it can be trained. Training the neural net is achieved by calling (train-xor epochs target-error) . This trains the neural net on the XOR data for epochs epochs (an epoch is a complete cycle through the training data), or until the average error is reduced to less than target-error.

Training 1

Initialize the XOR net with a learning rate of 0.3. Then train it in increments of 1000 epochs. Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs. Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.

Training 2

Initialize the XOR net with a learning rate of 0.6. Then train it in increments of 1000 epochs. Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs. Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.

Training 3

Initialize the XOR net with a learning rate of 1.0. Then train it in increments of 1000 epochs. Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs. Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.

A new data set

Now we'll try using a neural net for classification, using the dataset in nnet.data. To do this, we are using a net with 2 input neurons, some hidden (internal) neurons, and 1 output neuron. You will be experimenting with changing the number of hidden neurons and its effect on the training process. Take a look near the bottom of nnet-train.scm to find the procedures you will use in training the classifier function. In order to set up the neural net, call (initialize-classifier-net n-hidden learning-rate) , which readies the neural net for training, at the specified learning rate learning-rate, with n-hidden hidden neurons.

Now that the neural net is set up, it can be trained. Training the neural net is achieved by calling (train-classifier-net epochs target-error) . This trains the neural net on the data for epochs epochs (an epoch is a complete cycle through the training data), or until the average error is reduced to less than target-error.

@@ Line 27: / Line 27: @@
 0    0
-To do this, we are using a net with 2 input neurons, 2 hidden (internal) neurons, and 1 output neuron.  Take a look in nnet-train.scm to find the procedures you will use in training the XOR function.
+To do this, we are using a net with 2 input neurons, 2 hidden (internal) neurons, and 1 output neuron.  Take a look near the bottom of nnet-train.scm to find the procedures you will use in training the XOR function.  In order to set up the neural net for XOR, call <tt> (initialize-xor learning-rate) </tt>, which readies the neural net to learn XOR, at the specified learning rate <tt>learning-rate</tt>.
+Now that the neural net is set up, it can be trained.  Training the neural net is achieved by calling <tt> (train-xor epochs target-error) </tt>.  This trains the neural net on the XOR data for <tt>epochs</tt> epochs (an epoch is a complete cycle through the training data), or until the average error is reduced to less than <tt>target-error</tt>.
-====Hamming Distance====
+====Training 1====
-Implement (hamming-distance x y).  Given a pair of feature vectors, it should return the number of features that vary between the two.  For example,
+Initialize the XOR net with a learning rate of 0.3.  Then train it in increments of 1000 epochs.  Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs.  Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.
-  (hamming-distance '(1 1 1) '(1 1 5))
+====Training 2====
-should return 1 because one feature is different.
+Initialize the XOR net with a learning rate of 0.6.  Then train it in increments of 1000 epochs.  Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs.  Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.
-====Euclidean Distance====
+====Training 3====
-Hamming distance is a reasonable distance metric for discreet features, but does not perform as well with continuous data points.  Implement (euclidean-distance x y).  Given a pair of feature vectors, it should return the Euclidean distance between them.  Recall that the formula for Euclidean distance is:
+Initialize the XOR net with a learning rate of 1.0.  Then train it in increments of 1000 epochs.  Stop training when you have completed 10000 epochs or the average error decreased by less than .01 in the last 1000 epochs.  Do this a few times (remember to re-initialize to reset the weights) to make sure your results are consistent and report the average error at the end of training, and the average number of epochs you trained for.
-''[(x1 - y1)^2 + (x2 - y2)^2 + ... + (xn - yn)^2] ^ (1/2)''
-If you want to test your Euclidean distance metric on congressional data, you'll have to first convert the feature vectors into numerical ones using this method:
+===A new data set===
+Now we'll try using a neural net for classification, using the dataset in nnet.data.  To do this, we are using a net with 2 input neurons, some hidden (internal) neurons, and 1 output neuron.  You will be experimenting with changing the number of hidden neurons and its effect on the training process.  Take a look near the bottom of nnet-train.scm to find the procedures you will use in training the classifier function.  In order to set up the neural net, call <tt> (initialize-classifier-net n-hidden learning-rate) </tt>, which readies the neural net for training, at the specified learning rate <tt>learning-rate</tt>, with <tt> n-hidden </tt> hidden neurons.
-<tt>(congress-to-metric data)</tt> - Converts an entire dataset from yes/no/maybe votes into numerical vectors
+Now that the neural net is set up, it can be trained.  Training the neural net is achieved by calling <tt> (train-classifier-net epochs target-error) </tt>.  This trains the neural net on the data for <tt>epochs</tt> epochs (an epoch is a complete cycle through the training data), or until the average error is reduced to less than <tt>target-error</tt>.
-====Angular Distance====
-An alternative to using plain distance is to find the angle between the vectors.  This accounts for similar feature sets that differ mainly in magnitude.  Implement (cos-theta x y), which returns the cosine of the angle between the two vectors.  Recall that this can be calculated in the following way:
-''(x • y) / [|x||y|]''
-(x dot y over magnitude of x times magnitude of y)
-===Implementing Nearest-Neighbors===
-Now that we have working distance metrics, implement <tt>(make-nn-classifier distance-metric training-examples)</tt>.  This sets up a nearest-neighbor classifier, which can then be used to classify feature sets in the following manner:
-  (define nn (make-nn-classifier distance-metric training-examples))
- (nn feature-vector)
-When given a feature vector, the nearest neighbor classifier should return the class of the neighbor with the least distance, as measured by <tt>distance-metric</tt>.
-===K-Nearest-Neighbors===
-To generalize, implement <tt>(make-k-nn-classifier k distance-metric training-examples)</tt>.  This sets up a k-nearest-neighbors classifier, which can then be used to classify feature sets in the following manner:
- (define k-nn (make-k-nn-classifier k distance-metric training-examples))
- (k-nn feature-vector)
-When given a feature vector, the classifier should return the class most represented within the k nearest neighbors, defined by <tt>distance-metric</tt>.
-When you have written code for nearest-neighbors and k-nearest-neighbors, you can use this procedure to test your classifiers on large datasets:
-<tt>(validate-classifier classifier examples)</tt>
-This tests a classifier against a set of labeled data, indicating the number of correct classifications.
-==Identification Trees==
-In this part of the problem set, you will be implementing code that works as part of an identification tree classification scheme.  When creating an identification tree, it is desirable to order the attribute tests such that the tree is as minimal as possible.  This implies ordering tests such that disorder decreases as fast as possible.  In order to do this, we are using the heuristic of choosing as the next attribute to test the one that places the most elements in homogenous groups.
-===Basic Disorder Metric===
-Implement <tt>(basic-disorder attribute-to-split classes examples)</tt>, which returns the negative of the number of elements in homogenous groups (so that when there are more homogenous elements, the disorder metric decreases).
-Note: there is another common method used to measure disorder in the identification tree. The average disorder formula is as follows:
-''sum over branches: (num in branch / num total) * disorder of branch''
-''Where disorder of branch = ''
-''sum over classes: -(num of class in branch / num in branch) * lg (num of class in branch / num in branch)''
-You don't have to implement this disorder metric for the problem set.
-Once you have a working disorder formula, you can generate and use a decision tree in the following manner:
- (define congress-id-classifier (make-idtree-classifier congress-attribute-specs train-data-large))
- (congress-id-classifier feature-vector)
-You may also use validate-classifier to test your decision tree as you did with nearest-neighbors:
- (validate-classifier congress-id-classifier test-data-large)

Problem Set 4b

From 6.034 Wiki

Revision as of 07:57, 7 November 2006

Contents

Neural Nets

Learning XOR

Training 1

Training 2

Training 3

A new data set

Views

Personal tools

Navigation

Search

Toolbox