Lab 5 2014

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search

Revision as of 19:54, 23 November 2008

Contents


This is the last problem set in 6.034! It's due Friday, December 5th at 11:59 PM.

To work on this problem set, you will need to get the code:

Your answers for the problem set belong in the main file lab5.py, as well as boost.py.

Neural Nets

In this problem set, you will be experimenting with neural nets. Although you'll only have to do the most superficial of coding, take your time and make sure you really understand what is happening when you run the nets.

The training process

In this problem set, we'll ask you to train a neural net until it converges. (We've written a function that does this for you; see the lab5.py file.) Here's how we define convergence for the purposes of this problem set:

  • Train the neural net in increments of 100 epochs. (An epoch is a run through the entire set of training data.)
  • After each 100 epochs, check to see if the neural net has converged, which for the purpose of this problem set we define as follows:
    • The mean error of the neural net has changed by less than .0001 since 100 epochs ago.
    • The mean error has changed by more than .0001 in some previous group of 100 epochs. In other words, a neural net hasn't converged yet if it's still trying to get out of its start state.
  • Stop training when the neural net has converged or 20,000 epochs have passed, whichever happens first.

These neural nets are initialized with small random weights, so it is expected that they will have varying behavior.

Learning XOR

The first function the neural net will be learning is XOR. XOR is a binary function that returns a 1 when exactly one of its two inputs is 1. The XOR table looks like this:

A B XOR(A, B)
1 1    0
1 0    1
0 1    1
0 0    0

To do this, we are using a net with 2 input neurons, 2 hidden (internal) neurons, and 1 output neuron. Take a look near the top to find the procedures you will use in training the XOR function.

Training the neural net is achieved by calling train_neural_net(neural_network, examples, learning_rate). This trains the neural net on the XOR data for up to 200,000 epochs (or until it converges), and returns the average error of the neural net when it is done.

  1. Initialize the XOR net with a learning rate of 0.5, and train it until it converges. Do this a few times to make sure that your results are consistent. Report the average error at the end of training, and the average number of epochs you trained for, in lab5.py.
  2. Do the same with a learning rate of 0.05, and report your results in lab5.py.
  3. Try several different learning rates. What learning rate takes the fewest epochs, on average, to converge?
  4. Change the number of internal nodes to 4. How many epochs on average does it take to converge?
  5. Change the number of internal nodes to 6. How many epochs on average does it take to converge?
  6. What is the minimum number of internal nodes you need for the neural net to learn this function with better than 99% accuracy?

Checking your answers

You can sanity-check your answers with tester.py. Don't worry about being "exactly right"; there will be a wide range of acceptable answers. If you are confident of your answer to within 20%, you are fine.

Boosting

You're still trying to use AI to predict the votes of politicians. You decide that the ID-tree classifier was too rigid and uninformative, so now you try using a boosting classifier instead.

To make sure that you interpret the results without letting your political preconceptions get in the way, you dig up some old data to work with: in particular, the data from the 4th House of Representatives, which met from 1796 to 1797. (According to the records on voteview.com, this is when the two-party system first emerged, with the two parties being designated "Federalists" and "Republicans".)

You experiment with that data before going on to the 2007-2008 data, finding that Congressmen in 1796 were much more clear about what they were voting on than in 2008.

The framework for a boosting classifier can be found in boost.py. You need to finish coding it, and then use it to learn some classifiers and answer a few questions.

The following resources will be helpful:

  • The documentation for the boosting code, which you can find embedded in boost.py in the documentation strings.
  • The course notes on boosting, or the handout that summarizes it.
  • The Lab 4 writeup, if you need to refer to how data_reader.py represents legislators and votes.

A (clever|cheap) trick

The boosting code uses a trick that means it only has to try half the number of base classifiers.

It turns out that AdaBoost does not really care which side of its base classifier is +1 and which side is -1. If you choose a classifier that is the opposite of the best classifier -- it returns -1 for most points that should be +1, and returns +1 for most points that should be -1, and therefore has a high error rate -- it works the same as if you had chosen the negation of that classifier, which is the best classifier.

If the data reweighting step is implemented correctly, it will produce the same weights given a classifier or its opposite. Also, a classifier with a high error rate will end up with a negative alpha value, so that in the final "vote" of classifiers it will act like its opposite. So the important thing about a classifier isn't that its error rate is low -- it's that the error rate is far from 1/2.

In the boosting code, we take advantage of this. We include only classifiers that output +1 for voting YES on a particular motion, or +1 for voting NO on a particular motion, and as the "best classifier" we choose the classifier whose error rate is farthest from 1/2. If the error rate is high, then the result will act like a classifier that outputs +1 for "voting NO or abstaining", or +1 for "voting YES or abstaining", respectively. This means we don't have to include these classifiers in the base classifier list, speeding up the algorithm by a factor of 2.

Completing the code

Here are the parts that you need to complete:

  • In the BoostClassifier class in boost.py, the update_weights method is undefined. You need to define this method so that it changes the data weights in the way prescribed by the AdaBoost algorithm. There are two ways of implementing this update which happen to be mathematically equivalent.
  • In the BoostClassifier class, the classify method is also undefined. Define it so that you can use a trained BoostClassifier as a classifier, outputting +1 or -1 based on the weighted results of its base classifiers.
  • In lab5.py, the most_misclassified function is undefined. You will need to define it to answer the questions.

Questions

Answer the questions in lab5.py.

When you are asked how a particular political party would vote on a particular motion, disregard the possibility of abstaining. If your classifier results indicate that the party wouldn't vote NO, consider that an indication that the party would vote YES.

Personal tools