Lab 5 2014

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
(Edited for fall2011, almost ready to release)
Line 1: Line 1:
<!--
<!--
__TOC__
__TOC__
 +
=Hints=
=Hints=
https://docs.google.com/View?id=dhqhm2bq_183ndrncvvr
https://docs.google.com/View?id=dhqhm2bq_183ndrncvvr
 +
=General=
=General=
-
This is the last problem set in 6.034! It's due Friday, December 3rd at 11:59 PM.
+
This is the last problem set in 6.034! It's due Monday, November 21st at 11:59 PM.
To work on this problem set, you will need to get the code:
To work on this problem set, you will need to get the code:
Line 11: Line 13:
* Download it as a ZIP file: http://web.mit.edu/6.034/www/labs/lab5/lab5.zip
* Download it as a ZIP file: http://web.mit.edu/6.034/www/labs/lab5/lab5.zip
* Or, on Athena, <tt>add 6.034</tt> and copy it from <tt>/mit/6.034/www/labs/lab5/</tt>.
* Or, on Athena, <tt>add 6.034</tt> and copy it from <tt>/mit/6.034/www/labs/lab5/</tt>.
-
* You will need to <font size="+1"><b>download and install</b> an additional software package called [http://www.ailab.si/orange/ Orange]</font> for the second part of the lab.  (Windows and Mac versions are also mirrored [http://web.mit.edu/6.034/www/labs/orange/ here] if you have a particularly slow connection.)  In the past, Orange has not worked with Python 2.6. The new download appears to now require Python 2.6. Please download Orange first so that you get the problems worked out early.  If you have downloaded and installed it, you should be able to run <tt>orange_for_6034.py</tt> and get the Orange version number. Once you've filled in the boosting part, you should be able to run <tt>lab5.py</tt>, and see the output of several classifiers on the vampire dataset.  If you get errors, email us.
+
* You will need to <font size="+1"><b>download and install</b> an additional software package called [http://www.ailab.si/orange/ Orange]</font> for the second part of the lab.  (Windows and Mac versions are also mirrored [http://web.mit.edu/6.034/www/labs/orange/ here] if you have a particularly slow connection.)  Please download Orange first so that you get the problems worked out early.  If you have downloaded and installed it, you should be able to run <tt>orange_for_6034.py</tt> and get the Orange version number. Once you've filled in the boosting part, you should be able to run <tt>lab5.py</tt>, and see the output of several classifiers on the vampire dataset.  If you get errors, email us.
-
* Orange is available for Linux (Ubuntu), Windows, and OS X (and we have successfully tested this lab on these platforms).  Also, if you are working on Mac OS X, be sure that you have [http://sourceforge.net/projects/numpy/files/NumPy/1.3.0/numpy-1.3.0-py2.6-macosx10.5.dmg/download Numpy] installed.
+
* Orange is available for Linux (Ubuntu), Windows, and OS X.
-
* For Ubuntu users:  Please follow the install instruction [http://www.ailab.si/orange/nightly_builds.html#linux here] (see section on Linux).   Note: just running <tt>apt-get orange</tt> on Ubuntu will *NOT* install orange but a completely different software package!   
+
* For Ubuntu users:  Please follow the install instruction [http://www.ailab.si/orange/nightly_builds.html#linux here] (see section on Linux). Note: just running <tt>apt-get orange</tt> on Ubuntu will *NOT* install orange but a completely different software package!   
* ADVICE:  If you can't get orange to work directly on your machine, try working on Athena instead.  It may not be a wise use of your time trying to get the Orange software to work on your machine, as it is only needed for half of lab 5.  If orange works right out of the box great, but if it doesn't then follow the Athena instructions to do the lab.
* ADVICE:  If you can't get orange to work directly on your machine, try working on Athena instead.  It may not be a wise use of your time trying to get the Orange software to work on your machine, as it is only needed for half of lab 5.  If orange works right out of the box great, but if it doesn't then follow the Athena instructions to do the lab.
-
* To check that your Orange is properly installed, run:  <pre>python orange_for_6034.py</pre> and you should get a version string and no errors.
+
* To check that your Orange is properly installed, run:  <pre>python orange_for_6034.py</pre> and you should get a version string and no errors.  
-
* The built in Python 2.5 seems to work okay with Orange on Linux.
+
-
 
+
To work on this lab on Athena you'll need to:
To work on this lab on Athena you'll need to:
Line 27: Line 27:
# For your convenience, we've provided a script, <tt>run-orange-gui.sh</tt> that will run the Orange GUI on linux.mit.edu.  Note that some Orange Widgets (like ROC charting) may not appear in GUI in the Athena version.  Don't worry, you won't need the GUI to answer the questions for this lab.
# For your convenience, we've provided a script, <tt>run-orange-gui.sh</tt> that will run the Orange GUI on linux.mit.edu.  Note that some Orange Widgets (like ROC charting) may not appear in GUI in the Athena version.  Don't worry, you won't need the GUI to answer the questions for this lab.
-
Your answers for the problem set belong in the main file <tt>lab5.py</tt>, as well as <tt>neural_net.py</tt> and <tt>boost.py</tt>.
+
<b>Your answers for the problem set belong in the main file <tt>lab5.py</tt>, as well as <tt>neural_net.py</tt> and <tt>boost.py</tt>.</b>
= Neural Nets =
= Neural Nets =
Line 41: Line 41:
<pre>
<pre>
def output(self)
def output(self)
-
def dOutDX(self,elem)
+
def dOutdX(self, elem)
</pre>
</pre>
-
<tt>output(self)</tt> produces the output of each of these elements.
+
Your first task is to fill in all 6 functions to complete the API.
-
<tt>dOutDx(self, elem)</tt> generates the value of the partial derivative, given a weight element.
+
==== Output ====
-
Your first task is to fill in all 6 functions to complete the API.  
+
The function <tt>output(self)</tt> produces the output of each of these elements.
 +
Be sure to use the sigmoid and ds/dz functions as discussed in class:
 +
<pre>
 +
o = s(z) = 1.0 / (1.0 + e**(-z))
 +
ds(o)/dz = s(z) * (1 - s(z)) = o * (1 - o)
 +
</pre>
 +
and also performance function and its derivative as discussed in class:
 +
<pre>
 +
P(o) = -0.5 (d - o)**2
 +
dP(o)/dx = (d - o)
 +
</pre>
 +
The output functions for these three elements are the same as defined in class.
The output functions for these three elements are the same as defined in class.
Recall that the output for a neuron is computed using the sigmoid decision function:
Recall that the output for a neuron is computed using the sigmoid decision function:
Line 54: Line 65:
s(z) = 1.0 / (1.0 + e**(-z))
s(z) = 1.0 / (1.0 + e**(-z))
</pre>
</pre>
-
Where z is the sum of the products of it's weights and inputs.
+
Where z is the sum of the products of its weights and inputs.
The output of a performance function is similarly standard:
The output of a performance function is similarly standard:
Line 61: Line 72:
</pre>
</pre>
-
Now on to the derivatives:
+
==== Derivatives ====
-
 
+
The function <tt>dOutDx(self, elem)</tt> generates the value of the partial derivative, given a weight element.
 +
Recall, neural nets update a given weight by computing the partial derivative of the  
Recall, neural nets update a given weight by computing the partial derivative of the  
performance function with respect to that weight.  The formula we have used in class is as follows:
performance function with respect to that weight.  The formula we have used in class is as follows:
<pre>
<pre>
-
weight[i] = weight[i] + rate * dP / dweight[i]
+
wi' = wi + rate * dP / dwi
</pre>
</pre>
-
In our code this is represented as (see <tt>def train()</tt>)
+
In our code this is represented as (see <tt>def train()</tt> -- you don't have to implement this):
<pre>
<pre>
w.set_next_value( w.get_value() + rate * network.performance.dOutdX(w) )
w.set_next_value( w.get_value() + rate * network.performance.dOutdX(w) )
Line 91: Line 103:
If you are confused about how the top-down recursive chaining of derivatives work, first read the [http://courses.csail.mit.edu/6.034f/ai3/netmath.pdf course notes] to review.  If you are still confused, ask 6034tas for hints and clarifications.
If you are confused about how the top-down recursive chaining of derivatives work, first read the [http://courses.csail.mit.edu/6.034f/ai3/netmath.pdf course notes] to review.  If you are still confused, ask 6034tas for hints and clarifications.
-
 
-
Be sure to use the sigmoid and ds/dz functions as discussed in class
 
-
<pre>
 
-
o = s(z) = 1.0 / (1.0 + e**(-z))
 
-
ds(o)/dz = s(z) * (1 - s(z))
 
-
</pre>
 
-
The performance function and its derivative as discussed in class
 
-
<pre>
 
-
P(o) = -0.5 (d - o)**2
 
-
dP(o)/dx = (d - o)
 
-
</pre>
 
=== About the API Classes ===
=== About the API Classes ===
Line 128: Line 129:
===== <tt>Weight(ValuedElement)</tt> =====
===== <tt>Weight(ValuedElement)</tt> =====
-
Represents update-able weights in the network.  It has in addition to ValueElement functions:
+
Represents update-able weights in the network.  In addition to ValueElement functions are the following methods, which are used for the training algorithm (you will not need them in your implementation):
* <tt>set_next_value(self,val):</tt> which sets the next weight value in self.next_value
* <tt>set_next_value(self,val):</tt> which sets the next weight value in self.next_value
* <tt>update(self):</tt> which sets the current weight to the value stored in self.next_value
* <tt>update(self):</tt> which sets the current weight to the value stored in self.next_value
-
Both methods are used for the training algorithm itself and you will not need them in your implementation.
 
===== <tt>Input(DifferentiableElement, ValuedElement)</tt> =====
===== <tt>Input(DifferentiableElement, ValuedElement)</tt> =====
Line 152: Line 152:
Represents a Performance Element that allows you to set the desired output.
Represents a Performance Element that allows you to set the desired output.
-
* <tt>set_desired</tt> which sets the desired_output. 
+
* <tt>set_desired</tt> which sets my_desired_val.
-
The desired output is set during training.  You will not need it for your implementation.
+
To better understand back-propagation, you should take a look at the methods <b><tt>train</tt></b> and <b><tt>test</tt></b> in <tt>neural_network.py</tt> to see how everything is put together.
To better understand back-propagation, you should take a look at the methods <b><tt>train</tt></b> and <b><tt>test</tt></b> in <tt>neural_network.py</tt> to see how everything is put together.
Line 243: Line 242:
</pre>
</pre>
-
Note: the function <tt>random_weight()</tt> in neural_net.py uses the python function <tt>random.randrange(-1,1)</tt> to compute initial weights.  This function generates values: -1, 0, 1 (randomly).  While this may seem like a mistake but what we've found empirically is that this actually performs better than using random.uniform(-1, 1).  Be our guest and play around with the <tt>random_weight</tt> function.  You'll find that Neural Nets can be quite sensitive to initialization weight settings.  Recall what happens if you set all weights to the same value?
+
Note: the function <tt>random_weight()</tt> in neural_net.py uses the python function <tt>random.randrange(-1,2)</tt> to compute initial weights.  This function generates values: -1, 0, 1 (randomly).  While this may seem like a mistake but what we've found empirically is that this actually performs better than using random.uniform(-1, 1).  Be our guest and play around with the <tt>random_weight</tt> function.  You'll find that Neural Nets can be quite sensitive to initialization weight settings.  Recall what happens if you set all weights to the same value?
To test your completed network run:
To test your completed network run:
Line 287: Line 286:
</pre>
</pre>
We claim that a network architecture containing 5 neuron nodes or less can  
We claim that a network architecture containing 5 neuron nodes or less can  
-
fully learn and classify all three shapes.  In fact we required it!
+
fully learn and classify all three shapes.  In fact we require it!
Construct a new network in:
Construct a new network in:
Line 327: Line 326:
</pre>
</pre>
-
If everything tests with an accuracy of 1.0.  Then you've completed the Neural Networks portion of lab5.  Congrats!
+
If everything tests with an accuracy of 1.0, then you've completed the Neural Networks portion of lab5.  Congrats!
Now on to Boosting!
Now on to Boosting!
Line 357: Line 356:
=== Completing the code ===
=== Completing the code ===
Here are the parts that you need to complete:
Here are the parts that you need to complete:
-
* In the <tt>BoostClassifier</tt> class in <tt>boost.py</tt>, the <tt>update_weights</tt> method is undefined. You need to define this method so that it changes the data weights in the way prescribed by the AdaBoost algorithm. There are two ways of implementing this update which happen to be mathematically equivalent.
 
* In the <tt>BoostClassifier</tt> class, the <tt>classify</tt> method is also undefined. Define it so that you can use a trained BoostClassifier as a classifier, outputting +1 or -1 based on the weighted results of its base classifiers.  Complete the very similar <tt>orange_classify</tt> method as well.
* In the <tt>BoostClassifier</tt> class, the <tt>classify</tt> method is also undefined. Define it so that you can use a trained BoostClassifier as a classifier, outputting +1 or -1 based on the weighted results of its base classifiers.  Complete the very similar <tt>orange_classify</tt> method as well.
 +
* In the <tt>BoostClassifier</tt> class in <tt>boost.py</tt>, the <tt>update_weights</tt> method is undefined. You need to define this method so that it changes the data weights in the way prescribed by the AdaBoost algorithm. Note: There are two ways of implementing this update; they happen to be mathematically equivalent.)
* In <tt>lab5.py</tt>, the <tt>most_misclassified</tt> function is undefined. You will need to define it to answer the questions.   
* In <tt>lab5.py</tt>, the <tt>most_misclassified</tt> function is undefined. You will need to define it to answer the questions.   
<B>Remember to use the supplied <tt>legislator_info(datum)</tt> to output your list of the most-misclassified data points!</B>
<B>Remember to use the supplied <tt>legislator_info(datum)</tt> to output your list of the most-misclassified data points!</B>
Line 398: Line 397:
We have set up a learner that uses the BoostClassifier from the first part, but its underlying classifiers are the Orange classifiers we were just looking at.  When you combine classifiers in this way, you create what's called an "ensemble classifier".  You will notice, as you run this new classifier on the various data sets we've provided, that the ensemble frequently performs worse in cross-validation than some (or most) of its underlying classifiers.   
We have set up a learner that uses the BoostClassifier from the first part, but its underlying classifiers are the Orange classifiers we were just looking at.  When you combine classifiers in this way, you create what's called an "ensemble classifier".  You will notice, as you run this new classifier on the various data sets we've provided, that the ensemble frequently performs worse in cross-validation than some (or most) of its underlying classifiers.   
-
Your job is to find a set of classifiers for the ensemble that get <b>at least 74% accuracy on the breast-cancer dataset</b>.  You may use any subset of the classifiers we've provided.  Put the short names of your classifiers into the list <tt>classifiers_for_best_ensemble</tt>.  There will be honorable mention in class for the best ensemble.  Classifier performance appears to be architecture dependent, so you might be able to get to 74% with just one classifier on your machine, but that won't be enough on the server -- in this case, try to get even better performance at home. If you are proud of the way that you went about choosing the best ensemble, let us know to look at your code carefully, and there may be another honorable mention for that.
+
Your job is to find a set of classifiers for the ensemble that get <b>at least 74% accuracy on the breast-cancer dataset</b>.  You may use any subset of the classifiers we've provided.  Put the short names of your classifiers into the list <tt>classifiers_for_best_ensemble</tt>.  Classifier performance appears to be architecture dependent, so you might be able to get to 74% with just one classifier on your machine, but that won't be enough on the server -- in this case, try to get even better performance at home.
 +
 
=== Bonus! ===
=== Bonus! ===
Line 407: Line 407:
==== Neural Nets ====
==== Neural Nets ====
-
If you downloaded lab5 on Nov. 15 or earlier, please re-download the lab files again to pick up any fixes related to tester.py and lab5.py
 
-
 
If you are having problems with getting your network to convergence on certain problems, try the following:
If you are having problems with getting your network to convergence on certain problems, try the following:
# Change your random weight function to use random.randrange(-1,1) instead of random.uniform(-1,1).  We've found that the former produced weights that are more conducive to network convergence.   
# Change your random weight function to use random.randrange(-1,1) instead of random.uniform(-1,1).  We've found that the former produced weights that are more conducive to network convergence.   
Line 418: Line 416:
If you can get your two-layer network to classify perfectly when running:
If you can get your two-layer network to classify perfectly when running:
<pre>
<pre>
-
neural_net_tester.py two_layer.
+
neural_net_tester.py two_layer
</pre>
</pre>
-
But can't get any of the problems in the challenging section working.
+
but can't get any of the problems in the challenging section working, then your implementation is likely correct, and that the problem is in your network architecture or initial weight settings.
-
Then your implementation is likely correct, and that the problem is in your network architecture or initial weight settings.
+
-->
-->
<!--
<!--

Revision as of 21:05, 4 November 2011


Personal tools