Lab 5: Bayes Nets

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
m (Iterating over a BayesNet)
Current revision (04:29, 21 August 2020) (view source)
 
(26 intermediate revisions not shown.)
Line 1: Line 1:
-
{{Unreleased}}
+
<!-- {{Unreleased}} -->
-
 
+
__TOC__
__TOC__
 +
{{Lab_Due|when=Thursday, October 22}}
-
{{Lab_Due|when=??}}
+
{{Get_Code|lab=5}}
-
 
+
-
{{Get_Code|lab=8}}
+
== The world is an uncertain place ==
== The world is an uncertain place ==
Line 48: Line 46:
In particular, the links in this graph inform us of the following:
In particular, the links in this graph inform us of the following:
-
:Every feature (accent, eats-garlic, and complexion) is conditionally independent of every feature, '''given the classification''' as a vampire or not.
+
:Every feature (accent, eats-garlic, and complexion) is conditionally independent of every other feature, '''given the classification''' as a vampire or not.
In a world with vampires, it would be crazy to claim that the individual features are independent: after all, if someone is a vampire, they are more likely to have particular traits. However, '''given that we know''' whether or not someone is a vampire, the features no longer depend on each other.
In a world with vampires, it would be crazy to claim that the individual features are independent: after all, if someone is a vampire, they are more likely to have particular traits. However, '''given that we know''' whether or not someone is a vampire, the features no longer depend on each other.
-
 
-
== Lab 8 API ==
 
-
 
-
As usual, we have supplied an API for lab 8, which gives you access to a lot of administrative functions for manipulating and interacting with Bayes nets, as well as a few other helper functions. If you're confused about some of the nomenclature, don't worry! You can always read ahead and then come back to reference the API when necessary.
 
-
 
-
=== Getting information from a BayesNet ===
 
-
<dl>
 
-
 
-
* <tt>net.</tt>'''get_variables'''(): Returns a list of all variables in the Bayes net.
 
-
* <tt>net.</tt>'''get_parents'''(<tt>var</tt>): Returns the set of variables that are the parents of <tt>var</tt>.
 
-
* <tt>net.</tt>'''get_children'''(<tt>var</tt>): Returns the set of variables that are the children of <tt>var</tt>.
 
-
* <tt>net.</tt>'''get_domain'''(<tt>var</tt>): Returns the domain of the variable. For example, if the variable is boolean, the function will return <tt>(False, True)</tt>.
 
-
* <tt>net.</tt>'''is_neighbor'''(<tt>var1</tt>, <tt>var2</tt>): Returns True if <tt>var1</tt> and <tt>var2</tt> are directly connected by an edge in the Bayes net, otherwise returns False. In other words, returns True just if <tt>var1</tt> is a parent of <tt>var2</tt>, or <tt>var2</tt> is a parent of <tt>var1</tt>.
 
-
* <tt>net.</tt>'''get_probability'''(<tt>hypothesis</tt>, <tt>parents_vals=None</tt>, <tt>infer_missing=True</tt>): Given the value of a variable and all its parents, looks up and returns a probability in the network's conditional probability tables. For example, if A, B, C are boolean variables in a particular network, and C is the child of A and B; you can look up <i>P(C = True | A=False, B=False)</i> using <tt>net.get_probability({"C" : True}, {"A": False, "B" : False})</tt>.  If <tt>infer_missing</tt> is True, the function will try to infer missing entries in the table using the fact that certain probabilities must sum to one.  Note that <tt>get_probability</tt> may raise a <tt>ValueError</tt> or <tt>LookupError</tt> if it is unable to find the requested probability in the network's conditional probability tables.
 
-
* <tt>net.</tt>'''find_path'''(<tt>start_var</tt>, <tt>goal_var</tt>): Performs breadth-first search (BFS) to find a path from <tt>start_var</tt> to <tt>goal_var</tt>.  Returns path as a list of nodes (variables), or <tt>None</tt> if no path was found.
 
-
 
-
</dl>
 
-
 
-
 
-
To view the structure of a BayesNet, you can print its variables and connections directly using <tt>print net</tt>.
 
-
 
-
To print a conditional probability table:
 
-
<tt>net.</tt>'''CPT_print'''(<tt>var=None</tt>): Pretty-prints the Bayes net's conditional probability table for var. If var is not specified, prints every conditional probability table in the net.
 
-
 
-
=== Iterating over a BayesNet ===
 
-
 
-
<tt>net.</tt>'''topological_sort'''(<!--<tt>variables=None</tt>-->): Returns an ordered list of all variables such that each variable
 
-
comes after all its parents. <!--If the list <tt>variables</tt> isn't specified, uses the list of all variables.-->
 
-
 
-
 
-
<tt>net.</tt>'''combinations'''(<tt>variables</tt>, <tt>constant_bindings=None</tt>):
 
-
Given a list of variables, returns a list of every possible
 
-
binding of those variables. For example, if you have three
 
-
boolean variables "A", "B", "C", this function will return a
 
-
list of eight possible bindings :
 
-
<ul>
 
-
<li><tt>{'A': False, 'B': False, 'C': False}</tt></li>
 
-
<li><tt>{'A': False, 'B': False, 'C': True}</tt></li>
 
-
<li><tt>{'A': False, 'B': True, 'C': False}</tt></li>
 
-
<li><tt>{'A': False, 'B': True, 'C': True}</tt></li>
 
-
<li><tt>{'A': True, 'B': False, 'C': False}</tt></li>
 
-
<li>…</li>
 
-
<li><tt>{'A': True, 'B': True, 'C': True}</tt></li>
 
-
</ul>
 
-
 
-
<br> Each binding in the list will also include the entries
 
-
in <tt>constant_bindings</tt>, if specified. For example,
 
-
passing <tt>constant_bindings = {'D': False, 'E': True}</tt>
 
-
as an argument would include the entries <tt>{'D': False, 'E': True}</tt> in every one of the above eight bindings.
 
-
Passing <tt>constant_bindings = {'C': True}</tt> as an argument would result in only four bindings, each including the entry <tt>{'C': True}</tt>.
 
-
 
-
=== Creating or modifying a BayesNet ===
 
-
 
-
We predict that you will only require the following four methods:
 
-
<dl>
 
-
 
-
* <tt>net.</tt>'''subnet'''(<tt>subnet_variables</tt>): Returns a new BayesNet that is a subnet of this one.  The new net includes the specified variables and any edges that exist between them in the original Bayes net.  Ignores any specified variables that aren't in the original Bayes net.
 
-
* <tt>net.</tt>'''link'''(<tt>var_parent</tt>, <tt>var_child</tt>): Adds a directed edge from <tt>var_parent</tt> to <tt>var_child</tt>, then returns the Bayes net.  (If the edge already exists, this function simply returns the Bayes net.)
 
-
* <tt>net.</tt>'''make_bidirectional'''(): Adds edges so that all original edges effectively become bi-directional.  Returns the Bayes net.
 
-
* <tt>net.</tt>'''remove_variable'''(<tt>var</tt>): Removes <tt>var</tt> from the Bayes net and deletes all edges to/from <tt>var</tt>.  (If <tt>var</tt> is not a variable in the Bayes net, does nothing.)  Returns the Bayes net.
 
-
 
-
</dl>
 
-
 
-
<!--
 
-
However, if you would like to instantiate a new BayesNet or perform other modifications, the following methods also exist:
 
-
TODO
 
-
-->
 
-
 
-
=== Helper functions ===
 
-
'''approx_equal'''(<tt>a, b, epsilon=0.0000000001</tt>): Returns True if two numbers a and b are approximately equal (within some margin epsilon), otherwise False.
 
-
 
-
Sample usage:
 
-
<tt>approx_equal(0.4999999999999999, 0.5)</tt> -> <tt>True</tt>
 
-
 
-
 
-
'''product'''(<tt>factors</tt>): Computes the product of a list of numbers, similar to the Python built-in function <tt>sum</tt>
 
-
 
-
<!--
 
-
'''filter_dict'''(<tt>pred</tt>, <tt>dictionary</tt>):
 
-
Returns a new dictionary containing only the keys
 
-
matching <tt>pred(key)</tt>.
 
-
 
-
For example, <tt>filter_dict({"A"&nbsp;:&nbsp;0, "B"&nbsp;:&nbsp;1, "C"&nbsp;:&nbsp;2, "D"&nbsp;:&nbsp;3}, lambda x : x in ["A", "D"])</tt> will return the map <tt>{"A" : 0, "D" : 3}</tt> -->
 
-
<!--TODO I didn't know filter_dict existed... is it useful?-->
 
-
 
-
 
== Part 1: Ancestors, descendants, and non-descendants ==
== Part 1: Ancestors, descendants, and non-descendants ==
-
As warm up for this lab, you will now implement some basic functions that will help us throughout the remainder of the lab.
+
As warm up for this lab, you will now implement some basic functions that will help us throughout the remainder of the lab. As usual, we have supplied an API for lab 8 below, which gives you access to a lot of administrative functions for manipulating and interacting with Bayes nets, as well as a few other helper functions.  
First, implement a function that takes in <tt>net</tt> (a <tt>BayesNet</tt> object) and <tt>var</tt> (a variable, as a string), and returns a set containing the '''ancestors''' of <tt>var</tt> in the network. This set should include the variable's parents, its parents' parents, etc.
First, implement a function that takes in <tt>net</tt> (a <tt>BayesNet</tt> object) and <tt>var</tt> (a variable, as a string), and returns a set containing the '''ancestors''' of <tt>var</tt> in the network. This set should include the variable's parents, its parents' parents, etc.
Line 177: Line 89:
Before we start calculating probabilities, write a helper function that simplifies a probability expression using the [[#Bayesian_Networks_encode_independence_assumptions | independence assumption encoded in the Bayes net]]. This function will take in
Before we start calculating probabilities, write a helper function that simplifies a probability expression using the [[#Bayesian_Networks_encode_independence_assumptions | independence assumption encoded in the Bayes net]]. This function will take in
* <tt>net</tt>, a <tt>BayesNet</tt> object
* <tt>net</tt>, a <tt>BayesNet</tt> object
-
* <tt>var</tt>, a single bayes net variable as a string
+
* <tt>var</tt>, a single Bayes net variable as a string
* <tt>givens</tt>, a dictionary mapping variables to assigned values in the net
* <tt>givens</tt>, a dictionary mapping variables to assigned values in the net
and it will return another dictionary which is the simplified version of <tt>givens</tt> as allowed by the Bayes net assumption.
and it will return another dictionary which is the simplified version of <tt>givens</tt> as allowed by the Bayes net assumption.
Line 202: Line 114:
If the probability of the hypothesis variable can be looked up directly in the network's probability tables, or if its given variables can be simplified (with <tt>simplify_givens</tt>) to a form that can be looked up directly, return it. Otherwise, raise the exception <tt>LookupError</tt>.
If the probability of the hypothesis variable can be looked up directly in the network's probability tables, or if its given variables can be simplified (with <tt>simplify_givens</tt>) to a form that can be looked up directly, return it. Otherwise, raise the exception <tt>LookupError</tt>.
-
Note that <tt>net.get_probability</tt> will raise a <tt>ValueError</tt> if the hypothesis contains multiple variables, or if the givens do not contain exactly the parents of the hypothesis' variable. You may want to use a try/except block to catch this error.
+
Note that the function <tt>net.get_probability</tt> will raise a <tt>ValueError</tt> if the provided hypothesis contains multiple variables, or if the provided givens do not contain exactly the parents of the hypothesis' variable. You may want to use a try/except block to catch this error. We will handle multiple variable hypotheses in later functions.
(For a refresher on how to handle exceptions, see the [[Lab_5:_Identification_Trees#Appendix:_How_to_handle_exceptions|appendix in Lab 5]].)
(For a refresher on how to handle exceptions, see the [[Lab_5:_Identification_Trees#Appendix:_How_to_handle_exceptions|appendix in Lab 5]].)
Line 218: Line 130:
For example to compute the joint probability P(A=True, B=False, C=False), one could call <tt>probability_joint(net, {"A": True, "B": False, "C": False})</tt>.
For example to compute the joint probability P(A=True, B=False, C=False), one could call <tt>probability_joint(net, {"A": True, "B": False, "C": False})</tt>.
-
To compute the joint probability for a Bayes net, you can use the chain rule to split up the probability into a product of conditional probability values produced by <tt>probability_lookup</tt>.
+
To compute the joint probability for a Bayes net, you can use the [http://web.mit.edu/jmn/www/6.034/probability-flowchart.pdf chain rule] to split up the probability into a product of conditional probability values produced by <tt>probability_lookup</tt>. Hint: We can use the Bayes net Assumption to our advantage. When we expand using the chain rule, how can we order the variables to ensure that we can use <tt>probability_lookup</tt> for each term?
You may assume that the hypothesis represents a valid joint probability (that is, contains every variable in the Bayes net).
You may assume that the hypothesis represents a valid joint probability (that is, contains every variable in the Bayes net).
Line 228: Line 140:
  def probability_marginal(<tt>net</tt>, <tt>hypothesis</tt>):
  def probability_marginal(<tt>net</tt>, <tt>hypothesis</tt>):
-
For example, if there are three variables in the system, P(A=True) = P(A=True, B=True) + P(A=True, B=False).
+
For example, if there are two variables in the system, P(A=True) = P(A=True, B=True) + P(A=True, B=False).
Hint 1: The <tt>BayesNet</tt> method <tt>net.combinations</tt> may be useful.
Hint 1: The <tt>BayesNet</tt> method <tt>net.combinations</tt> may be useful.
-
Hint 2: To combine two dictionaries <tt>d1</tt> and <tt>d2</tt> to form a third dict <tt>d3</tt>, you can use <tt>d3 = dict(d1, **d2)</tt>. (But note that if a key exists in both <tt>d1</tt> and <tt>d2</tt>, the resulting dict <tt>d3</tt> will only include the key's value from <tt>d2</tt>.)
+
Hint 2: If you are not passing every test for this function, consider the order in which you are expanding using the chain rule.
==== Conditional probability ====
==== Conditional probability ====
Line 242: Line 154:
  def probability_conditional(<tt>net</tt>, <tt>hypothesis</tt>, <tt>givens=None</tt>):
  def probability_conditional(<tt>net</tt>, <tt>hypothesis</tt>, <tt>givens=None</tt>):
-
Note that if <tt>givens</tt> is <tt>None</tt>, the "conditional" probability is really just a marginal or joint probability.  
+
Note that if <tt>givens</tt> is <tt>None</tt>, the "conditional" probability is really just a marginal or joint probability.
 +
 
 +
Hint 1: To combine two dictionaries <tt>d1</tt> and <tt>d2</tt> to form a third dict <tt>d3</tt>, you can use <tt>d3 = dict(d1, **d2)</tt>. (But note that if a key exists in both <tt>d1</tt> and <tt>d2</tt>, the resulting dict <tt>d3</tt> will only include the key's value from <tt>d2</tt>.)
 +
 
 +
Hint 2: There are a few nuanced edge cases to consider for a conditional probability. For example, in general, what is P(A=True|A=False)?
==== Tying it all together ====
==== Tying it all together ====
Line 255: Line 171:
Implement a function that takes in a single argument <tt>net</tt> (a <tt>BayesNet</tt> object), and returns the number of parameters in the
Implement a function that takes in a single argument <tt>net</tt> (a <tt>BayesNet</tt> object), and returns the number of parameters in the
-
network: that is, the ''minimum'' number of entries that ''must'' be stored in the network's probability tables in order to fully define all joint probabilities. Do '''not''' assume that the variables are boolean; any variable can take on an arbitrary number of values.
+
network: that is, the ''minimum'' number of entries that ''must'' be stored in the network's probability tables in order to fully define all joint probabilities. '''Do not assume that the variables are boolean; any variable can take on an arbitrary number of values.'''
  def number_of_parameters(net):
  def number_of_parameters(net):
Line 296: Line 212:
** ...''C'' had only one parent?
** ...''C'' had only one parent?
** ...''C'' had a third parent, ''D''?
** ...''C'' had a third parent, ''D''?
 +
 +
If you're still having trouble with this section, we strongly recommend coming to office hours to discuss it with a TA, or other students!
== Part 4: Independence ==
== Part 4: Independence ==
Line 323: Line 241:
You should use [http://web.mit.edu/jmn/www/6.034/d-separation.pdf d-separation] to determine whether <tt>var1</tt> and <tt>var2</tt> are independent, based solely on the structure of the Bayes net. This function should ''not'' consider numerical independence.
You should use [http://web.mit.edu/jmn/www/6.034/d-separation.pdf d-separation] to determine whether <tt>var1</tt> and <tt>var2</tt> are independent, based solely on the structure of the Bayes net. This function should ''not'' consider numerical independence.
 +
 +
== Lab 8 API ==
 +
 +
As usual, we have supplied an API for lab 8, which gives you access to a lot of administrative functions for manipulating and interacting with Bayes nets, as well as a few other helper functions.
 +
 +
=== Getting information from a <tt>BayesNet</tt> ===
 +
 +
The class <tt>BayesNet</tt> defines fields and methods for interacting with a Bayes network. To get some information from a <tt>BayesNet</tt> object, we supply the following methods:
 +
 +
;<tt>get_variables()</tt>
 +
:Returns a list of all variables in the Bayes net.
 +
;<tt>get_parents(var)</tt>
 +
:Returns the set of variables that are the parents of <tt>var</tt>.
 +
;<tt>get_children(var)</tt>
 +
:Returns the set of variables that are the children of <tt>var</tt>.
 +
;<tt>get_domain(var)</tt>
 +
:Returns the domain of the variable. For example, if the variable is boolean, the function will return <tt>(False, True)</tt>.
 +
;<tt>is_neighbor(var1, var2)</tt>
 +
:Returns <tt>True</tt> if <tt>var1</tt> and <tt>var2</tt> are directly connected by an edge in the Bayes net, otherwise returns <tt>False</tt>. In other words, returns <tt>True</tt> exactly when <tt>var1</tt> is a parent of <tt>var2</tt> or <tt>var2</tt> is a parent of <tt>var1</tt>.
 +
;<tt>find_path(start_var, goal_var)</tt>
 +
:Performs breadth-first search (BFS) to find a path from <tt>start_var</tt> to <tt>goal_var</tt>.  Returns path as a list of nodes (variables), or <tt>None</tt> if no path was found.
 +
;<tt>get_probability(hypothesis, parents_vals=None, infer_missing=True)</tt>
 +
:Given <tt>hypothesis</tt> (a singleton dictionary mapping a variable to its value) and <tt>parents_vals</tt> (a dictionary mapping all of the hypothesis variable's parents to values), looks up and returns a probability in the network's conditional probability tables. If <tt>infer_missing</tt> is <tt>True</tt>, the function will try to infer missing entries in the table using the fact that certain probabilities must sum to one.
 +
:Requires that <tt>hypothesis</tt> has exactly one variable and that <tt>parents_vals</tt>'s keys are exactly the parents of the hypothesis variable; if either condition isn't met, raises a <tt>ValueError</tt>.
 +
:If the hypothesis variable does not exist in the network, or the value cannot be appropriately located in the conditional probability tables of the net, raises a <tt>LookupError</tt>.
 +
:Example: Suppose <tt>net</tt> is a <tt>BayesNet</tt> instance with three boolean variables ''A'', ''B'', and ''C'', with ''C'' the child of ''A and ''B''. To look up P(C = True | A=False, B=False), you can call <tt>net.get_probability({"C" : True}, {"A": False, "B" : False})</tt>. 
 +
 +
To view the structure of a <tt>BayesNet</tt>, you can print its variables and connections directly using the <tt>print</tt> statement.
 +
 +
To print a conditional probability table:
 +
;<tt>CPT_print(var=None)</tt>
 +
:Pretty-prints the Bayes net's conditional probability table for <tt>var</tt> (a variable in the net). If <tt>var</tt> is not specified, prints every conditional probability table in the net.
 +
 +
=== Iterating over a <tt>BayesNet</tt> ===
 +
 +
;<tt>topological_sort(<!--variables=None-->)</tt>
 +
:Returns an ordered list of all variables such that each variable comes after all its parents. <!--If the list <tt>variables</tt> isn't specified, uses the list of all variables.-->
 +
 +
;<tt>combinations(variables, constant_bindings=None)</tt>
 +
:Given <tt>variables</tt> (a list of variables), and <tt>constant_bindings</tt> (a dictionary mapping variables to constant values), returns a list of dictionaries, each of which is a possible binding of <tt>variables</tt>, mapping variables to values. Each binding dictionary also includes the entries in <tt>constant_bindings</tt>, if specified.
 +
:Example: If your Bayes net <tt>net</tt> has three boolean variables ''A'', ''B'', and ''C'', calling <tt>net.combinations(['A','B','C'])</tt> will return a list of eight dictionaries indicating possible bindings for ''A'', ''B'', and ''C'':
 +
:* <tt>{'A': False, 'B': False, 'C': False}</tt>
 +
:* <tt>{'A': False, 'B': False, 'C': True}</tt>
 +
:* <tt>{'A': False, 'B': True, 'C': False}</tt>
 +
:* <tt>{'A': False, 'B': True, 'C': True}</tt>
 +
:* <tt>{'A': True, 'B': False, 'C': False}</tt>
 +
:* ...
 +
:* <tt>{'A': True, 'B': True, 'C': True}</tt>
 +
:Example: Passing <tt>constant_bindings = {'D': False, 'E': True}</tt> as an argument in the above example would include the entries <tt>{'D': False, 'E': True}</tt> in every one of the above eight bindings. On the other hand, passing <tt>constant_bindings = {'C': True}</tt> as an argument would result in only four bindings, each including the entry <tt>{'C': True}</tt>.
 +
 +
=== Creating or modifying a <tt>BayesNet</tt> ===
 +
 +
The following methods will allow you to create or modify a <tt>BayesNet</tt> instance.
 +
 +
;<tt>subnet(subnet_variables)</tt>
 +
:Given <tt>subnet_variables</tt> (a list of variables in the net), returns a new <tt>BayesNet</tt> that is a subnet of this one. The new net includes the specified variables and any edges that exist between them in the original Bayes net. Ignores any specified variables that aren't in the original Bayes net.
 +
;<tt>link(var_parent, var_child)</tt>
 +
:Adds a directed edge from <tt>var_parent</tt> to <tt>var_child</tt>, then returns the modified Bayes net. If the edge already exists, this function does nothing, and returns the Bayes net.
 +
;<tt>make_bidirectional()</tt>
 +
:Adds edges so that all original edges effectively become bi-directional. Returns the modified Bayes net.
 +
;<tt>remove_variable(var)</tt>
 +
:Removes <tt>var</tt> from the Bayes net and deletes all edges to and from <tt>var</tt>. If <tt>var</tt> is not a variable in the Bayes net, does nothing. Returns the Bayes net.
 +
 +
=== Helper functions ===
 +
 +
We also provide a couple of helper functions that you may use directly in your <tt>lab8.py</tt> file:
 +
 +
;<tt>approx_equal(a, b, epsilon=0.0000000001)</tt>
 +
:Returns <tt>True</tt> if two numbers <tt>a</tt> and <tt>b</tt> are approximately equal (within some margin <tt>epsilon</tt>), otherwise <tt>False</tt>.
 +
:Example: <tt>approx_equal(0.4999999999999999, 0.5)</tt> --> <tt>True</tt>
 +
 +
;<tt>product(factors)</tt>
 +
:Computes the product of a list of numbers, similar to the Python built-in function <tt>sum</tt>.
 +
:Example: <tt>product([1,2,3,4])</tt> --> <tt>24</tt>
 +
 +
At this point, we strongly recommend reading the API again: there are several methods there that will likely be of use.
The following <tt>set</tt> methods may be useful:
The following <tt>set</tt> methods may be useful:

Current revision

Contents


This lab is due by Thursday, October 22 at 10:00pm.

Before working on the lab, you will need to get the code. You can...

  • Use Git on your computer: git clone username@athena.dialup.mit.edu:/mit/6.034/www/labs/fall2020/lab5
  • Use Git on Athena: git clone /mit/6.034/www/labs/fall2020/lab5


All of your answers belong in the main file lab5.py. To submit your lab to the test server, you will need to download your key.py file and put it in either your lab5 directory or its parent directory. You can also view all of your lab submissions and grades here.


The world is an uncertain place

So far in 6.034, we have discussed various machine learning methods that are used to classify data in different ways. Such methods have included k-nearest neighbors, support vector machines, and neural networks. These machine learning algorithms train on and predict classifications for data points that have definitive features, such as "height", "likes-garlic", or more mathematically abstract features like numbers x and y.

Unfortunately, the world is not so cut-and-dried. Often, we deal with features that may exist, values that fall somewhere within a distribution, or events that happen with some likelihood. As such, in this lab, we will be exploring the world of probability as well as Bayesian networks which encode certain assumptions in probability.

Bayesian Networks encode independence assumptions

As described in lecture and recitation, events might be marginally or conditionally independent.

  • A and B are said to be marginally independent when
  • A is said to be conditionally independent of B given C when

A Bayesian Network (also known as a Bayes Net) is a graphical model that encodes assumptions of conditional independence between certain events (variables). This assumption of conditional independence is often referred to as Bayes net assumption.

The Bayes net assumption
Every variable in a Bayes net is conditionally independent of its non-descendants, given its parents.

For example, consider the Bayes net shown below:

In this net, variable D has

  • one parent (C),
  • three ancestors (A, B, and C),
  • two descendants (F and G), and
  • 4 non-descendants (A, B, C, and E).

Then, we could say (for example) that

  • D is conditionally independent of A given C: i.e. P(D|AC) = P(D|C)
  • D is conditionally independent of E given C: i.e. P(D|EC) = P(D|C)

Conceptually, we see that the arrow relation simply indicates dependency: each variable is only dependent on the value(s) of its parent(s).

As a more concrete example, the following Bayes net encodes assumptions about the features of a subject versus its classification as a vampire or not:

In particular, the links in this graph inform us of the following:

Every feature (accent, eats-garlic, and complexion) is conditionally independent of every other feature, given the classification as a vampire or not.

In a world with vampires, it would be crazy to claim that the individual features are independent: after all, if someone is a vampire, they are more likely to have particular traits. However, given that we know whether or not someone is a vampire, the features no longer depend on each other.

Part 1: Ancestors, descendants, and non-descendants

As warm up for this lab, you will now implement some basic functions that will help us throughout the remainder of the lab. As usual, we have supplied an API for lab 8 below, which gives you access to a lot of administrative functions for manipulating and interacting with Bayes nets, as well as a few other helper functions.

First, implement a function that takes in net (a BayesNet object) and var (a variable, as a string), and returns a set containing the ancestors of var in the network. This set should include the variable's parents, its parents' parents, etc.

def get_ancestors(net, var):

Next, implement a function that returns a set containing the descendants of the variable in the network. This set should include the variable's children, its children's children, etc.

def get_descendants(net, var):

Finally, implement a function that returns a set containing the non-descendants of the variable. Note that a variable is neither a descendant nor a non-descendant of itself.

def get_nondescendants(net, var):

Part 2: Probability

As a brief reminder, recall that there are three basic types of probabilities. Suppose our universe consists of five variables, A, B, C, D, and E. Then, the three types of probability associated with this universe are:

  • Joint probability: Likelihood of a completely specified state of events, e.g. P(ABCDE)
  • Marginal probability: Likelihood of an incompletely specified state of events, e.g. P(ACD) or P(E)
  • Conditional probability: Likelihood of an event given some known information, e.g. P(A|C) or P(DE|AB)

When discussing probability, the hypothesis is the event or set of events whose likelihood we are interested in evaluating, and the givens are the event or events upon which we're conditioning the hypothesis. In other words,

  • for a joint or marginal probability, we are interested in P(hypothesis)
  • for a conditional probability, we are interested in P(hypothesis | givens)

In this lab, hypotheses and givens are expressed as dicts assigning variables to values. For example, P(A=False | B=True, C=False) is represented as the two dicts:

hypothesis = {'A': False}
givens = {'B': True, 'C': False}

The probability flowchart may be useful as a reference for manipulating probabilities.

Simplifying probability expressions

Before we start calculating probabilities, write a helper function that simplifies a probability expression using the independence assumption encoded in the Bayes net. This function will take in

  • net, a BayesNet object
  • var, a single Bayes net variable as a string
  • givens, a dictionary mapping variables to assigned values in the net

and it will return another dictionary which is the simplified version of givens as allowed by the Bayes net assumption.

def simplify_givens(net, var, givens):

In other words,

  • If all parents of var are given, and no descendants of var are given, simplify_givens should return a new dictionary of givens with var's non-descendants (except parents) removed.
  • Otherwise, if not all the parents of var are given, or if the givens include one or more descendants of var, the function should simply return the original givens.

In either case, the function should not modify the original givens.

Hint: The set method .issubset may be useful here.

Looking up probabilities in a Bayes net

Now, we will implement a function that looks up probabilities in the net's conditional probability tables. This function takes in three arguments:

  • net, a BayesNet object
  • hypothesis, a dictionary mapping variables to values.
  • givens, a dictionary mapping variables to assigned values. If this argument is None, the probability to look up is not conditioned on anything.
def probability_lookup(net, hypothesis, givens=None):

If the probability of the hypothesis variable can be looked up directly in the network's probability tables, or if its given variables can be simplified (with simplify_givens) to a form that can be looked up directly, return it. Otherwise, raise the exception LookupError.

Note that the function net.get_probability will raise a ValueError if the provided hypothesis contains multiple variables, or if the provided givens do not contain exactly the parents of the hypothesis' variable. You may want to use a try/except block to catch this error. We will handle multiple variable hypotheses in later functions.

(For a refresher on how to handle exceptions, see the appendix in Lab 5.)

Computing probabilities

Now we will implement functions that actually compute different types of probabilities. We will start by defining functions that can compute joint, marginal, and conditional probabilities; then we will generalize by implementing a single probability function.

Joint probability

Given net (a BayesNet object) and hypothesis (a dictionary mapping variables in the network to values), compute its joint probability:

def probability_joint(net, hypothesis):

For example to compute the joint probability P(A=True, B=False, C=False), one could call probability_joint(net, {"A": True, "B": False, "C": False}).

To compute the joint probability for a Bayes net, you can use the chain rule to split up the probability into a product of conditional probability values produced by probability_lookup. Hint: We can use the Bayes net Assumption to our advantage. When we expand using the chain rule, how can we order the variables to ensure that we can use probability_lookup for each term?

You may assume that the hypothesis represents a valid joint probability (that is, contains every variable in the Bayes net).

Marginal probability

Recall that a marginal probability can be represented as a sum of joint probabilities. You should implement probability_marginal (which takes in the same arguments as probability_joint) to compute the marginal probability as a sum of joint probabilities produced by probability_joint:

def probability_marginal(net, hypothesis):

For example, if there are two variables in the system, P(A=True) = P(A=True, B=True) + P(A=True, B=False).

Hint 1: The BayesNet method net.combinations may be useful.

Hint 2: If you are not passing every test for this function, consider the order in which you are expanding using the chain rule.

Conditional probability

Some conditional probabilities can be looked up directly in the Bayes net using probability_lookup. The rest, however, can be computed as ratios of marginal probabilities.

Implement probability_conditional, which takes in the same arguments as probability_marginal plus the addition of an optional givens dictionary mapping variables to values.

def probability_conditional(net, hypothesis, givens=None):

Note that if givens is None, the "conditional" probability is really just a marginal or joint probability.

Hint 1: To combine two dictionaries d1 and d2 to form a third dict d3, you can use d3 = dict(d1, **d2). (But note that if a key exists in both d1 and d2, the resulting dict d3 will only include the key's value from d2.)

Hint 2: There are a few nuanced edge cases to consider for a conditional probability. For example, in general, what is P(A=True|A=False)?

Tying it all together

Use all of the above types of probability to produce a function that can compute any probability expression in terms of the Bayes net parameters. This function takes in the same arguments as probability_conditional:

def probability(net, hypothesis, givens=None):

Part 3: Counting Parameters

Encoded in every Bayes net is a set of independence assumptions for the variables contained within. In general, knowing information about variables allows us to make more informed choices when computing probabilities. In particular, with these independence assumptions, we are able to drastically decrease the amount of information we need to store while still retaining the ability to compute any joint probability within the net.

Implement a function that takes in a single argument net (a BayesNet object), and returns the number of parameters in the network: that is, the minimum number of entries that must be stored in the network's probability tables in order to fully define all joint probabilities. Do not assume that the variables are boolean; any variable can take on an arbitrary number of values.

def number_of_parameters(net):

Python hint: The helper function product, defined in the API section, may be helpful here.

Conceptual hint: Consider variable C from the Bayes net above. Its two parents are A and B. Suppose each variable has three values in its domain. Then, the conditional probability table for C would look like this:

A B P(C=c1 | A,B) P(C=c2 | A,B)
a1 b1 # #
a1 b2 # #
a1 b3 # #
a2 b1 # #
a2 b2 # #
a2 b3 # #
a3 b1 # #
a3 b2 # #
a3 b3 # #


Questions to consider:

  • How many parameters (represented by "#") are in the table? Why?
  • Why don't we need a column for P(C=c3 | A,B)?
  • How would this table change if...
    • ...A were boolean?
    • ...A had 5 values instead of 3?
    • ...C had 5 values instead of 3?
    • ...C had only one parent?
    • ...C had a third parent, D?

If you're still having trouble with this section, we strongly recommend coming to office hours to discuss it with a TA, or other students!

Part 4: Independence

Lastly, we will write two functions that check for variable independence.

The first such function that you should implement is is_independent, which checks if two variables are numerically independent. This function takes in four arguments,

  • net, a BayesNet object
  • var1, a variable in the Bayes net
  • var2, a variable in the Bayes net
  • givens, a dictionary mapping variables to values; possibly None

returning True if var1 and var2 are independent.

def is_independent(net, var1, var2, givens=None):

If givens is None, this function checks if var1 and var2 are marginally independent. Otherwise, this function checks if var1 and var2 are conditionally independent given givens.

Hint: The helper function approx_equal may be useful for comparing probabilities.

Recall that variables can be independent either because of the topology of the network (structural independence), or because of their conditional probability table entries (numerical independence). In general, to determine variable independence, it is sufficient to check only numerical independence, because variables that are structurally independent are guaranteed to also be numerically independent. In other words, you can implement this function by computing and comparing probabilities, without considering d-separation.


Now, implement a function that checks for structural independence. This function takes in the same arguments as is_independent, and has the same return type.

def is_structurally_independent(net, var1, var2, givens=None):

You should use d-separation to determine whether var1 and var2 are independent, based solely on the structure of the Bayes net. This function should not consider numerical independence.

Lab 8 API

As usual, we have supplied an API for lab 8, which gives you access to a lot of administrative functions for manipulating and interacting with Bayes nets, as well as a few other helper functions.

Getting information from a BayesNet

The class BayesNet defines fields and methods for interacting with a Bayes network. To get some information from a BayesNet object, we supply the following methods:

get_variables()
Returns a list of all variables in the Bayes net.
get_parents(var)
Returns the set of variables that are the parents of var.
get_children(var)
Returns the set of variables that are the children of var.
get_domain(var)
Returns the domain of the variable. For example, if the variable is boolean, the function will return (False, True).
is_neighbor(var1, var2)
Returns True if var1 and var2 are directly connected by an edge in the Bayes net, otherwise returns False. In other words, returns True exactly when var1 is a parent of var2 or var2 is a parent of var1.
find_path(start_var, goal_var)
Performs breadth-first search (BFS) to find a path from start_var to goal_var. Returns path as a list of nodes (variables), or None if no path was found.
get_probability(hypothesis, parents_vals=None, infer_missing=True)
Given hypothesis (a singleton dictionary mapping a variable to its value) and parents_vals (a dictionary mapping all of the hypothesis variable's parents to values), looks up and returns a probability in the network's conditional probability tables. If infer_missing is True, the function will try to infer missing entries in the table using the fact that certain probabilities must sum to one.
Requires that hypothesis has exactly one variable and that parents_vals's keys are exactly the parents of the hypothesis variable; if either condition isn't met, raises a ValueError.
If the hypothesis variable does not exist in the network, or the value cannot be appropriately located in the conditional probability tables of the net, raises a LookupError.
Example: Suppose net is a BayesNet instance with three boolean variables A, B, and C, with C the child of A and B. To look up P(C = True | A=False, B=False), you can call net.get_probability({"C" : True}, {"A": False, "B" : False}).

To view the structure of a BayesNet, you can print its variables and connections directly using the print statement.

To print a conditional probability table:

CPT_print(var=None)
Pretty-prints the Bayes net's conditional probability table for var (a variable in the net). If var is not specified, prints every conditional probability table in the net.

Iterating over a BayesNet

topological_sort()
Returns an ordered list of all variables such that each variable comes after all its parents.
combinations(variables, constant_bindings=None)
Given variables (a list of variables), and constant_bindings (a dictionary mapping variables to constant values), returns a list of dictionaries, each of which is a possible binding of variables, mapping variables to values. Each binding dictionary also includes the entries in constant_bindings, if specified.
Example: If your Bayes net net has three boolean variables A, B, and C, calling net.combinations(['A','B','C']) will return a list of eight dictionaries indicating possible bindings for A, B, and C:
  • {'A': False, 'B': False, 'C': False}
  • {'A': False, 'B': False, 'C': True}
  • {'A': False, 'B': True, 'C': False}
  • {'A': False, 'B': True, 'C': True}
  • {'A': True, 'B': False, 'C': False}
  • ...
  • {'A': True, 'B': True, 'C': True}
Example: Passing constant_bindings = {'D': False, 'E': True} as an argument in the above example would include the entries {'D': False, 'E': True} in every one of the above eight bindings. On the other hand, passing constant_bindings = {'C': True} as an argument would result in only four bindings, each including the entry {'C': True}.

Creating or modifying a BayesNet

The following methods will allow you to create or modify a BayesNet instance.

subnet(subnet_variables)
Given subnet_variables (a list of variables in the net), returns a new BayesNet that is a subnet of this one. The new net includes the specified variables and any edges that exist between them in the original Bayes net. Ignores any specified variables that aren't in the original Bayes net.
link(var_parent, var_child)
Adds a directed edge from var_parent to var_child, then returns the modified Bayes net. If the edge already exists, this function does nothing, and returns the Bayes net.
make_bidirectional()
Adds edges so that all original edges effectively become bi-directional. Returns the modified Bayes net.
remove_variable(var)
Removes var from the Bayes net and deletes all edges to and from var. If var is not a variable in the Bayes net, does nothing. Returns the Bayes net.

Helper functions

We also provide a couple of helper functions that you may use directly in your lab8.py file:

approx_equal(a, b, epsilon=0.0000000001)
Returns True if two numbers a and b are approximately equal (within some margin epsilon), otherwise False.
Example: approx_equal(0.4999999999999999, 0.5) --> True
product(factors)
Computes the product of a list of numbers, similar to the Python built-in function sum.
Example: product([1,2,3,4]) --> 24

At this point, we strongly recommend reading the API again: there are several methods there that will likely be of use.

The following set methods may be useful:

  • set1.update(set2): Update set1 with the union of itself and set2. (This is basically the set equivalent of list.extend.)
  • set1.intersection(set2): Return the intersection of set1 and set2 as a new set.

Survey

Please answer these questions at the bottom of your lab file:

  • NAME: What is your name? (string)
  • COLLABORATORS: Other than 6.034 staff, whom did you work with on this lab? (string, or empty string if you worked alone)
  • HOW_MANY_HOURS_THIS_LAB_TOOK: Approximately how many hours did you spend on this lab? (number or string)
  • WHAT_I_FOUND_INTERESTING: Which parts of this lab, if any, did you find interesting? (string)
  • WHAT_I_FOUND_BORING: Which parts of this lab, if any, did you find boring or tedious? (string)
  • (optional) SUGGESTIONS: What specific changes would you recommend, if any, to improve this lab for future years? (string)


(We'd ask which parts you find confusing, but if you're confused you should really ask a TA.)

When you're done, run the online tester to submit your code.

Personal tools