Lab 5: Bayes Nets

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
m (Probability)
m (Probability)
Line 32: Line 32:
Hypotheses and givens are expressed as dicts assigning variables to
Hypotheses and givens are expressed as dicts assigning variables to
-
values. For example, P(A=False | B=True, C=False) is represented as
+
values. For example, ''P(A=False | B=True, C=False)'' is represented as
the two dicts:
the two dicts:
-
  hypothesis = {A: False}
+
  hypothesis = {'A': False}
-
  givens = {B: True, C: False}
+
  givens = {'B': True, 'C': False}
First, write a helper function:
First, write a helper function:

Revision as of 19:03, 15 November 2016

Contents


This lab is due by Tuesday, November 22 at 10:00pm.

To work on this lab, you will need to get the code, much like you did for the previous labs. You can:

Online tests will be made available by Wednesday morning (11/16). In the meantime, the local tester provides thorough unit tests for each section of the lab.

Your answers for this lab belong in the main file lab8.py.

Problems

Ancestors, descendants, and non-descendants

Implement a function that returns a set containing the ancestors of the variable in the network. This set should include the variable's parents, its parents' parents, etc.

def get_ancestors(net, var):


Implement a function that returns a set containing the descendants of the variable in the network. This set should include the variable's children, its children's children, etc.

def get_descendants(net, var):


Implement a function that returns a set containing the non-descendants of the variable. Note that a variable is neither a descendant nor a non-descendant of itself.

def get_nondescendants(net, var):

Probability

Hypotheses and givens are expressed as dicts assigning variables to values. For example, P(A=False | B=True, C=False) is represented as the two dicts:

hypothesis = {'A': False}
givens = {'B': True, 'C': False}

First, write a helper function:

def simplify_givens(net, var, givens):

If all parents of var are given, and no descendants of var are given, this function should return a new dict of givens with var's non-descendants (except parents) removed -- that is, it should return a new dict consisting of only the parent variables and their values. Otherwise, if not all the parents of var are given, or if the givens include one or more descendants of var, the function should simply return the original givens. In either case, the function should not modify the original givens.

Hint: The set method .issubset may be useful here.


Now, we will implement a function that looks up probabilities in the net's conditional probability tables:

def probability_lookup(net, hypothesis, givens=None):

If the probability can be looked up directly (or simplified (with simplify_givens) to a form that can be looked up directly) in the network's probability tables, return it; otherwise, raise the exception LookupError.

Remember that you can simplify some probability expressions using the independence assumptions of the Bayes net.

Hint: Note that net.get_probability will raise a ValueError if the hypothesis contains multiple variables, or if the givens do not contain exactly the parents of var. You may want to use a try/except block to catch this error.


Given a dict assigning every variable in the network to a value, compute its joint probability (e.g. {"A": True, "B": False, "C": False}):

def probability_joint(net, hypothesis):

Use the chain rule to compute joint probabilities in terms of values produced by probability_lookup. You may assume that the hypothesis represents a valid joint probability (that is, contains every variable in the Bayes net).


A marginal probability can be represented as a sum of joint probabilities:

def probability_marginal(net, hypothesis):

Compute marginal probabilities as sums of joint probabilities produced by probability_joint.

Hint 1: The BayesNet method net.combinations may be useful. (See the API below for details.)

Hint 2: To combine two dictionaries d1 and d2 to form a third dict d3, you can use d3 = dict(d1, **d2).


Some conditional probabilities can be looked up in the Bayes net using probability_lookup. The rest can be computed as ratios of marginal probabilities.

def probability_conditional(net, hypothesis, givens=None):


Use all of the above types of probability to produce a function that can compute any probability expression in terms of the Bayes net parameters:

def probability(net, hypothesis, givens=None):

Parameter-counting and independence

Here are three more things we can do with Bayes nets.


First, implement a function that returns the number of parameters in the network. Note that we are no longer assuming boolean variables, so some variables can take on more than two values.

def number_of_parameters(net):

Hint: The helper function product may be helpful here.


Second, implement a function that checks independence:

def is_independent(network, var1, var2, givens=None):

If givens are supplied, return True if var1 and var2 are conditionally independent given the givens. If givens are not supplied, return True if var1 and var2 are marginally independent. Otherwise, return False.

Hint: The helper function approx_equal may be useful for comparing probabilities, which are floats.

Recall that variables can be independent either because of the topology of the network (structural independence), or because of their conditional probability table entries (numerical independence). It is sufficient to check only numerical independence, because variables that are structurally independent are guaranteed to also be numerically independent. (That is, you can implement by computing and comparing probabilities, without considering d-separation.)


Third, implement a function that checks structural independence:

def is_structurally_independent(net, var1, var2, givens=None):

Use d-separation (or an equivalent method, such as Bayes Ball) to determine whether var1 and var2 are independent, based solely on the structure of the Bayes net. Return True or False accordingly. This function should not consider numerical independence.

The following set methods may be useful:

  • set1.update(set2): Update set1 with the union of itself and set2. (This is basically the set equivalent of list.extend.)
  • set1.intersection(set2): Return the intersection of set1 and set2 as a new set.

Survey

Please answer these questions at the bottom of your lab8.py file:

  • NAME: What is your name? (string)
  • COLLABORATORS: Other than 6.034 staff, whom did you work with on this lab? (string, or empty string if you worked alone)
  • HOW_MANY_HOURS_THIS_LAB_TOOK: Approximately how many hours did you spend on this lab? (number or string)
  • WHAT_I_FOUND_INTERESTING: Which parts of this lab, if any, did you find interesting? (string)
  • WHAT_I_FOUND_BORING: Which parts of this lab, if any, did you find boring or tedious? (string)
  • (optional) SUGGESTIONS: What specific changes would you recommend, if any, to improve this lab for future years? (string)


(We'd ask which parts you find confusing, but if you're confused you should really ask a TA.)

Online tests will be made available by Wednesday morning.

API

The file bayes_api.py defines the BayesNet class, as well as some helper functions, all described below.

Getting information from a BayesNet

  • net.get_variables(): Returns a list of all variables in the Bayes net.
  • net.get_parents(var): Returns the set of variables that are the parents of var.
  • net.get_children(var): Returns the set of variables that are the children of var.
  • net.get_domain(var): Returns the domain of the variable. For example, if the variable is boolean, the function will return (False, True).
  • net.is_neighbor(var1, var2): Returns True if var1 and var2 are directly connected by an edge in the Bayes net, otherwise returns False. In other words, returns True just if var1 is a parent of var2, or var2 is a parent of var1.
  • net.get_probability(hypothesis, parents_vals=None, infer_missing=True): Given the value of a variable and all its parents, looks up and returns a probability in the network's conditional probability tables. For example, if A, B, C are boolean variables in a particular network, and C is the child of A and B; you can look up P(C = True | A=False, B=False) using net.get_probability({"C" : True}, {"A": False, "B" : False}). If infer_missing is True, the function will try to infer missing entries in the table using the fact that certain probabilities must sum to one. Note that get_probability may raise a ValueError or LookupError if it is unable to find the requested probability in the network's conditional probability tables.
  • net.find_path(start_var, goal_var): Performs breadth-first search (BFS) to find a path from start_var to goal_var. Returns path as a list of nodes (variables), or None if no path was found.


To view the structure of a BayesNet, you can print its variables and connections directly using print net.

To print a conditional probability table: net.CPT_print(var=None): Pretty-prints the Bayes net's conditional probability table for var. If var is not specified, prints every conditional probability table in the net.

Iterating over a BayesNet

net.topological_sort(): Returns an ordered list of all variables such that each variable comes after all its parents.


net.combinations(variables, initial_bindings=None): Given a list of variables, returns a list of every possible binding of those variables. For example, if you have three boolean variables "A", "B", "C", this function will return a list of eight possible bindings :

  • {"A" : False, "B" : False, "C" : False}
  • {"A" : True, "B" : False, "C" : False}
  • {"A" : False, "B" : True, "C" : False}
  • {"A" : False, "B" : False, "C" : True}
  • {"A" : True, "B" : True, "C" : True}


Each binding in the list will also include the entries in initial_bindings, if specified. For example, passing initial_bindings = {"D" : False, "E" : True} as an argument would include the entries {"D" : False, "E" : True} in every one of the above eight bindings.


Creating or modifying a BayesNet

We predict that you will only require the following four methods:

  • net.subnet(subnet_variables): Returns a new BayesNet that is a subnet of this one. The new net includes the specified variables and any edges that exist between them in the original Bayes net. Ignores any specified variables that aren't in the original Bayes net.
  • net.link(var_parent, var_child): Adds a directed edge from var_parent to var_child, then returns the Bayes net. (If the edge already exists, this function simply returns the Bayes net.)
  • net.make_bidirectional(): Adds edges so that all original edges effectively become bi-directional. Returns the Bayes net.
  • net.remove_variable(var): Removes var from the Bayes net and deletes all edges to/from var. (If var is not a variable in the Bayes net, does nothing.) Returns the Bayes net.


Helper functions

approx_equal(a, b, epsilon=0.0000000001): Returns True if two numbers a and b are approximately equal (within some margin epsilon), otherwise False.

Sample usage: approx_equal(0.4999999999999999, 0.5) -> True


product(factors): Computes the product of a list of numbers, similar to the Python built-in function sum


Personal tools