Lab 1: Search

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
Line 13: Line 13:
Your answers for the problem set belong in the main file <tt>lab2.py</tt>.  
Your answers for the problem set belong in the main file <tt>lab2.py</tt>.  
-
= Lab 2 =
+
= API =
-
 
+
-
== API ==
+
The file __ contains definitions for graphs and edges.  You probably won't need to read it, because the relevant information is described here.
The file __ contains definitions for graphs and edges.  You probably won't need to read it, because the relevant information is described here.
Line 48: Line 46:
-
== Problems ==
+
= Problems =
-
=== Part 1: Helper Functions ===
+
== Part 1: Helper Functions ==
A path is a list of nodes where each neighboring pair is connected by an edge in the graph. By convention, the start of the path is at index [0] and the end of the path is at index [-1].
A path is a list of nodes where each neighboring pair is connected by an edge in the graph. By convention, the start of the path is at index [0] and the end of the path is at index [-1].
Line 63: Line 61:
-
=== Part 2: Search Algorithms ===
+
== Part 2: Search Algorithms ==
Implement each search algorithm.  Each algorithm takes in an UndirectedGraph graph, a node startNode, and a node goalNode.  Each algorithm should return an appropriate path from startNode to goalNode, or None if no such path exists.
Implement each search algorithm.  Each algorithm takes in an UndirectedGraph graph, a node startNode, and a node goalNode.  Each algorithm should return an appropriate path from startNode to goalNode, or None if no such path exists.
Line 73: Line 71:
-
=== Part 3: Generic Search ===
+
== Part 3: Generic Search ==
generic_search is a generic search algorithm:
generic_search is a generic search algorithm:
Line 122: Line 120:
-
=== Part 4: Counting Extensions ===
+
== Part 4: Counting Extensions ==
generic_search_counter is identical to the generic_search algorithm in ___.  Your task is to slightly modify the search_algorithm code in generic_search_counter so that instead of returning a path to the goal, the search algorithm returns the number of extensions it had to make.
generic_search_counter is identical to the generic_search algorithm in ___.  Your task is to slightly modify the search_algorithm code in generic_search_counter so that instead of returning a path to the goal, the search algorithm returns the number of extensions it had to make.
Line 131: Line 129:
-
=== Part 5: Heuristics ===
+
== Part 5: Heuristics ==
Fill in is_admissible(graph, goalNode) to write a function that returns True if the graph has an admissible heuristic for the given goal node, otherwise False.
Fill in is_admissible(graph, goalNode) to write a function that returns True if the graph has an admissible heuristic for the given goal node, otherwise False.
Line 163: Line 161:
-
== Survey ==
+
= Survey =
Please answer the following questions:
Please answer the following questions:

Revision as of 00:05, 15 August 2015

Contents


Your answers for the problem set belong in the main file lab2.py.

API

The file __ contains definitions for graphs and edges. You probably won't need to read it, because the relevant information is described here.

A graph is an object of type UndirectedGraph.

For an UndirectedGraph g, you can use the following attributes: - You can get a list of all nodes in the graph via the .nodes attribute. - You can get a list of all edges in the graph via the .edges attribute. todo

For an UndirectedGraph g, you can use the following methods: - g.get_neighbors(node1): Given a node node1, returns a list of all nodes that are directly connected to node1. - g.get_neighboring_edges(node1): Given a node node1, returns a list of all edges that have node1 as their startNode. - g.get_edge(node1, node2): Given two nodes, returns the edge that directly connects node1 to node2 (or None if there is no such edge) - g.is_neighbor(node1, node2): Given two nodes, returns True if there is an edge that connects them, or False if there is no such edge - g.get_heuristic_value(node1, goalNode): Given two nodes node1 and goalNode, returns the heuristic estimate of the distance from node1 to goalNode.

An edge is an object of type Edge. It has the following attributes: .startNode: a node .endNode: a node .length: a positive number, or None if the edge has no length specified

For this lab, you may assume that edge weights, unless None, will always be positive numbers.

You can check whether two edges e1 and e2 are the same with e1 == e2.


Python advice: #todo copy from old wiki - sorted / .sort -- https://wiki.python.org/moin/HowTo/Sorting ^ in case of a tie, sort paths lexicographically - using lists


Problems

Part 1: Helper Functions

A path is a list of nodes where each neighboring pair is connected by an edge in the graph. By convention, the start of the path is at index [0] and the end of the path is at index [-1].

Fill in path_length(graph, path) to write a function that returns the length of a path. You may assume that the path is a valid sequence of edge-connected nodes in the graph.

Fill in has_loops(path) to write a function that returns True if the path contains a loop, otherwise false. (Hint: One efficient solution involves using a set.)

Fill in extensions(graph, path) to write a function that takes a path and returns a list of possible one-node extensions of that path. That is, your function should return a list of all possible paths that can be created by adding one more node to the given path.

Fill in sort_by_heuristic(graph, goalNode, nodes) to write a function that takes a list of any nodes in the graph and sorts them based on their heuristic value to the goalNode, from smallest to largest. The function should return the sorted list. (Hint: use sorted with the "key" argument)


Part 2: Search Algorithms

Implement each search algorithm. Each algorithm takes in an UndirectedGraph graph, a node startNode, and a node goalNode. Each algorithm should return an appropriate path from startNode to goalNode, or None if no such path exists.

Alternatively, you may skip to Part 3: Generic Search and complete this section by calling generic_search with the arguments from Part 3.

For beam search, use beam_width = get_beam_width()


Part 3: Generic Search

generic_search is a generic search algorithm:

generic_search(sort_new_paths_fn, add_paths_to_front_of_agenda, sort_agenda_fn, use_extended_set) :

It takes two functions and two boolean values as arguments. By defining appropriate values for the arguments, you can recreate any of the search algorithms we study in 6.034.

For example, to create a search algorithm that does not sort new paths, adds paths to the front of the agenda, does not sort the agenda, and does not use an extended set, you could call generic_search like this:

my_search_fn = generic_search(do_nothing_fn, True, do_nothing_fn, False)

You can then call your search algorithm on a graph:

my_path = my_search_fn(graph, startNode, goalNode)

Here's an explanation of each of the arguments in generic_search:

When you extend a node, you create a list of new paths. Some algorithms will sort these paths before adding them to the agenda. The argument sort_new_paths_fn is a function which takes the list of new paths and performs the appropriate sorting routine; it returns the sorted list of new paths. If you don't want any sorting to happen, you can pass a function which simply returns the list of paths without changing it. The function do_nothing_fn accomplishes this.

Some algorithms will add new paths to the front of the agenda (like a stack). Others will add new paths to the end of the agenda (like a queue). When the argument add_paths_to_front_of_agenda is True, paths will get added to the front of the agenda. Otherwise, paths will get added to the back.

After new paths are added to the agenda, some algorithms will then sort the entire agenda. The argument sort_agenda_fn takes the agenda as an argument, and performs the appropriate sorting routine; it returns a sorted agenda. If you don't want any sorting to happen, you can pass a function which simply returns the list of paths without changing them. The variable do_nothing_fn accomplishes this.

Please note that if you do sort the agenda, then the previous two arguments become irrelevant (it doesn't matter whether you added new paths to the front, or sorted them before adding them to the agenda.)

Some algorithms use an extended set. Set use_extended_set to True to maintain an extended set, else set it to False.


TODO: Provide sample graphs that you can call the algorithm on so as to see visually what it's doing. Show how to call the algorithm with the default arguments: generic_search()(graph, start, goal).

Please implement the following search algorithms (dfs, bfs, hill-climbing, best-first, beam, branch-and-bound, b&b+heuristic, b&b+xset, A*) by designing the right arguments to pass to the generic search algorithm. Your answer should be an ordered list of the appropriate arguments to generic_search.

You will need to write your own helper functions for sorting paths. Each sorting function should take in a graph, goalNode, and list of paths and return a list of paths. (For example, sorted_paths = my_sorting_fn(graph, goalNode, paths)) The only exception is the sort_agenda_fn for beam search, which should take in beam width as a fourth argument. (For example, sorted_beam_agenda = my_beam_sorting_fn(graph, goalNode, paths, beam_width)) Break ties lexicographically. We recommend that you avoid modifying the original list of paths.

Hint for beam search: You can implement beam search in a way almost identical to bfs. The only difference is that beam search will need to limit the number of paths whenever an entire level has been expanded. You can do this by defining a sort_agenda_fn function that checks whether an entire level has been extended. If it has, the function should limit the number of paths in the agenda. If it hasn't, the function should do nothing to the agenda.

To tell whether an entire level has been expanded, check whether all paths in the agenda have the same number of nodes. Use the variable beam_width to refer to the limiting size of the agenda.

As taught in 6.034, dfs and bfs use backtracking but do not use an extended set.


todo: for best-first, hill-climbing, beam: note that quality measure can be something other than est dist to goal; maybe have additional arg for non-generic search

You can call generic_search on a list of arguments (such as generic_dfs) like this: path = generic_search(*generic_dfs)(graph1, 'S', 'G')


Part 4: Counting Extensions

generic_search_counter is identical to the generic_search algorithm in ___. Your task is to slightly modify the search_algorithm code in generic_search_counter so that instead of returning a path to the goal, the search algorithm returns the number of extensions it had to make.

By convention, please include the final extension of the goal node, if any, in your extension count.

Hint: One possible solution involves replacing the extended set with a list.


Part 5: Heuristics

Fill in is_admissible(graph, goalNode) to write a function that returns True if the graph has an admissible heuristic for the given goal node, otherwise False.

Fill in is_consistent(graph, goalNode) to write a function that returns True if the graph's heuristic for the given goal node is consistent, otherwise False. (Hint: it's sufficient to check only neighboring nodes, rather than checking all pairs of nodes in the graph.)

The next questions use the following graph:

   A
(1) (1)

S C-(7)-G

(1) (3)
   B

For heuristic_1, pick heuristic values so that, for the goal node G, the heuristic is both admissible and consistent. Fill in the values by replacing the list of five None's with five numbers.

For heuristic_2, pick heuristic values so that, for the goal node G, the heuristic is admissible but NOT consistent.

For heuristic_3, pick heuristic values so that, for the goal node G, the heuristic is admissible but A* returns a non-optimal (not shortest) path to the goal (S-A-C-G).

For heuristic_4, pick heuristic values so that, for the goal node G, the heuristic is admissible but NOT consistent, yet A* still finds the optimal (shortest) path to the goal (S-B-C-G).


Search can be applied both to abstract problems like finding recipes, and to concrete problems like finding a route on a map. When you are finding a route on a map, you may have nodes that look like the following, where each node has spatial coordinates:

node_F = {'coordinates' : [2.5, 3.5], 'label' : 'F'}

Whenever you are using spatial coordinates, "Euclidean distance to the goal" can be a useful heuristic. Fill in euclidean_heuristic(startNode, goalNode) to write a function that takes two nodes with spatial coordinates as input, and which returns the Euclidean distance between them.

You can test your heuristic function by uncommenting the line [todo].


Survey

Please answer the following questions:

NAME: What is your name? (string) COLLABORATORS: Other than 6.034 staff, who did you work with on this lab? (string, or empty string if you worked alone)

HOW_MANY_HOURS_THIS_LAB_TOOK: Approximately how many hours did you spend on this lab? (number) WHAT_I_FOUND_INTERESTING: Which parts of this lab, if any, did you find interesting? (string, or empty string if none) WHAT_I_FOUND_BORING = Which parts of this lab, if any, did you find boring or tedious? (string, or empty string if none)

(optional) SUGGESTIONS = What concrete changes would you recommend, if any, to improve this lab for future years? (string)


Bonus API

To create a new graph g, you can instantiate it with: g = UndirectedGraph() To add nodes: g.nodes = ["A","B","C","D","E"] To create an edge between two nodes: g.join("A","B",5)

Personal tools