Lab 2: Games

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
(Use progressive deepening to return anytime values: todo)
(tiebreaking)
Line 196: Line 196:
# Evaluate the state using the <tt>heuristic_fn</tt> argument if the state is at the limit of search.
# Evaluate the state using the <tt>heuristic_fn</tt> argument if the state is at the limit of search.
# Or otherwise, generate all subsequent moves, recur on them, and compile the results.
# Or otherwise, generate all subsequent moves, recur on them, and compile the results.
-
 
+
# Break ties by preferring earlier branches to later branches (i.e. if two moves have the same minimax score, prefer the earlier one.)
This function should have the same return type as the other search methods above. When counting evaluations, you should include both the number of endgame score evaluations and the number of heuristic score evaluations.
This function should have the same return type as the other search methods above. When counting evaluations, you should include both the number of endgame score evaluations and the number of heuristic score evaluations.
Line 216: Line 216:
# Because endgame_score is an objective measure of endgame value, it is a method of the AbstractGameState; in contrast, <tt>heuristic_fn</tt> is an argument of the search function, and varies depending on the priorities of the player.
# Because endgame_score is an objective measure of endgame value, it is a method of the AbstractGameState; in contrast, <tt>heuristic_fn</tt> is an argument of the search function, and varies depending on the priorities of the player.
# The method <tt>endgame_score</tt> applies to Abstract Game State objects; in contrast, <em>heuristic functions apply to the underlying snapshot objects</em> . For example, the <tt>heuristic_connectfour</tt> function (defined above) takes a ConnectFourBoard object as its main argument.
# The method <tt>endgame_score</tt> applies to Abstract Game State objects; in contrast, <em>heuristic functions apply to the underlying snapshot objects</em> . For example, the <tt>heuristic_connectfour</tt> function (defined above) takes a ConnectFourBoard object as its main argument.
-
 
-
TODO: Add note to wiki about breaking ties in minimax score by preferring moves that appear earlier in the tree.
 

Revision as of 06:16, 28 September 2015

Contents


This lab is due by Thursday, October 8 at 10:00pm.


To work on this lab, you will need to get the code, much like you did for the first two labs.

Your answers for this lab belong in the main file lab3.py.

Games and Adversarial search

This lab has two parts. In the first part of this lab, you'll work with the game Connect Four, writing methods to represent states of a ConnectFour game as a game tree.

In the second part of this lab, you'll write subroutines for performing search functions on a game tree: depth-first search, ordinary minimax search, minimax search with alpha-beta pruning, and finally progressive deepening.


Part 1 : Representing Connect Four as a game tree

How to play Connect Four

Connect Four is a two-player game where players take turns dropping a piece into a 6x7 grid. The first player to build a consecutive sequence of four pieces wins. The sequence of four pieces can lie either horizontally, vertically, or diagonally. In this way, Connect Four is similar to other games you might know such as tic-tac-toe/noughts and crosses.

From an abstract perspective, each player has seven alternative moves in a turn, corresponding to the seven columns where the player can drop the piece (unless the column is full, in which case the player cannot move there.)

Utility functions for playing Connect Four

For the first part of this lab, we have provided an API for constructing and manipulating Connect Four boards (see ConnectFourBoard below). Your task is to define subroutines for generating a player's set of possible moves, for determining whether the game is over, and for determining whether any player has won. These subroutines will provide the necessary definitions for treating Connect Four boards as states in a game tree, which will be useful in the second half of this lab.

Game over

Return True if the game is over, otherwise False.

def is_game_over_connectfour(board) : 
   returns True if board contains chain of length 4, or all columns are full

Generating all possible moves

Return a list of boards representing the legal game states you can end up in by making a move in the given position.

At most, you can move in any of the 7 columns. But you cannot legally move in a column if it's full, and so should exclude those. Finally, you should check in advance if the game is already over and return the empty list if so.

For each board in the list, board.prev_move_string() notes which column you moved in (for debugging purposes).

Note: iterate over columns in increasing numerical order

def next_boards_connectfour(board) :


Assigning points to the winner

Here, we'll define the final score of a Connect Four game that has been won. We'll return a score of 1000 or -1000 depending on whether the maximizer or the minimizer has won. (In case of a tie, return 0 instead.)

Important subtlety: Regarding the way the endgame scoring works: Suppose the game is already over and the maximizer would be making the next move. Who won the game? It must be that the minimizer has just won. Therefore, although it is slightly counter-intuitive, the endgame scoring function should return -1000 if the current player is the maximizer, and +1000 if the current player is the minimizer.


def endgame_score_connectfour(board, is_current_player_maximizer) :

Win already!

The previous endgame_score_connectfour function has the drawback that it doesn't reward players for winning sooner, since all winning states have the same value. Therefore, it doesn't encourage the computer to win as fast as possible the way a human would. To remedy this, write a score function that rewards players for winning sooner (that assigns a better score for games that have won in fewer moves).

To retain consistency with your other endgame scoring function, make sure you return a value with abs(val) ≥ 1000 to indicate that someone has won

def endgame_score_connectfour_faster(board, is_current_player_maximizer) :

Hint: you can use board.count_pieces() to help determine how long the game has gone on.

A heuristic for ConnectFour boards

In practice, it is infeasible to search all the way to the leaves of the game tree in order to find a score. (Imagine ranking opening chess moves by whether they are automatic wins/losses for perfect play.)

Instead, here we will define a heuristic for estimating the "goodness" of a state --- specifically the goodness of a Connect Four board --- and use that in place of an endgame scoring function when we want to stop evaluation before reaching the end of the game.

Developing a good heuristic is a matter of design: there are many possible strategies and ways of measuring/defining the goodness of a state; for the purposes of this lab, you only need to define a scoring function that gives better scores for obviously stronger Connect Four positions.

As a hint, much of the information about the goodness of a state concerns the number and length of both players' chains. And, just to remain consistent with endgame scores, the absolute value of the heuristic scores should be scaled to be less than 1000. As before, the returned score will be negated depending on whether the current player is the maximizer.

def heuristic_connectfour(board, is_current_player_maximizer) :


On to AbstractGameStates

To define the rules and starting state of the game Connect Four, we can pass some of the functions you wrote above to the AbstractGameState constructor. In lab3.py, you can find two examples of an AbstractGameState object representing a Connect Four game:

# This AbstractGameState represents a new ConnectFourBoard, before the game has started:
state_starting_connectfour = AbstractGameState(snapshot = ConnectFourBoard(),
                                 is_game_over_fn = is_game_over_connectfour,
                                 generate_next_states_fn = next_boards_connectfour,
                                 endgame_score_fn = endgame_score_connectfour_faster)

# This AbstractGameState represents the ConnectFourBoard "NEARLY_OVER" from boards.py:
state_NEARLY_OVER = AbstractGameState(snapshot = NEARLY_OVER,
                                 is_game_over_fn = is_game_over_connectfour,
                                 generate_next_states_fn = next_boards_connectfour,
                                 endgame_score_fn = endgame_score_connectfour_faster)

Part 2: Searching a game tree

This part of the lab involves writing search functions on game trees made out of AbstractGameStates. The utility functions from Part 1 allow you to construct an AbstractGameState out of Connect Four boards, but the methods you write will work on other games as well.

Abstract game states have methods for:

  • Checking whether the game is over (state.is_game_over()).
  • Generating a list of the resulting game state for each legal move (state.generate_next_moves())
  • Returning an object representing a current snapshot of the game.
  • Computing the endgame score of a state, if the game is over.

There are other methods and attributes as well; for a more complete list, see the API.


About sample game states: To help you test out your search algorithms in this section, we provide several sample game states for Connect Four in your lab3.py file. There are also print statements that you can uncomment them to see examples of what each algorithm returns. You can also print the current snapshot of an AbstractGameState object to get intuition for the current state of the game.


Cooperative depth-first search

To start, we will see what non-adversarial search looks like. In particular, all of the search algorithms we've studied so far are non-adversarial because they assume that you have the same goal at each level of the tree (rather than players taking turns to move toward opposing goals). Imagine both players are trying to collaboratively find the path that leads to the highest score. Search the tree in depth-first search order to find the path leading to the highest score.

Please return a tuple containing the best path (a list of AbstractGameStates showing the sequence of moves necessary to achieve the highest score), the score value of the leaf node (a number), and the number of score evaluations you performed during search (a number).

(Although the minimax algorithm we discuss in class only returns the score of the best possible move, in practice it's necessary to keep track of the corresponding path consisting of game states, as well. We also ask you to keep track of the number of score evaluations you performed during search in order to compare the efficiency of different search algorithms.)


def dfs_maximizing(state) :


Hint: Here, and throughout this lab, you can use the constant INF in lab3.py to represent infinity


Minimax search with endgame scores only

Find the minimax path. The boolean argument maximize determines whether the current player is the maximizer or not. This function should have the same return type as dfs; that is, it should return a tuple containing the minimax path, the minimax score, and the number of static evaluations performed during search.

def minimax_endgame_search(state, maximize=True) :


Important clarification: Either the maximizer or the minimizer may make the first move, so to determine which player is maximizing, don't look at the game state itself --- look at the maximize argument to the search function.


Minimax with heuristic and endgame scores

The previous version of minimax which you wrote could only handle endgame states. Now we'll define minimax search which can search to any shallower depth, using a heuristic score instead of an endgame score for states that are not endgame states.

Define minimax search which can search up to any specified depth in the tree. Given a state, your algorithm should:

  1. Evaluate the state using the endgame scoring function if the state is an endgame state.
  2. Evaluate the state using the heuristic_fn argument if the state is at the limit of search.
  3. Or otherwise, generate all subsequent moves, recur on them, and compile the results.
  4. Break ties by preferring earlier branches to later branches (i.e. if two moves have the same minimax score, prefer the earlier one.)

This function should have the same return type as the other search methods above. When counting evaluations, you should include both the number of endgame score evaluations and the number of heuristic score evaluations.

def minimax_search(state, heuristic_fn=always_zero, depth_limit=INF, maximize=True) :


By default, minimax_search uses the heuristic function always_zero, which returns a score of zero for any board.


Important Note: The scoring functions endgame_score and heuristic_fn differ in three key ways:

  1. Endgame scores are used only for leaf nodes of the game tree; heuristic functions are used only for non-leaf nodes.
  2. Because endgame_score is an objective measure of endgame value, it is a method of the AbstractGameState; in contrast, heuristic_fn is an argument of the search function, and varies depending on the priorities of the player.
  3. The method endgame_score applies to Abstract Game State objects; in contrast, heuristic functions apply to the underlying snapshot objects . For example, the heuristic_connectfour function (defined above) takes a ConnectFourBoard object as its main argument.


Minimax with alpha-beta optimization

todo fix argument order (also for progressive deepening, maybe others)

Perform alphabeta pruning. Same return type as minimax_search.

def minimax_search_alphabeta(state, alpha=-INF, beta=INF, depth_limit=INF,
                             heuristic_fn=always_zero, maximize=True) :

Use progressive deepening to return anytime values

In this final section, you'll write a progressive deepening algorithm which uses minimax with alpha-beta pruning to search progressively deeper into the tree until time runs out. The arguments to progressive_deepening and the return type are identical to minimax_search --- with one difference: progressive deepening also takes an "AnytimeValue" object as an argument, which you should update with the latest minimax result (a sequence of moves, minimax score, and number of evaluations).


You can code progressive deepening using a similar technique to the other search algorithms. Here, you'll make iterative calls to minimax_search_alphabeta, increasing the depth each time. But at each level when minimax_search_alphabeta returns a new value, store that most recent result using anytime_value.set_value(new_best_option).

def progressive_deepening(state, depth_limit=INF, heuristic_fn=always_zero, maximize=True, 
                          anytime_value = AnytimeValue()) :
   """Runs minimax with alpha-beta pruning. At each level, updates anytime_value 
   with the tuple returned from minimax_search_alphabeta. Returns anytime_value."""

TODO: Add instructions to the wiki for creating a new AnytimeValue, because the constructor won't work.

AnytimeValue objects are simple; the methods are just:

class AnytimeValue :

  • .set_value(val):
  • .get_value():

API

[in progress]

ConnectFourBoard

ConnectFourBoard is class for probing and manipulating the state of a Connect Four game. A ConnectFourBoard has players (represented by their name strings), and pieces. Players' pieces are represented by numbers: 1 denotes the first player's piece, 2 denotes the second player's piece, and 0 denotes an empty space in the board.

The dimensions of the board are:

  • board.num_rows. The number of rows in the board (= 6).
  • board.num_cols. The number of columns in the board (= 7).


We provide methods for acquiring information about the pieces in the board.

  • board.count_pieces(current_player=None). With no arguments, returns the total number of pieces on the board. Otherwise, returns the number of pieces owned by the current player (current_player = True) or by the other player (current_player = False).
  • board.get_all_chains(current_player=None). With no arguments, returns a list of all maximal contiguous chains of pieces. (A chain is a sequence of pieces belonging to a single player, arranged either horizontally, vertically, or diagonally.) Each chain is a list of 1s (for pieces belonging to the player who moved first) or 2s (for pieces belonging to the player who moved second.) If current_player=True, returns only chains belonging to the current player. If current_player=False, returns only chains belonging to the other player.
A depiction of the NEARLY_OVER board contained in lab3.py

todo: examples of what "chains" are & what get_all_chains returns for simple connectfourboards (add a picture of NEARLY_OVER and list of chains)

  • board.get_piece(col, row). Return the piece in the given column and row. The Connect Four board has seven columns and six rows. Return value is 1 (a piece owned by the first player), 2 (a piece owned by the second player), or 0 (if the space is empty).


You can also investigate or make moves in particular columns:

  • board.is_column_full(col_number). Given an integer between 0 (leftmost column) and 6 (rightmost column), returns a boolean value representing whether the column is full of pieces or not.
  • board.add_piece(col_number). Adds the current player's piece to the given column. This method does not modify the original board, but instead returns a new ConnectFourBoard object with the piece added. In the new board, the current player will have swapped.
  • board.get_column_height(col_number). Return the number of pieces in the specified column. The returned value will be a number between 0 (no pieces in the column) and 6 (the column is full).

There are two players who are represented by their names (as strings). The names are entirely irrelevant, but names may make debugging more intuitive. Note: Player names do not determine which player is the maximizer and which player is the minimizer; those roles are attributes of the search algorithms, not the game itself.

  • board.players. Return a list of the players' names. The order will change so that the current player is always first in the list.
  • board.get_player_name(n). Return the name of the player who moved first (n=1) or second (n=2) in this game.


AbstractGameState

AbstractGameState is a class which encodes the rules of a game and a snapshot of its current position. They are generic enough to encode the rules of any game that can be played with minimax.

AbstractGameState objects can be traversed like trees:

  • state.generate_next_states(). Get the children of this node in the game tree, returning a list of AbstractGameState objects corresponding to the result of each possible legal move. If there are no possible moves, return the empty list.
  • state.is_game_over(). Return True if this is a leaf node in the game tree, otherwise False.
  • state.get_endgame_score(is_current_player_maximizer=True). If this node is a leaf node (i.e. an endgame state), returns its final score. The score will be negated depending on whether the current player is a maximizer (default) or not. If this node is not a leaf node, throws an error.

You can also access an internal snapshot of the current game. This allows you to access game-specific details such as the number of pieces in a Connect Four board. The snapshot will also be helpful because heuristic score functions evaluate not the AbstractGameState object, but its snapshot of the current position. (This is an implementation difference between endgame scoring functions and heuristic scoring functions.)

  • state.get_snapshot(). Return a snapshot of the current state of the game. Depending on the game, the type of the returned object may be different. For example, if the game is Connect Four, the snapshot will be a ConnectFourBoard object.


Finally, you may find the following methods useful when debugging:

  • state.describe_previous_move(). Return a string representing how the current state was generated from its parent. For example, if the game is Connect Four, previous move strings look like "Put Player Two's piece in column 4."
  • state.restart(). Reset the game to the initial state. Returns the reset AbstractGameState object.


Survey

Please answer these questions at the bottom of your lab3.py file:

  • NAME: What is your name? (string)
  • COLLABORATORS: Other than 6.034 staff, whom did you work with on this lab? (string, or empty string if you worked alone)
  • HOW_MANY_HOURS_THIS_LAB_TOOK: Approximately how many hours did you spend on this lab? (number or string)
  • WHAT_I_FOUND_INTERESTING: Which parts of this lab, if any, did you find interesting? (string)
  • WHAT_I_FOUND_BORING: Which parts of this lab, if any, did you find boring or tedious? (string)
  • (optional) SUGGESTIONS: What specific changes would you recommend, if any, to improve this lab for future years? (string)


(We'd ask which parts you find confusing, but if you're confused you should really ask a TA.)

When you're done, run the online tester to submit your code.

Personal tools