Lab 2: Games

From 6.034 Wiki

(Difference between revisions)
Jump to: navigation, search
m
(ConnectFourBoard)
Line 254: Line 254:
* <tt><b>board</b>.add_piece(col_number)</tt>. Adds the current player's piece to the given column. This method does <em>not</em> modify the original board, but instead returns a new ConnectFourBoard object with the piece added. In the new board, the current player will have swapped.
* <tt><b>board</b>.add_piece(col_number)</tt>. Adds the current player's piece to the given column. This method does <em>not</em> modify the original board, but instead returns a new ConnectFourBoard object with the piece added. In the new board, the current player will have swapped.
-
There are two players who are represented by their names (as strings).
+
* <tt><b>board</b>.get_column_height(col_number)</tt>. Return the number of pieces in the specified column. The returned value will be a number between 0 (no pieces in the column) and 6 (the column is full).
 +
 
 +
There are two players who are represented by their names (as strings). The names are entirely irrelevant, but names may make debugging more intuitive. Note: Player names do not determine which player is the maximizer and which player is the minimizer; those roles are attributes of the search algorithms, not the game itself.
 +
 
* <tt><b>board</b>.players</tt>. Return a list of the current players' names. The order will change so that the current player is always first in the list.
* <tt><b>board</b>.players</tt>. Return a list of the current players' names. The order will change so that the current player is always first in the list.
* <tt><b>board</b>.get_current_player()</tt>. Return the current player's name as a string.
* <tt><b>board</b>.get_current_player()</tt>. Return the current player's name as a string.
Line 260: Line 263:
-
 
-
 
-
* <tt><b>state</b>.get_endgame_score(is_current_player_maximizer=True)</tt>. If this node is a leaf node (i.e. an endgame state), returns its final score. The score will be negated depending on whether the current player is a maximizer (default) or not. If this node is not a leaf node, throws an error.
 
-
 
-
AbstractGameState is a class which encodes the rules of a game and a snapshot of its current position. They are generic enough to encode the rules of any game that can be played with minimax.
 
-
 
-
AbstractGameState objects can be traversed like trees:
 
-
* <tt><b>state</b>.generate_next_states()</tt>. Get the children of this node in the game tree, returning a list of AbstractGameState objects corresponding to the result of each possible legal move. If there are no possible moves, return the empty list.
 
-
* <tt><b>state</b>.is_game_over()</tt>. Return True if this is a leaf node in the game tree, otherwise False.
 
-
* <tt><b>state</b>.get_endgame_score(is_current_player_maximizer=True)</tt>. If this node is a leaf node (i.e. an endgame state), returns its final score. The score will be negated depending on whether the current player is a maximizer (default) or not. If this node is not a leaf node, throws an error.
 
-
 
-
 
-
AbstractGameState is a class which encodes the rules of a game and a snapshot of its current position. They are generic enough to encode the rules of any game that can be played with minimax.
 
-
 
-
AbstractGameState objects can be traversed like trees:
 
-
* <tt><b>state</b>.generate_next_states()</tt>. Get the children of this node in the game tree, returning a list of AbstractGameState objects corresponding to the result of each possible legal move. If there are no possible moves, return the empty list.
 
-
* <tt><b>state</b>.is_game_over()</tt>. Return True if this is a leaf node in the game tree, otherwise False.
 
-
* <tt><b>state</b>.get_endgame_score(is_current_player_maximizer=True)</tt>. If this node is a leaf node (i.e. an endgame state), returns its final score. The score will be negated depending on whether the current player is a maximizer (default) or not. If this node is not a leaf node, throws an error.
 
-
todo: use updated docstrings from more recent version of file
 
class ConnectFourBoard :
class ConnectFourBoard :

Revision as of 02:59, 26 September 2015

Contents

Games and Adversarial search

This lab has two parts. In the first part of this lab, you'll work with the game Connect Four, writing methods to represent states of a ConnectFour game as a game tree.

In the second part of this lab, you'll write subroutines for performing search functions on a game tree: depth-first search, ordinary minimax search, minimax search with alpha-beta pruning, and finally progressive deepening.


[in progress]

todo: make sure below is consistent with docstrings from lab3.py

more todo notes... (commented out, edit to view)


Part 1 : Representing Connect Four as a game tree

How to play Connect Four

Connect Four is a two-player game where players take turns dropping a piece into a 6x7 grid. The first player to build a consecutive sequence of four pieces wins. The sequence of four pieces can lie either horizontally, vertically, or diagonally. In this way, Connect Four is similar to other games you might know such as tic-tac-toe/noughts and crosses.

From an abstract perspective, each player has seven alternative moves in a turn, corresponding to the seven columns where the player can drop the piece (unless the column is full, in which case the player cannot move there.)

Utility functions for playing Connect Four

For the first part of this lab, we have provided an API for constructing and manipulating Connect Four boards (see ConnectFourBoard below). Your task is to define subroutines for generating a player's set of possible moves, for determining whether the game is over, and for determining whether any player has won. These subroutines will provide the necessary definitions for treating Connect Four boards as states in a game tree, which will be useful in the second half of this lab.

Game over

Return True if the game is over, otherwise False.

def is_endgame_connectfour(board) :
   returns True if board contains chain of length 4, or all columns are full

Determining the winner

Return 1 or 2 if one of the players has won, otherwise None. Assume that the board is legal, so at most one player has won.

def winner_connectfour(board) :

Generating all possible moves

Return a list of boards representing the legal game states you can end up in by making a move in the given position.

At most, you can move in any of the 7 columns. But you cannot legally move in a column if it's full, and so should exclude those. Finally, you should check in advance if the game is already over and return the empty list if so.

For each board in the list, board.prev_move_string() notes which column you moved in (for debugging purposes).

Note: iterate over columns in increasing numerical order

def next_boards_connectfour(board) :


Assigning points to the winner

Here, we'll define the utility/score of a connect four game that has been won. We'll return a score of 1000 or -1000 depending on whether the maximizer or the minimizer has won.

Important subtlety: Regarding the way the endgame scoring works: Suppose the game is already over and the maximizer would be making the next move. Who won the game? It must be that the minimizer has just won. Therefore, although it is slightly counter-intuitive, the endgame scoring function should return -1000 if the current player is the maximizer, and +1000 if the current player is the minimizer.


def endgame_score_connectfour(board, is_current_player_maximizer) :

Win already!

The previous endgame_score_connectfour function has the drawback that it doesn't reward players for winning sooner, since all winning states have the same value. Therefore, it doesn't encourage the computer to win as fast as possible the way a human would. To remedy this, write a score function that rewards players for winning sooner (that assigns a better score for games that have won in fewer moves).

To retain consistency with your other endgame scoring function, make sure you return a value with abs(val) ≥ 1000 to indicate that someone has won

def endgame_score_connectfour_faster(board) :

Hint: you can use board.count_pieces() to help determine how long the game has gone on.

On to AbstractGameStates

To define the rules and starting state of the game ConnectFour, we can pass the functions you wrote above to the AbstractGameState constructor.

Example (which we won't use for now):

starting_state_connectfour = AbstractGameState(state = ConnectFourBoard(),
                                               who_won_fn = winner_connectfour,
                                               generate_next_states_fn = next_moves_connectfour,
                                               endgame_score_fn = endgame_score_connectfour_faster)

Part 2: Searching a game tree

This part of the lab involves writing search functions on game trees made out of AbstractGameStates. The utility functions from Part 1 allow you to construct an AbstractGameState out of Connect Four boards, but the methods you write will work on other games as well.

Abstract game states have methods for:

  • Checking whether the game has a winner (...).
  • Generating the next possible moves (state.generate_next_states())
  • Returning an object representing a current snapshot of the game.
  • Computing the endgame score of a state, if the game is over.

There are other methods and attributes as well; for a more complete list, see the API. (TODO:link)

Cooperative depth-first search

To start, we will see what non-adversarial search looks like. In particular, all of the search algorithms we've studied so far are non-adversarial because they assume that you have the same goal at each level of the tree (rather than players taking turns to move toward opposing goals). Imagine both players are trying to collaboratively find the path that leads to the highest score. Search the tree in depth-first search order to find the path leading to the highest score.

Please return a tuple containing the best path (a list of AbstractGameStates showing the sequence of moves necessary to achieve the highest score), the score value of the leaf node (a number), and the number of score evaluations you performed during search (a number).

(Although the minimax algorithm we discuss in class only returns the score of the best possible move, in practice it's necessary to keep track of the corresponding states, as well. We also ask you to keep track of the number of score evaluations you performed during search in order to compare the efficiency of different search algorithms.)


def dfs_maximizing(state) :


Hint: Here, and throughout this lab, you can use the constant INF in lab3.py to represent infinity

Minimax search with endgame scores only

Find the minimax path. The boolean argument maximize determines whether the current player is the maximizer or not. This function should have the same return type as dfs; that is, it should return a tuple containing the minimax path, the minimax score, and the number of static evaluations performed during search.

def minimax_endgame_search(state, maximize=True) :


Important clarification: Either the maximizer or the minimizer may make the first move, so to determine which player is maximizing, don't look at the game state itself --- look at the maximize argument to the search function.

You can run your function on the test boards (todo)

A heuristic for ConnectFour boards

In practice, it is infeasible to search all the way to the leaves of the game tree in order to find a score. (Imagine ranking opening chess moves by whether they are automatic wins/losses for perfect play.)

Instead, here we will define a heuristic for estimating the "goodness" of a state --- specifically the goodness of a Connect Four board --- and use that in place of an endgame scoring function when we want to stop evaluation before reaching the end of the game.

Developing a good heuristic is a matter of design: there are many possible strategies and ways of measuring/defining the goodness of a state; for the purposes of this lab, you only need to define a scoring function that gives better scores for obviously stronger Connect Four positions.

As a hint, much of the information about the goodness of a state concerns the number and length of both player's chains. And, just to remain consistent with endgame scores, the absolute value of the heuristic scores should be scaled to be less than 1000.

def connectfour_heuristic(board, is_current_player_maximizer) :

Minimax with heuristic and endgame scores

The previous version of minimax which you wrote could only handle endgame states. Now we'll define minimax search which can search to any shallower depth, using a heuristic score instead of an endgame score for states that are not endgame states.

Define minimax search which can search up to any specified depth in the tree. Given a state, your algorithm should:

  1. Evaluate the state using the endgame scoring function if the state is an endgame state.
  2. Evaluate the state using the heuristic_fn argument if the state is at the limit of search.
  3. Or otherwise, generate all subsequent moves, recur on them, and compile the results.

This function should have the same return type as the other search methods above. When counting evaluations, you should include both the number of endgame score evaluations and the number of heuristic score evaluations.



todo: describe what is always_zero


def minimax_search(state, heuristic_fn=always_zero, maxdepth=INF, maximize=True) :

Minimax with alpha-beta optimization

Perform alphabeta pruning. Same return type as minimax_search.

def minimax_search_alphabeta(state, alpha=-INF, beta=INF, maxdepth=INF,
                            heuristic_fn=always_zero, maximize=True) :

Use progressive deepening to return anytime values

Write a progressive deepening algorithm (using minimax with alpha-beta pruning)

Anytime value api:

class AnytimeValue :

  • .set_value(val):
  • .get_value():


Code progressive deepening as you normally would, with the same return type as minimax_search. But at each level when you call alphabeta and it returns a new value, store that value using anytime_value.set_value(new_best_option).

def progressive_deepening(state, maxdepth=INF, heuristic_fn=always_zero, maximize=True, 
                         anytime_value = AnytimeValue()) :
   """Runs minimax with alpha-beta pruning. At each level, updates anytime_value
   with the tuple returned from minimax_search_alphabeta. Returns anytime_value."""



API

[in progress]

ConnectFourBoard

ConnectFourBoard is class for probing and manipulating the state of a Connect Four game.


We provide methods for acquiring information about the pieces in the board.

  • board.count_pieces(current_player=None). With no arguments, returns the total number of pieces on the board. Otherwise, returns the number of pieces owned by the current player (current_player = True) or by the other player (current_player = False).
  • board.get_all_chains(current_player=None). With no arguments, returns a list of all maximal contiguous chains of pieces. (A chain is a sequence of pieces belonging to a single player, arranged either horizontally, vertically, or diagonally.) Each chain is a list of 1s (for pieces belonging to the player who moved first) or 2s (for pieces belonging to the player who moved second.) If current_player=True, returns only chains belonging to the current player. If current_player=False, returns only chains belonging to the other player.
  • board.get_piece(col, row). Return the piece in the given column and row. The Connect Four board has seven columns and six rows. Return value is 1 (a piece owned by the first player), 2 (a piece owned by the second player), or 0 (if the space is empty).


You can also investigate or make moves in particular columns:

  • board.is_column_full(col_number). Given an integer between 0 (leftmost column) and 6 (rightmost column), returns a boolean value representing whether the column is full of pieces or not.
  • board.add_piece(col_number). Adds the current player's piece to the given column. This method does not modify the original board, but instead returns a new ConnectFourBoard object with the piece added. In the new board, the current player will have swapped.
  • board.get_column_height(col_number). Return the number of pieces in the specified column. The returned value will be a number between 0 (no pieces in the column) and 6 (the column is full).

There are two players who are represented by their names (as strings). The names are entirely irrelevant, but names may make debugging more intuitive. Note: Player names do not determine which player is the maximizer and which player is the minimizer; those roles are attributes of the search algorithms, not the game itself.

  • board.players. Return a list of the current players' names. The order will change so that the current player is always first in the list.
  • board.get_current_player(). Return the current player's name as a string.
  • board.get_other_player(). Return the other player's name as a string.




class ConnectFourBoard :

   max_height = 6
   max_width = 7 #number of columns
  • .get_current_player():
       """Return the current player. By convention, 1 or 2."""
  • .set_current_player(player):
       """Set the current player. By convention, 1 or 2."""
  • .get_other_player():
       """Return the other player (the one whose turn it is NOT). By convention, 1 or 2."""
  • .count_pieces(player=None):
       """Return the total number of pieces on the board. If player is
       supplied, returns only the number of those belonging to that player."""
  • .get_column_height(col_number):
       """Return the number of pieces in the column; e.g., 0 if the column is empty."""
    
  • .is_column_full(col_number):
       "Return True if column is full, False otherwise"
    
  • .add_piece(col_number, player):
       """Adds a piece belonging to the player to the given column.
       Returns new board without modifying original."""
  • .describe_previous_move():
       "Returns a string describing the most recent move leading to current state"
  • .get_all_chains(player=None):
       """Get all maximal contiguous chains of pieces. If player is provided, 
       returns only chains belonging to that player."""

AbstractGameState

AbstractGameState is a class which encodes the rules of a game and a snapshot of its current position. They are generic enough to encode the rules of any game that can be played with minimax.

AbstractGameState objects can be traversed like trees:

  • state.generate_next_states(). Get the children of this node in the game tree, returning a list of AbstractGameState objects corresponding to the result of each possible legal move. If there are no possible moves, return the empty list.
  • state.is_game_over(). Return True if this is a leaf node in the game tree, otherwise False.
  • state.get_endgame_score(is_current_player_maximizer=True). If this node is a leaf node (i.e. an endgame state), returns its final score. The score will be negated depending on whether the current player is a maximizer (default) or not. If this node is not a leaf node, throws an error.

You can also access an internal snapshot of the current game. This allows you to access game-specific details such as the number of pieces in a Connect Four board. The snapshot will also be helpful because heuristic score functions evaluate not the AbstractGameState object, but its snapshot of the current position. (This is an implementation difference between endgame scoring functions and heuristic scoring functions.)

  • state.get_snapshot(). Return a snapshot of the current state of the game. Depending on the game, the type of the returned object may be different. For example, if the game is Connect Four, the snapshot will be a ConnectFourBoard object.


Finally, you may find the following methods useful when debugging:

  • state.describe_previous_move(). Return a string representing how the current state was generated from its parent. For example, if the game is Connect Four, previous move strings look like "Put Player Two's piece in column 4."
  • state.restart(). Reset the game to the initial state. Returns the reset AbstractGameState object.


Personal tools