Lab 2: Games

From 6.034 Wiki

Revision as of 04:47, 21 September 2015 by Jmn (Talk | contribs)
Jump to: navigation, search

Contents

API

[in progress]

ConnectFourBoard

class ConnectFourBoard :

   max_height = 6
   max_width = 7 #number of columns
  • .get_current_player():
       """Return the current player. By convention, 1 or 2."""
  • .set_current_player(player):
       """Set the current player. By convention, 1 or 2."""
  • .get_other_player():
       """Return the other player (the one whose turn it is NOT). By convention, 1 or 2."""
  • .count_pieces(player=None):
       """Return the total number of pieces on the board. If player is
       supplied, returns only the number of those belonging to that player."""
  • .get_column_height(col_number):
       """Return the number of pieces in the column; e.g., 0 if the column is empty."""
    
  • .is_column_full(col_number):
       "Return True if column is full, False otherwise"
    
  • .add_piece(col_number, player):
       """Adds a piece belonging to the player to the given column.
       Returns new board without modifying original."""
  • .describe_previous_move():
       "Returns a string describing the most recent move leading to current state"
  • .get_all_chains(player=None):
       """Get all maximal contiguous chains of pieces. If player is provided, 
       returns only chains belonging to that player."""

AbstractGameState

todo: which methods do we want in the api?

class AbstractGameState :

  • .__init__(self,
                state,
                who_won_fn,
                generate_next_states_fn,
                endgame_score_fn):
  • .state(self):
       return self.state
           
  • .endgame_winner():
       return self.who_won_fn(self.state)
           
  • .is_game_over():
       return self.endgame_winner() is not None
   
  • .generate_next_states():
       return self.generate_next_states_fn(self.state)
       
  • .describe_previous_move():
       return self.state.describe_previous_move()
  • .get_endgame_score():
       # only for leaf nodes
       if not self.is_game_over():
           raise ValueError("Only endgame states have endgame score defined.")
       return self.endgame_score_fn(self.state)
  • .restart():
       self.state = self.starting_state

Problems

[in progress]

todo: make sure below is consistent with docstrings from lab3.py

more todo notes... (commented out, edit to view)

Creating a ConnectFour AbstractGameState object

We are going to define a new AbstractGameState object which will encode the rules and starting state of the game ConnectFour.


Return True if the game is over, otherwise False.

def is_endgame_connectfour(board) :
   returns True if board contains chain of length 4, or all columns are full

Return 1 or 2 if one of the players has won, otherwise None. Assume that the board is legal, so at most one player has won.

def winner_connectfour(board) :

Return a list of boards representing the legal game states you can end up in by making a move in the given position.

At most, you can move in any of the 7 columns. But you cannot legally move in a column if it's full. And you should check in advance if the game is already over and return the empty list if so.

For each board in the list, board.prev_move_string() notes which column you moved in (for debugging purposes).

Note: iterate over columns in increasing numerical order

def next_boards_connectfour(board) :

Here, we'll define the utility/score of a connect four game that has been won. We'll return a score of 1000 (arbitrarily chosen) if player 1 has won, and a score of -1000 if player 2 has won.

def endgame_score_connectfour(board) :

However, the above score function doesn't reward players for winning sooner, since all winning states have the same value. Instead, write a score function that rewards players for winning sooner. (Hint: you can use board.count_pieces())

make sure you return a value with abs(val) >= 1000 to indicate that someone has won

def endgame_score_connectfour_faster(board) :

To define the rules and starting state of the game ConnectFour, we can pass the functions you wrote above to the AbstractGameState constructor.

Example (which we won't use for now):

starting_state_connectfour = AbstractGameState(state = ConnectFourBoard(),
                                               who_won_fn = winner_connectfour,
                                               generate_next_states_fn = next_moves_connectfour,
                                               endgame_score_fn = endgame_score_connectfour_faster)

Searching a game tree

Cooperative depth-first search

Now we can begin writing minimax functions for abstract game states. Note: we will now be using the AbstractGameState API.

First, we'll see what non-adversarial search looks like. Imagine both players are trying to collaboratively find the best path. Use depth-first search to find the best path. Assume both players have the same goal: finding the maximum score.

use dfs to explore all possible end-game states

Please return a tuple containing the best path (a list of Move objects), the score value of the leaf node (a number), and the number of score evaluations you performed (a number).

Although we elide over this in recitation and only return the final score, it is necessary both to keep track of and to return the entire sequence of states so that you know where the score came from, and so that you'll know which move to make to get to that score. We also ask you to keep track of the number of evaluations so that you can compare the performance of the different algorithms.

use the constant INF in lab3.py to represent infinity

def dfs_maximizing(state) :

Adversarial search with minimax on endgame scores

Find the minimax path by alternating max/min. Same return type as dfs.

def minimax_endgame_search(state, maximize=True) :


You can run your function on the test boards (todo)

A heuristic for ConnectFour boards

In practice, it is infeasible to search through the entire game tree in order to find a score. (Imagine ranking opening chess moves by whether they are automatic wins/losses for perfect play.)

Instead, define a heuristic that allows you to estimate the goodness of a state

There are many possible strategies and ways of measuring/defining the goodness of a state; you only need to define something that gives higher scores for obviously stronger positions.

hint: incorporate information about each player's chains must be abs < 1000 b/c not winning states


def connectfour_heuristic(board) :

Minimax with heuristic and endgame scores

The previous version of minimax which you wrote could only handle endgame states. Now we'll define minimax search which can search to any shallower depth, using a heuristic score instead of an endgame score for states that are not endgame states.

Define minimax search which (1) recurses if the state is not an endgame state or at the maxdepth limit (2) uses endgame score if the state is an endgame state (3) uses heuristic_fn for states at the maxdepth limit. Same return value as minimax_search.

if game is over and limit reached, use endgame score (not heuristic). maxdepth is the number of levels further you can look. ie, if maxdepth=1, you can look at all the child nodes of the current state, but no further. If maxdepth=0, you are at the maxdepth and cannot look at any child nodes.

the evaluation count should count the total number of endgame score evals and heuristic score evals. You don't need to distinguish between them for the purpose of counting evaluations.

todo: describe what is always_zero


def minimax_search(state, heuristic_fn=always_zero, maxdepth=INF, maximize=True) :

Minimax with alpha-beta optimization

Perform alphabeta pruning. Same return type as minimax_search.

def minimax_search_alphabeta(state, alpha=-INF, beta=INF, maxdepth=INF,
                            heuristic_fn=always_zero, maximize=True) :

Use progressive deepening to return anytime values

Write a progressive deepening algorithm (using minimax with alpha-beta pruning)

Anytime value api:

class AnytimeValue :

  • .set_value(val):
  • .get_value():


Code progressive deepening as you normally would, with the same return type as minimax_search. But at each level when you call alphabeta and it returns a new value, store that value using anytime_value.set_value(new_best_option).

def progressive_deepening(state, maxdepth=INF, heuristic_fn=always_zero, maximize=True, 
                         anytime_value = AnytimeValue()) :
   """Runs minimax with alpha-beta pruning. At each level, updates anytime_value
   with the tuple returned from minimax_search_alphabeta. Returns anytime_value."""