From 6.034 Wiki

(Difference between revisions)

Jump to: navigation, search

Revision as of 03:20, 21 September 2015

API

todo

Problems

[in progress]

We are going to define a new GameState object which will encode the rules and starting state of the game ConnectFour.

Return True if the game is over, otherwise False.

def is_endgame_connectfour(board) :
   returns True if board contains chain of length 4, or all columns are full

Return 1 or 2 if one of the players has won, otherwise None. Assume that the board is legal, so at most one player has won.

def winner_connectfour(board) :

Return a list of boards representing the legal game states you can end up in by making a move in the given position.

At most, you can move in any of the 7 columns. But you cannot legally move in a column if it's full. And you should check in advance if the game is already over and return the empty list if so.

For each board in the list, board.prev_move_string() notes which column you moved in (for debugging purposes).

Note: iterate over columns in increasing numerical order

def next_moves_connectfour(board) :

Here, we'll define the utility/score of a connect four game that has been won. We'll return a score of 1000 (arbitrarily chosen) if player 1 has won, and a score of -1000 if player 2 has won.

def endgame_score_connectfour(board) :

However, the above score function doesn't reward players for winning sooner, since all winning states have the same value. Instead, write a score function that rewards players for winning sooner. (Hint: you can use board.count_pieces())

make sure you return a value with abs(val) >= 1000 to indicate that someone has won

def endgame_score_connectfour_faster(board) :

To define the rules and starting state of the game ConnectFour, we can pass the functions you wrote above to the AbstractGameState constructor.

Example (which we won't use for now): starting_state_connectfour = AbstractGameState(state = ConnectFourBoard(),

                                              who_won_fn = winner_connectfour,
                                              generate_next_states_fn = next_moves_connectfour,
                                              endgame_score_fn = endgame_score_connectfour_faster)

Now we can begin writing minimax functions for abstract game states. Note: we will now be using the abstractgamestate api.

First, we'll see what non-adversarial search looks like. Imagine both players are trying to collaboratively find the best path. Use depth-first search to find the best path. Assume both players have the same goal: finding the maximum score.

use dfs to explore all possible end-game states

Please return a tuple containing the best path (a list of Move objects), the score value of the leaf node (a number), and the number of score evaluations you performed (a number).

Although we elide over this in recitation and only return the final score, it is necessary both to keep track of and to return the entire sequence of states so that you know where the score came from, and so that you'll know which move to make to get to that score. We also ask you to keep track of the number of evaluations so that you can compare the performance of the different algorithms.

use the constant INF in lab3.py to represent infinity

def dfs_maximizing(state) :

Find the minimax path by alternating max/min. Same return type as dfs.

def minimax_endgame_search(state, maximize=True) :

LAB WRITER TODO: provide test boards that are nearly full and so have a very small game tree. (If you started with an empty board, the tree would be massive.) also provide commented out lines of code that can be uncommented to see how the different minimax variations perform in terms of number of evaluations

In practice, it is infeasible to search through the entire game tree in order to find a score. (Imagine ranking opening chess moves by whether they are automatic wins/losses for perfect play.)

Instead, define a heuristic that allows you to estimate the goodness of a state

There are many possible strategies and ways of measuring/defining the goodness of a state; you only need to define something that gives higher scores for obviously stronger positions.

hint: incorporate information about each player's chains must be abs < 1000 b/c not winning states

def connectfour_heuristic(board) :

The previous version of minimax which you wrote could only handle endgame states. Now we'll define minimax search which can search to any shallower depth, using a heuristic score instead of an endgame score for states that are not endgame states.

Define minimax search which (1) recurses if the state is not an endgame state or at the maxdepth limit (2) uses endgame score if the state is an endgame state (3) uses heuristic_fn for states at the maxdepth limit. Same return value as minimax_search.

if game is over and limit reached, use endgame score (not heuristic). maxdepth is the number of levels further you can look. ie, if maxdepth=1, you can look at all the child nodes of the current state, but no further. If maxdepth=0, you are at the maxdepth and cannot look at any child nodes.

the evaluation count should count the total number of endgame score evals and heuristic score evals. You don't need to distinguish between them for the purpose of counting evaluations.

LAB WRITER TODO: minimax_search could also be minimax_search_anydepth

def minimax_search(state, heuristic_fn=always_zero, maxdepth=INF, maximize=True) :

Perform alphabeta pruning. Same return type as minimax_search.

def minimax_search_alphabeta(state, alpha=-INF, beta=INF, maxdepth=INF,
                            heuristic_fn=always_zero, maximize=True) :

Write a progressive deepening algorithm (using minimax with alpha-beta pruning)

Code progressive deepening as you normally would, with the same return type as minimax_search. But at each level when you call alphabeta and it returns a new value, store that value using anytime_value.set_value(new_best_option).

def progressive_deepening(state, maxdepth=INF, heuristic_fn=always_zero, maximize=True, 
                         anytime_value = AnytimeValue()) :

LAB WRITER TODO: Have it reorganize the moves each round based on the heuristic value.

Lab 2: Games

From 6.034 Wiki

Revision as of 03:20, 21 September 2015

API

Problems

Views

Personal tools

Navigation

Search

Toolbox

@@ Line 3: / Line 3: @@
 = Problems =
-todo
+[in progress]
+We are going to define a new GameState object which will encode the
+rules and starting state of the game ConnectFour.
+Return True if the game is over, otherwise False.
+ def is_endgame_connectfour(board) :
+    returns True if board contains chain of length 4, or all columns are full
+Return 1 or 2 if one of the players has won, otherwise None.
+Assume that the board is legal, so at most one player has won.
+ def winner_connectfour(board) :
+Return a list of boards representing the legal game states you can
+end up in by making a move in the given position.
+At most, you can move in any of the 7 columns.  But you cannot
+legally move in a column if it's full. And you should check in
+advance if the game is already over and return the empty list if so.
+For each board in the list, board.prev_move_string()
+notes which column you moved in (for debugging purposes).
+Note: iterate over columns in increasing numerical order
+ def next_moves_connectfour(board) :
+Here, we'll define the utility/score of a connect four game that has been
+won. We'll return a score of 1000 (arbitrarily chosen) if player 1
+has won, and a score of -1000 if player 2 has won.
+ def endgame_score_connectfour(board) :
+However, the above score function doesn't reward players for
+winning sooner, since all winning states have the same value.
+Instead, write a score function that rewards players for winning
+sooner. (Hint: you can use board.count_pieces())
+make sure you return a value with abs(val) >= 1000 to indicate that someone has won
+ def endgame_score_connectfour_faster(board) :
+To define the rules and starting state of the game ConnectFour, we
+can pass the functions you wrote above to the AbstractGameState
+constructor.
+Example (which we won't use for now):
+starting_state_connectfour = AbstractGameState(state = ConnectFourBoard(),
+                                               who_won_fn = winner_connectfour,
+                                               generate_next_states_fn = next_moves_connectfour,
+                                               endgame_score_fn = endgame_score_connectfour_faster)
+Now we can begin writing minimax functions for abstract game
+states. Note: we will now be using the abstractgamestate api.
+First, we'll see what non-adversarial search looks like. Imagine
+both players are trying to collaboratively find the best path. Use
+depth-first search to find the best path. Assume both players have
+the same goal: finding the maximum score.
+use dfs to explore all possible end-game states
+Please return a tuple containing the best path (a list of Move
+objects), the score value of the leaf node (a number), and the
+number of score evaluations you performed (a number).
+Although we elide over this in recitation and only return the final
+score, it is necessary both to keep track of and to return the
+entire sequence of states so that you know where the score came
+from, and so that you'll know which move to make to get to that
+score.  We also ask you to keep track of the number of evaluations
+so that you can compare the performance of the different algorithms.
+use the constant INF in lab3.py to represent infinity
+ def dfs_maximizing(state) :
+Find the minimax path by alternating max/min. Same return type as dfs.
+ def minimax_endgame_search(state, maximize=True) :
+LAB WRITER TODO: provide test boards that are nearly full and so have a very
+small game tree. (If you started with an empty board, the tree would
+be massive.)
+also provide commented out lines of code that can be uncommented to see
+how the different minimax variations perform in terms of number of evaluations
+In practice, it is infeasible to search through the entire game tree
+in order to find a score. (Imagine ranking opening chess moves by
+whether they are automatic wins/losses for perfect play.)
+Instead, define a heuristic that allows you to estimate the goodness of a state
+There are many possible strategies and ways of measuring/defining
+the goodness of a state; you only need to define something that
+gives higher scores for obviously stronger positions.
+hint: incorporate information about each player's chains
+must be abs < 1000 b/c not winning states
+ def connectfour_heuristic(board) :
+The previous version of minimax which you wrote could only handle
+endgame states. Now we'll define minimax search which can search to
+any shallower depth, using a heuristic score instead of an endgame
+score for states that are not endgame states.
+Define minimax search which (1) recurses if the state is not an
+endgame state or at the maxdepth limit (2) uses endgame score if
+the state is an endgame state (3) uses heuristic_fn for states at
+the maxdepth limit. Same return value as minimax_search.
+if game is over and limit reached, use endgame score (not heuristic).
+maxdepth is the number of levels further you can look.  ie, if maxdepth=1,
+you can look at all the child nodes of the current state, but no further. If maxdepth=0,
+you are at the maxdepth and cannot look at any child nodes.
+the evaluation count should count the total number of endgame score evals and
+heuristic score evals. You don't need to distinguish between them for the purpose
+of counting evaluations.
+LAB WRITER TODO: minimax_search could also be minimax_search_anydepth
+ def minimax_search(state, heuristic_fn=always_zero, maxdepth=INF, maximize=True) :
+Perform alphabeta pruning. Same return type as minimax_search.
+ def minimax_search_alphabeta(state, alpha=-INF, beta=INF, maxdepth=INF,
+                             heuristic_fn=always_zero, maximize=True) :
+Write a progressive deepening algorithm (using minimax with alpha-beta pruning)
+Code progressive deepening as you normally would, with the
+same return type as minimax_search. But at each level when you call
+alphabeta and it returns a new value, store that value
+using anytime_value.set_value(new_best_option).
+ def progressive_deepening(state, maxdepth=INF, heuristic_fn=always_zero, maximize=True,
+                          anytime_value = AnytimeValue()) :
+LAB WRITER TODO: Have it reorganize the moves each round based on the heuristic value.