Artificial Intelligence — Alpha-Beta Pruning
A Brief Explanation to an Optimization Technique for the Minimax Algorithm in Game Theory of Artificial Intelligence and the Implementation in Java
We are now in a decade that focuses on expanding the application of Artificial Intelligence (AI) in almost every sector that is possible. Since the debate started about AI, game playing has been an important and one of the most interesting applications of AI.
Claude Shannon and Alan Turing wrote the first chess programs in 1950, almost as soon as computers could be programmed. Chess, tic-tac-toe, and Go are intriguing because they provide a pure abstraction of the competition between the two players. This subjectivity is what makes game playing an appealing area for AI research.
What is a minimax algorithm?
Minimax is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy to minimize the potential loss in the worst-case (highest loss) scenario. When dealing with gains, the term maximin refers to maximizing the minimum gain.
It is a classic depth-first search method for a two-player sequential game. MAX and MIN are the names of the two players. The minimax algorithm is proposed to find the best move for MAX, the root node player. The search tree is built by recursively broadening all nodes from the root in a depth-first fashion until the game ends or the maximum search depth is reached.
This algorithm provides an optimal move for the player assuming that the opponent is also playing optimally. To search through the game tree, the Mini-Max algorithm employs recursion. This algorithm is widely used in games in AI.
- Chess
- Checkers
- Tic-tac-toe
- Go
- Various two-players game
Let’s take a closer look at this algorithm.
Minimax Decision = max{
min{
max{1}, max{3, -2, -5}
},
min{
max{99}, max{2, -99}
},
min{
max{4}
}
}
= max{
min{
{1}, {3}
},
min{
{99}, {2}
},
min{
{4}
}
}
= max{
{1, 2, 4}
}
= 4
Pseudocode of the minimax algorithm:
function minimax(node, depth, maximizingPlayer)
if depth = 0 or node is a terminal node
return the utility of the node if maximizingPlayer
bestValue := ??
for each child of node
v := minimax(child, depth ? 1, FALSE)
bestValue := max(bestValue, v)
return bestValue else (* minimizing player *)
bestValue := +?
for each child of node
v := minimax(child, depth ? 1, TRUE)
bestValue := min(bestValue, v)
return bestValue
What is the drawback of this algorithm?
Game trees take a long time to construct in general, and they can only be generated quickly for simple games. If there are k number of legal moves and the maximum depth of the tree h,
Time complexity is O(kʰ)
Space complexity is O(kh)
There are some optimizations that can be incorporated into the algorithm to alleviate this circumstance. Fortunately, it is possible to determine the exact minimax decision without having to examine every node of the game tree. As a result, we remove nodes from the tree without evaluating them, a process is known as pruning.
Alpha-beta pruning
Alpha-beta pruning is a method of locating the best minimax solution while ignoring searching subtrees of steps that will not be chosen.
There are two types of nodes in the search tree for a two-player game:
Nodes representing your moves (MAX nodes): Squares (or possibly upward-pointing triangles) are commonly used to represent nodes that represent your movements.
At a MAX node, the objective is to maximize the value of the subtree rooted at that node. To accomplish this, a MAX node selects the child with the highest value, which becomes the MAX node’s value.
Nodes representing your opponent’s moves (MIN nodes): Circles(or possibly downward-pointing triangles) are commonly used to represent nodes that represent your opponent’s movements.
The objective of a MIN node is to minimize the value of the subtree rooted at a certain node. To accomplish this, a MIN node selects the child with the least (smallest) value, which becomes the MIN node’s value.
When we apply alpha-beta pruning to a regular minimax algorithm, we get the same move as before, but it removes (prunes) all the nodes that may or may not influence the final decision.
- Alpha (α): It is the best option for the player MAX so far. We want to get the most bang for our buck here. That is; the maximum lower bound of possible solutions.
- Beta (β): It is the best option for MIN so far, and it must be the lowest possible value. That is; the minimum upper bound of possible solutions.
Thus, when any new node is being considered as a possible path to the solution, it can only work if α ≤ N ≤ β, where N is the current estimate of the value of the node.
Let’s use a number line to visualize this. At any point in time, let alpha and beta be lower and upper bounds on the set of possible solution values. Therefore we can show it like this;
As the concern develops, we can make assumptions about the range of potential solutions based on min (which may place an upper bound) and max nodes (which may place a lower bound). These boundaries typically get closer and closer together as we progress through the search tree.
As long as the alpha and beta range overlap, this convergence is not really a problem. At some point during the evaluation of a node, we may discover that one of the bounds has been moved such that there is no longer any overlap between the ranges of alpha and beta.
We know that this node will never result in a solution path that we will consider, so we can stop processing it. In other words, we stop generating children for it and return to its parent node. We should pass the value we changed that exceeded the other bound to the parent for this node’s value.
Pseudocode of the alpha-beta pruning algorithm:
evaluate (node, alpha, beta)
if node is a leaf
return the utility value of node
if node is a minimizing node
for each child of node
beta = min (beta, evaluate (child, alpha, beta))
if beta <= alpha
return beta
return beta
if node is a maximizing node
for each child of node
alpha = max (alpha, evaluate (child, alpha, beta))
if beta <= alpha
return alpha
return alpha
Let’s see how it works using a minimax tree given below (Ignore the shapes, just look at the levels labeled. Imagine Square for nodes in MAX and Circles in MIN).
For the rest of this example, I’ll only show the parts of the tree that have already been evaluated or are currently being evaluated. I’ll also describe the behavior as if you were generating the child states rather than simply traversing the tree that was given to you. In that spirit, we’re attempting to find the best move by looking at two moves ahead (i.e. two moves each my me and my opponent). As a result, we will traverse the tree to a depth of 4 and then evaluate the state.
You only see the current state at the start of the problem (i.e. the current position of pieces on the game board). All you know about upper and lower bounds is that they are less than infinity and greater than negative infinity. As a result, here is the initial situation.
Since the bounds still contain a valid range until we reach a node with value 3 at depth 4, we run the problem by generating the children states and passing along the current set of bounds. At depth 4 after reaching node with value 4, our search looks like this.
We return this node to the min node above. Because this is a min node, we know that its minimax value must be less than or equal to 3. In other words, we change beta to 3.
It’s worth noting that the alpha and beta values at higher levels of the tree remained constant. Their values will be updated when processing returns to those nodes. If there is a chance that the values will change again in the future, there is no real benefit in propagating them up the tree. The only way for alpha and beta values to spread is between parent and child nodes.
When alpha and beta are plotted on a number line, they now look like this.
Then, at depth 4, we generate the next child, run our evaluation function, and return a value of 17 to the min node above. This child is ignored because it is a min node and 17 is greater than 3. We’ve now seen all of this min node’s children, so we return the beta value to the max node above. We know that because it is a max node, its value will be greater than or equal to 3, so we change alpha to 3.
It’s important to note that beta hasn’t changed. This is due to the fact that max nodes can only impose restrictions on the lower bound. Furthermore, while values passed down the tree are simply passed along, they are not passed up the tree. Instead, the final beta value of a min node is passed on to possibly change the alpha value of its parent. Similarly, the final alpha value in a max node is passed on to possibly change the beta value of its parent.
When alpha and beta are plotted on a number line, they now look like this.
We generate the next child and pass along the bounds. We produce its first child, run the evaluation function on that node, and return its value because this node is not at the target depth. Because this is a min node, we realize that its value is less than or equal to 2, so we change beta to 2.
When alpha and beta are plotted on a number line, they now look like this.
The number line shows that there is no longer any overlap between the regions bounded by alpha and beta. In essence, we discovered that the only way we could find a solution path at this node was to find a child node with a value greater than 3 and less than 2. Because that is impossible, we can stop evaluating this node’s children and revert back the beta value (2) as the node’s value.
To be sure, we don’t know what the node is worth. There could be a 1 or a 0 or a -100 somewhere in this node’s other children. Even if such a value existed, searching for it would not help us find the best solution in the search tree. The number 2 is sufficient to render this subtree useless, so we can prune any other children and return it.
That’s all there is to beta pruning!
Back at the parent max node, our alpha value is already three, which is more restrictive than two, so we leave it alone. We’ve seen all of the children of this max node at this point, so we can set its value to the final alpha value. Now we’ll look at the parent min node. We know that the value of the min node must be less than or equal to 3 because we used 3 for the first child value, so we set beta to 3.
When alpha and beta are plotted on a number line, they now look like this.
We continue to investigate the next child because we still have a valid range. We create the maximum node. It is the first child of the min node. Finally, we can reach the maximum node at the desired depth. We simply pass the alpha and beta bounds along this path.
At this point, we’ve seen all of the min node’s children and haven’t changed the beta bound. We should return the node’s actual min value because we haven’t exceeded the bound. This is distinct from the case where we pruned, in which you returned the beta value. The reason for this will become clear in a moment. The value is now returned to the parent max node. We know that this max node will have a value of 15 or greater based on this value, so we set alpha to 15.
When alpha and beta are plotted on a number line, they now look like this.
Because the alpha and beta bands have crossed yet again, we can prune the remaining children of this node and return the value that exceeded the bound (i.e. 15). We wouldn’t have been able to prune here if we had returned the beta value of the child min node (3) instead of the actual value (15).
Now since the parent min node has seen all of its children, it can choose the lowest value of its children (3) and return. Finally, we have completed the first child of the root max node. We now know that our solution will be at least three, so we set the alpha value to three and proceed to the second child.
Passing the alpha and beta values along, let’s generate the second child of the root node. Keep passing the values and keep generating the child nodes until you reach where there are no further child nodes, here at depth 4. As a result, we call the evaluation function and obtain 2. This value is used by the min node parent to set its beta value to 2.
When alpha and beta are plotted on a number line, they now look like this.
Once again we are able to prune the other children of this node and return the value that exceeded the bound. We don’t change the bounds because this value is less than the alpha bound of the parent max node.
As a result, we will generate the max node’s next child. Then, at the target depth, we generate its child. We invoke the evaluation function and obtain a value of 3. This value is used by the parent min node to set its upper bound (beta) to 3.
When alpha and beta are plotted on a number line, they now look like this.
In other words, at this point, alpha = beta. Should we prune around here? We haven’t actually exceeded the bounds, but since alpha and beta are equal, we know we won’t be able to do any better in this subtree.
Yes, we should prune. The reason for this is that even if we cannot do better, we may be able to do worse. Remember that minimax’s task is to find the best move to make at the state represented by the top-level max node. We’ve already finished with this node’s children, so we return the min value 3.
Since this max node has now seen all of its children, it returns the maximum value of those seen, which is 3. This value is handed back to its parent min node, which now has a new upper bound of 3, causing beta to be set to 3.
When alpha and beta are plotted on a number line, they now look like this.
We’ve reached a point where alpha and beta are inextricably linked, so we prune. It’s worth noting that a true solution not only indicates a number but also the movement that resulted in that number.
So the final output will be 3.
How to implement this in Java?
The order in which nodes are examined determines the effectiveness of alpha-beta pruning. The order of moves is critical in alpha-beta pruning.
In Alpha-beta pruning, there are two types of move order:
- The worst way to order: In some cases of alpha-beta pruning, none of the nodes are pruned by the algorithm, and the algorithm behaves similarly to the standard minimax algorithm. Because of the alpha and beta factors, this takes a long time and produces no effective results. In pruning, this is known as the worst ordering. The best move, in this case, is on the right side of the tree.
- Ideal Ordering: In some cases of alpha-beta pruning, the algorithm prunes a large number of nodes. In pruning, this is known as ideal ordering. The best move, in this case, is on the left side of the tree. We use DFS, so it searches the left side of the tree first and then goes deep twice as the minimax algorithm in the same amount of time.
Rules to find Good ordering
- The best move happens from the lowest node.
- When determining the best move, make use of your domain knowledge.
- The order of nodes should be such that the best nodes are computed first.
Time and Space Complexity
In the best case, each node will examine 2b-1 grandchildren to decide on its value. In the worst case, the node would examine b² grandchildren. This essentially means that the overall algorithm examined O(b^(d/2) ) nodes, the same as a worst-case algorithm whose cutoff is half of d. In practice this is significant.
Games are appealing, and writing game-playing AI programs is possibly even more so. Just as we wouldn’t expect a racing car to perform flawlessly on a bumpy road, we shouldn’t expect game-playing algorithms to perform flawlessly in every situation. However, with proper implementation, it has the potential to create a formidable competitor.
Try to implement the algorithm and have fun with AI, trust me, it won’t disappoint you!
Hope this helps. Share your thoughts too.