Expectimax Algorithm in Game Theory
Last Updated :
25 Oct, 2021
The Expectimax search algorithm is a game theory algorithm used to maximize the expected utility. It is a variation of the Minimax algorithm. While Minimax assumes that the adversary(the minimizer) plays optimally, the Expectimax doesn’t. This is useful for modelling environments where adversary agents are not optimal, or their actions are based on chance.
Expectimax vs Minimax
Consider the below Minimax tree:
As we know that the adversary agent(minimizer) plays optimally, it makes sense to go to the left. But what if there is a possibility of the minimizer making a mistake(or not playing optimally). Therefore going right might sound more appealing or may result in a better solution.
In the below Expectimax tree, we have replaced minimizer nodes by chance nodes.
The Chance nodes take the average of all available utilities giving us the ‘expected utility’. Thus the expected utilities for left and right sub-trees are (10+10)/2=10 and (100+9)/2=54.5. The maximizer node chooses the right sub-tree to maximize the expected utilities.
Advantages of Expectimax over Minimax:
- Expectimax algorithm helps take advantage of non-optimal opponents.
- Unlike Minimax, Expectimax ‘can take a risk’ and end up in a state with a higher utility as opponents are random(not optimal).
Disadvantages:
- Expectimax is not optimal. It may lead to the agent losing(ending up in a state with lesser utility)
- Expectimax requires the full search tree to be explored. There is no type of pruning that can be done, as the value of a single unexplored utility can change the expectimax value drastically. Therefore it can be slow.
- It is sensitive to monotonic transformations in utility values.
For minimax, if we have two states S1 and S2, if S1 is better than S2, the magnitudes of the evaluation function values f(S1) and f(S2) don’t matter as along as f(S1)>f(S2).
For expectimax, magnitudes of the evaluation function values matter.
Algorithm: Expectimax can be implemented using recursive algorithm as follows,
- If the current call is a maximizer node, return the maximum of the state values of the nodes successors.
- If the current call is a chance node, then return the average of the state values of the nodes successors(assuming all nodes have equal probability). If different nodes have different probabilities the expected utility from there is given by ∑ixipi
- We call the function recursively until we reach a terminal node(the state with no successors). Then return the utility for that state.
Implementation:
C++
#include <iostream>
using namespace std;
struct Node {
int value;
struct Node *left, *right;
};
Node* newNode( int v)
{
Node* temp = new Node;
temp->value = v;
temp->left = NULL;
temp->right = NULL;
return temp;
}
float expectimax(Node* node, bool is_max)
{
if (node->left == NULL
&& node->right == NULL) {
return node->value;
}
if (is_max) {
return max(
expectimax(
node->left, false ),
expectimax(node->right, false ));
}
else {
return (
expectimax(node->left, true )
+ expectimax(node->right, true ))
/ 2.0;
}
}
int main()
{
Node* root = newNode(0);
root->left = newNode(0);
root->right = newNode(0);
root->left->left = newNode(10);
root->left->right = newNode(10);
root->right->left = newNode(9);
root->right->right = newNode(100);
float res = expectimax(root, true );
cout << "Expectimax value is "
<< res << endl;
return 0;
}
|
Java
class GFG{
static class Node {
int value;
Node left, right;
};
static Node newNode( int v)
{
Node temp = new Node();
temp.value = v;
temp.left = null ;
temp.right = null ;
return temp;
}
static float expectimax(Node node, boolean is_max)
{
if (node.left == null
&& node.right == null ) {
return node.value;
}
if (is_max) {
return Math.max(
expectimax(
node.left, false ),
expectimax(node.right, false ));
}
else {
return ( float ) ((
expectimax(node.left, true )
+ expectimax(node.right, true ))
/ 2.0 );
}
}
public static void main(String[] args)
{
Node root = newNode( 0 );
root.left = newNode( 0 );
root.right = newNode( 0 );
root.left.left = newNode( 10 );
root.left.right = newNode( 10 );
root.right.left = newNode( 9 );
root.right.right = newNode( 100 );
float res = expectimax(root, true );
System.out.print( "Expectimax value is "
+ res + "\n" );
}
}
|
Python3
class Node:
def __init__( self , value):
self .value = value
self .left = None
self .right = None
def newNode(v):
temp = Node(v);
return temp;
def expectimax(node, is_max):
if (node.left = = None and node.right = = None ):
return node.value;
if (is_max):
return max (expectimax(node.left, False ), expectimax(node.right, False ))
else :
return (expectimax(node.left, True ) + expectimax(node.right, True )) / 2 ;
if __name__ = = '__main__' :
root = newNode( 0 );
root.left = newNode( 0 );
root.right = newNode( 0 );
root.left.left = newNode( 10 );
root.left.right = newNode( 10 );
root.right.left = newNode( 9 );
root.right.right = newNode( 100 );
res = expectimax(root, True )
print ( "Expectimax value is " + str (res))
|
C#
using System;
class GFG{
class Node {
public int value;
public Node left, right;
};
static Node newNode( int v)
{
Node temp = new Node();
temp.value = v;
temp.left = null ;
temp.right = null ;
return temp;
}
static float expectimax(Node node, bool is_max)
{
if (node.left == null
&& node.right == null ) {
return node.value;
}
if (is_max) {
return Math.Max(
expectimax(
node.left, false ),
expectimax(node.right, false ));
}
else {
return ( float ) ((
expectimax(node.left, true )
+ expectimax(node.right, true ))
/ 2.0);
}
}
public static void Main(String[] args)
{
Node root = newNode(0);
root.left = newNode(0);
root.right = newNode(0);
root.left.left = newNode(10);
root.left.right = newNode(10);
root.right.left = newNode(9);
root.right.right = newNode(100);
float res = expectimax(root, true );
Console.Write( "Expectimax value is "
+ res + "\n" );
}
}
|
Javascript
<script>
class Node
{
constructor(v) {
this .left = null ;
this .right = null ;
this .value = v;
}
}
function newNode(v)
{
let temp = new Node(v);
return temp;
}
function expectimax(node, is_max)
{
if (node.left == null
&& node.right == null ) {
return node.value;
}
if (is_max) {
return Math.max(
expectimax(
node.left, false ),
expectimax(node.right, false ));
}
else {
return ((
expectimax(node.left, true )
+ expectimax(node.right, true ))
/ 2.0);
}
}
let root = newNode(0);
root.left = newNode(0);
root.right = newNode(0);
root.left.left = newNode(10);
root.left.right = newNode(10);
root.right.left = newNode(9);
root.right.right = newNode(100);
let res = expectimax(root, true );
document.write( "Expectimax value is " + res);
</script>
|
Output:
Expectimax value is 54.5
Time complexity: O(bm)
Space complexity: O(b*m), where b is branching factor and m is the maximum depth of the tree.
Applications: Expectimax can be used in environments where the actions of one of the agents are random. Following are a few examples,
- In Pacman, if we have random ghosts, we can model Pacman as the maximizer and ghosts as chance nodes. The utility values will be the values of the terminal states(win, lose or draw) or the evaluation function value for the set of possible states at a given depth.
- We can create a minesweeper AI by modelling the player agent as the maximizer and the mines as chance nodes.
Share your thoughts in the comments
Please Login to comment...