The Minimax method



We saw that it was necessary to have good pruning tools, to correctly use a decision tree which becomes more complicated and denser than a forest.

Let us start with the method, which will avoid us finding us with billion figures which we will be unable to use.



The method universally used (checkers, chess, go, othello, bridge...) is called Minimax.

Minimax schematizes the decision-making process which should follow any player : it supposes that each player takes at each turn the decision which is the best for him (this is the basic axiom of all the game theory).

This means that the computer, in the opposite of what is sometimes believed, will not try to let traps in the hope that you fall inside. On the other hand, it will take and maintain positions strong for him, which can seem like traps for you.

Let us examine a precise case, to understand how "thinks" the computer.



Diagram n° 3

In this flow chart which stops on the level 1, we suppose that the initial position of the computer is identified by letter A, and that the computer has 3 possibilities of moves, which respectively bring it to the positions B1, B2, B3. We associated with these positions the value which the computer finds by a static evaluation. You see that the computer can choose between the values 5, 10, 3, found in B1, B2 and B3. It will naturally take the most advantageous solution for it, i.e. that corresponding to the highest value (here 10 in B2).

That's easy for now !



Let us examine now a flow chart of level 2:

Diagram n° 4

There, the evaluations in B1, B2, B3 do not have significance, since, after the following moves (leading to the positions C1 with C 9) other values are found, the piece having moved. Your vision in the future thus goes further, and it is this future that it is necessary to examine.

The computer has to choose between the moves leading to B1, B2 and B3. If it plays the move that leads to B1, it id its opponent who will choose the answer, leading to position C1, C2 or C3.

In the same way, if the computer plays the move leading to B2, the opponent chooses between C4, C5 and C6,and if the computer chooses B3, the opponent chooses between C7, C8 and C9. As the opponent is not there to make some gift, and if it have the same criterions of evaluation than the computer (it is that we are obliged to suppose), it will choose the solution which is the better for him, and the worse for the computer. So, the computer will never reach inviting positions like C1 or C9.



Thus, if the computer plays the move leading to the position B1, the opponent will answer by the move leading to C3 which gives a value of 3 ; if the computer chooses B2, the adversary will answer by the position C4 which gives a value of 5, if the computer chooses B3, he will answer by C8, which gives a value of 2. So we have a flow chart like that one :



Diagram n° 5

Between the 3 values corresponding to B1, B2 and B3, the computer will choose the one which gives him the most points, and it will be the way B2, which will bring back 5 to him, its adversary being brought to answer him by the move leading to the position C4.

In the same way, if we continue our reasoning on level 3, we would see that the choice of the following move would be given to the computer :



 

Diagram n° 6

Let us go to the position C1. The computer has the choice between 3 possibilities, bringing it to 3 positions (which I

could have called D1, D2, D3), with respective values 2, 1, 5. It deduces from it that, if it is brought to find itself in the position C1, the best move than it will be able to play will lead it to a position which has a value of 5 points. It can now assign this value of 5 to the position C1. In the same way, it can assign 11 to C2, and 8 to C3.

Now, in the position B1, it will be the opponent which will choose between C1, C2 and C3.

And the opponent will choose the least advantageous solution for the computer, i.e. The one leading to C1. So, we can affect 5 to B1. Then, in the same way, we will affect value 2 in B2 (from C6), and 3 in B3 (from C7). The best choice for the computer is thus the branch B1, which gives him a score of 5, which comes from C1 (choice probably imposed by the opponent if it plays well).

Thus, the process used by the program corresponds to the alternation of the decisions (computer / opponent), and alternates the levels known as " maximizing ", where the choice of the move to be played belongs to the computer, and levels known as " minimizing ", where the choice of the move belongs to its opponent. The solution going up to the top is, with certainty, the best choice initial, taking into account the best possible answers of the opponent in each case.

You see that only the evaluations of end of branch (leaves) are necessary, the results going up then at the top by alternation of maximum and minimum.



Minimax gives the possibility of examining with method the complex tree of the decisions, but it does not prune any branch. We will approach the methods of pruning further.



However, to prepare the next lesson effectively, I invite you to a small exercise which we will correct together, and which will make you better understand the following concept: the splendid one, the essential algorithm alpha-beta.



From the diagram n° 6:



1) First, check that, if you assign any values to the 3 branches resulting from C5, it modifies nothing in the choice of the computer.

2) Try to say the number of branches which can thus take any value without changing the final choice of the computer (value 5, by B1 and C1).



When we will have determined that there are branches which can take any value without changing the solution, we will be able to think that it would have been better to not examine these branches, by using an adequate pruning.

* The alpha-beta method


* Back to first page