The alpha-beta method
We saw that the method of Minimax enabled us to go up with certainty the best choice, but we did not solve the problem of the gigantic number of possibilities to study.
Must we study all the solutions? I reassure you immediately, no, there is a miracle method which makes possible to undoubtedly prune in the tree : it is the alpha-beta method.
The principle is the following :
if, on one of your moves, the opponent has a possibility of answer which makes this move bad for you, it is useless to investigate the other possibilities of answer to this same move, because all the better choices that you can find for the opponent will not be chosen by him.
This is perhaps not very clear, so, I will illustrate this with the example which you prepared for this lesson (after the previous lesson).
Let us look at the diagram n° 6 (a) (which is the same one than the 6, for those who would not have noticed it, which would mean that they would perhaps not have been as studious as I hoped).
Diagram n° 6 a
The computer investigates, like we taught to him, the first end of branch. It finds 2, that it puts in C1. The following value is 1, the computer considers it useless to go up it since C1 is a maximizing level and that the current value, 2, is higher. The following value is 5, that it puts in C1 and which replaces the value 2 that it had memorized just before.
The investigation of the branch resulting from C1 is finished, we can temporarily put the value 5 in B1, and studies the branch resulting from C2.
The first value that we find is 6, which temporarily goes up in C2. It is at this point that intervenes the alpha-beta method.
Indeed, C2 is a maximizing " level " (the computer chooses, and takes the best solution). Therefore the value which will be retained will be at least equal to the 6 we just found.
On the other hand, the level B1 is a minimizing level (the adversary chooses, and takes the worst solution for the computer). The value resulting from C2 (6 or more) will never be retained since the value resulting from C1 (5) is lower. So, it is useless to continue the investigation of the other branches resulting from C2.
In the same way, the first branch resulting from C3 gives 8, which implies that the opponent will never go in this direction (he will prefer the branch C1). That eliminates this branch too.
We thus finished the investigation of the branches resulting from B1. And 5 is the value of evaluation which we temporarily put in A.
Now, let us study the branches resulting from B2. We start with C4. The investigation of the 3 values is necessary, and we put in C4 the best value, 4.
B2 is a minimizing level, so the final value will be lower or equal to 4.
On the other hand, A is a maximizing level, which has already a temporarily value equal to 5. So, the value resulting from B2 (4 or less) will be uninteresting, because if the computer goes in this direction, it knows that the opponent can lead it to a position of value 4 (or less), which is worse than if it goes in B1. The branch B2 can be eliminated right now, because less interesting than B1, which, if the opponent answers as well as possible, will lead to a worst position than 5.
In the same way, values 2, 1, 3 coming from C7 make possible to go up 3 in B3, and to eliminate B3 for the same reasons without continuing the investigation of the other parts of this branch.
At the end of the investigation, we have studied 11 positions, out of 27 possible, that is to say a profit of 60%, and we found 16 positions useless to study, their values not changing the final choice. For those which await the result of the exercise of the preceding chapter, I can thus say that 16 is the good answer, and I congratulate the happy winners.
The example that we chose was particularly favorable, since " the good " values were found from the very start (in fact, we must arrive at a number of evaluations between 11 and 27). Alpha-beta always allows a saving of more significant time when the best solutions are examined initially.
This property is used by other algorithms, which try to order the solutions according to their probable value before really evaluating them. Several methods are used for that, but I will quote only two of them :
Method of killer, which consists in studying first the moves that have already, in other branches, authorized good alpha-beta pruning . They are often departures of shots, that we will find several times, except if "the hole" is stopped, and it is natural to try to know if a new move maintains or eliminates the threat which existed before ; if it maintains it, it possible to eliminate the branch quickly.
Iterative search, on which I will come back later, because it has another advantages.
The final profit in the case of an ordered tree is appreciable.
Various tests carried out on DAMES 2020, then on QUIDAM, up to levels 20, make me consider the profit average is about 80% of the computing time.
I advise you, for understanding this fundamental method, to study by yourself several positions with different values in the end (the leaves of the tree), and to compare with evaluations without using alpha-beta, to check that pruning does not change the result.
Other types of pruning are possible, but they are " less logically pure " than alpha-beta.
Some method can nevertheless improve alpha beta, and we will further examine those that we can use before the static evaluation (thus during the dynamic evaluation). We call them preliminary prunings (or cut off).