Principal Variation Search

See also minimax with α/β pruning.

Description

The term “principal variation search” is something of a misnomer because pvs is not a distinct search algorithm but simply α/β minimax with an additional pruning rule:

Before searching a successor, attempt to prove that the search will fail low (return a value beneath alpha), and if successful, skip the successor.

We’ll be using the notation described on the page for α/β minimax and extending it slightly.

A node can be an inhabitant of multiple search trees in the sense that, if node n is an inhabitant of tree S and S is a subtree of tree T, then n is an inhabitant of T as well. In our formulation, a search tree is entirely specified by its root, since each node is a structure containing alpha, beta, &c, and the list of its successors.
(In other words, a tree is merely a collection of nodes with a particular property that it is closed under the has-successor relation rather than a pair of collections, viz labelled positions and directed edges.) So although a node and the tree rooted at that node are not quite the same thing, the set of nodes and the set of trees are isomorphic.
We’ll generally prefer to think of there being only a single search tree, however (the transitive closure of the has-predecessor and has-successor relations), and write node.tree to refer to the largest tree that contains node. For a tree U, we’ll write U.root to refer to the node that contains no predecessors in U.

But besides this except in the sense above each node belongs to only one tree; in other words, we will consider the nodes of different searches to be distinct. For example: given some node, the search tree X that is created by search(node.position, −1, 0) and the search tree Y that is created by search(node.position, −1, +1) have no nodes in common, and more pointedly, neither X.root nor Y.root are node; the three are different, even though X.root.position, Y.root.position, and node.position are all identical.

Proving that a search will fail low

The proof that the search of a successor will fail low

node.succ[k].ret < node.alpha[k]

is established by a zero-window search, which has the property of failing low exactly when the search of node.succ[k], a full-window search, will fail low. This implies node.alpha[k+1] = node.alpha[k], and knowing node.alpha[k+1] allows the search of node.succ[k+1] to begin.

The root of this zero-window search is Z(node.succ[k]), defined like so:

Recall that

so for the zero-window search, we’ve effectively lowered node.beta to node.alpha[k].

It’s common for the criterion for cutoff to be ret beta rather than ret > beta, in which case node.beta is effectively lowered to node.alpha[k] + ε rather than node.alpha[k], where ε is the smallest non­zero difference in evaluation, and so you will almost exclusively see
  • Z(node.succ[k]).alpha[0] = node.alpha[k]ε
in implementations instead.

The property above can be stated more precisely as

Z(node.succ[k]).ret < node.alpha[k] implies node.succ[k].ret < node.alpha[k] .

Typically, the same routine is used for full-window and zero-window searches, and this requires only a single bit of bookkeeping. A flag is necessary because you can’t tell just from looking at alpha and beta, because they may differ by zero ( or ε ) only incidentally.

When to attempt a proof

Zero-window pruning is not applicable in a zero-window search because full-window and zero-window searches are equivalent in zero-window trees.

Typically, zero-window pruning is not attempted for the first successor of a node because the first successor is expected to meet or exceed alpha, in which case the proof will fail and the full-window search will have to be performed anyway. Since the zero-window search is expected to be useless, implementations simply begin the full-window search right away.

More correctly, the first successor meets or exceeds alpha in pv and cut nodes with perfect ordering, but not in all nodes. It’d be reasonable, then, to attempt zero-window pruning for every successor of an expected-all node (including the first), but as far as I’m aware, this does not improve playing strength in practice. (We might expect it to be fairly useless since there shouldn’t be many all nodes in a search tree if move ordering is accurate; they would only appear in zero-window trees, as we’ll see below.) If the first successor of an expected-pv node fails to meet alpha, the next successor is predicted to meet or raise alpha instead. Then a search of Z(node.succ[1]) is expected to be useless (just as a search of Z(node.succ[0]) was expected to be useless), which suggests that an implementation oughtn’t attempt zero-window pruning yet but instead simply begin the (full-window) search of node.succ[1] right away (and so on until a successor meets or exceeds alpha). As far as I’m aware, however, this does not improve playing strength in practice.