relation LCA parameters

An LCA parameter is a type of relation parameter that which matches a source/target pair depending on the sequence of moves that lead to the source and to the target. More precisely, given any two positions in the PGN file, there is a unique position, called their latest common ancestor, or LCA. The LCA is always an ancestor of both the source and the target

To determine if an LCA parameter matches a source and target, first the LCA of the source and target is computed. Next, two sequences of moves are computed: from the LCA to the source; and from the LCA to the target. Depending on the length or composition of these sequences of moves, the LCA parameter matches (or fails to match) the target and source). The LCA is defined below, and the syntax and meaning of the LCA parameters are defined in the following table:

Syntax of LCA parameterMeaning
(lcamax range) maximum of the distances of source and target to LCA is in range
(lcasum range) sum of distances of source and target to LCA is in range
(lcasource range) distance between source and LCA is in range
(lcatarget range) distance between target and LCA is in range
(ancestor) target is an ancestor of source
(descendant) target is a descendant of source
(lcasubstring range) length of the longest common sequence of consecutive moves from LCA to source, and from LCA to target, is in range

Latest Common Ancestor

The latest common ancestor of two positions in a PGN file is the latest position in the game that is an ancestor of both of the positions.

In this definition, a position X is an ancestor of a position Y if there is a sequence of one or more consecutive moves in the PGN file beginning at X and reaching Y. We say Y is a descendant of X if X is an ancestor of Y.

A position X is thus the latest common ancestor of positions Y and Z if:

  • X is an ancestor of Y
  • X is an ancestor of Z
  • X is the latest such common ancestor: if some other position X' is also an ancestor of both Y and Z, then X' is also an ancestor of X.

As a note, by "position" here we don't just mean the board position; we mean the exact position at a particular location within the PGN file. The same board position can recur at multiple places in the same PGN file; each of these occurrences is considered a distinct "position".

For example, consider this simple game from the start position :

1. e4 e5 
2. Nf3 Nc6 
3. Bb5 (3. Bc4 Bc5) a6 
4. Ba4 Nf6
Suppose the source is the position after 3. Bc4 in the variation, and suppose the target is the position after 4...Nf6 in the mainline. The LCA of the source and the target is the position after 2...Nc6, because that position leads to both the source and to the target, and no later position also does.

Now suppose the source is the position after 2. Nf3 and the target is as before - the position after 4...Nf6.

Then the LCA is the same as the source, because the source is an ancestor of the target (equivalently, the target is a descendant of the source).

LCA Distance Parameters

All of the LCA parameters except for lcasubstring depend only on the distances between the LCA and the source/target. By distance between two positions is meant the number of moves required to get from one to the other (in the PGN file).

For example, as can be seen from the above table the relation parameter (ancestor) matches a source and target if the target is an ancestor of the source. This is equivalent to requiring that the distance from the LCA to the target equals 0, or (lcatarget 0). This is because:

  • if the target is an ancestor of the source, then the target is the LCA of the source and the target;
  • The distance between the two identical positions is 0
  • (lcatarget 0) matches if the distance between the LCA and the target is 0.

As another example, the lca parameter (lcasum range) will match a source and target if the sum of the distances between the LCA and the target and between the LCA and the source lie within range.

For example, consider the relation filter in the CQL file underpromotion-relation.cql. This example looks for studies that arise from a White underpromotion. The square parameters in that file were discussed here, but the full relation filter also includes an lcasum parameter:

relation 
  variation
    (tomove match)
    (sourcesquares [RNBQ] targetsquares [RNBQ] mismatch 1)
    (mismatch 1)
    (lcasum 20 1000)
The lcasum parameter limits the results to targets for which the sum of the distances between the LCA and the target, and between the LCA and the source, to be at least 20. If we removed this line, we would get technically correct but uninteresting promotions studies in which, for example, the differently promoted pieces were shown just after promotion, on the promotion square. By restricting lcasum to be at least 20, we limit the results to be deeper studies, in which the point of the underpromotion only becomes clear long after the promotion. Thus, the The lcasum parameter ensures that the source and the target are fairly far away in the game tree. The length of the path from the source to the target in this case must be at least 20 (lie within the range 20 1000 , where we tacitly assume that the distance is less than 1000 ply).

The lcasubstring parameter

The lcasubstring parameter is used to find sequences of moves from the LCA to the target and from the LCA to the source that are similar in some sense. Like the other LCA parameters, it takes a single argument, a range: (lcasubstring range)

Let P1 be the sequence of moves that were used to reach the source from the LCA. Let P2 be the sequence of moves that were used to reach the target from the LCA. The longest common substring of these sequences is the length of the longest sequence of consecutive moves that is identical in P1 and P2.

An example of the use of lcasubstring is in zugzwang2.cql. This simple CQL file looks for zugzwangs in which the same position recurs in the mainline and in the variations but with btm in the mainline and wtm in the variation. The relation filter is:

 relation wtm variation
  (mismatch 0)
  (lcasubstring 15 1000)
The two target filters wtm variation of the relation filter ensure that the target is wtm and in a variation. The square parameter (mismatch 0) ensures that the positions are identical. The (lcasubstring 15 1000) parameter requires that the paths of moves from the LCA to the source and to the target share a contiguous sequence of at least 15 identical moves.

The filter therefore finds studies like the following: V. Vlasenko
sp.p Uralsky Problemist - 20 JT
2013
try: after 8..Ke3 solution: after 9.c4

sorting by relation

A relation filter cannot be sorted directly, but you can sort by the value of any LCA parameter by prefacing the name of the LCA parameter with sort or with sort documentation-string. In this case, the output is sorted by the maximum value of that LCA parameter in any matching target.

For example, the example file zugzwang2.cql looks for positions that are black-to-move in the main line, with the same position recurring with white-to-move in the variation. The output is sorted by the length of the longest common contiguous subsequence of moves from the latest common ancestor of the source and the target, if this length is at least 5. The relevant relation filter is:

    relation wtm variation
    (mismatch 0)
    (sort "Common consecutive move sequence" lcasubstring 5 1000)
The other lca parameters: lcasum, lcatarget, lcasource, lcamax can all be sorted in this fashion.