sorting

The sort filter sorts games in the PGN output file. Normally, games appear in the PGN output file in the order they appear in the PGN input file. When the cql file contains a sort filter, however, the output games appear in sorted order.

The basic syntax of sort is just

      sort value

where value is a numeric filter or string filter.

The sort can be followed by a documentation string, a quoted string that documents the meaning of the value on which the sort is performed:

      sort documentation_string body
For example:
      sort "white material" power 
      sort "material difference" abs power  - power 
      sort "move number" movenumber

The examples below mostly don't use documentation strings for simplicity, but we recommend always using them sorting inside a real CQL file, as they add clarity to the output.

The effect of the sort filter is that its value filter is executed as usual. However, the maximum value that the value filter ever obtained during the game is remembered.

After all the games are matched using CQL, the games in the output PGN file are sorted by these maximum values. A comment before the first move of the game indicates the maximum value of that sort filter. If there is a documentation string, that documentation will be used in the comment.

For example, to sort all the games by decreasing length, you can use

      cql (input foo.pgn)
       sort ply

The sort filter will be evaluated at each position. Only the position with maximum ply in each game will contribute to the final value of the sort filter. This is the ply value of the terminal position in a game, or the number of ply in the game. By sorting this way, we the longest games are output first.

Of course only the games that actually match are output. For instance, to sort all games that end in white stalemate by the the material power of the white side at the time of the stalemate, you could use:

      cql(input foo.pgn)
       stalemate
       sort power 

The sort filter itself is a numeric or string filter whose value is the value of its value filter in a given position. If this value does not match the position, neither does the sort filter.

Suppose you want to sort by the maximum ply value but you only want to consider games with at least 200 ply. Then you could use

     200<=sort ply
Whenever the ply of a position is smaller than 200, the <= filter will fail and the position will not be considered in a sort. You could also write here
   sort ply  200

This is parsed as

    {sort ply}  200
which has the same effect.

A sort of a set filter is also converted to a sort on the cardinality of that set. That is,

 sort setfilter      
   
  sort #setfilter

where setfilter is a set filter.

For example, the following CQL file will sort games with at least four queens in the same position by the maximum number of queens in such a position:

   cql (input i.pgn)
   sort
     []  4

Because sort itself is numeric, the above code is equivalent to:

  cql (input i.pgn)
  {sort #[]}  4

sort with strings

New in CQL 6.1 is the ability to sort by strings values. For example, to sort alphabetically by the white player, use
  sort "White player" player white

sort can also be used to sort by date, since the value of the date filter is a string that can be sorted:

  sort "Date" date

Where available, one can sort by the UTCDate:

  sort "UTCDate" tag "UTCDate"

This offers higher-grained resolution than date and can be used, for example, to sort games from sites like lichess where many games by the same player are played on the same day and one wants to sort these within a day by time-of-day played.

Value of unmatched filter

Occasionally a sort filter will not match a position even though the game as whole matches the .cql file. CQL uses a special value for unmatched sort filters. This value is always sorted after any values corresponding to matched filters; it is considered "worse" than any matched value.

Suppose, for example, we want to sort all games descending by date in which Kasparov was a player:

  cql(input data.pgn)
  player "Kasparov"
  sort "Date" date

In this example, every matched game will have Kasparov as a player. But suppose we now want to include all the other games as well, just not sorted. We can do:

  cql(input data.pgn)
  if player "Kasparov"
     sort "Date" date

This will match every game in the database (because if the test clause of an if filter fails, and there is no else, the if filter will match). But only games in which Kasparov was a player will actually have a corresponding sort value. The other games will appear after all the Kasparov games in the database, but those other will not be sorted by date. They will just appear in the order they appeared in the input .pgn file.

(Note that the same effect would be achieved by:

  cql(input data.pgn)
  {sort "Date" 
    {player "Kasparov"
      date}}
  or true
).

Now suppose the sort was by increasing date:

  sort min "Date" date

Here, the Kasparov games would appear by increasing date, but again all the non-Kasparov games would appear afterwards, in their original database order. It is as if each of these unmatched filters were supplied a special date value that was worse than any other date, no matter whether the sort was increasing or decreasing.

quiet parameter

If sort is followed by the keyword quiet, the comment before the first move that sort normally outputs is suppressed:
  sort quiet A attacks a

Multiple sorts

You can sort by multiple filters.

When multiple filters are sorted, the output is sorted by each filter in the order the filters occur in the CQL file. For example, to following CQL file sorts by the maximum number of White pins in a position, and then by year. That is, if two game have the same number of pins, then they will be output together with the most recent first:

    cql(input i.pgn)
     sort "Number of pins" pin
     1800 < sort "Year" year

Sorting under transforms/multiple sorts with the same documentation string

All sorts with the same (nonempty) documentation strings are combined in the sense that the maximum from any of them is used as a sort.

Thus, to sort on the maximum of the numbers of queens and rooks in a position, you could use

     sort "maxqr" [q]
     sort "maxqr" [Rr]

In consequence, the sort behaves as expected when it is transformed. For example, to sort on the maximum numbers of either white or black queens in a position, use sort "maxqs"

Sorting by matchcount

To sort by the matchcount (the number of positions in the game that matched, assuming this number lies within the range of the matchcount) preface the matchcount CQL parameter with sort as in:
  cql(input i.pgn sort matchcount 10 1000)
   check

The above code finds games in which the check filter matches at least 10 positions, that is, in which there are at least 10 checks, and outputs these in order of decreasing number of checks.

A sort matchcount can only appear in the CQL header, cannot have a documentation string, and cannot be sorted by min.

sorting by minimum instead of the maximum

By default, sort sorts games by the maximum attained value of its argument value filter. However, if the keyword sort is immediately followed by the word min, then games are sorted by the minimum attained value of its argument filter. In this case, the games are sorted in ascending order instead of descending order.

(Note: because min can be a parameter or a min filter, to sort by a min filter enclose that filter in braces: sort {min(x y)}) An example of this usage is in the file pinstalemate.cql. The line:

sort min "material" power .
has the effect of sorting the games in order of increasing material. (Actually, in this example there is another sort by number of pins that comes first, so the games will be sorted first by descending order of the number pins, and then by ascending order by the material.)