# The ~~ operator

The `~~` binary operator is used to determine whether a string matches a regular expression.

The left hand side of the `~~` filter is a string filter whose value is the string to search within, the `target`.

The right hand side of the `~~` filter is a quoted regular expression, the `pattern`.

The `~~` operator matches the position if the target matches the pattern. For example, each of the following filters match the current position:

```   "football" ~~ "f"
"football" ~~ "f.*l"
"football" ~~ "[otba]+ll"```

Suppose the player playing White in the current game is Kasparov. Then:

```  player white ~~ "Kasparov"
player white ~~ "K.*ov"```

To check if either Kotov or Kasparov is playing white or black, one could use:

`  flipcolor player white ~~ "K(ot|aspar)ov"`

or more simply

`  player ~~ "K(ot|aspar)ov"`

Regexes can be used in this to query the result of any filter returning a string:

```  event ~~ "Wijk .* Zee"
date ~~ "2004\.03\."
site ~~ "Bel.*m"```

## Value of ~~ operator

The value of the ~~ operator is the matched string, that is, the sequence of characters in the target that matched the regular expression:
```  Result= "football" ~~ ".*"
Result == "football"
Result2= "football" ~~ "otb"
Result2 == "otb"
Result3 = "football" ~~ "[otba]+"
Result3 == "ootba"```

Note that a value can be an empty string, which is different from failing to match.

Thus,

```  RR= "hello" ~~ "z*" //this matches
RR=="" // the value of RR is the empty string
"hello" ~~ "z+" // this filter fails to match
```

## Group captures

The `~~` filter sets the values `\0`, `\1`, `\2` and so on to denote the value of the regex capturing group, if any. `\0` is the matched string. `\1` is the first capturing group, and so on:

```  "football" ~~ "(o+)tba(l+)"
\0 == "ootball"
\1 == "oo"
\2 == "ll"```

### index of a capturing group

If `\i` is a capturing group, then `\-i` is the index (zero-based) within the target string at which this capturing group is located:

```  "football" ~~ "(o+)tba(l+)"
\-0 == 1
\-1 == 2
\-2 == 6
```

For getting the index of a string inside another string more generally, use indexof.

## Extracting numbers using ~~

You can use `~~` to extract numbers from strings using the `int` filter. For example, suppose you have a string that contains among other things a substring "Eval: 43" where 43 is any number. You can get that value as follows:

```  Target= "Blunder: Eval: 43"
Target ~~ "Eval: (\d+)"
Val = int \1
```
The variable `Val` will have value 43. If `Target` had no such matching substring, the `~~` would not have matched and `Val` would not be changed

## Using ~~ with while

`~~` is treated specially when used as the test of a `while` filter (using a syntax borrowed from Perl):

`  while (lhs ~~ regex) body`

Here, `lhs` is a string filter; `regex` is a quoted string; `body` is any filter.

Initially, `lhs` will be evaluated to get a string, the `target` . The regular expression `regex` will be successively matched from left to right across the string, with `body` being evaluated after each match.

This kind of `while` filter will match any position, unless the `lhs` failed to match.

Let's call a "square string" a two-character string denoting a square, like `"a4"`.

For example, this function counts the number of square strings in a string:

```  function CountSquares(Arg){
NumSquares=0
while(Arg~~"[a-h][1-8]")
NumSquares+=1
NumSquares //return number of square strings
}```

We could apply this function to different strings:

```  CountSquares("No squares")==0
CountSquares("One c6 square")==1
CountSquares("Foura1d3squae8c7")==4```

Suppose we wanted to count the number of distinct square strings in a string.

The `makesquare` filter can take a single string as an argument and return a square. If we `|` all these squares together and count the number of squares in the result, we will get the number of distinct squares:

```  function CountDistinctSquares(Arg){
Squares=~. //the empty set

while(Arg~~"[a-h][1-8]")
Squares |= makesquare \0

#Squares
}```

Note how `\0` above refers to the currently matched string, in this case, the two-character string denoting a single square.

`  CountDistinctSquares("Two: a2a1a1a2") == 2`

For another example of the use of `while` with `~~`, see ~~ form of while.

## Precedence

The `~~` filter has higher precedence than `+` :
```  X="foot"
Y="ball"
X+ (Y ~~ "tba") == "tba" // false
(X+Y) ~~ "tba" == "tba" //true
X+Y ~~ "tba" == "tba" //true, same as above
```

(As usual, we recommend using parentheses or braces to clarify the meaning when in doubt about precedence.)

## Matching multiline targets

There are a few special considerations involved in matching multiline strings.

If the `target` does not contain the newline character, then `^` matches the beginning of the target and `\$` matches the end of the target. If the `target` contains the newline character, then on some platforms `^` matches the beginning of the line while `\$` matches the end of the line. Unfortunately, we do not know when this inconsistency will be fixed.

Note that `.` in the pattern never matches a newline. Generally, to match a line of characters in a platform-independent way, one can use something like:

```  Lines="pin" + \n + "mate" + \n + "1-0" + \n
while (Lines~~".*"){
CurrentLine=\0
// Now the variable CurrentLine holds the current line,
// without the trailing \n
}```

Also note that in typical Windows usages end of lines are indicated by the two characters `\r` and `\n`. (On Linux and Mac, just `\n` is used). This is unlikely to cause much confusion in practice, but Windows users should be aware of the issue if parsing multiline strings.

## Matching quotation marks characters

To search for a regular expression which contains the character `"`, use `\x22`:
```  Target = "Tal said: " + \" + "mate" + \"
Target ~~ "\x22mate\x22"```
In the above example, the string `Target` has the value
`  Tal said: "mate"`

This because, standing alone, the two-character sequence `\"` stands for a quotation mark in CQL. However, that sequence cannot currently be embedded inside a longer string literal. Therefore, the hexadecimal value of the quotation mark must be used to search for it as a regular expression.

## Example

The fen filter documentation shows how to use the `~~` filter to parse FEN strings.