chainl composite

Signature

type Fn<L, R> = (left: L, right: R) => L

function chainl<T, L extends T, R>(
  parser: Parser<L>,
  op: Parser<R>,
  fn: Fn<T, R>
): Parser<T>

Description

chainl combinator parses zero or more occurrences of parser, separated by op (in EBNF notation: parser (op parser)*). Returns a value obtained by a recursive left-associative application of fn to the values returned by op and parser. This combinator is particularly useful for eliminating left recursion, which typically occurs in expression grammars.

Examples

Simple calculator

Combinators and parsers used in this section

Combinators: chainl, choice, sequence
Parsers: integer, string

The code below showcases an implementation of a simple calculator that supports addition and subtraction. Read on to see how to make it more useful by adding new operators, grouping, and operator precedence.

function mapBinary(left: number, [op, right]: [string, number]) {
  switch (op) {
    case '+': return left + right
    case '-': return left - right
    default: throw `Unknown operator '${op}'.`
  }
}

const Parser = chainl(
  integer(),
  sequence(
    choice(
      string('+'),
      string('-')
    ),
    integer()
  ),
  mapBinary
)

When you run Parser and feed it with input:

run(Parser).with('10+10-5+15')

You will get the following result:

Success

{
  isOk: true,
  span: [ 0, 10 ],
  pos: 10,
  value: 30
}

So what happens here? Let's unpack, step-by-step.

Consume 10 ['+', 10], eagerly evaluate by applying mapBinary, yield 20.
Consume 20 ['-', 5], eagerly evaluate by applying mapBinary, yield 15.
Consume 15 ['+', 15], eagerly evaluate by applying mapBinary, and finally yield 30.

As you can see, it directly maps to the EBNF notation given above: parser (op parser)*.

Eliminating left recursion

Combinators and parsers used in this section

Combinators: chainl, choice, map, sequence, takeMid, takeRight
Parsers: defer, integer, string

This example builds upon what we had earlier, but this time we will add notion of grouping (with parentheses), a couple of new operators (like multiplication (*) and division (/)), and operator precedence.

Before we see the actual code, let's first come up with some formal representation, a grammar (I'll use the slightly simplified EBNF):

haskell

term
  = number
  = ('+' | '-') term
  = '(' expression ')'

factor
  = term
  = factor ('*' | '/') term

expression
  = factor
  = expression ('+' | '-') factor

Problem

If you look at the highlighted lines, you will notice that factor and expression reference themselves on the left-hand side of the whole production rule, creating left recursion.

Why is this a problem? Well, this is a problem for some parsers, namely recursive descent parsers. Recursive descent parsers will go into an infinite loop if the non-terminal keeps on expanding into itself, as it happens with factor and expression in our grammar. Parser combinators belong to this class of parsers, so you have to take ambiguity, which left recursion brings, into account.

Note

You may have noticed already, that the term production rule refers to itself like the factor and expression production rules do, but it happens on the right-hand side, so it's fine (at least for our grammar), and we don't need to do anything special to deal with it.

Solution

One way to avoid ambiguity is to transform the grammar into an equivalent one that has no left recursion. The other is to use specialized combinators like chainl.

To implement our grammar, we first need some means to represent mutual recursion. We can use the defer parser, that is designed exactly for that.

const Term = defer<number>()
const Factor = defer<number>()
const Expression = defer<number>()

Now we are ready to define parsers for the production rules. Let's start with the parser for the term production rule.

Term.with(
  choice(
    integer(),
    takeRight(choice(string('+'), string('-')), Term),
    takeMid(string('('), Expression, string(')'))
  )
)

That was easy, wasn't it? If you look closely, you will see that the parser definition directly maps to the grammar. This is almost true for factor and expression, except we don't encode these non-terminals in ambiguous fashion.

Factor.with(
  chainl(
    Term,
    sequence(choice(string('*'), string('/')), Term),
    mapBinary
  )
)

Expression.with(
  chainl(
    Factor,
    sequence(choice(string('+'), string('-')), Factor),
    mapBinary
  )
)

As you can see, we have added the multiplication and division operators, so we need to change the mapBinary function from the previous example a little bit.

function mapBinary(left: number, [op, right]: [string, number]) {
  switch (op) {
    case '+': return left + right
    case '-': return left - right
    case '*': return left * right
    case '/': return left / right
    default: throw `Unknown operator '${op}'.`
  }
}

And this is it. Now, when we run our parser and feed it with input:

run(Expression).with('10+10+(2*30)')

We will get the following result:

Success

{
  isOk: true,
  span: [ 0, 12 ],
  pos: 12,
  value: 80
}

Complete example

import { chainl, choice, sequence, takeMid, takeRight } from '@nrsk/sigma/combinators'
import { defer, integer, string, run } from '@nrsk/sigma/parsers'

function mapBinary(left: number, [op, right]: [string, number]) {
  switch (op) {
    case '+': return left + right
    case '-': return left - right
    case '*': return left * right
    case '/': return left / right
    default: throw `Unknown operator '${op}'.`
  }
}

const Term = defer<number>()
const Factor = defer<number>()
const Expression = defer<number>()

Term.with(
  choice(
    integer(),
    takeRight(choice(string('+'), string('-')), Term),
    takeMid(string('('), Expression, string(')'))
  )
)

Factor.with(
  chainl(
    Term,
    sequence(choice(string('*'), string('/')), Term),
    mapBinary
  )
)

Expression.with(
  chainl(
    Factor,
    sequence(choice(string('+'), string('-')), Factor),
    mapBinary
  )
)

console.log(
  run(Expression).with('10+10+(2*30)')
)

chainl composite ​

Signature ​

Description ​

Examples ​

Simple calculator ​

Eliminating left recursion ​

Problem ​

Solution ​