ustring primitive
Signature
ts
function ustring(match: string): Parser<string>
Description
For parsing ASCII-only strings, consider using string.
ustring
parses a Unicode string. Returns the parsed string.
Implementation notes
This parser is very similar to the string parser, except it takes a bit hacky (though performant) approach, that is based on counting length of the given match
string in bytes. It then subslices and compares string slice with that match
string.
It was tested on code points from the Basic Multilingual Plane, but various tests showed that other planes are consumable as well, but that is not guaranteed. If you need guaranteed parsing of code points outside of the BMP, consider using regexp with u
flag.
Usage
ts
const Parser = ustring('语言处理')
Success
Note that the index is 12, which is correct, since every hieroglyph here takes 3 bytes.
ts
run(Parser).with('语言处理')
{
isOk: true,
span: [ 0, 12 ],
pos: 12,
value: '语言处理'
}
Failure
ts
run(Parser).with('语言')
{
isOk: false,
span: [ 0, 0 ],
pos: 0,
expected: '语言处理'
}