Documentation

Lexer
in package

A Lexer is a stateful stream generator in that every time it is advanced, it returns the next token in the Source. Assuming the source lexes, the final Token emitted by the lexer will be of kind EOF, after which the lexer will repeatedly return the same EOF token whenever called.

Algorithm is O(N) both on memory and time

Table of Contents

TOKEN_AMP  = 38
TOKEN_AT  = 64
TOKEN_BANG  = 33
TOKEN_BRACE_L  = 123
TOKEN_BRACE_R  = 125
TOKEN_BRACKET_L  = 91
TOKEN_BRACKET_R  = 93
TOKEN_COLON  = 58
TOKEN_DOLLAR  = 36
TOKEN_DOT  = 46
TOKEN_EQUALS  = 61
TOKEN_HASH  = 35
TOKEN_PAREN_L  = 40
TOKEN_PAREN_R  = 41
TOKEN_PIPE  = 124
$lastToken  : Token
The previously focused non-ignored token.
$line  : int
The (1-indexed) line containing the current token.
$lineStart  : int
The character offset at which the current line begins.
$options  : array<string|int, bool>
$source  : Source
$token  : Token
The currently focused non-ignored token.
$byteStreamPosition  : int
Current cursor position for ASCII representation of the source
$position  : int
Current cursor position for UTF8 encoding of the source
__construct()  : mixed
advance()  : Token
lookahead()  : mixed
assertValidBlockStringCharacterCode()  : mixed
assertValidStringCharacterCode()  : mixed
moveStringCursor()  : self
Moves internal string cursor position
positionAfterWhitespace()  : mixed
Reads from body starting at startPosition until it finds a non-whitespace or commented character, then places cursor to the position of that character.
readBlockString()  : mixed
Reads a block string token from the source file.
readChar()  : array<string|int, string|int>
Reads next UTF8Character from the byte stream, starting from $byteStreamPosition.
readChars()  : array<string|int, string|int>
Reads next $numberOfChars UTF8 characters from the byte stream, starting from $byteStreamPosition.
readComment()  : Token
Reads a comment token from the source file.
readDigits()  : mixed
Returns string with all digits + changes current string cursor position to point to the first char after digits
readName()  : Token
Reads an alphanumeric + underscore name from the source.
readNumber()  : Token
Reads a number token from the source file, either a float or an int depending on whether a decimal point appears.
readString()  : Token
readToken()  : Token
unexpectedCharacterMessage()  : mixed

Constants

TOKEN_AMP

private mixed TOKEN_AMP = 38

TOKEN_AT

private mixed TOKEN_AT = 64

TOKEN_BANG

private mixed TOKEN_BANG = 33

TOKEN_BRACE_L

private mixed TOKEN_BRACE_L = 123

TOKEN_BRACE_R

private mixed TOKEN_BRACE_R = 125

TOKEN_BRACKET_L

private mixed TOKEN_BRACKET_L = 91

TOKEN_BRACKET_R

private mixed TOKEN_BRACKET_R = 93

TOKEN_COLON

private mixed TOKEN_COLON = 58

TOKEN_DOLLAR

private mixed TOKEN_DOLLAR = 36

TOKEN_DOT

private mixed TOKEN_DOT = 46

TOKEN_EQUALS

private mixed TOKEN_EQUALS = 61

TOKEN_HASH

private mixed TOKEN_HASH = 35

TOKEN_PAREN_L

private mixed TOKEN_PAREN_L = 40

TOKEN_PAREN_R

private mixed TOKEN_PAREN_R = 41

TOKEN_PIPE

private mixed TOKEN_PIPE = 124

Properties

$lastToken

The previously focused non-ignored token.

public Token $lastToken

$line

The (1-indexed) line containing the current token.

public int $line

$lineStart

The character offset at which the current line begins.

public int $lineStart

$options

public array<string|int, bool> $options

$token

The currently focused non-ignored token.

public Token $token

$byteStreamPosition

Current cursor position for ASCII representation of the source

private int $byteStreamPosition

$position

Current cursor position for UTF8 encoding of the source

private int $position

Methods

__construct()

public __construct(Source $source[, array<string|int, bool> $options = [] ]) : mixed
Parameters
$source : Source
$options : array<string|int, bool> = []
Return values
mixed

lookahead()

public lookahead() : mixed
Return values
mixed

assertValidBlockStringCharacterCode()

private assertValidBlockStringCharacterCode(mixed $code, mixed $position) : mixed
Parameters
$code : mixed
$position : mixed
Return values
mixed

assertValidStringCharacterCode()

private assertValidStringCharacterCode(mixed $code, mixed $position) : mixed
Parameters
$code : mixed
$position : mixed
Return values
mixed

moveStringCursor()

Moves internal string cursor position

private moveStringCursor(int $positionOffset, int $byteStreamOffset) : self
Parameters
$positionOffset : int
$byteStreamOffset : int
Return values
self

positionAfterWhitespace()

Reads from body starting at startPosition until it finds a non-whitespace or commented character, then places cursor to the position of that character.

private positionAfterWhitespace() : mixed
Return values
mixed

readBlockString()

Reads a block string token from the source file.

private readBlockString(mixed $line, mixed $col, Token $prev) : mixed

"""("?"?(\"""|\(?!=""")|[^"\]))*"""

Parameters
$line : mixed
$col : mixed
$prev : Token
Return values
mixed

readChar()

Reads next UTF8Character from the byte stream, starting from $byteStreamPosition.

private readChar([bool $advance = false ][, int $byteStreamPosition = null ]) : array<string|int, string|int>
Parameters
$advance : bool = false
$byteStreamPosition : int = null
Return values
array<string|int, string|int>

readChars()

Reads next $numberOfChars UTF8 characters from the byte stream, starting from $byteStreamPosition.

private readChars(int $charCount[, bool $advance = false ][, null $byteStreamPosition = null ]) : array<string|int, string|int>
Parameters
$charCount : int
$advance : bool = false
$byteStreamPosition : null = null
Return values
array<string|int, string|int>

readComment()

Reads a comment token from the source file.

private readComment(int $line, int $col, Token $prev) : Token

#[\u0009\u0020-\uFFFF]*

Parameters
$line : int
$col : int
$prev : Token
Return values
Token

readDigits()

Returns string with all digits + changes current string cursor position to point to the first char after digits

private readDigits() : mixed
Return values
mixed

readName()

Reads an alphanumeric + underscore name from the source.

private readName(int $line, int $col, Token $prev) : Token

[_A-Za-z][_0-9A-Za-z]*

Parameters
$line : int
$col : int
$prev : Token
Return values
Token

readNumber()

Reads a number token from the source file, either a float or an int depending on whether a decimal point appears.

private readNumber(int $line, int $col, Token $prev) : Token

Int: -?(0|[1-9][0-9]) Float: -?(0|[1-9][0-9])(.[0-9]+)?((E|e)(+|-)?[0-9]+)?

Parameters
$line : int
$col : int
$prev : Token
Tags
throws
SyntaxError
Return values
Token

unexpectedCharacterMessage()

private unexpectedCharacterMessage(mixed $code) : mixed
Parameters
$code : mixed
Return values
mixed

Search results